[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2017-10-10 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199434#comment-16199434
 ] 

Haibo Chen commented on HADOOP-12436:
-

[~aw] [~mattpaduano] This seems an incompatible change given GlobFilter and 
RegexFilter are Public Evolving. Hence, I have added an incompatible tag. Feel 
free to remove it if you disagree

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0-alpha1
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch, HADOOP-12436.05.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2016-04-22 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255017#comment-15255017
 ] 

Harsh J commented on HADOOP-12436:
--

This change subtly fixes the issue described in HADOOP-13051 (test-case added 
there for regression's sake)

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch, HADOOP-12436.05.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969608#comment-14969608
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #569 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/569/])
HADOOP-12436. GlobPattern regex library has performance issues with  (aw: rev 
4c0bae240bea9a475e8ee9a0b081bfce6d1cd1e5)
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobFilter.java
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* hadoop-common-project/hadoop-common/pom.xml


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch, HADOOP-12436.05.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969505#comment-14969505
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #1305 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1305/])
HADOOP-12436. GlobPattern regex library has performance issues with  (aw: rev 
4c0bae240bea9a475e8ee9a0b081bfce6d1cd1e5)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* hadoop-project/pom.xml
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobFilter.java
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch, HADOOP-12436.05.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969975#comment-14969975
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #2464 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2464/])
HADOOP-12436. GlobPattern regex library has performance issues with  (aw: rev 
4c0bae240bea9a475e8ee9a0b081bfce6d1cd1e5)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobFilter.java
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* hadoop-project/pom.xml


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch, HADOOP-12436.05.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969865#comment-14969865
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #527 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/527/])
HADOOP-12436. GlobPattern regex library has performance issues with  (aw: rev 
4c0bae240bea9a475e8ee9a0b081bfce6d1cd1e5)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobFilter.java
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* hadoop-common-project/hadoop-common/pom.xml


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch, HADOOP-12436.05.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969873#comment-14969873
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2516 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2516/])
HADOOP-12436. GlobPattern regex library has performance issues with  (aw: rev 
4c0bae240bea9a475e8ee9a0b081bfce6d1cd1e5)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* LICENSE.txt
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* hadoop-common-project/hadoop-common/pom.xml


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch, HADOOP-12436.05.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969174#comment-14969174
 ] 

Hadoop QA commented on HADOOP-12436:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 17s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_60. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 0s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m 30s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-10-22 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12767470/HADOOP-12436.05.patch 
|
| JIRA Issue | HADOOP-12436 |
| Optional Tests |  asflicense  javac  javadoc  mvninstall  unit  xml  findbugs 
 checkstyle  compile  |
| uname | Linux c46774885f7c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HADOOP-Build/patchprocess/apache-yetus-28a3a3d/dev-support/personality/hadoop.sh
 |
| git revision | trunk / 381610d |
| Default Java | 1.7.0_79 |
| Multi-JDK versions |  

[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969308#comment-14969308
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-trunk-Commit #8689 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8689/])
HADOOP-12436. GlobPattern regex library has performance issues with  (aw: rev 
4c0bae240bea9a475e8ee9a0b081bfce6d1cd1e5)
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobFilter.java
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* hadoop-common-project/hadoop-common/pom.xml


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch, HADOOP-12436.05.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969379#comment-14969379
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #584 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/584/])
HADOOP-12436. GlobPattern regex library has performance issues with  (aw: rev 
4c0bae240bea9a475e8ee9a0b081bfce6d1cd1e5)
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch, HADOOP-12436.05.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959151#comment-14959151
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #1273 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1273/])
Revert "HADOOP-12436. GlobPattern regex library has performance issues (aw: rev 
dc45a7a7c4920a60424d60aca07a72a9eb909fe2)
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* LICENSE.txt
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* hadoop-common-project/hadoop-common/pom.xml


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959087#comment-14959087
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-trunk-Commit #8644 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8644/])
Revert "HADOOP-12436. GlobPattern regex library has performance issues (aw: rev 
dc45a7a7c4920a60424d60aca07a72a9eb909fe2)
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* LICENSE.txt


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959111#comment-14959111
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #550 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/550/])
Revert "HADOOP-12436. GlobPattern regex library has performance issues (aw: rev 
dc45a7a7c4920a60424d60aca07a72a9eb909fe2)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* hadoop-common-project/hadoop-common/pom.xml
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* hadoop-common-project/hadoop-common/CHANGES.txt


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-15 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959085#comment-14959085
 ] 

Allen Wittenauer commented on HADOOP-12436:
---

So a few things:

a) we are clearly missing test coverage in common, since this issue wasn't 
detected there. Those tests should probably be either moved or at least 
replicated over in common for better, more complete testing.

b) we're hitting a (documented!) incompatibility between 
com.google.re2j.PatternSyntaxException and 
java.util.regex.PatternSyntaxException

c) GlobPattern is Private, Evolving . GlobFilter is Public, Evolving but it 
converts the PatternSyntaxException to IOException, so even though this is an 
incompatibility, no deprecation should be required.  That said, we should 
definitely scan the source for any other calls into GlobPattern to see if they 
are processing PatternSyntaxException.

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959576#comment-14959576
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #502 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/502/])
Revert "HADOOP-12436. GlobPattern regex library has performance issues (aw: rev 
dc45a7a7c4920a60424d60aca07a72a9eb909fe2)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* hadoop-project/pom.xml
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959159#comment-14959159
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2486 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2486/])
Revert "HADOOP-12436. GlobPattern regex library has performance issues (aw: rev 
dc45a7a7c4920a60424d60aca07a72a9eb909fe2)
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* hadoop-common-project/hadoop-common/pom.xml
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959184#comment-14959184
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #537 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/537/])
Revert "HADOOP-12436. GlobPattern regex library has performance issues (aw: rev 
dc45a7a7c4920a60424d60aca07a72a9eb909fe2)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* hadoop-project/pom.xml
* LICENSE.txt


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959450#comment-14959450
 ] 

Hudson commented on HADOOP-12436:
-

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2439 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2439/])
Revert "HADOOP-12436. GlobPattern regex library has performance issues (aw: rev 
dc45a7a7c4920a60424d60aca07a72a9eb909fe2)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957143#comment-14957143
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-trunk-Commit #8632 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8632/])
HADOOP-12436. GlobPattern regex library has performance issues with (aw: rev 
0d77e85f0aa503fdb826886d867fe61c9e984073)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* LICENSE.txt
* hadoop-common-project/hadoop-common/pom.xml


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957478#comment-14957478
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #529 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/529/])
HADOOP-12436. GlobPattern regex library has performance issues with (aw: rev 
0d77e85f0aa503fdb826886d867fe61c9e984073)
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* LICENSE.txt
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957376#comment-14957376
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #541 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/541/])
HADOOP-12436. GlobPattern regex library has performance issues with (aw: rev 
0d77e85f0aa503fdb826886d867fe61c9e984073)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957296#comment-14957296
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2477 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2477/])
HADOOP-12436. GlobPattern regex library has performance issues with (aw: rev 
0d77e85f0aa503fdb826886d867fe61c9e984073)
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* hadoop-common-project/hadoop-common/CHANGES.txt


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957512#comment-14957512
 ] 

Hudson commented on HADOOP-12436:
-

SUCCESS: Integrated in Hadoop-Yarn-trunk #1265 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1265/])
HADOOP-12436. GlobPattern regex library has performance issues with (aw: rev 
0d77e85f0aa503fdb826886d867fe61c9e984073)
* hadoop-project/pom.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* LICENSE.txt


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957660#comment-14957660
 ] 

Hudson commented on HADOOP-12436:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #2433 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2433/])
HADOOP-12436. GlobPattern regex library has performance issues with (aw: rev 
0d77e85f0aa503fdb826886d867fe61c9e984073)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java
* hadoop-project/pom.xml


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957928#comment-14957928
 ] 

Hudson commented on HADOOP-12436:
-

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #496 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/496/])
HADOOP-12436. GlobPattern regex library has performance issues with (aw: rev 
0d77e85f0aa503fdb826886d867fe61c9e984073)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/AbstractPatternFilter.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GlobPattern.java
* hadoop-project/pom.xml
* LICENSE.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/RegexFilter.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* hadoop-common-project/hadoop-common/pom.xml
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGlobPattern.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/filter/GlobFilter.java


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956164#comment-14956164
 ] 

Hadoop QA commented on HADOOP-12436:


\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  21m 19s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   9m  3s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  12m 46s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 30s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 28s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m  4s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 49s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 21s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |   8m 46s | Tests passed in 
hadoop-common. |
| | |  59m 10s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12766441/HADOOP-12436.04.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 40cac59 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7807/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7807/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7807/console |


This message was automatically generated.

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-13 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955873#comment-14955873
 ] 

Allen Wittenauer commented on HADOOP-12436:
---

*sigh*

Need a rebase'd patch. :(

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951464#comment-14951464
 ] 

Hadoop QA commented on HADOOP-12436:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  20m 27s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 17s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 31s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 18s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 10s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 38s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 53s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |   6m 36s | Tests failed in 
hadoop-common. |
| | |  51m 28s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.net.TestDNS |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12765930/HADOOP-12436.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / def374e |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7790/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7790/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7790/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7790/console |


This message was automatically generated.

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-09 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951296#comment-14951296
 ] 

Allen Wittenauer commented on HADOOP-12436:
---

Argh. I forgot to mention the 'paperwork' component.  The re2 license should be 
added to the LICENSE file at the root of the tree.  There are some examples 
there for other, non-Apache projects.


> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951564#comment-14951564
 ] 

Hadoop QA commented on HADOOP-12436:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 41s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  4s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 24s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 20s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m  6s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 52s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |   6m 54s | Tests passed in 
hadoop-common. |
| | |  48m 27s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12765930/HADOOP-12436.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / def374e |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7792/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7792/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7792/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7792/console |


This message was automatically generated.

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-09 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951021#comment-14951021
 ] 

Allen Wittenauer commented on HADOOP-12436:
---

bq. +  1.0

This should be parameterized from hadoop-project/pom.xml.

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Attachments: HADOOP-12436.01.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-10-05 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943888#comment-14943888
 ] 

Colin Patrick McCabe commented on HADOOP-12436:
---

Thanks for this, [~mattpaduano].

I don't think there are any API issues since {{GlobPattern}} has 
{{InterfaceAnnotation private}}.

I do think adding a new dependency could be messy and we should consider 
shading it, since it seems like a small utility library.

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Attachments: HADOOP-12436.01.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-09-29 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935157#comment-14935157
 ] 

Daniel Templeton commented on HADOOP-12436:
---

The tests passed, so that's a good sign.  Have you tried spinning up a cluster 
with the patch and banging on it a bit?

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Attachments: HADOOP-12436.01.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-09-23 Thread Matthew Paduano (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905231#comment-14905231
 ] 

Matthew Paduano commented on HADOOP-12436:
--

Propose switching the java.util.regex library for com.google.re2j.

One possible concern:   The public interface of GlobPattern does permit 
users to obtain a reference to the Pattern objects.   re2j does not claim 
to be a drop in replacement, so this might break something somewhere.

Please find proposed patchfile attached.

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2015-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905557#comment-14905557
 ] 

Hadoop QA commented on HADOOP-12436:


\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 24s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 53s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  4s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 11s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 36s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 51s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  23m 21s | Tests passed in 
hadoop-common. |
| | |  64m 18s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12761973/HADOOP-12436.01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 1f707ec |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7695/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7695/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7695/console |


This message was automatically generated.

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Attachments: HADOOP-12436.01.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)