from:"Jason Lowe \(JIRA\)"

[jira] [Comment Edited] (HADOOP-16007) Order of property settings is incorrect when includes are processed

2018-12-20 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725976#comment-16725976
 ] 

Jason Lowe edited comment on HADOOP-16007 at 12/20/18 4:06 PM:
---

This was fixed by HADOOP-15973.


was (Author: jlowe):
This was fixed by HADOOP-15554.

> Order of property settings is incorrect when includes are processed
> ---
>
> Key: HADOOP-16007
> URL: https://issues.apache.org/jira/browse/HADOOP-16007
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 3.2.0, 3.1.1, 3.0.4
>Reporter: Jason Lowe
>Assignee: Eric Payne
>Priority: Blocker
>
> If a configuration file contains a setting for a property then later includes 
> another file that also sets that property to a different value then the 
> property will be parsed incorrectly. For example, consider the following 
> configuration file:
> {noformat}
> http://www.w3.org/2001/XInclude;>
>  
>  myprop
>  val1
>  
> 
> 
> {noformat}
> with the contents of /some/other/file.xml as:
> {noformat}
>  
>myprop
>val2
>  
> {noformat}
> Parsing this configuration should result in myprop=val2, but it actually 
> results in myprop=val1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream

2018-12-20 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15973:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.3
   3.1.2
   3.0.4
   3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

Thanks, Eric!  I committed this to trunk, branch-3.2, branch-3.2.0, branch-3.1, 
branch-3.0, branch-2 and branch-2.9.

> Configuration: Included properties are not cached if resource is a stream
> -
>
> Key: HADOOP-15973
> URL: https://issues.apache.org/jira/browse/HADOOP-15973
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Critical
> Fix For: 2.10.0, 3.2.0, 3.0.4, 3.1.2, 2.9.3
>
> Attachments: HADOOP-15973.001.patch, HADOOP-15973.002.patch, 
> HADOOP-15973.003.branch-2.patch, HADOOP-15973.003.branch-3.0.patch, 
> HADOOP-15973.003.patch
>
>
> If a configuration resource is a bufferedinputstream and the resource has an 
> included xml file, the properties from the included file are read and stored 
> in the properties of the configuration, but they are not stored in the 
> resource cache. So, if a later resource is added to the config and the 
> properties are recalculated from the first resource, the included properties 
> are lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-16007) Order of property settings is incorrect when includes are processed

2018-12-20 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved HADOOP-16007.
-
Resolution: Duplicate

This was fixed by HADOOP-15554.

> Order of property settings is incorrect when includes are processed
> ---
>
> Key: HADOOP-16007
> URL: https://issues.apache.org/jira/browse/HADOOP-16007
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 3.2.0, 3.1.1, 3.0.4
>Reporter: Jason Lowe
>Assignee: Eric Payne
>Priority: Blocker
>
> If a configuration file contains a setting for a property then later includes 
> another file that also sets that property to a different value then the 
> property will be parsed incorrectly. For example, consider the following 
> configuration file:
> {noformat}
> http://www.w3.org/2001/XInclude;>
>  
>  myprop
>  val1
>  
> 
> 
> {noformat}
> with the contents of /some/other/file.xml as:
> {noformat}
>  
>myprop
>val2
>  
> {noformat}
> Parsing this configuration should result in myprop=val2, but it actually 
> results in myprop=val1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream

2018-12-20 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725938#comment-16725938
 ] 

Jason Lowe commented on HADOOP-15973:
-

+1 for the trunk, branch-3, and branch-2 patches.  Committing this.

> Configuration: Included properties are not cached if resource is a stream
> -
>
> Key: HADOOP-15973
> URL: https://issues.apache.org/jira/browse/HADOOP-15973
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Critical
> Attachments: HADOOP-15973.001.patch, HADOOP-15973.002.patch, 
> HADOOP-15973.003.branch-2.patch, HADOOP-15973.003.branch-3.0.patch, 
> HADOOP-15973.003.patch
>
>
> If a configuration resource is a bufferedinputstream and the resource has an 
> included xml file, the properties from the included file are read and stored 
> in the properties of the configuration, but they are not stored in the 
> resource cache. So, if a later resource is added to the config and the 
> properties are recalculated from the first resource, the included properties 
> are lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream

2018-12-19 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725151#comment-16725151
 ] 

Jason Lowe commented on HADOOP-15973:
-

Thanks for updating the patch!  I agree the TestSSLFactory failure is 
unrelated.  I also saw it sporadically fail in HADOOP-15995 with the same 
error.  I filed HADOOP-16016 to track that error.

Speaking of preserving behaviors, I noticed the old include handling behavior 
was to silently ignore cases where it could not build a stream reader if quiet 
was true.  That's what loadResource does and it used to be leveraged to process 
include files.  This behavior was not preserved in the new version, but I've 
always been confused about the use-case where we don't want to complain/throw 
when we can't load something.  It seems like that would make things very hard 
to debug in practice.  I wanted to point this out to confirm the omission of 
quiet mode suppression in the new include handling was intentional and see if 
anyone can think of a reason why this type of error would want to be suppressed.


> Configuration: Included properties are not cached if resource is a stream
> -
>
> Key: HADOOP-15973
> URL: https://issues.apache.org/jira/browse/HADOOP-15973
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Critical
> Attachments: HADOOP-15973.001.patch, HADOOP-15973.002.patch, 
> HADOOP-15973.003.branch-2.patch, HADOOP-15973.003.branch-3.0.patch, 
> HADOOP-15973.003.patch
>
>
> If a configuration resource is a bufferedinputstream and the resource has an 
> included xml file, the properties from the included file are read and stored 
> in the properties of the configuration, but they are not stored in the 
> resource cache. So, if a later resource is added to the config and the 
> properties are recalculated from the first resource, the included properties 
> are lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16016) TestSSLFactory#testServerWeakCiphers sporadically fails in precommit builds

2018-12-19 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725148#comment-16725148
 ] 

Jason Lowe commented on HADOOP-16016:
-

Full stack trace:
{noformat}
[ERROR] testServerWeakCiphers(org.apache.hadoop.security.ssl.TestSSLFactory)  
Time elapsed: 0.079 s  <<< FAILURE!
java.lang.AssertionError: 
Expected to find 'no cipher suites in common' but got unexpected 
exception:javax.net.ssl.SSLHandshakeException: No appropriate protocol 
(protocol is disabled or cipher suites are inappropriate)
at sun.security.ssl.Handshaker.activate(Handshaker.java:509)
at 
sun.security.ssl.SSLEngineImpl.kickstartHandshake(SSLEngineImpl.java:714)
at 
sun.security.ssl.SSLEngineImpl.writeAppRecord(SSLEngineImpl.java:1212)
at sun.security.ssl.SSLEngineImpl.wrap(SSLEngineImpl.java:1165)
at javax.net.ssl.SSLEngine.wrap(SSLEngine.java:469)
at 
org.apache.hadoop.security.ssl.TestSSLFactory.wrap(TestSSLFactory.java:246)
at 
org.apache.hadoop.security.ssl.TestSSLFactory.testServerWeakCiphers(TestSSLFactory.java:220)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)

at 
org.apache.hadoop.security.ssl.TestSSLFactory.testServerWeakCiphers(TestSSLFactory.java:240)
Caused by: javax.net.ssl.SSLHandshakeException: No appropriate protocol 
(protocol is disabled or cipher suites are inappropriate)
at 
org.apache.hadoop.security.ssl.TestSSLFactory.wrap(TestSSLFactory.java:246)
at 
org.apache.hadoop.security.ssl.TestSSLFactory.testServerWeakCiphers(TestSSLFactory.java:220)
{noformat}

Looks like maybe the exception message changed in the Java libraries?  Test is 
looking for "no cipher suites in common" but the exception message is "No 
appropriate protocol" instead.

> TestSSLFactory#testServerWeakCiphers sporadically fails in precommit builds
> ---
>
> Key: HADOOP-16016
> URL: https://issues.apache.org/jira/browse/HADOOP-16016
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.3.0
>Reporter: Jason Lowe
>Priority: Major
>
> I have seen a couple of precommit builds across JIRAs fail in 
> TestSSLFactory#testServerWeakCiphers with the error:
> {noformat}
> [ERROR]   TestSSLFactory.testServerWeakCiphers:240 Expected to find 'no 
> cipher suites in common' but got unexpected 
>

[jira] [Created] (HADOOP-16016) TestSSLFactory#testServerWeakCiphers sporadically fails in precommit builds

2018-12-19 Thread Jason Lowe (JIRA)

Jason Lowe created HADOOP-16016:
---

 Summary: TestSSLFactory#testServerWeakCiphers sporadically fails 
in precommit builds
 Key: HADOOP-16016
 URL: https://issues.apache.org/jira/browse/HADOOP-16016
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 3.3.0
Reporter: Jason Lowe


I have seen a couple of precommit builds across JIRAs fail in 
TestSSLFactory#testServerWeakCiphers with the error:
{noformat}
[ERROR]   TestSSLFactory.testServerWeakCiphers:240 Expected to find 'no cipher 
suites in common' but got unexpected 
exception:javax.net.ssl.SSLHandshakeException: No appropriate protocol 
(protocol is disabled or cipher suites are inappropriate)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream

2018-12-18 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724210#comment-16724210
 ] 

Jason Lowe commented on HADOOP-15973:
-

Thanks for the patch!  In addition to the unit test added, I manually verified 
that this patch fixes HADOOP-16007.

Nit: We don't need to declare MalformedURLException as being thrown by 
handleInclude or handleStartElement since it's a subclass of IOException which 
was added to the throws list.  And once that's removed we no longer need to 
import it.

handleInclude doesn't handle the case where getStreamReader returns null.  It 
should probably throw in a similar way to how loadResource does when the reader 
is null.


> Configuration: Included properties are not cached if resource is a stream
> -
>
> Key: HADOOP-15973
> URL: https://issues.apache.org/jira/browse/HADOOP-15973
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Critical
> Attachments: HADOOP-15973.001.patch, HADOOP-15973.002.patch
>
>
> If a configuration resource is a bufferedinputstream and the resource has an 
> included xml file, the properties from the included file are read and stored 
> in the properties of the configuration, but they are not stored in the 
> resource cache. So, if a later resource is added to the config and the 
> properties are recalculated from the first resource, the included properties 
> are lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16007) Order of property settings is incorrect when includes are processed

2018-12-18 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724070#comment-16724070
 ] 

Jason Lowe commented on HADOOP-16007:
-

This is not quite the same issue as discovered during the RC0 voting period, as 
that's HADOOP-15973.  Eric and I have been discussing this quite a bit offline, 
and he said that rolling back to the commit before HADOOP-15554 did not fix 
HADOOP-15973, so they are related but slightly different issues.  We _think_ 
there's a way to fix both of them with the same change, and Eric is actively 
working on that.  I agree that we should hold the RC for these fixes, as not 
loading the intended config settings properly could lead to very bad behavior 
depending upon the property which was accidentally, silently dropped after 
upgrading.


> Order of property settings is incorrect when includes are processed
> ---
>
> Key: HADOOP-16007
> URL: https://issues.apache.org/jira/browse/HADOOP-16007
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 3.2.0, 3.1.1, 3.0.4
>Reporter: Jason Lowe
>Assignee: Eric Payne
>Priority: Blocker
>
> If a configuration file contains a setting for a property then later includes 
> another file that also sets that property to a different value then the 
> property will be parsed incorrectly. For example, consider the following 
> configuration file:
> {noformat}
> http://www.w3.org/2001/XInclude;>
>  
>  myprop
>  val1
>  
> 
> 
> {noformat}
> with the contents of /some/other/file.xml as:
> {noformat}
>  
>myprop
>val2
>  
> {noformat}
> Parsing this configuration should result in myprop=val2, but it actually 
> results in myprop=val1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16007) Order of property settings is incorrect when includes are processed

2018-12-17 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723015#comment-16723015
 ] 

Jason Lowe commented on HADOOP-16007:
-

The behavior will only be noticed if an included resource overrides a 
previously set property from the same resource doing the include.  If the 
include was overriding a value from a previously parsed resource (like 
core-default.xml) then the problem does not manifest.

The parser directly sets the included properties on the conf as a side-effect 
of parsing but the non-included properties are returned as a parse result and 
those results are iterated to set them.  The sideband processing of includes 
effectively reverses the order in which properties are processed if the 
xinclude appears after the property setting in the original resource.

Here's the simple code I used to test it:
{code:title=testconf.java}
import org.apache.hadoop.conf.Configuration;
class testconf {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
System.out.println("myconf = " + conf.get("myconf"));
}
}
{code}

Using this sample code with a core-site.xml and included file setup as 
described in the JIRA description, The following shows what I get at two 
adjacent commits on the trunk line:
{noformat}
$ git log -1
commit f51da9c4d1423c2ac92eb4f40e973264e7e968cc
Author: Andrew Wang 
Date:   Mon Jul 2 18:31:21 2018 +0200

HADOOP-15554. Improve JIT performance for Configuration parsing. 
Contributed by Todd Lipcon.
$ mvn clean && mvn install -Pdist -DskipTests -DskipShade -Dmaven.javadoc.skip 
-am -pl :hadoop-common
[...]
$ java -cp 
"hadoop/testconf:hadoop/apache/hadoop/hadoop-common-project/hadoop-common/target/hadoop-common-3.2.0-SNAPSHOT/share/hadoop/common/*:hadoop/apache/hadoop/hadoop-common-project/hadoop-common/target/hadoop-common-3.2.0-SNAPSHOT/share/hadoop/common/lib/*:."
 testconf
myconf = val1
{noformat}
So the above shows the broken behavior.  core-site.xml set myconf to val1 then 
xincluded another file which set it to val2, yet the property acts as if the 
xinclude occurred at the top of core-site.xml.  Moving one commit earlier in 
time shows the expected behavior:
{noformat}
$ git checkout HEAD~1
Previous HEAD position was f51da9c... HADOOP-15554. Improve JIT performance for 
Configuration parsing. Contributed by Todd Lipcon.
HEAD is now at 5d748bd... HDFS-13702. Remove HTrace hooks from DFSClient to 
reduce CPU usage. Contributed by Todd Lipcon.
$ mvn clean && mvn install -Pdist -DskipTests -DskipShade -Dmaven.javadoc.skip 
-am -pl :hadoop-common
[...]
$ java -cp 
"hadoop/testconf:hadoop/apache/hadoop/hadoop-common-project/hadoop-common/target/hadoop-common-3.2.0-SNAPSHOT/share/hadoop/common/*:hadoop/apache/hadoop/hadoop-common-project/hadoop-common/target/hadoop-common-3.2.0-SNAPSHOT/share/hadoop/common/lib/*:."
 testconf
myconf = val2
{noformat}


> Order of property settings is incorrect when includes are processed
> ---
>
> Key: HADOOP-16007
> URL: https://issues.apache.org/jira/browse/HADOOP-16007
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 3.2.0, 3.1.1, 3.0.4
>Reporter: Jason Lowe
>Priority: Blocker
>
> If a configuration file contains a setting for a property then later includes 
> another file that also sets that property to a different value then the 
> property will be parsed incorrectly. For example, consider the following 
> configuration file:
> {noformat}
> http://www.w3.org/2001/XInclude;>
>  
>  myprop
>  val1
>  
> 
> 
> {noformat}
> with the contents of /some/other/file.xml as:
> {noformat}
>  
>myprop
>val2
>  
> {noformat}
> Parsing this configuration should result in myprop=val2, but it actually 
> results in myprop=val1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15554) Improve JIT performance for Configuration parsing

2018-12-14 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721850#comment-16721850
 ] 

Jason Lowe commented on HADOOP-15554:
-

FYI I noticed an ordering issue with parsing includes and tracked it down to 
this commit as the cause.  See HADOOP-16007.

> Improve JIT performance for Configuration parsing
> -
>
> Key: HADOOP-15554
> URL: https://issues.apache.org/jira/browse/HADOOP-15554
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: conf, performance
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: 3.2.0, 3.1.1, 3.0.4
>
> Attachments: HADOOP-15554.branch-3.0.patch, 
> HADOOP-15554.branch-3.0.patch, hadoop-15554.patch, hadoop-15554.patch
>
>
> In investigating a performance regression for small tasks between Hadoop 2 
> and Hadoop 3, we found that the amount of time spent in JIT was significantly 
> higher. Using jitwatch we were able to determine that, due to a combination 
> of switching from DOM to SAX style parsing and just having more configuration 
> key/value pairs, Configuration.loadResource is now getting compiled with the 
> C2 compiler and taking quite some time. Breaking that very large function up 
> into several smaller ones and eliminating some redundant bits of code 
> improves the JIT performance measurably.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16007) Order of property settings is incorrect when includes are processed

2018-12-14 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721844#comment-16721844
 ] 

Jason Lowe commented on HADOOP-16007:
-

I tracked this behavior change down to HADOOP-15554.  I believe the problem 
stems from Parser#handleInclude parsing the included sub-resource directly into 
the Configuration {{properties}} member which bypasses the ordering of 
properties returned by the Parser#parse method.

> Order of property settings is incorrect when includes are processed
> ---
>
> Key: HADOOP-16007
> URL: https://issues.apache.org/jira/browse/HADOOP-16007
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 3.2.0, 3.1.1, 3.0.4
>Reporter: Jason Lowe
>Priority: Blocker
>
> If a configuration file contains a setting for a property then later includes 
> another file that also sets that property to a different value then the 
> property will be parsed incorrectly. For example, consider the following 
> configuration file:
> {noformat}
> http://www.w3.org/2001/XInclude;>
>  
>  myprop
>  val1
>  
> 
> 
> {noformat}
> with the contents of /some/other/file.xml as:
> {noformat}
>  
>myprop
>val2
>  
> {noformat}
> Parsing this configuration should result in myprop=val2, but it actually 
> results in myprop=val1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-16007) Order of property settings is incorrect when includes are processed

2018-12-14 Thread Jason Lowe (JIRA)

Jason Lowe created HADOOP-16007:
---

 Summary: Order of property settings is incorrect when includes are 
processed
 Key: HADOOP-16007
 URL: https://issues.apache.org/jira/browse/HADOOP-16007
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 3.1.1, 3.2.0, 3.0.4
Reporter: Jason Lowe


If a configuration file contains a setting for a property then later includes 
another file that also sets that property to a different value then the 
property will be parsed incorrectly. For example, consider the following 
configuration file:
{noformat}
http://www.w3.org/2001/XInclude;>
 
 myprop
 val1
 


{noformat}
with the contents of /some/other/file.xml as:
{noformat}
 
   myprop
   val2
 
{noformat}
Parsing this configuration should result in myprop=val2, but it actually 
results in myprop=val1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream

2018-12-14 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721781#comment-16721781
 ] 

Jason Lowe commented on HADOOP-15973:
-

Thanks for the patch!

I'm not sure returning the full properties is correct.  IIUC that will snapshot 
not only the properties that were loaded as part of parsing the input stream 
but also all previously loaded properties from prior resources.  That will 
cause problems if a previously parsed resource is changed and the user tries to 
refresh configs, as this snapshot will have old property values from that 
resource that could clobber the new values trying to be refreshed.

As I understand it, we need input streams to cache what was loaded from the 
input stream, including anything loaded from include directives found in that 
stream, and nothing else.  It appears the bug is in 
Configuration.Parser#handleInclude since any properties loaded via an include 
are not returned in the list of properties returned as a result of the parse 
method.  If those were included in {{results}} then I think we'd cache the 
proper amount which is what was found as a result of a full parse of the input 
stream.

> Configuration: Included properties are not cached if resource is a stream
> -
>
> Key: HADOOP-15973
> URL: https://issues.apache.org/jira/browse/HADOOP-15973
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Critical
> Attachments: HADOOP-15973.001.patch
>
>
> If a configuration resource is a bufferedinputstream and the resource has an 
> included xml file, the properties from the included file are read and stored 
> in the properties of the configuration, but they are not stored in the 
> resource cache. So, if a later resource is added to the config and the 
> properties are recalculated from the first resource, the included properties 
> are lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15974) Upgrade Curator version to 2.13.0 to fix ZK tests

2018-12-04 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15974:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.2.1
   3.1.2
   3.3.0
   3.0.4
   Status: Resolved  (was: Patch Available)

Thanks, [~ajisakaa]!  I committed this to trunk, branch-3.2, branch-3.1, and 
branch-3.0.

> Upgrade Curator version to 2.13.0 to fix ZK tests
> -
>
> Key: HADOOP-15974
> URL: https://issues.apache.org/jira/browse/HADOOP-15974
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.2.0, 3.0.4, 3.3.0, 3.1.2
>Reporter: Jason Lowe
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.0.4, 3.3.0, 3.1.2, 3.2.1
>
> Attachments: YARN-8937.01.patch
>
>
> TestLeaderElectorService hangs waiting for the TestingZooKeeperServer to 
> start and eventually gets killed by the surefire timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream

2018-12-04 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15973:

Priority: Critical  (was: Major)

Increasing the priority since this can be a very nasty bug in practice.  The 
omission of xincluded properties from an input stream is _silent_ and only 
occurs _after_ the first parse.  It's difficult to debug and can lead to some 
very bad behavior depending upon the nature of the properties omitted when a 
Configuration object ends up reparsing its resources.

> Configuration: Included properties are not cached if resource is a stream
> -
>
> Key: HADOOP-15973
> URL: https://issues.apache.org/jira/browse/HADOOP-15973
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eric Payne
>Priority: Critical
>
> If a configuration resource is a bufferedinputstream and the resource has an 
> included xml file, the properties from the included file are read and stored 
> in the properties of the configuration, but they are not stored in the 
> resource cache. So, if a later resource is added to the config and the 
> properties are recalculated from the first resource, the included properties 
> are lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Moved] (HADOOP-15974) Upgrade Curator version to 2.13.0 to fix ZK tests

2018-12-04 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe moved YARN-8937 to HADOOP-15974:
---

Affects Version/s: (was: 3.3.0)
   3.1.2
   3.3.0
   3.0.4
   3.2.0
 Target Version/s: 3.0.4, 3.3.0, 3.1.2, 3.2.1  (was: 3.0.4, 3.1.2, 3.3.0, 
3.2.1)
  Component/s: (was: test)
  Key: HADOOP-15974  (was: YARN-8937)
  Project: Hadoop Common  (was: Hadoop YARN)

> Upgrade Curator version to 2.13.0 to fix ZK tests
> -
>
> Key: HADOOP-15974
> URL: https://issues.apache.org/jira/browse/HADOOP-15974
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.2.0, 3.0.4, 3.3.0, 3.1.2
>Reporter: Jason Lowe
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: YARN-8937.01.patch
>
>
> TestLeaderElectorService hangs waiting for the TestingZooKeeperServer to 
> start and eventually gets killed by the surefire timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15816) Upgrade Apache Zookeeper version due to security concerns

2018-10-23 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661045#comment-16661045
 ] 

Jason Lowe commented on HADOOP-15816:
-

This change caused at least one unit test to break, see YARN-8937.

> Upgrade Apache Zookeeper version due to security concerns
> -
>
> Key: HADOOP-15816
> URL: https://issues.apache.org/jira/browse/HADOOP-15816
> Project: Hadoop Common
>  Issue Type: Task
>Affects Versions: 3.1.1, 3.0.3
>Reporter: Boris Vulikh
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.2.0, 3.0.4, 3.3.0, 3.1.2
>
> Attachments: HADOOP-15816.01.patch
>
>
> * 
> [CVE-2018-8012|https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2018-8012]
>  * 
> [CVE-2017-5637|https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2017-5637]
> We should upgrade the dependency to version 3.4.11 or the latest, if possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15836) Review of AccessControlList

2018-10-23 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660834#comment-16660834
 ] 

Jason Lowe commented on HADOOP-15836:
-

Wondering if we should just use a LinkedHashSet here.  That way we preserve the 
order things were added to the ACL and dump the string in that same order while 
preserving the same lookup performance we had before.


> Review of AccessControlList
> ---
>
> Key: HADOOP-15836
> URL: https://issues.apache.org/jira/browse/HADOOP-15836
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, security
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HADOOP-15836.1.patch
>
>
> * Improve unit tests (expected / actual were backwards)
> * Unit test expected elements to be in order but the class's return 
> Collections were unordered
> * Formatting cleanup
> * Removed superfluous white space
> * Remove use of LinkedList
> * Removed superfluous code
> * Use {{unmodifiable}} Collections where JavaDoc states that caller must not 
> manipulate the data structure



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15836) Review of AccessControlList

2018-10-23 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660730#comment-16660730
 ] 

Jason Lowe commented on HADOOP-15836:
-

bq. this information is supplied trough a configuration

If we want to get pedantic about ordering then shouldn't the order specified in 
the configuration be the order that is preserved?  That order isn't necessarily 
lexicographical.

If there's a pressing need to order the results of getAclString then fine, we 
can do that.  But that does not require the AccessControlList implementation to 
preserve that order at all times.  As I understand it, isUserAllowed is the 
critical path on this class and getAclString is relatively rare.  That makes me 
think this patch is optimizing for the rare case at the expense of the common 
case.


> Review of AccessControlList
> ---
>
> Key: HADOOP-15836
> URL: https://issues.apache.org/jira/browse/HADOOP-15836
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, security
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HADOOP-15836.1.patch
>
>
> * Improve unit tests (expected / actual were backwards)
> * Unit test expected elements to be in order but the class's return 
> Collections were unordered
> * Formatting cleanup
> * Removed superfluous white space
> * Remove use of LinkedList
> * Removed superfluous code
> * Use {{unmodifiable}} Collections where JavaDoc states that caller must not 
> manipulate the data structure



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15836) Review of AccessControlList

2018-10-23 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660675#comment-16660675
 ] 

Jason Lowe commented on HADOOP-15836:
-

Yes, this should be changed back to HashSet.  I'm all for making the unit tests 
more resilient to different implementations, but I don't see why we would want 
to choose TreeSet over HashSet here.  The API never said anything about 
ordering, and I don't think we should start ordering it just to make some lazy 
unit tests happy.  Changing the implementation in a way that does not fix a bug 
in the implementation just adds risk.

I propose we revert this change and put up another patch if there's still 
things worth changing without the TreeSet modification in place (like maybe the 
unmodifiable set change).


> Review of AccessControlList
> ---
>
> Key: HADOOP-15836
> URL: https://issues.apache.org/jira/browse/HADOOP-15836
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, security
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HADOOP-15836.1.patch
>
>
> * Improve unit tests (expected / actual were backwards)
> * Unit test expected elements to be in order but the class's return 
> Collections were unordered
> * Formatting cleanup
> * Removed superfluous white space
> * Remove use of LinkedList
> * Removed superfluous code
> * Use {{unmodifiable}} Collections where JavaDoc states that caller must not 
> manipulate the data structure



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15836) Review of AccessControlList

2018-10-22 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659741#comment-16659741
 ] 

Jason Lowe commented on HADOOP-15836:
-

bq. I think the right thing would be to fix the unit tests to not rely on the 
order?

Agree unit tests should not rely on order, but it looks like the fix may be 
misplaced here.  If I understand this change properly, a HashSet was changed to 
a TreeSet not because it was incorrect from an API semantic point of view but 
because tests were expecting a certain order.  IMHO that's not a good change 
unless the API docs explicitly said it would preserve order.  TreeSet is 
notoriously problematic from a performance point of view relative to HashSet.  
The getUsers method returns a Collection and no order should be implied there.  
If tests want to simplify their assertions then they can dump the collection 
into a temporary tree set for comparisons, but we shouldn't force the 
implementation to pay the performance penalty all the time so unit tests can do 
easy collection comparisons.


> Review of AccessControlList
> ---
>
> Key: HADOOP-15836
> URL: https://issues.apache.org/jira/browse/HADOOP-15836
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, security
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HADOOP-15836.1.patch
>
>
> * Improve unit tests (expected / actual were backwards)
> * Unit test expected elements to be in order but the class's return 
> Collections were unordered
> * Formatting cleanup
> * Removed superfluous white space
> * Remove use of LinkedList
> * Removed superfluous code
> * Use {{unmodifiable}} Collections where JavaDoc states that caller must not 
> manipulate the data structure



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15836) Review of AccessControlList

2018-10-22 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659085#comment-16659085
 ] 

Jason Lowe commented on HADOOP-15836:
-

This broke tests in other projects, see YARN-8928 and MAPREDUCE-7155.

> Review of AccessControlList
> ---
>
> Key: HADOOP-15836
> URL: https://issues.apache.org/jira/browse/HADOOP-15836
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, security
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HADOOP-15836.1.patch
>
>
> * Improve unit tests (expected / actual were backwards)
> * Unit test expected elements to be in order but the class's return 
> Collections were unordered
> * Formatting cleanup
> * Removed superfluous white space
> * Remove use of LinkedList
> * Removed superfluous code
> * Use {{unmodifiable}} Collections where JavaDoc states that caller must not 
> manipulate the data structure



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15859) ZStandardDecompressor.c mistakes a class for an instance

2018-10-17 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15859:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.2
   3.0.4
   2.9.2
   3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

Thanks for the review, Kihwal!  I committed this to trunk, branch-3.2, 
branch-3.1, branch-3.0, branch-2, and branch-2.9.

> ZStandardDecompressor.c mistakes a class for an instance
> 
>
> Key: HADOOP-15859
> URL: https://issues.apache.org/jira/browse/HADOOP-15859
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Ben Lau
>Assignee: Jason Lowe
>Priority: Blocker
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HADOOP-15859.001.patch
>
>
> As a follow up to HADOOP-15820, I was doing more testing on ZSTD compression 
> and still encountered segfaults in the JVM in HBase after that fix. 
> I took a deeper look and realized there is still another bug, which looks 
> like it's that we are actually [calling 
> setInt()|https://github.com/apache/hadoop/blob/f13e231025333ebf80b30bbdce1296cef554943b/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.c#L148]
>  on the "remaining" variable on the ZStandardDecompressor class itself 
> (instead of an instance of that class) because the Java stub for the native C 
> init() function [is marked 
> static|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L253],
>  leading to memory corruption and a crash during GC later.
> Initially I thought we would fix this by changing the Java init() method to 
> be non-static, but it looks like the "remaining" setInt() call is actually 
> unnecessary anyway, because in ZStandardDecompressor.java's reset() we [set 
> "remaining" to 0 right after calling the JNI init() 
> call|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L216].
>  So ZStandardDecompressor.java init() doesn't have to be changed to an 
> instance method, we can leave it as static, but remove the JNI init() call's 
> "remaining" setInt() call altogether.
> Furthermore we should probably clean up the class/instance distinction in the 
> C file because that's what led to this confusion. There are some other 
> methods where the distinction is incorrect or ambiguous, we should fix them 
> to prevent this from happening again.
> I talked to [~jlowe] who further pointed out the ZStandardCompressor also has 
> similar problems and needs to be fixed too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15859) ZStandardDecompressor.c mistakes a class for an instance

2018-10-16 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15859:

Status: Patch Available  (was: Open)

Attached a patch that removes the JNI setting of the remaining field per Ben's 
analysis above and cleans up the naming re: objects vs. classes in the JNI 
function arguments.

> ZStandardDecompressor.c mistakes a class for an instance
> 
>
> Key: HADOOP-15859
> URL: https://issues.apache.org/jira/browse/HADOOP-15859
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2, 2.9.0
>Reporter: Ben Lau
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: HADOOP-15859.001.patch
>
>
> As a follow up to HADOOP-15820, I was doing more testing on ZSTD compression 
> and still encountered segfaults in the JVM in HBase after that fix. 
> I took a deeper look and realized there is still another bug, which looks 
> like it's that we are actually [calling 
> setInt()|https://github.com/apache/hadoop/blob/f13e231025333ebf80b30bbdce1296cef554943b/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.c#L148]
>  on the "remaining" variable on the ZStandardDecompressor class itself 
> (instead of an instance of that class) because the Java stub for the native C 
> init() function [is marked 
> static|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L253],
>  leading to memory corruption and a crash during GC later.
> Initially I thought we would fix this by changing the Java init() method to 
> be non-static, but it looks like the "remaining" setInt() call is actually 
> unnecessary anyway, because in ZStandardDecompressor.java's reset() we [set 
> "remaining" to 0 right after calling the JNI init() 
> call|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L216].
>  So ZStandardDecompressor.java init() doesn't have to be changed to an 
> instance method, we can leave it as static, but remove the JNI init() call's 
> "remaining" setInt() call altogether.
> Furthermore we should probably clean up the class/instance distinction in the 
> C file because that's what led to this confusion. There are some other 
> methods where the distinction is incorrect or ambiguous, we should fix them 
> to prevent this from happening again.
> I talked to [~jlowe] who further pointed out the ZStandardCompressor also has 
> similar problems and needs to be fixed too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15859) ZStandardDecompressor.c mistakes a class for an instance

2018-10-16 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15859:

Attachment: HADOOP-15859.001.patch

> ZStandardDecompressor.c mistakes a class for an instance
> 
>
> Key: HADOOP-15859
> URL: https://issues.apache.org/jira/browse/HADOOP-15859
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Ben Lau
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: HADOOP-15859.001.patch
>
>
> As a follow up to HADOOP-15820, I was doing more testing on ZSTD compression 
> and still encountered segfaults in the JVM in HBase after that fix. 
> I took a deeper look and realized there is still another bug, which looks 
> like it's that we are actually [calling 
> setInt()|https://github.com/apache/hadoop/blob/f13e231025333ebf80b30bbdce1296cef554943b/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.c#L148]
>  on the "remaining" variable on the ZStandardDecompressor class itself 
> (instead of an instance of that class) because the Java stub for the native C 
> init() function [is marked 
> static|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L253],
>  leading to memory corruption and a crash during GC later.
> Initially I thought we would fix this by changing the Java init() method to 
> be non-static, but it looks like the "remaining" setInt() call is actually 
> unnecessary anyway, because in ZStandardDecompressor.java's reset() we [set 
> "remaining" to 0 right after calling the JNI init() 
> call|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L216].
>  So ZStandardDecompressor.java init() doesn't have to be changed to an 
> instance method, we can leave it as static, but remove the JNI init() call's 
> "remaining" setInt() call altogether.
> Furthermore we should probably clean up the class/instance distinction in the 
> C file because that's what led to this confusion. There are some other 
> methods where the distinction is incorrect or ambiguous, we should fix them 
> to prevent this from happening again.
> I talked to [~jlowe] who further pointed out the ZStandardCompressor also has 
> similar problems and needs to be fixed too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15859) ZStandardDecompressor.c mistakes a class for an instance

2018-10-16 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15859:

Affects Version/s: 2.9.0
   3.0.0-alpha2
 Target Version/s: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2

> ZStandardDecompressor.c mistakes a class for an instance
> 
>
> Key: HADOOP-15859
> URL: https://issues.apache.org/jira/browse/HADOOP-15859
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Ben Lau
>Assignee: Jason Lowe
>Priority: Blocker
>
> As a follow up to HADOOP-15820, I was doing more testing on ZSTD compression 
> and still encountered segfaults in the JVM in HBase after that fix. 
> I took a deeper look and realized there is still another bug, which looks 
> like it's that we are actually [calling 
> setInt()|https://github.com/apache/hadoop/blob/f13e231025333ebf80b30bbdce1296cef554943b/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.c#L148]
>  on the "remaining" variable on the ZStandardDecompressor class itself 
> (instead of an instance of that class) because the Java stub for the native C 
> init() function [is marked 
> static|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L253],
>  leading to memory corruption and a crash during GC later.
> Initially I thought we would fix this by changing the Java init() method to 
> be non-static, but it looks like the "remaining" setInt() call is actually 
> unnecessary anyway, because in ZStandardDecompressor.java's reset() we [set 
> "remaining" to 0 right after calling the JNI init() 
> call|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L216].
>  So ZStandardDecompressor.java init() doesn't have to be changed to an 
> instance method, we can leave it as static, but remove the JNI init() call's 
> "remaining" setInt() call altogether.
> Furthermore we should probably clean up the class/instance distinction in the 
> C file because that's what led to this confusion. There are some other 
> methods where the distinction is incorrect or ambiguous, we should fix them 
> to prevent this from happening again.
> I talked to [~jlowe] who further pointed out the ZStandardCompressor also has 
> similar problems and needs to be fixed too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long

2018-10-10 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645035#comment-16645035
 ] 

Jason Lowe commented on HADOOP-15820:
-

Thanks, [~jojochuang]!  Sorry for missing that commit.

> ZStandardDecompressor native code sets an integer field as a long
> -
>
> Key: HADOOP-15820
> URL: https://issues.apache.org/jira/browse/HADOOP-15820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HADOOP-15820.001.patch
>
>
> Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in 
> ZStandardDecompressor.c sets the {{remaining}} field as a long when it 
> actually is an integer.
> Kudos to Ben Lau from our HBase team for discovering this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-08 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642522#comment-16642522
 ] 

Jason Lowe commented on HADOOP-15822:
-

Looked into the unit test failures.
* TestNameNodeMetadataConsistency failure is an existing issue tracked by 
HDFS-11439
* TestBalancer test has been failing in other precommit builds, filed HDFS-13975
* TestStandbyCheckpoints does not look related and does not reproduce locally
* TestHAAppend is an inode create timeout that does not look related and does 
not reproduce locally
* TestDirectoryScanner is a timeout that does not look related and does not 
reproduce locally
* TestTimelineReaderWebServicesHBaseStorage has been failing in nightly builds, 
filed YARN-8856


> zstd compressor can fail with a small output buffer
> ---
>
> Key: HADOOP-15822
> URL: https://issues.apache.org/jira/browse/HADOOP-15822
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch
>
>
> TestZStandardCompressorDecompressor fails a couple of tests on my machine 
> with the latest zstd library (1.3.5).  Compression can fail to successfully 
> finalize the stream when a small output buffer is used resulting in a failed 
> to init error, and decompression with a direct buffer can fail with an 
> invalid src size error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-08 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642104#comment-16642104
 ] 

Jason Lowe commented on HADOOP-15822:
-

bq. do you think it's related? Or is it something different, maybe MR-specific? 

I do not think it is related.  The MapOutput buffer code is miscalculating how 
much buffer space is remaining before it forces a spill.  In this failure case 
the buffer involved is not dealing with compressed data, so it should not 
matter what codec is being used.  Have you tried reproducing it with lz4 or no 
codec at all?

I'll dig a bit into the Jenkins test failures to see if they are somehow 
related. 

> zstd compressor can fail with a small output buffer
> ---
>
> Key: HADOOP-15822
> URL: https://issues.apache.org/jira/browse/HADOOP-15822
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch
>
>
> TestZStandardCompressorDecompressor fails a couple of tests on my machine 
> with the latest zstd library (1.3.5).  Compression can fail to successfully 
> finalize the stream when a small output buffer is used resulting in a failed 
> to init error, and decompression with a direct buffer can fail with an 
> invalid src size error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long

2018-10-05 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640436#comment-16640436
 ] 

Jason Lowe commented on HADOOP-15820:
-

Thanks for moving this to a Blocker, [~leftnoteasy].  This issue can be 
particularly nasty since it corrupts the JVM process memory which can result in 
a difficult to debug JVM crash much later in the process lifetime.

> ZStandardDecompressor native code sets an integer field as a long
> -
>
> Key: HADOOP-15820
> URL: https://issues.apache.org/jira/browse/HADOOP-15820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HADOOP-15820.001.patch
>
>
> Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in 
> ZStandardDecompressor.c sets the {{remaining}} field as a long when it 
> actually is an integer.
> Kudos to Ben Lau from our HBase team for discovering this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-05 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640310#comment-16640310
 ] 

Jason Lowe commented on HADOOP-15822:
-

Minor fix to move the libzstd addition in the Dockerfile to its proper 
lexicographical place.


> zstd compressor can fail with a small output buffer
> ---
>
> Key: HADOOP-15822
> URL: https://issues.apache.org/jira/browse/HADOOP-15822
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch
>
>
> TestZStandardCompressorDecompressor fails a couple of tests on my machine 
> with the latest zstd library (1.3.5).  Compression can fail to successfully 
> finalize the stream when a small output buffer is used resulting in a failed 
> to init error, and decompression with a direct buffer can fail with an 
> invalid src size error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-05 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15822:

Attachment: HADOOP-15822.002.patch

> zstd compressor can fail with a small output buffer
> ---
>
> Key: HADOOP-15822
> URL: https://issues.apache.org/jira/browse/HADOOP-15822
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch
>
>
> TestZStandardCompressorDecompressor fails a couple of tests on my machine 
> with the latest zstd library (1.3.5).  Compression can fail to successfully 
> finalize the stream when a small output buffer is used resulting in a failed 
> to init error, and decompression with a direct buffer can fail with an 
> invalid src size error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-05 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15822:

Status: Patch Available  (was: Open)

Compression flushing failure has to do with how the JNI wrapper code was 
invoking the zstd library.  When using a small output buffer sometimes 
flushStream or endStream needs to be called successively to finish flushing 
everything, but the JNI code would always invoke the compressStream method on a 
null input buffer before invoking the flush/end call.  Older versions of zstd 
apparently were OK with this, but the new ones are not.  This patch skips 
calling compressStream if there is nothing in the input buffer to compress, so 
the zstd library will see a contiguous sequence of end stream calls towards the 
end of compression when using small output buffers.

The decompress direct test failure is a bug in the interface between the Java 
layer and the JNI layer.  The function takes a buffer pointer, a buffer length, 
and a buffer offset, as arguments but the Java layer was using remaining() 
instead of limit() to send down the size of the buffer.  Occasionally during 
the test remaining() can be smaller than position() and the zstd library 
rightfully complains that we are asking it to use a buffer past the end of the 
reported length.  In addition the test would sometimes fail to flip the output 
buffer which would break the test when that occurs.

These tests also were not running during precommit because the zstandard 
libraries were missing from the build environment, so this patch adds the 
libzstd package to the build environment Dockerfile.


> zstd compressor can fail with a small output buffer
> ---
>
> Key: HADOOP-15822
> URL: https://issues.apache.org/jira/browse/HADOOP-15822
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.9.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15822.001.patch
>
>
> TestZStandardCompressorDecompressor fails a couple of tests on my machine 
> with the latest zstd library (1.3.5).  Compression can fail to successfully 
> finalize the stream when a small output buffer is used resulting in a failed 
> to init error, and decompression with a direct buffer can fail with an 
> invalid src size error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-05 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15822:

Attachment: HADOOP-15822.001.patch

> zstd compressor can fail with a small output buffer
> ---
>
> Key: HADOOP-15822
> URL: https://issues.apache.org/jira/browse/HADOOP-15822
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15822.001.patch
>
>
> TestZStandardCompressorDecompressor fails a couple of tests on my machine 
> with the latest zstd library (1.3.5).  Compression can fail to successfully 
> finalize the stream when a small output buffer is used resulting in a failed 
> to init error, and decompression with a direct buffer can fail with an 
> invalid src size error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-05 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640275#comment-16640275
 ] 

Jason Lowe commented on HADOOP-15822:
-

Sample test failures:
{noformat}
[INFO] Running 
org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor
[ERROR] Tests run: 19, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.758 
s <<< FAILURE! - in 
org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor
[ERROR] 
testCompressingWithOneByteOutputBuffer(org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor)
  Time elapsed: 0.108 s  <<< ERROR!
java.lang.InternalError: Context should be init first
at 
org.apache.hadoop.io.compress.zstd.ZStandardCompressor.deflateBytesDirect(Native
 Method)
at 
org.apache.hadoop.io.compress.zstd.ZStandardCompressor.compress(ZStandardCompressor.java:216)
at 
org.apache.hadoop.io.compress.CompressorStream.compress(CompressorStream.java:81)
at 
org.apache.hadoop.io.compress.CompressorStream.finish(CompressorStream.java:92)
at 
org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor.testCompressingWithOneByteOutputBuffer(TestZStandardCompressorDecompressor.java:300)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)

[ERROR] 
testZStandardDirectCompressDecompress(org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor)
  Time elapsed: 0.014 s  <<< ERROR!
java.lang.InternalError: Src size is incorrect
at 
org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native
 Method)
at 
org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateDirect(ZStandardDecompressor.java:264)
at 
org.apache.hadoop.io.compress.zstd.ZStandardDecompressor$ZStandardDirectDecompressor.decompress(ZStandardDecompressor.java:307)
at 
org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor.compressDecompressLoop(TestZStandardCompressorDecompressor.java:416)
at 
org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor.testZStandardDirectCompressDecompress(TestZStandardCompressorDecompressor.java:385)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at

[jira] [Created] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-05 Thread Jason Lowe (JIRA)

Jason Lowe created HADOOP-15822:
---

 Summary: zstd compressor can fail with a small output buffer
 Key: HADOOP-15822
 URL: https://issues.apache.org/jira/browse/HADOOP-15822
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.9.0
Reporter: Jason Lowe
Assignee: Jason Lowe


TestZStandardCompressorDecompressor fails a couple of tests on my machine with 
the latest zstd library (1.3.5).  Compression can fail to successfully finalize 
the stream when a small output buffer is used resulting in a failed to init 
error, and decompression with a direct buffer can fail with an invalid src size 
error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long

2018-10-05 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15820:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.2
   3.0.4
   2.9.2
   3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

Thanks again to [~benlau] for identifying the issue and to [~kihwal] and 
[~Jim_Brennan] for reviews!  I committed this to trunk, branch-3.1, branch-3.0, 
branch-2, and branch-2.9.

> ZStandardDecompressor native code sets an integer field as a long
> -
>
> Key: HADOOP-15820
> URL: https://issues.apache.org/jira/browse/HADOOP-15820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HADOOP-15820.001.patch
>
>
> Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in 
> ZStandardDecompressor.c sets the {{remaining}} field as a long when it 
> actually is an integer.
> Kudos to Ben Lau from our HBase team for discovering this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long

2018-10-05 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15820:

Description: 
Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in 
ZStandardDecompressor.c sets the {{remaining}} field as a long when it actually 
is an integer.

Kudos to Ben Lau from our HBase team for discovering this issue.

  was:Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in 
ZStandardDecompressor.c sets the {{remaining}} field as a long when it actually 
is an integer.


Thanks for the reviews, [~kihwal] and [~Jim_Brennan]!  Committing this.

> ZStandardDecompressor native code sets an integer field as a long
> -
>
> Key: HADOOP-15820
> URL: https://issues.apache.org/jira/browse/HADOOP-15820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15820.001.patch
>
>
> Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in 
> ZStandardDecompressor.c sets the {{remaining}} field as a long when it 
> actually is an integer.
> Kudos to Ben Lau from our HBase team for discovering this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long

2018-10-04 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15820:

Status: Patch Available  (was: Open)

Attaching a patch that changes the setting from a long field to an int field.  
Oddly this was done correctly in the 
Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_inflateBytesDirect
 function but was wrong in the init function.


> ZStandardDecompressor native code sets an integer field as a long
> -
>
> Key: HADOOP-15820
> URL: https://issues.apache.org/jira/browse/HADOOP-15820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2, 2.9.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15820.001.patch
>
>
> Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in 
> ZStandardDecompressor.c sets the {{remaining}} field as a long when it 
> actually is an integer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long

2018-10-04 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15820:

Attachment: HADOOP-15820.001.patch

> ZStandardDecompressor native code sets an integer field as a long
> -
>
> Key: HADOOP-15820
> URL: https://issues.apache.org/jira/browse/HADOOP-15820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15820.001.patch
>
>
> Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in 
> ZStandardDecompressor.c sets the {{remaining}} field as a long when it 
> actually is an integer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long

2018-10-04 Thread Jason Lowe (JIRA)

Jason Lowe created HADOOP-15820:
---

 Summary: ZStandardDecompressor native code sets an integer field 
as a long
 Key: HADOOP-15820
 URL: https://issues.apache.org/jira/browse/HADOOP-15820
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0-alpha2, 2.9.0
Reporter: Jason Lowe
Assignee: Jason Lowe


Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in 
ZStandardDecompressor.c sets the {{remaining}} field as a long when it actually 
is an integer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15755) StringUtils#createStartupShutdownMessage throws NPE when args is null

2018-09-18 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15755:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.2
   3.0.4
   2.9.2
   3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

Thanks [~dineshchitlangia] and [~ljain]!  I committed this to trunk, 
branch-3.1, branch-3.0, branch-2, and branch-2.9.

> StringUtils#createStartupShutdownMessage throws NPE when args is null
> -
>
> Key: HADOOP-15755
> URL: https://issues.apache.org/jira/browse/HADOOP-15755
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Dinesh Chitlangia
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HADOOP-15755.001.patch, HADOOP-15755.002.patch
>
>
> StringUtils#createStartupShutdownMessage uses 
> {code:java}
> Arrays.asList(args)
> {code}
> which throws NPE when args is null.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15755) StringUtils#createStartupShutdownMessage throws NPE when args is null

2018-09-18 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619703#comment-16619703
 ] 

Jason Lowe commented on HADOOP-15755:
-

Thanks for updating the patch!  +1 lgtm.  Committing this.

> StringUtils#createStartupShutdownMessage throws NPE when args is null
> -
>
> Key: HADOOP-15755
> URL: https://issues.apache.org/jira/browse/HADOOP-15755
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Dinesh Chitlangia
>Priority: Major
> Attachments: HADOOP-15755.001.patch, HADOOP-15755.002.patch
>
>
> StringUtils#createStartupShutdownMessage uses 
> {code:java}
> Arrays.asList(args)
> {code}
> which throws NPE when args is null.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15755) StringUtils#createStartupShutdownMessage throws NPE when args is null

2018-09-14 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614840#comment-16614840
 ] 

Jason Lowe commented on HADOOP-15755:
-

Thanks for the report and patch!  Fix looks fine.  It could use 
Collections.emptyList but that's not a must-fix.

Would you mind adding a unit test?  It's trivial in this case since the test 
just needs to invoke the method with a null args parameter.  That way if 
someone later refactors the method a test will verify this doesn't regress.


> StringUtils#createStartupShutdownMessage throws NPE when args is null
> -
>
> Key: HADOOP-15755
> URL: https://issues.apache.org/jira/browse/HADOOP-15755
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: HADOOP-15755.001.patch
>
>
> StringUtils#createStartupShutdownMessage uses 
> {code:java}
> Arrays.asList(args)
> {code}
> which throws NPE when args is null.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15738) MRAppBenchmark.benchmark1() fails with NullPointerException

2018-09-10 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609359#comment-16609359
 ] 

Jason Lowe commented on HADOOP-15738:
-

I added you as a contributor to the HADOOP and MAPREDUCE projects.


> MRAppBenchmark.benchmark1() fails with NullPointerException
> ---
>
> Key: HADOOP-15738
> URL: https://issues.apache.org/jira/browse/HADOOP-15738
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Oleksandr Shevchenko
>Priority: Minor
>
> MRAppBenchmark.benchmark1() fails with NullPointerException:
> 1. We do not set any queue for this test. As the result we got the following 
> exception:
> {noformat}
> 2018-09-10 17:04:23,486 ERROR [Thread-0] rm.RMCommunicator 
> (RMCommunicator.java:register(177)) - Exception while registering
> java.lang.NullPointerException
> at org.apache.avro.util.Utf8$2.toUtf8(Utf8.java:123)
> at org.apache.avro.util.Utf8.getBytesFor(Utf8.java:172)
> at org.apache.avro.util.Utf8.(Utf8.java:39)
> at 
> org.apache.hadoop.mapreduce.jobhistory.JobQueueChangeEvent.(JobQueueChangeEvent.java:35)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.setQueueName(JobImpl.java:1167)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:174)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:301)
> at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:285)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.run(MRAppBenchmark.java:72)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.benchmark1(MRAppBenchmark.java:194)
> {noformat}
> 2. We override createSchedulerProxy method and do not set application 
> priority that was added later by MAPREDUCE-6515. We got the following error:
> {noformat}
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleJobPriorityChange(RMContainerAllocator.java:1025)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:880)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:286)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$AllocatorRunnable.run(RMCommunicator.java:280)
>  at java.lang.Thread.run(Thread.java:748)
> {noformat}
> In both cases, the job never will be run and the test stuck and will not be 
> finished.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-15738) MRAppBenchmark.benchmark1() fails with NullPointerException

2018-09-10 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved HADOOP-15738.
-
Resolution: Duplicate

> MRAppBenchmark.benchmark1() fails with NullPointerException
> ---
>
> Key: HADOOP-15738
> URL: https://issues.apache.org/jira/browse/HADOOP-15738
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Oleksandr Shevchenko
>Priority: Minor
>
> MRAppBenchmark.benchmark1() fails with NullPointerException:
> 1. We do not set any queue for this test. As the result we got the following 
> exception:
> {noformat}
> 2018-09-10 17:04:23,486 ERROR [Thread-0] rm.RMCommunicator 
> (RMCommunicator.java:register(177)) - Exception while registering
> java.lang.NullPointerException
> at org.apache.avro.util.Utf8$2.toUtf8(Utf8.java:123)
> at org.apache.avro.util.Utf8.getBytesFor(Utf8.java:172)
> at org.apache.avro.util.Utf8.(Utf8.java:39)
> at 
> org.apache.hadoop.mapreduce.jobhistory.JobQueueChangeEvent.(JobQueueChangeEvent.java:35)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.setQueueName(JobImpl.java:1167)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:174)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:301)
> at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:285)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.run(MRAppBenchmark.java:72)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.benchmark1(MRAppBenchmark.java:194)
> {noformat}
> 2. We override createSchedulerProxy method and do not set application 
> priority that was added later by MAPREDUCE-6515. We got the following error:
> {noformat}
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleJobPriorityChange(RMContainerAllocator.java:1025)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:880)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:286)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$AllocatorRunnable.run(RMCommunicator.java:280)
>  at java.lang.Thread.run(Thread.java:748)
> {noformat}
> In both cases, the job never will be run and the test stuck and will not be 
> finished.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15738) MRAppBenchmark.benchmark1() fails with NullPointerException

2018-09-10 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609314#comment-16609314
 ] 

Jason Lowe commented on HADOOP-15738:
-

A "Fixed" resolution should only be used for cases where a patch has been 
committed.  Marking this as a duplicate of MAPREDUCE-7137.  Note that in the 
future JIRAs can be moved between projects rather than refiling them from 
scratch.  See the "Move" action under the "More" tab at the top of the JIRA.


> MRAppBenchmark.benchmark1() fails with NullPointerException
> ---
>
> Key: HADOOP-15738
> URL: https://issues.apache.org/jira/browse/HADOOP-15738
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Oleksandr Shevchenko
>Priority: Minor
>
> MRAppBenchmark.benchmark1() fails with NullPointerException:
> 1. We do not set any queue for this test. As the result we got the following 
> exception:
> {noformat}
> 2018-09-10 17:04:23,486 ERROR [Thread-0] rm.RMCommunicator 
> (RMCommunicator.java:register(177)) - Exception while registering
> java.lang.NullPointerException
> at org.apache.avro.util.Utf8$2.toUtf8(Utf8.java:123)
> at org.apache.avro.util.Utf8.getBytesFor(Utf8.java:172)
> at org.apache.avro.util.Utf8.(Utf8.java:39)
> at 
> org.apache.hadoop.mapreduce.jobhistory.JobQueueChangeEvent.(JobQueueChangeEvent.java:35)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.setQueueName(JobImpl.java:1167)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:174)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:301)
> at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:285)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.run(MRAppBenchmark.java:72)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.benchmark1(MRAppBenchmark.java:194)
> {noformat}
> 2. We override createSchedulerProxy method and do not set application 
> priority that was added later by MAPREDUCE-6515. We got the following error:
> {noformat}
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleJobPriorityChange(RMContainerAllocator.java:1025)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:880)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:286)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$AllocatorRunnable.run(RMCommunicator.java:280)
>  at java.lang.Thread.run(Thread.java:748)
> {noformat}
> In both cases, the job never will be run and the test stuck and will not be 
> finished.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Reopened] (HADOOP-15738) MRAppBenchmark.benchmark1() fails with NullPointerException

2018-09-10 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reopened HADOOP-15738:
-

> MRAppBenchmark.benchmark1() fails with NullPointerException
> ---
>
> Key: HADOOP-15738
> URL: https://issues.apache.org/jira/browse/HADOOP-15738
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Oleksandr Shevchenko
>Priority: Minor
>
> MRAppBenchmark.benchmark1() fails with NullPointerException:
> 1. We do not set any queue for this test. As the result we got the following 
> exception:
> {noformat}
> 2018-09-10 17:04:23,486 ERROR [Thread-0] rm.RMCommunicator 
> (RMCommunicator.java:register(177)) - Exception while registering
> java.lang.NullPointerException
> at org.apache.avro.util.Utf8$2.toUtf8(Utf8.java:123)
> at org.apache.avro.util.Utf8.getBytesFor(Utf8.java:172)
> at org.apache.avro.util.Utf8.(Utf8.java:39)
> at 
> org.apache.hadoop.mapreduce.jobhistory.JobQueueChangeEvent.(JobQueueChangeEvent.java:35)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.setQueueName(JobImpl.java:1167)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:174)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122)
> at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:301)
> at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:285)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.run(MRAppBenchmark.java:72)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.benchmark1(MRAppBenchmark.java:194)
> {noformat}
> 2. We override createSchedulerProxy method and do not set application 
> priority that was added later by MAPREDUCE-6515. We got the following error:
> {noformat}
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleJobPriorityChange(RMContainerAllocator.java:1025)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:880)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:286)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$AllocatorRunnable.run(RMCommunicator.java:280)
>  at java.lang.Thread.run(Thread.java:748)
> {noformat}
> In both cases, the job never will be run and the test stuck and will not be 
> finished.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15722) regression: Hadoop 2.7.7 release breaks spark submit

2018-09-07 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607592#comment-16607592
 ] 

Jason Lowe commented on HADOOP-15722:
-

I haven't been able to reproduce the issue yet, but looking closer at the logs 
I think it's related to variable expansion.  Another aspect of restricted 
parsing is they are unable to access system properties or environment variables 
from the config since those could potentially contain secrets.  Looks like in 
the following log snippets for the good and bad runs, the user.name system 
property is not getting expanded in the bad run because the conf resource is 
untrusted:

Log excerpt from the session with hadoop 2.7.3:
{noformat}
18/09/06 08:12:04 INFO SessionState: Created HDFS directory: 
/tmp/hive-admin/user_b/799640f8-3d34-4cb7-90fe-5368c22881d5
{noformat}

Log excerpt from the session with hadoop 2.7.7:
{noformat}
18/09/06 07:23:09 INFO SessionState: Created HDFS directory: 
/tmp/hive-${user.name}/user_b
{noformat}

[~yumwang] would you mind running with the following patch to Hadoop 2.7.7's 
Configuration to see if this fixes the issue or at least gets significantly 
farther?  That would help validate my theory as to what's going on here.  The 
patch keeps XML directives restricted for untrusted sources but re-enables 
system property access.
{noformat}
diff --git 
a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
 
b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
index 5ce3e65..4df8491 100644
--- 
a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
+++ 
b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
@@ -905,7 +905,7 @@ public synchronized void reloadConfiguration() {
   
   private synchronized void addResourceObject(Resource resource) {
 resources.add(resource);  // add to resources
-restrictSystemProps |= resource.isParserRestricted();
+restrictSystemProps = false;
 reloadConfiguration();
   }
{noformat}

If it indeed is the issue then we may need to reconsider the restriction on 
system properties.  Choices include:
- Removing the property expansion restriction completely so all system and env 
properties are available, and it would be up to admins to sanitize these when 
starting proxy servers
- Allowing system properties but restricting environment variables, if we feel 
env variables are more common for passing secrets
- Using a whitelist for system properties

> regression: Hadoop 2.7.7 release breaks spark submit
> 
>
> Key: HADOOP-15722
> URL: https://issues.apache.org/jira/browse/HADOOP-15722
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build, conf, security
>Affects Versions: 2.7.7
>Reporter: Steve Loughran
>Priority: Major
>
> SPARK-25330 highlights that upgrading spark to hadoop 2.7.7 is causing a 
> regression in client setup, with things only working when 
> {{Configuration.getRestrictParserDefault(Object resource)}} = false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15722) regression: Hadoop 2.7.7 release breaks spark submit

2018-09-07 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607336#comment-16607336
 ] 

Jason Lowe commented on HADOOP-15722:
-

The getRestrictedParserDefault method was added to address CVE-2017-15713 and 
shipped as part of 2.7.5.  The idea behind the fix is to restrict the parsing 
of XML entities from a configuration Resource when that Resource may not be 
trusted.  Untrusted resources that are parsed from Resources that come from the 
classpath are trusted, but resources that come from file streams as a proxy 
user are not.  When parsing configs outside of the classpath as a proxy user, 
the contents are likely coming from conf data provided by a cluster user, and 
we would need to restrict certain XML entities in those cases.  Failing to do 
so could expose the contents of local files on the server to the cluster user 
which is the crux of the CVE.

I'll try to work through the repro steps listed in SPARK-25330 to see if I can 
reproduce the issue locally.  If successful it should be relatively 
straightforward to see where the suspect conf is coming from and why it breaks 
when parsing of that conf is restricted. Note that restricted parsing doesn't 
mean the contents are not parsed at all, rather that the parser won't honor 
certain requested directives in the XML stream.


> regression: Hadoop 2.7.7 release breaks spark submit
> 
>
> Key: HADOOP-15722
> URL: https://issues.apache.org/jira/browse/HADOOP-15722
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build, conf, security
>Affects Versions: 2.7.7
>Reporter: Steve Loughran
>Priority: Major
>
> SPARK-25330 highlights that upgrading spark to hadoop 2.7.7 is causing a 
> regression in client setup, with things only working when 
> {{Configuration.getRestrictParserDefault(Object resource)}} = false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15614) TestGroupsCaching.testExceptionOnBackgroundRefreshHandled reliably fails

2018-07-17 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546917#comment-16546917
 ] 

Jason Lowe commented on HADOOP-15614:
-

It fails reliably when run in isolation on this line:
{noformat}
assertEquals(startingRequestCount, FakeGroupMapping.getRequestCount());
{noformat}

but it also sporadically fails on this last code line below when run with the 
other tests:
{noformat}
// Now sleep for a short time and re-check the request count. It should have
// increased, but the exception means the cache will not have updated
Thread.sleep(50);
FakeGroupMapping.setThrowException(false);
assertEquals(startingRequestCount + 1, FakeGroupMapping.getRequestCount());
assertEquals(groups.getGroups("me").size(), 2);
{noformat}

The 50msec sleep screams racy test to me.


> TestGroupsCaching.testExceptionOnBackgroundRefreshHandled reliably fails
> 
>
> Key: HADOOP-15614
> URL: https://issues.apache.org/jira/browse/HADOOP-15614
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Priority: Major
>
> When {{testExceptionOnBackgroundRefreshHandled}} is run individually, it 
> reliably fails. It seems like a fundamental bug in the test or groups caching.
> A similar issue was dealt with in HADOOP-13375. [~cheersyang], do you have 
> any insight into this?
> This test case was added in HADOOP-13263.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15528) Deprecate ContainerLaunch#link by using FileUtil#SymLink

2018-07-09 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537672#comment-16537672
 ] 

Jason Lowe commented on HADOOP-15528:
-

It may be useful to have a config disable for this given it could cause 
difficulties, but it'd also be nice if we could avoid users shooting themselves 
in the foot with this config.  If we know using this config with some container 
executors makes no sense then it'd be nice to either fail fast on NM startup, 
warn it's being ignored, or otherwise do something smarter than just failing 
every container execution in a difficult to debug manner.


> Deprecate ContainerLaunch#link by using FileUtil#SymLink
> 
>
> Key: HADOOP-15528
> URL: https://issues.apache.org/jira/browse/HADOOP-15528
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: HADOOP-15528-HADOOP-15461.v1.patch, 
> HADOOP-15528-HADOOP-15461.v2.patch, HADOOP-15528-HADOOP-15461.v3.patch
>
>
> {{ContainerLaunch}} currently uses its own utility to create links (including 
> winutils).
> This should be deprecated and rely on {{FileUtil#SymLink}} which is already 
> multi-platform and pure Java.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15121) Encounter NullPointerException when using DecayRpcScheduler

2018-07-09 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15121:

Fix Version/s: 3.1.0
   3.0.1
   2.8.5
   2.9.2
   2.10.0

Thanks, [~Tao Jie]!  I recently ran across this in 2.8 and saw it still wasn't 
fixed in branch-2.8, so I committed this to branch-2, branch-2.9, and 
branch-2.8.  I also saw the Fix Version field in the JIRA wasn't set when this 
was committed, so I updated that as well to reflect the 3.x versions.

> Encounter NullPointerException when using DecayRpcScheduler
> ---
>
> Key: HADOOP-15121
> URL: https://issues.apache.org/jira/browse/HADOOP-15121
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.2
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 3.0.1, 2.9.2, 2.8.5
>
> Attachments: HADOOP-15121.001.patch, HADOOP-15121.002.patch, 
> HADOOP-15121.003.patch, HADOOP-15121.004.patch, HADOOP-15121.005.patch, 
> HADOOP-15121.006.patch, HADOOP-15121.007.patch, HADOOP-15121.008.patch
>
>
> I set ipc.8020.scheduler.impl to org.apache.hadoop.ipc.DecayRpcScheduler, but 
> got excetion in namenode:
> {code}
> 2017-12-15 15:26:34,662 ERROR impl.MetricsSourceAdapter 
> (MetricsSourceAdapter.java:getMetrics(202)) - Error getting metrics from 
> source DecayRpcSchedulerMetrics2.ipc.8020
> java.lang.NullPointerException
> at 
> org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.getMetrics(DecayRpcScheduler.java:781)
> at 
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:199)
> at 
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:182)
> at 
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:155)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
> at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:66)
> at 
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:222)
> at 
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:100)
> at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:268)
> at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:233)
> at 
> org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.registerMetrics2Source(DecayRpcScheduler.java:709)
> at 
> org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.(DecayRpcScheduler.java:685)
> at 
> org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.getInstance(DecayRpcScheduler.java:693)
> at 
> org.apache.hadoop.ipc.DecayRpcScheduler.(DecayRpcScheduler.java:236)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.hadoop.ipc.CallQueueManager.createScheduler(CallQueueManager.java:102)
> at 
> org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:76)
> at org.apache.hadoop.ipc.Server.(Server.java:2612)
> at org.apache.hadoop.ipc.RPC$Server.(RPC.java:958)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:374)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:349)
> at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:415)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:755)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:697)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:905)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:884)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1610)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1678)
> {code}
> It seems that

[jira] [Commented] (HADOOP-15528) Deprecate ContainerLaunch#link by using FileUtil#SymLink

2018-07-06 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535409#comment-16535409
 ] 

Jason Lowe commented on HADOOP-15528:
-

Sorry for the delay in replying, as I recently got back from an extended 
vacation and am catching up on things.

bq. However, the new behavior is the symlink operation is executed by NM 
itself, which is executed as a child process under NM itself, it shares the 
same execution environment as NM.

This cannot work in a secure environment.  Well at least the one we have today 
on Linux with the native container executor.  In that secure environment the 
container is running as the user and therefore has access to things that the NM 
user does not.  The container working directory is one of those things.  
Normally the NM user has no need or reason to be able to see the contents of 
the container working directory nor be able to modify it.


> Deprecate ContainerLaunch#link by using FileUtil#SymLink
> 
>
> Key: HADOOP-15528
> URL: https://issues.apache.org/jira/browse/HADOOP-15528
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: HADOOP-15528-HADOOP-15461.v1.patch, 
> HADOOP-15528-HADOOP-15461.v2.patch, HADOOP-15528-HADOOP-15461.v3.patch
>
>
> {{ContainerLaunch}} currently uses its own utility to create links (including 
> winutils).
> This should be deprecated and rely on {{FileUtil#SymLink}} which is already 
> multi-platform and pure Java.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15571) Multiple FileContexts created with the same configuration object should be allowed to have different umask

2018-07-06 Thread Jason Lowe (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535342#comment-16535342
 ] 

Jason Lowe commented on HADOOP-15571:
-

branch-3 should not exist, branch-3.0 is what you want.  Please cherry-pick 
this to branch-3.0 and update the fix versions to include 3.0.4.

> Multiple FileContexts created with the same configuration object should be 
> allowed to have different umask
> --
>
> Key: HADOOP-15571
> URL: https://issues.apache.org/jira/browse/HADOOP-15571
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15571.1.txt, HADOOP-15571.txt
>
>
> Ran into a super hard-to-debug issue due to this. [Edit: Turns out the same 
> issue as YARN-5749 that [~Tao Yang] ran into]
> h4. Issue
> Configuration conf = new Configuration();
>  fc1 = FileContext.getFileContext(uri1, conf);
>  fc2 = FileContext.getFileContext(uri2, conf);
>  fc.setUMask(umask_for_fc1); // Screws up umask for fc2 also!
> This was not the case before HADOOP-13440.
> h4. Symptoms:
> h5. Scenario I ran into
> When trying to localize a HDFS directory (hdfs:///my/dir/1.txt), NodeManager 
> tries to replicate the directory structure on the local file-system 
> ($yarn-local-dirs/filecache/my/dir/1.txt).
> Now depending on whether NM has ever done a log-aggregation (completely 
> unrelated code that sets umask to be 137 for its own files on HDFS), the 
> directories /my and /my/dir on local-fs may have different permissions. In 
> the specific case where NM did log-aggregation, /my/dir was created with 137 
> umask and so localization of 1.txt completely failed due to absent directory 
> executable permissions!
> h5. Previous scenarios:
> We ran into this before in test-cases and instead of fixing the root-cause, 
> we just fixed the test-cases: YARN-5679 / YARN-5749



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15450) Avoid fsync storm triggered by DiskChecker and handle disk full situation

2018-05-07 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15450:

Target Version/s: 2.8.4, 3.1.1, 2.9.2, 3.0.3
   Fix Version/s: (was: 2.8.4)

> Avoid fsync storm triggered by DiskChecker and handle disk full situation
> -
>
> Key: HADOOP-15450
> URL: https://issues.apache.org/jira/browse/HADOOP-15450
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Arpit Agarwal
>Priority: Blocker
>
> Fix disk checker issues reported by [~kihwal] in HADOOP-13738:
> # When space is low, the os returns ENOSPC. Instead simply stop writing, the 
> drive is marked bad and replication happens. This make cluster-wide space 
> problem worse. If the number of "failed" drives exceeds the DFIP limit, the 
> datanode shuts down.
> # There are non-hdfs users of DiskChecker, who use it proactively, not just 
> on failures. This was fine before, but now it incurs heavy I/O due to 
> introduction of fsync() in the code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15385) Many tests are failing in hadoop-distcp project in branch-2.8

2018-04-24 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450026#comment-16450026
 ] 

Jason Lowe commented on HADOOP-15385:
-

The failures are unrelated and pass for me locally on branch-2 with or without 
the patch applied.

The ASF warnings are for hs_err pid files, so apparently the JVM was crashing 
on the system at least twice during the tests.  That could easily explain why 
the test was failing with timeouts and other errors if the launched distcp 
didn't complete properly.  I'm not sure that's so much a problem with the test 
as it is with the setup of the Jenkins host.


> Many tests are failing in hadoop-distcp project in branch-2.8
> -
>
> Key: HADOOP-15385
> URL: https://issues.apache.org/jira/browse/HADOOP-15385
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 2.8.2
>Reporter: Rushabh S Shah
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: HADOOP-15385-branch-2.001.patch
>
>
> Many tests are failing in hadoop-distcp project in branch-2.8
> Below are the failing tests.
> {noformat}
> Failed tests: 
>   
> TestDistCpViewFs.testUpdateGlobTargetMissingSingleLevel:326->checkResult:428 
> expected:<4> but was:<5>
>   TestDistCpViewFs.testGlobTargetMissingMultiLevel:346->checkResult:428 
> expected:<4> but was:<5>
>   TestDistCpViewFs.testGlobTargetMissingSingleLevel:306->checkResult:428 
> expected:<2> but was:<3>
>   TestDistCpViewFs.testUpdateGlobTargetMissingMultiLevel:367->checkResult:428 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
> Tests run: 258, Failures: 16, Errors: 0, Skipped: 0
> {noformat}
> {noformat}
> rushabhs$ pwd
> /Users/rushabhs/hadoop/apacheHadoop/hadoop/hadoop-tools/hadoop-distcp
> rushabhs$ git branch
>  branch-2
>   branch-2.7
> * branch-2.8
>   branch-2.9
>   branch-3.0
>  rushabhs$ git log --oneline | head -n3
> c4ea1c8bb73 HADOOP-14970. MiniHadoopClusterManager doesn't respect lack of 
> format option. Contributed by Erik Krogen
> 1548205a845 YARN-8147. TestClientRMService#testGetApplications sporadically 
> fails. Contributed by Jason Lowe
> c01b425ba31 YARN-8120. JVM can crash with SIGSEGV when exiting due to custom 
> leveldb logger. Contributed by Jason Lowe.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15398) StagingTestBase uses methods not available in Mockito 1.8.5

2018-04-24 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449975#comment-16449975
 ] 

Jason Lowe commented on HADOOP-15398:
-

I discovered HADOOP-12427 which is essentially proposing the same thing.  It 
looks like there's a chance this update could break some existing tests, so we 
really should verify no test starts breaking as a result of this change.  
[~arshadmohammad] can you look into this?  I'll try to find some time to run 
tests on my end as well.

> StagingTestBase uses methods not available in Mockito 1.8.5
> ---
>
> Key: HADOOP-15398
> URL: https://issues.apache.org/jira/browse/HADOOP-15398
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Mohammad Arshad
>Assignee: Mohammad Arshad
>Priority: Major
> Attachments: HADOOP-15398.001.patch
>
>
> *Problem:* hadoop trunk compilation is failing
>  *Root Cause:*
>  compilation error is coming from 
> {{org.apache.hadoop.fs.s3a.commit.staging.StagingTestBase}}. Compilation 
> error is "The method getArgumentAt(int, Class) is 
> undefined for the type InvocationOnMock".
> StagingTestBase is using getArgumentAt(int, Class) method 
> which is not available in mockito-all 1.8.5 version. getArgumentAt(int, 
> Class) method is available only from version 2.0.0-beta
> *Expectations:*
>  Either mockito-all version to be upgraded or test case to be written only 
> with available functions in 1.8.5.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15406) hadoop-nfs dependencies for mockito and junit are not test scope

2018-04-24 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15406:

Assignee: Jason Lowe
Target Version/s: 3.2.0, 3.1.1, 3.0.3
  Status: Patch Available  (was: Open)

> hadoop-nfs dependencies for mockito and junit are not test scope
> 
>
> Key: HADOOP-15406
> URL: https://issues.apache.org/jira/browse/HADOOP-15406
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: nfs
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15406.001.patch
>
>
> hadoop-nfs asks for mockito-all and junit for its unit tests but it does not 
> mark the dependency as being required only for tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15406) hadoop-nfs dependencies for mockito and junit are not test scope

2018-04-24 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15406:

Attachment: HADOOP-15406.001.patch

> hadoop-nfs dependencies for mockito and junit are not test scope
> 
>
> Key: HADOOP-15406
> URL: https://issues.apache.org/jira/browse/HADOOP-15406
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: nfs
>Reporter: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15406.001.patch
>
>
> hadoop-nfs asks for mockito-all and junit for its unit tests but it does not 
> mark the dependency as being required only for tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-12427) [JDK8] Upgrade Mockito version to 1.10.19

2018-04-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448931#comment-16448931
 ] 

Jason Lowe commented on HADOOP-12427:
-

Ran across this as part of analyzing HADOOP-15398.  Was there anything that 
kept this from going in?  HADOOP-15398 proposes the same fix for its transient 
compile issue -- upgrading from 1.8.5 to 1.10.19.

> [JDK8] Upgrade Mockito version to 1.10.19
> -
>
> Key: HADOOP-12427
> URL: https://issues.apache.org/jira/browse/HADOOP-12427
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Minor
> Attachments: HADOOP-12427.v0.patch
>
>
> The current version is 1.8.5 - inserted in 2011.
> JDK 8 has been supported since 1.10.0. 
> https://github.com/mockito/mockito/blob/master/doc/release-notes/official.md
> "Compatible with JDK8 with exception of defender methods, JDK8 support will 
> improve in 2.0"
> http://mockito.org/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-15398) StagingTestBase uses methods not available in Mockito 1.8.5

2018-04-23 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reassigned HADOOP-15398:
---

Assignee: Mohammad Arshad
 Summary: StagingTestBase uses methods not available in Mockito 1.8.5  
(was: Compilation error in trunk in hadoop-aws )

Thanks for the patch!

+1 lgtm.  I'll commit this tomorrow if there are no objections.

> StagingTestBase uses methods not available in Mockito 1.8.5
> ---
>
> Key: HADOOP-15398
> URL: https://issues.apache.org/jira/browse/HADOOP-15398
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Mohammad Arshad
>Assignee: Mohammad Arshad
>Priority: Major
> Attachments: HADOOP-15398.001.patch
>
>
> *Problem:* hadoop trunk compilation is failing
>  *Root Cause:*
>  compilation error is coming from 
> {{org.apache.hadoop.fs.s3a.commit.staging.StagingTestBase}}. Compilation 
> error is "The method getArgumentAt(int, Class) is 
> undefined for the type InvocationOnMock".
> StagingTestBase is using getArgumentAt(int, Class) method 
> which is not available in mockito-all 1.8.5 version. getArgumentAt(int, 
> Class) method is available only from version 2.0.0-beta
> *Expectations:*
>  Either mockito-all version to be upgraded or test case to be written only 
> with available functions in 1.8.5.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15406) hadoop-nfs dependencies for mockito and junit are not test scope

2018-04-23 Thread Jason Lowe (JIRA)

Jason Lowe created HADOOP-15406:
---

 Summary: hadoop-nfs dependencies for mockito and junit are not 
test scope
 Key: HADOOP-15406
 URL: https://issues.apache.org/jira/browse/HADOOP-15406
 Project: Hadoop Common
  Issue Type: Bug
  Components: nfs
Reporter: Jason Lowe


hadoop-nfs asks for mockito-all and junit for its unit tests but it does not 
mark the dependency as being required only for tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15403) FileInputFormat recursive=false fails instead of ignoring the directories.

2018-04-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448602#comment-16448602
 ] 

Jason Lowe commented on HADOOP-15403:
-

bq. would a change in config be ok?

A change in the default value for a config is arguably the same thing as a code 
change that changes the default behavior from the perspective of a user.

To be clear I'm not saying we can't ever change the default behavior, but we 
need to be careful about the ramifications.  If we do, it needs to be marked as 
an incompatible change and have a corresponding release note that clearly 
explains the potential for silent data loss relative to the old behavior and 
what users can do to restore the old behavior.

Given the behavior for non-recursive has been this way for quite a long time, 
either users aren't running into this very often or they've set the value to 
recursive.  That leads me to suggest adding the ability to ignore directories 
but _not_ make it the default.  Then we don't have a backward incompatibility 
and the one Hive case you're trying can still work once the config is updated 
(or Hive can run the job with that setting automatically if it makes sense for 
that use case).


> FileInputFormat recursive=false fails instead of ignoring the directories.
> --
>
> Key: HADOOP-15403
> URL: https://issues.apache.org/jira/browse/HADOOP-15403
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HADOOP-15403.patch
>
>
> We are trying to create a split in Hive that will only read files in a 
> directory and not subdirectories.
> That fails with the below error.
> Given how this error comes about (two pieces of code interact, one explicitly 
> adding directories to results without failing, and one failing on any 
> directories in results), this seems like a bug.
> {noformat}
> Caused by: java.io.IOException: Not a file: 
> file:/,...warehouse/simple_to_mm_text/delta_001_001_
>   at 
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> {noformat}
> This code, when recursion is disabled, adds directories to results 
> {noformat} 
> if (recursive && stat.isDirectory()) {
>   result.dirsNeedingRecursiveCalls.add(stat);
> } else {
>   result.locatedFileStatuses.add(stat);
> }
> {noformat} 
> However the getSplits code after that computes the size like this
> {noformat}
> long totalSize = 0;   // compute total size
> for (FileStatus file: files) {// check we have valid files
>   if (file.isDirectory()) {
> throw new IOException("Not a file: "+ file.getPath());
>   }
>   totalSize +=
> {noformat}
> which would always fail combined with the above code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15403) FileInputFormat recursive=false fails instead of ignoring the directories.

2018-04-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448264#comment-16448264
 ] 

Jason Lowe commented on HADOOP-15403:
-

Does this have backward compatibility ramifications?  The default for 
mapreduce.input.fileinputformat.input.dir.recursive is false, so unless users 
changed it the jobs are failing today if the input contains directories.  If we 
change the behavior to ignore directories that could lead to lead to silent 
data loss if the job tried to consume an input location that now suddenly 
contains some directories.

In short: is it OK to assume the users will be aware of and agree with the new 
behavior?  Is there any way for users to revert to the old behavior if they do 
not want any inputs to be silently ignored?

> FileInputFormat recursive=false fails instead of ignoring the directories.
> --
>
> Key: HADOOP-15403
> URL: https://issues.apache.org/jira/browse/HADOOP-15403
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HADOOP-15403.patch
>
>
> We are trying to create a split in Hive that will only read files in a 
> directory and not subdirectories.
> That fails with the below error.
> Given how this error comes about (two pieces of code interact, one explicitly 
> adding directories to results without failing, and one failing on any 
> directories in results), this seems like a bug.
> {noformat}
> Caused by: java.io.IOException: Not a file: 
> file:/,...warehouse/simple_to_mm_text/delta_001_001_
>   at 
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> {noformat}
> This code, when recursion is disabled, adds directories to results 
> {noformat} 
> if (recursive && stat.isDirectory()) {
>   result.dirsNeedingRecursiveCalls.add(stat);
> } else {
>   result.locatedFileStatuses.add(stat);
> }
> {noformat} 
> However the getSplits code after that computes the size like this
> {noformat}
> long totalSize = 0;   // compute total size
> for (FileStatus file: files) {// check we have valid files
>   if (file.isDirectory()) {
> throw new IOException("Not a file: "+ file.getPath());
>   }
>   totalSize +=
> {noformat}
> which would always fail combined with the above code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15385) Many tests are failing in hadoop-distcp project in branch-2.8

2018-04-20 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15385:

Status: Patch Available  (was: Open)

Here's a patch that uses a unique path for each test suite that won't collide 
with the hadoop.tmp.dir set during the unit test runs.  This is something we 
would need to do for parallel unit tests anyway.

> Many tests are failing in hadoop-distcp project in branch-2.8
> -
>
> Key: HADOOP-15385
> URL: https://issues.apache.org/jira/browse/HADOOP-15385
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 2.8.2
>Reporter: Rushabh S Shah
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: HADOOP-15385-branch-2.001.patch
>
>
> Many tests are failing in hadoop-distcp project in branch-2.8
> Below are the failing tests.
> {noformat}
> Failed tests: 
>   
> TestDistCpViewFs.testUpdateGlobTargetMissingSingleLevel:326->checkResult:428 
> expected:<4> but was:<5>
>   TestDistCpViewFs.testGlobTargetMissingMultiLevel:346->checkResult:428 
> expected:<4> but was:<5>
>   TestDistCpViewFs.testGlobTargetMissingSingleLevel:306->checkResult:428 
> expected:<2> but was:<3>
>   TestDistCpViewFs.testUpdateGlobTargetMissingMultiLevel:367->checkResult:428 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
> Tests run: 258, Failures: 16, Errors: 0, Skipped: 0
> {noformat}
> {noformat}
> rushabhs$ pwd
> /Users/rushabhs/hadoop/apacheHadoop/hadoop/hadoop-tools/hadoop-distcp
> rushabhs$ git branch
>  branch-2
>   branch-2.7
> * branch-2.8
>   branch-2.9
>   branch-3.0
>  rushabhs$ git log --oneline | head -n3
> c4ea1c8bb73 HADOOP-14970. MiniHadoopClusterManager doesn't respect lack of 
> format option. Contributed by Erik Krogen
> 1548205a845 YARN-8147. TestClientRMService#testGetApplications sporadically 
> fails. Contributed by Jason Lowe
> c01b425ba31 YARN-8120. JVM can crash with SIGSEGV when exiting due to custom 
> leveldb logger. Contributed by Jason Lowe.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15385) Many tests are failing in hadoop-distcp project in branch-2.8

2018-04-20 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15385:

Attachment: HADOOP-15385-branch-2.001.patch

> Many tests are failing in hadoop-distcp project in branch-2.8
> -
>
> Key: HADOOP-15385
> URL: https://issues.apache.org/jira/browse/HADOOP-15385
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 2.8.2
>Reporter: Rushabh S Shah
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: HADOOP-15385-branch-2.001.patch
>
>
> Many tests are failing in hadoop-distcp project in branch-2.8
> Below are the failing tests.
> {noformat}
> Failed tests: 
>   
> TestDistCpViewFs.testUpdateGlobTargetMissingSingleLevel:326->checkResult:428 
> expected:<4> but was:<5>
>   TestDistCpViewFs.testGlobTargetMissingMultiLevel:346->checkResult:428 
> expected:<4> but was:<5>
>   TestDistCpViewFs.testGlobTargetMissingSingleLevel:306->checkResult:428 
> expected:<2> but was:<3>
>   TestDistCpViewFs.testUpdateGlobTargetMissingMultiLevel:367->checkResult:428 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
> Tests run: 258, Failures: 16, Errors: 0, Skipped: 0
> {noformat}
> {noformat}
> rushabhs$ pwd
> /Users/rushabhs/hadoop/apacheHadoop/hadoop/hadoop-tools/hadoop-distcp
> rushabhs$ git branch
>  branch-2
>   branch-2.7
> * branch-2.8
>   branch-2.9
>   branch-3.0
>  rushabhs$ git log --oneline | head -n3
> c4ea1c8bb73 HADOOP-14970. MiniHadoopClusterManager doesn't respect lack of 
> format option. Contributed by Erik Krogen
> 1548205a845 YARN-8147. TestClientRMService#testGetApplications sporadically 
> fails. Contributed by Jason Lowe
> c01b425ba31 YARN-8120. JVM can crash with SIGSEGV when exiting due to custom 
> leveldb logger. Contributed by Jason Lowe.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-15385) Many tests are failing in hadoop-distcp project in branch-2.8

2018-04-20 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reassigned HADOOP-15385:
---

 Assignee: Jason Lowe
Affects Version/s: (was: 2.8.3)
   2.8.2
 Target Version/s: 2.10.0, 2.9.1, 2.8.4  (was: 2.9.1, 2.8.4)

Thanks for the analysis, [~tasanuma0829]!

I took a quick look at this, and the failures are an unfortunate collision 
between the test directory and the staging directory.  Both of these end up 
using 'target/tmp' as a directory, so the staging directory used during the 
launch of the distcp job ends up showing up in the test directory and confuses 
the test when it checks how many files are in its test directory.  I can put up 
a patch shortly.

> Many tests are failing in hadoop-distcp project in branch-2.8
> -
>
> Key: HADOOP-15385
> URL: https://issues.apache.org/jira/browse/HADOOP-15385
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 2.8.2
>Reporter: Rushabh S Shah
>Assignee: Jason Lowe
>Priority: Blocker
>
> Many tests are failing in hadoop-distcp project in branch-2.8
> Below are the failing tests.
> {noformat}
> Failed tests: 
>   
> TestDistCpViewFs.testUpdateGlobTargetMissingSingleLevel:326->checkResult:428 
> expected:<4> but was:<5>
>   TestDistCpViewFs.testGlobTargetMissingMultiLevel:346->checkResult:428 
> expected:<4> but was:<5>
>   TestDistCpViewFs.testGlobTargetMissingSingleLevel:306->checkResult:428 
> expected:<2> but was:<3>
>   TestDistCpViewFs.testUpdateGlobTargetMissingMultiLevel:367->checkResult:428 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
>   TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 
> expected:<4> but was:<5>
>   TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 
> expected:<2> but was:<3>
>   TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 
> expected:<6> but was:<8>
> Tests run: 258, Failures: 16, Errors: 0, Skipped: 0
> {noformat}
> {noformat}
> rushabhs$ pwd
> /Users/rushabhs/hadoop/apacheHadoop/hadoop/hadoop-tools/hadoop-distcp
> rushabhs$ git branch
>  branch-2
>   branch-2.7
> * branch-2.8
>   branch-2.9
>   branch-3.0
>  rushabhs$ git log --oneline | head -n3
> c4ea1c8bb73 HADOOP-14970. MiniHadoopClusterManager doesn't respect lack of 
> format option. Contributed by Erik Krogen
> 1548205a845 YARN-8147. TestClientRMService#testGetApplications sporadically 
> fails. Contributed by Jason Lowe
> c01b425ba31 YARN-8120. JVM can crash with SIGSEGV when exiting due to custom 
> leveldb logger. Contributed by Jason Lowe.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Moved] (HADOOP-15398) Compilation error in trunk in hadoop-aws

2018-04-18 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe moved HDFS-13472 to HADOOP-15398:


Target Version/s: 3.2.0, 3.1.1  (was: 3.2.0)
 Key: HADOOP-15398  (was: HDFS-13472)
 Project: Hadoop Common  (was: Hadoop HDFS)

> Compilation error in trunk in hadoop-aws 
> -
>
> Key: HADOOP-15398
> URL: https://issues.apache.org/jira/browse/HADOOP-15398
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Mohammad Arshad
>Priority: Major
>
> *Problem:* hadoop trunk compilation is failing
>  *Root Cause:*
>  compilation error is coming from 
> {{org.apache.hadoop.fs.s3a.commit.staging.StagingTestBase}}. Compilation 
> error is "The method getArgumentAt(int, Class) is 
> undefined for the type InvocationOnMock".
> StagingTestBase is using getArgumentAt(int, Class) method 
> which is not available in mockito-all 1.8.5 version. getArgumentAt(int, 
> Class) method is available only from version 2.0.0-beta
> *Expectations:*
>  Either mockito-all version to be upgraded or test case to be written only 
> with available functions in 1.8.5.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution

2018-04-10 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15357:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.3
   2.9.2
   3.1.1
   3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

Thanks to [~Jim_Brennan] for the contribution and to [~lmccay] for addtional 
review!  I committed this to trunk, branch-3.1, branch-3.0, branch-2, and 
branch-2.9.

> Configuration.getPropsWithPrefix no longer does variable substitution
> -
>
> Key: HADOOP-15357
> URL: https://issues.apache.org/jira/browse/HADOOP-15357
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3
>
> Attachments: HADOOP-15357.001.patch, HADOOP-15357.002.patch, 
> HADOOP-15357.003.patch
>
>
> Before [HADOOP-13556], Configuration.getPropsWithPrefix() used the 
> Configuration.get() method to get the value of the variables.   After 
> [HADOOP-13556], it now uses props.getProperty().
> The difference is that Configuration.get() does deprecation handling and more 
> importantly variable substitution on the value.  So if a property has a 
> variable specified with ${variable_name}, it will no longer be expanded when 
> retrieved via getPropsWithPrefix().
> Was this change in behavior intentional?  I am using this function in the fix 
> for [MAPREDUCE-7069], but we do want variable expansion to happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution

2018-04-10 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15357:

Affects Version/s: 2.9.0
   3.0.0
 Target Version/s: 3.2.0, 3.1.1, 2.9.2, 3.0.3

+1 the latest patch looks good to me as well.  I'll commit this later today if 
there are no objections.

> Configuration.getPropsWithPrefix no longer does variable substitution
> -
>
> Key: HADOOP-15357
> URL: https://issues.apache.org/jira/browse/HADOOP-15357
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-15357.001.patch, HADOOP-15357.002.patch, 
> HADOOP-15357.003.patch
>
>
> Before [HADOOP-13556], Configuration.getPropsWithPrefix() used the 
> Configuration.get() method to get the value of the variables.   After 
> [HADOOP-13556], it now uses props.getProperty().
> The difference is that Configuration.get() does deprecation handling and more 
> importantly variable substitution on the value.  So if a property has a 
> variable specified with ${variable_name}, it will no longer be expanded when 
> retrieved via getPropsWithPrefix().
> Was this change in behavior intentional?  I am using this function in the fix 
> for [MAPREDUCE-7069], but we do want variable expansion to happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-15372) Race conditions and possible leaks in the Shell class

2018-04-09 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reassigned HADOOP-15372:
---

Assignee: Eric Badger

> Race conditions and possible leaks in the Shell class
> -
>
> Key: HADOOP-15372
> URL: https://issues.apache.org/jira/browse/HADOOP-15372
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Miklos Szegedi
>Assignee: Eric Badger
>Priority: Minor
>
> YARN-5641 introduced some cleanup code in the Shell class. It has a race 
> condition. {{Shell.
> runCommand()}} can be called while/after {{Shell.getAllShells()}} returned 
> all the shells to be cleaned up. This new thread can avoid the clean up, so 
> that the process held by it can be leaked causing leaked localized files/etc.
> I see another issue as well. {{Shell.runCommand()}} has a finally block with 
> a {{
> process.destroy();}} to clean up. However, the try catch block does not cover 
> all instructions after the process is started, so for example we can exit the 
> thread and leak the process, if {{
> timeOutTimer.schedule(timeoutTimerTask, timeOutInterval);}} causes an 
> exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15372) Race conditions and possible leaks in the Shell class

2018-04-09 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16430747#comment-16430747
 ] 

Jason Lowe commented on HADOOP-15372:
-

Thanks for the report, Miklos

bq. runCommand()}} can be called while/after Shell.getAllShells() returned all 
the shells to be cleaned up.  This new thread can avoid the clean up, so that 
the process held by it can be leaked causing leaked localized files/etc.

Yes, there's a small bug where Shell#runCommand should be synchronizing on 
CHILD_SHELLS as it starts the subprocess. That way it either won't be started 
before destroyAllShells is called or it will be part of the list as long as 
it's been started.  However I would argue it's outside the scope of the 
getAllShells and destoryAllShells APIs to prevent all future shells from being 
launched, as there may be a use case where someone wants to cleanup all current 
shells but still launch future ones. Its job is to kill all active ones which 
client code outside of Shell cannot do.

In the specific case of localizing, it looks like we need a second destroy pass 
after awaiting the shutdown of the executor to catch any shell that was trying 
to launch just as we destroyed the active ones.

{quote}I see another issue as well. [...] the try catch block does not cover 
all instructions after the process is started, so for example we can exit the 
thread and leak the process
{quote}
Yes that appears to be an issue as well.

[~ebadger] do you have some cycles to look into this?

> Race conditions and possible leaks in the Shell class
> -
>
> Key: HADOOP-15372
> URL: https://issues.apache.org/jira/browse/HADOOP-15372
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Miklos Szegedi
>Priority: Minor
>
> YARN-5641 introduced some cleanup code in the Shell class. It has a race 
> condition. {{Shell.
> runCommand()}} can be called while/after {{Shell.getAllShells()}} returned 
> all the shells to be cleaned up. This new thread can avoid the clean up, so 
> that the process held by it can be leaked causing leaked localized files/etc.
> I see another issue as well. {{Shell.runCommand()}} has a finally block with 
> a {{
> process.destroy();}} to clean up. However, the try catch block does not cover 
> all instructions after the process is started, so for example we can exit the 
> thread and leak the process, if {{
> timeOutTimer.schedule(timeoutTimerTask, timeOutInterval);}} causes an 
> exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Reopened] (HADOOP-13500) Concurrency issues when using Configuration iterator

2018-04-03 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reopened HADOOP-13500:
-

This is not a duplicate of HADOOP-13556.  That JIRA only changed the 
getPropsWithPrefix method which was not involved in the error reported by this 
JIRA or TEZ-3413.  AFAICT iterating a shared configuration object is still 
unsafe.

> Concurrency issues when using Configuration iterator
> 
>
> Key: HADOOP-13500
> URL: https://issues.apache.org/jira/browse/HADOOP-13500
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Reporter: Jason Lowe
>Assignee: Ajay Kumar
>Priority: Major
>
> It is possible to encounter a ConcurrentModificationException while trying to 
> iterate a Configuration object.  The iterator method tries to walk the 
> underlying Property object without proper synchronization, so another thread 
> simultaneously calling the set method can trigger it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15336) NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol communication between 2.7 and 3.2

2018-03-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411600#comment-16411600
 ] 

Jason Lowe commented on HADOOP-15336:
-

Could you elaborate in the description on the nature of the NPE?  A sample 
stacktrace would be immensely helpful here.

> NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol communication 
> between 2.7 and 3.2
> -
>
> Key: HADOOP-15336
> URL: https://issues.apache.org/jira/browse/HADOOP-15336
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Sherwood Zheng
>Assignee: Sherwood Zheng
>Priority: Major
>  Labels: backward-incompatible, common
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-03-08 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391522#comment-16391522
 ] 

Jason Lowe commented on HADOOP-15206:
-

skipBytes is decremented because of the read() call.  The skip() call is not 
guaranteed to be able to skip, and the workaround in that case is to try to 
read().  If the read() is successful then we were able to skip one more byte 
and need to account for that in the total number of bytes trying to be skipped.


> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Assignee: Aki Tanaka
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 2.9.1, 2.8.4, 2.7.6, 3.0.2
>
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, 
> HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, 
> HADOOP-15206.005.patch, HADOOP-15206.006.patch, HADOOP-15206.007.patch, 
> HADOOP-15206.008.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15284) Docker launch fails when user private filecache directory is missing

2018-03-05 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15284:

Affects Version/s: 3.1.0
  Summary: Docker launch fails when user private filecache 
directory is missing  (was: Could not determine real path of mount)

ContainerLocalizer, which is run for every user-specific localization (i.e.: 
PRIVATE and APPLICATION visibility), creates both the 
usercache/_user_/filecache and usercache/_user_/appcache directories whenever 
it runs (see ContainerLocalizer#initDirs).

If this directory is missing then I'm wondering if this is a case where 
_nothing_ was localized for this user, not just PRIVATE but also no APPLICATION 
visibility resources (i.e.: only public resources or no resources at all).  The 
only reason this would have worked before YARN-7815 is because the container 
executor creates the container work directory which exists under the 
usercache/_user_ directory, and that's what it used to mount before tha changes 
in YARN-7815.

> Docker launch fails when user private filecache directory is missing
> 
>
> Key: HADOOP-15284
> URL: https://issues.apache.org/jira/browse/HADOOP-15284
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Priority: Major
>
> Docker container is failing to launch in trunk.  The root cause is:
> {code}
> [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_20]: 
> [2018-03-02 23:26:09.196]Exception from container-launch.
> Container id: container_1520032931921_0001_01_20
> Exit code: 29
> Exception message: image: hadoop/centos:latest is trusted in hadoop registry.
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Invalid docker mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache',
>  realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache
> Error constructing docker command, docker error code=12, error 
> message='Invalid docker mount'
> Shell output: main : command provided 4
> main : run as user is hbase
> main : requested yarn user is hbase
> Creating script paths...
> Creating local dirs...
> [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 
> 23:26:09.240]
> [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29.
> [2018-03-02 23:26:39.278]Could not find 
> nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_20//container_1520032931921_0001_01_20.pid
>  in any of the directories
> [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down 
> now...
> {code}
> The filecache cant not be mounted because it doesn't exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15284) Could not determine real path of mount

2018-03-05 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386208#comment-16386208
 ] 

Jason Lowe commented on HADOOP-15284:
-

Looks like this was caused by YARN-7815.  The user's directory that was mounted 
before is always going to be there because the container executor creates the 
underlying container directory, but the user's filecache directory for 
resources with PRIVATE visibility may not be there.

One straightforward fix is to have the container executor ensure the user's 
filecache directory is present when launching Docker containers, but there may 
be cleaner alternatives.

> Could not determine real path of mount
> --
>
> Key: HADOOP-15284
> URL: https://issues.apache.org/jira/browse/HADOOP-15284
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eric Yang
>Priority: Major
>
> Docker container is failing to launch in trunk.  The root cause is:
> {code}
> [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_20]: 
> [2018-03-02 23:26:09.196]Exception from container-launch.
> Container id: container_1520032931921_0001_01_20
> Exit code: 29
> Exception message: image: hadoop/centos:latest is trusted in hadoop registry.
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Invalid docker mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache',
>  realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache
> Error constructing docker command, docker error code=12, error 
> message='Invalid docker mount'
> Shell output: main : command provided 4
> main : run as user is hbase
> main : requested yarn user is hbase
> Creating script paths...
> Creating local dirs...
> [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 
> 23:26:09.240]
> [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29.
> [2018-03-02 23:26:39.278]Could not find 
> nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_20//container_1520032931921_0001_01_20.pid
>  in any of the directories
> [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down 
> now...
> {code}
> The filecache cant not be mounted because it doesn't exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15279) increase maven heap size recommendations

2018-03-01 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382687#comment-16382687
 ] 

Jason Lowe commented on HADOOP-15279:
-

Thanks, Allen!  +1 lgtm.

> increase maven heap size recommendations
> 
>
> Key: HADOOP-15279
> URL: https://issues.apache.org/jira/browse/HADOOP-15279
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Minor
> Attachments: HADOOP-15279.00.patch
>
>
> 1G is just a bit too low for JDK8+surefire 2.20+hdfs unit tests running in 
> parallel.  Bump it up a bit more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-16 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15206:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.2
   2.7.6
   2.8.4
   2.9.1
   2.10.0
   3.1.0
   Status: Resolved  (was: Patch Available)

Thanks, [~tanakahda]!  I committed this to trunk, branch-3.1, branch-3.0, 
branch-2, branch-2.9, branch-2.8, and branch-2.7.

> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Assignee: Aki Tanaka
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 2.9.1, 2.8.4, 2.7.6, 3.0.2
>
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, 
> HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, 
> HADOOP-15206.005.patch, HADOOP-15206.006.patch, HADOOP-15206.007.patch, 
> HADOOP-15206.008.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-16 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367851#comment-16367851
 ] 

Jason Lowe commented on HADOOP-15206:
-

The TestMRJobs failure is unrelated and tracked by MAPREDUCE-7053.

Committing this.

> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Assignee: Aki Tanaka
>Priority: Major
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, 
> HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, 
> HADOOP-15206.005.patch, HADOOP-15206.006.patch, HADOOP-15206.007.patch, 
> HADOOP-15206.008.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-15 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366237#comment-16366237
 ] 

Jason Lowe commented on HADOOP-15206:
-

bq. Since this is the first time to apply a patch to the community, I apologize 
for having bothered you.

No worries whatsoever.  It is very common to go back and forth on a number of 
patches before anything is committed, so this is simply development as usual 
from my perspective.  I deeply appreciate your contribution on this subtle and 
tricky issue!

+1 for the latest patch, pending Jenkins.

> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Assignee: Aki Tanaka
>Priority: Major
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, 
> HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, 
> HADOOP-15206.005.patch, HADOOP-15206.006.patch, HADOOP-15206.007.patch, 
> HADOOP-15206.008.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-15 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365954#comment-16365954
 ] 

Jason Lowe commented on HADOOP-15206:
-

bq. Deleted comments in the code

Sorry, I didn't mean the entire comment needs to be deleted.  I think the 
comments were very helpful to explain why the logic is there, but I just didn't 
see the need to call out the specific JIRA number.  That is something trivially 
obtained from git.

Speaking of comments, when they are reinstated I noticed that this comment is 
slightly incorrect:
{code}
// HADOOP-15206: When we're in BYBLOCK mode and the start position
// is >=0 and < HEADER_LEN + SUB_HEADER_LEN, we should also skip
// to right after the BZip2 header to avoid duplicated records
skipPos = HEADER_LEN + SUB_HEADER_LEN + 1 - this.startingPos;
{code}
"Skip to right after the BZip2 header" may lead someone to think there's an 
off-by-one bug in the code.  We need to skip to right after the start of the 
first bz2 block (which occurs right after the bz2 header).

Nit: skipPos is not really a position but rather the number of bytes being 
skipped, so it looks incorrect when the code calls updateReportedByteCount on 
what appears to be a position rather than a byte delta.  Something like 
numSkipped or numBytesSkipped would be a less confusing name.

It would be nice to fix the checkstyle warning about line length on the comment.

The unit test failures appear to be unrelated, and they pass for me locally 
with the patch applied.

> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Assignee: Aki Tanaka
>Priority: Major
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, 
> HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, 
> HADOOP-15206.005.patch, HADOOP-15206.006.patch, HADOOP-15206.007.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-14 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reassigned HADOOP-15206:
---

Assignee: Aki Tanaka

> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Assignee: Aki Tanaka
>Priority: Major
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, 
> HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, 
> HADOOP-15206.005.patch, HADOOP-15206.006.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-14 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15206:

Status: Patch Available  (was: Open)

Thanks for updating the patch! Looks good overall, just a few nits. I think 
we're close, so moving this to Patch Available so the QA bot can comment on 
this as well.

Why are we only skipping one byte at a time instead of trying to skip the rest 
of the way in one call? The code can track the remaining bytes in skipBytes, 
decrement that by the number of bytes skipped in the loop, then loop while 
skipBytes > 0.

There is trailing whitespace on a couple of lines which would be nice to 
cleanup. I expect the QA bot to flag this in its whitespace check.

I'm not sure it's necessary to call out the JIRA in the comments. That's what 
{{git blame}} is for. ;) Otherwise the code would be littered with JIRA numbers 
in every bugfix change.

"steam is on BZip2 header" should be "a split is before the first BZip2 block"

> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.8.3
>Reporter: Aki Tanaka
>Priority: Major
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, 
> HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, 
> HADOOP-15206.005.patch, HADOOP-15206.006.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15227) add mapreduce.outputcommitter.factory.scheme.s3a to core-default

2018-02-14 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364275#comment-16364275
 ] 

Jason Lowe commented on HADOOP-15227:
-

Yeah, mapred-default and mapred-site aren't loaded until the JobConf class is 
loaded.  A common mistake for code is to create a plain Configuration early in 
{{main}} and try to lookup mapred properties (or even hdfs or yarn properties) 
expecting to get the default if they are not set by the user.  The easy fix is 
to create a JobConf instead of a Configuration if the code knows it wants to do 
mapred stuff.


> add mapreduce.outputcommitter.factory.scheme.s3a to core-default
> 
>
> Key: HADOOP-15227
> URL: https://issues.apache.org/jira/browse/HADOOP-15227
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>
> Need to add this property to core-default.xml. It's documented as being 
> there, but it isn't.
> {code}
> 
>   mapreduce.outputcommitter.factory.scheme.s3a
>   org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory
>   
> The committer factory to use when writing data to S3A filesystems.
>   
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-13 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363203#comment-16363203
 ] 

Jason Lowe commented on HADOOP-15206:
-

Thanks for updating the patch!

I believe the latest patch will break CONTINUOUS mode since it will no longer 
strip the bzip2 file header in that case.

I don't think it will be OK to remove calling readStreamHeader when reset() is 
called.  We're resetting the codec state to start afresh, and that means 
potentially reading a new file header (e.g.: concatenated bzip2 files).  My 
thinking is that we need to read the header, but we should not report the byte 
position being updated when doing so while we're in BLOCK mode (i.e.: split 
processing).

I think we need to revert the stream header reading logic to the original 
behavior.  Instead we can put a small change in the BZip2InputStream 
constructor to handle the special case of small splits that can start at or 
before the first bz2 block.  If the read mode is BLOCK and 0 < startingPos <= 
HEADER_LEN + SUB_HEADER_LEN then we skip bytes until we get to the HEADER_LEN + 
SUB_HEADER_LEN + 1 offset in the stream.  The bufferedIn.skip method will be 
useful here, but it needs to be called in a loop in case the skip fails to skip 
everything in one call (per the javadoc).


> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Priority: Major
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, 
> HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, 
> HADOOP-15206.005.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15227) add mapreduce.outputcommitter.factory.scheme.s3a to core-default

2018-02-13 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363017#comment-16363017
 ] 

Jason Lowe commented on HADOOP-15227:
-

Does this go in core-default or mapred-default?  The property name implies it 
would not belong in core-default, and it currently has the proper value in 
mapred-default.  So maybe the documentation is what needs to be corrected 
instead?

> add mapreduce.outputcommitter.factory.scheme.s3a to core-default
> 
>
> Key: HADOOP-15227
> URL: https://issues.apache.org/jira/browse/HADOOP-15227
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>
> Need to add this property to core-default.xml. It's documented as being 
> there, but it isn't.
> {code}
> 
>   mapreduce.outputcommitter.factory.scheme.s3a
>   org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory
>   
> The committer factory to use when writing data to S3A filesystems.
>   
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-12 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361615#comment-16361615
 ] 

Jason Lowe commented on HADOOP-15206:
-

Thanks for updating the patch!

bq. In the current implementation, read only "BZ" header when the read mode is 
CONTINUOUS. Do you think we should keep this?

Yes, because it's not important to read the header when the codec is in BLOCK 
mode.  IIUC the main difference between CONTINUOUS and BLOCK mode is that BLOCK 
mode will be used when processing splits and CONTINUOUS mode is used when we're 
simply trying to decompress the data in one big chunk (i.e.: no splits).  BLOCK 
mode always will scan for the start of the bz2 block, so it will automatically 
skip a bz2 file header while searching for the start of the first bz2 block 
from the specified start offset.

Given the splittable codec is always scanning for the block and doesn't really 
care what bytes are being skipped, I'm now thinking we can go back to a much 
simpler implementation.  I think the code can check if we're in BLOCK mode to 
know whether we are processing splits or not.  If we are in BLOCK mode we avoid 
advertising the byte position if start offset is zero just as the previous 
patches.  In BLOCK mode we should also skip to file offset HEADER_LEN + 
SUB_HEADER_LEN + 1 if the start position is >=0 and < HEADER_LEN + 
SUB_HEADER_LEN.  That will put us one byte past the start of the first bz2 
block, and BLOCK mode will automatically scan forward to the next block.  This 
proposal is very similar to what was implemented in patch 003.  I think we just 
need to make it only do the position adjustment if we're in BLOCK mode.

> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Priority: Major
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, 
> HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-09 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358919#comment-16358919
 ] 

Jason Lowe commented on HADOOP-15206:
-

Thanks for updating the patch!

It seems the basic problem is that split 0, the first split, is _always_ 
responsible for the first record even if that record is technically past the 
byte offset of the end of the split.  That's because all other splits will 
unconditionally throw away the first (potentially partial) record under the 
assumption the previous split is responsible for it.  Therefore we need to do 
two things to avoid drops and duplicates:
* If the first split ends before the start of the first bz2 block then we need 
to avoid advertising the updated byte position until we have started to consume 
the first bz2 block.  This avoids the dropped record.
* If subsequent splits start before the first bz2 block begins then we need to 
make sure any split that starts before the first block is artificially pushed 
past that first block.  This avoids the duplicates.

I'm wondering if it gets cleaner if we move this logic into readStreamHeader() 
and always call it.  That method can check the starting position and do one of 
the following:
* check for and read the full header if it is at starting position 0
* do nothing if start pos is past the full header + 1
* verify the bytes being skipped are the expected header bytes if start pos 
between 0 and full_header+1.  If they are not the expected bytes then we reset 
the buffered input (just like starting pos 0 logic does today if header is not 
there).

In the constructor we should be able to avoid updating the reported position if 
starting position is 0 (so we will always read into the first bz2 block), 
otherwise we advertise after reading any header so subsequent splits always 
start at least one byte after the start of the first bz2 block.


> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Priority: Major
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, 
> HADOOP-15206.002.patch, HADOOP-15206.003.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-06 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354606#comment-16354606
 ] 

Jason Lowe commented on HADOOP-15206:
-

Thanks for updating the patch!
{quote}Because 4 is a position of the first bz2 block marker, and an input 
stream will start reading the first bz2 block if the start position of the 
input stream is 4.
{quote}
Ah, right. Thanks for the explanation.
{quote}So, if the input stream tries to read from position 1-4, it will drop 
the first BZ2 block even though the block marker position is 4.
{quote}
This doesn't just drop the first bzip2 block, it drops the entire split. This 
goes back to my previous comment about the code assuming splits that start 
between bytes 1-4 are always tiny. Splits do not have to be equally sized, so 
theoretically there could be just two splits where the first split is a 
two-byte split starting at offset 0 and the other split is the rest of the 
file. I believe this change would cause all records to be dropped in that 
scenario. To fix that I think we only need to report a position that is one 
byte beyond the start of the first bzip2 block rather than at the end of the 
entire split (i.e.: header_len + 1 rather than end + 1).

The logic regarding the header seems backwards. If the header is stripped then 
that means there was a header present, yet the logic is only adding up bytes 
for a header length if it was *not* stripped which is the case when the header 
is not there.  I'm wondering how it's working since I think the header is 
always there in the unit tests.

> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Priority: Major
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, 
> HADOOP-15206.002.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-06 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354074#comment-16354074
 ] 

Jason Lowe commented on HADOOP-15206:
-

Thanks for the patch!
{code:java}
  if (this.startingPos > 0 && this.startingPos <= 4) {
this.startingPos = end + 1;
this.compressedStreamPosition = end + 1;
  }
{code}
The code above is making the following assumptions that I believe could not be 
true in some cases:
 * The bzip2 file header is always present at starting pos 0. I think it should 
be checking isHeaderStripped/isSubHeaderStripped.
 * If the split starts after byte 0 but before byte 5 then it must also end on 
or before byte 5. (Splits are not required to be equally sized.)

If the bzip2 file header is four bytes, why is the condition {{<= 4}} instead 
of {{< 4}}?  Should this code leverage HEADER_LEN and SUB_HEADER_LEN here?

Nit: "emptry" s/b "empty"

> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Priority: Major
> Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-02 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351044#comment-16351044
 ] 

Jason Lowe commented on HADOOP-15206:
-

I found a bit of time to look into this, so I'm dumping my notes here.  I'm not 
sure when I'll get some more time to work on it, so if someone feels brave 
enough to step in feel free.

Here's how I believe records get dropped with very small split sizes:
 # There's only one bz2 block in the file
 # The split size is smaller than 4 bytes
 # First split starts to read the data. It consumes the 'BZh9' magic header 
then updates the reported byte position of the stream to be 4
 # At this point the first split reader is beyond the end of the split before 
it ever read a single record, so it ends up returning with no records.
 # The second split starts in the middle of the 'BZh9' magic header and scans 
forward to find the start of a bz2 block and starts processing the split
 # Since this is not the first split, it throws away the first record with the 
assumption the previous split is responsible for it
 # Second split reader proceeds to consume all remaining data, since byte 
position is not updated until the next bz2 block and there's only one block
 # End result is first record is lost since first split never consumed it.

I think we can fix this scenario by not advertising a new byte position after 
reading the 'BZh9' header and only updating the byte position when we read the 
bz2 block header following the current bz2 block.

I didn't get as much time to look into the duplicated record scenario, but I 
suspect multiple splits end up discovering the beginning of the bz2 block and 
think it is their block to consume. Not sure yet how we can easily distinguish 
which split is the one, true split that is responsible for consuming the bz2 
block given we're hiding the true byte offset from the upper layers most of the 
time.

> BZip2 drops and duplicates records when input split size is small
> -
>
> Key: HADOOP-15206
> URL: https://issues.apache.org/jira/browse/HADOOP-15206
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Aki Tanaka
>Priority: Major
> Attachments: HADOOP-15206-test.patch
>
>
> BZip2 can drop and duplicate record when input split file is small. I 
> confirmed that this issue happens when the input split size is between 1byte 
> and 4bytes.
> I am seeing the following 2 problem behaviors.
>  
> 1. Drop record:
> BZip2 skips the first record in the input file when the input split size is 
> small
>  
> Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
> {code:java}
> 2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(317)) - 
> splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
>  count=99{code}
> > The input format read only 99 records but not 100 records
>  
> 2. Duplicate Record:
> 2 input splits has same BZip2 records when the input split size is small
>  
> Set the split size to 1 and tested to load 100 records (0, 1, 2..99)
>  
> {code:java}
> 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
> /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
>  count=99
> 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
> (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
> at position 8
> {code}
>  
> I experienced this error when I execute Spark (SparkSQL) job under the 
> following conditions:
> * The file size of the input files are small (around 1KB)
> * Hadoop cluster has many slave nodes (able to launch many executor tasks)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava

2018-02-02 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15170:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

Thanks, [~ajayydv]!  I committed this to trunk.

> Add symlink support to FileUtil#unTarUsingJava 
> ---
>
> Key: HADOOP-15170
> URL: https://issues.apache.org/jira/browse/HADOOP-15170
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Jason Lowe
>Assignee: Ajay Kumar
>Priority: Minor
> Fix For: 3.1.0
>
> Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch, 
> HADOOP-15170.003.patch, HADOOP-15170.004.patch
>
>
> Now that JDK7 or later is required, we can leverage 
> java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support 
> archives that contain symbolic links.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava

2018-02-02 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350699#comment-16350699
 ] 

Jason Lowe commented on HADOOP-15170:
-

Thanks for updating the patch!

+1 lgtm.  Committing this.


> Add symlink support to FileUtil#unTarUsingJava 
> ---
>
> Key: HADOOP-15170
> URL: https://issues.apache.org/jira/browse/HADOOP-15170
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Jason Lowe
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch, 
> HADOOP-15170.003.patch, HADOOP-15170.004.patch
>
>
> Now that JDK7 or later is required, we can leverage 
> java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support 
> archives that contain symbolic links.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15200) Missing DistCpOptions constructor breaks downstream DistCp projects in 3.0

2018-01-31 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15200:

Target Version/s: 3.1.0, 3.0.1

> Missing DistCpOptions constructor breaks downstream DistCp projects in 3.0
> --
>
> Key: HADOOP-15200
> URL: https://issues.apache.org/jira/browse/HADOOP-15200
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.0.0
>Reporter: Kuhu Shukla
>Priority: Critical
>
> Post HADOOP-14267, the constructor for DistCpOptions was removed and will 
> break any project using it for java based implementation/usage of DistCp. 
> This JIRA would track next steps required to reconcile/fix this 
> incompatibility. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava

2018-01-31 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347207#comment-16347207
 ] 

Jason Lowe commented on HADOOP-15170:
-

Thanks for updating the patch!

I tested this out on a manually created a tarball with some symlinks and the 
link targets are being mishandled.  For example:
{noformat}
$ mkdir testdir
$ cd testdir
$ ln -s a b
$ ln -s /tmp/foo c
$ ls -l
total 0
lrwxrwxrwx. 1 nobody nobody 1 Jan 31 10:40 b -> a
lrwxrwxrwx. 1 nobody nobody 8 Jan 31 10:40 c -> /tmp/foo
$ cd ..
$ tar zcf testdir.tgz testdir 
{noformat}
When I unpack this tarball to a destination directory of "output" with 
unTarUsingJava the symlinks are all relative to the top-level output directory 
which is incorrect:
{noformat}
$ ls -l output/testdir
total 0
lrwxrwxrwx. 1 nobody nobody  8 Jan 31 10:41 b -> output/a
lrwxrwxrwx. 1 nobody nobody 14 Jan 31 10:41 c -> output/tmp/foo
{noformat}

The fix is to just take the symlink name as-is rather than trying to make it 
relative to something else.

> Add symlink support to FileUtil#unTarUsingJava 
> ---
>
> Key: HADOOP-15170
> URL: https://issues.apache.org/jira/browse/HADOOP-15170
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Jason Lowe
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch, 
> HADOOP-15170.003.patch
>
>
> Now that JDK7 or later is required, we can leverage 
> java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support 
> archives that contain symbolic links.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava

2018-01-26 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341831#comment-16341831
 ] 

Jason Lowe commented on HADOOP-15170:
-

Thanks for the patch!  Apologies for the delay.

The VisibleForTesting import looks shady (pun intended).  We should be pulling 
this in from the normal location.  Actually even better would be to simply make 
this package-private rather than public.  Then I would argue the visibility 
marker isn't necessary.

The createSymbolicLinkUsingJava does not seem worth it given it's private and 
it's less typing to call the Files method directly.





> Add symlink support to FileUtil#unTarUsingJava 
> ---
>
> Key: HADOOP-15170
> URL: https://issues.apache.org/jira/browse/HADOOP-15170
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Jason Lowe
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch
>
>
> Now that JDK7 or later is required, we can leverage 
> java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support 
> archives that contain symbolic links.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15027) AliyunOSS: Support multi-thread pre-read to improve sequential read from Hadoop to Aliyun OSS performance

2018-01-17 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15027:

Fix Version/s: (was: 3.0.1)
   (was: 2.9.1)
   (was: 2.10.0)

I reverted this from branch-3.0, branch-2 and branch-2.9 since it broke the 
builds there:
{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hadoop-aliyun: Compilation failure: Compilation failure:
[ERROR] 
/home/jlowe/hadoop/apache/hadoop/hadoop-tools/hadoop-aliyun/src/main/java/org/apache/hadoop/fs/aliyun/oss/AliyunOSSFileSystem.java:[46,30]
 cannot find symbol
[ERROR] symbol:   class BlockingThreadPoolExecutorService
[ERROR] location: package org.apache.hadoop.util
[ERROR] 
/home/jlowe/hadoop/apache/hadoop/hadoop-tools/hadoop-aliyun/src/main/java/org/apache/hadoop/fs/aliyun/oss/AliyunOSSFileSystem.java:[53,30]
 cannot find symbol
[ERROR] symbol:   class SemaphoredDelegatingExecutor
[ERROR] location: package org.apache.hadoop.util
[ERROR] 
/home/jlowe/hadoop/apache/hadoop/hadoop-tools/hadoop-aliyun/src/main/java/org/apache/hadoop/fs/aliyun/oss/AliyunOSSFileSystem.java:[332,30]
 cannot find symbol
[ERROR] symbol:   variable BlockingThreadPoolExecutorService
[ERROR] location: class org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem
[ERROR] 
/home/jlowe/hadoop/apache/hadoop/hadoop-tools/hadoop-aliyun/src/main/java/org/apache/hadoop/fs/aliyun/oss/AliyunOSSFileSystem.java:[549,13]
 cannot find symbol
[ERROR] symbol:   class SemaphoredDelegatingExecutor
[ERROR] location: class org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem
{noformat}

If this needs to go into those releases, please revisit the patches for those 
branches.  Looks like this patch depends upon HADOOP-15039 which only went into 
3.1.0.

> AliyunOSS: Support multi-thread pre-read to improve sequential read from 
> Hadoop to Aliyun OSS performance
> -
>
> Key: HADOOP-15027
> URL: https://issues.apache.org/jira/browse/HADOOP-15027
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0
>Reporter: wujinhu
>Assignee: wujinhu
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HADOOP-15027.001.patch, HADOOP-15027.002.patch, 
> HADOOP-15027.003.patch, HADOOP-15027.004.patch, HADOOP-15027.005.patch, 
> HADOOP-15027.006.patch, HADOOP-15027.007.patch, HADOOP-15027.008.patch, 
> HADOOP-15027.009.patch, HADOOP-15027.010.patch, HADOOP-15027.011.patch, 
> HADOOP-15027.012.patch, HADOOP-15027.013.patch, HADOOP-15027.014.patch
>
>
> Currently, AliyunOSSInputStream uses single thread to read data from 
> AliyunOSS,  so we can do some refactoring by using multi-thread pre-read to 
> improve read performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

1 2 3 4 5 6 7 8 >

1 - 100 of 766 matches

Mail list logo