[jira] [Comment Edited] (HADOOP-16007) Order of property settings is incorrect when includes are processed
[ https://issues.apache.org/jira/browse/HADOOP-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725976#comment-16725976 ] Jason Lowe edited comment on HADOOP-16007 at 12/20/18 4:06 PM: --- This was fixed by HADOOP-15973. was (Author: jlowe): This was fixed by HADOOP-15554. > Order of property settings is incorrect when includes are processed > --- > > Key: HADOOP-16007 > URL: https://issues.apache.org/jira/browse/HADOOP-16007 > Project: Hadoop Common > Issue Type: Bug > Components: conf >Affects Versions: 3.2.0, 3.1.1, 3.0.4 >Reporter: Jason Lowe >Assignee: Eric Payne >Priority: Blocker > > If a configuration file contains a setting for a property then later includes > another file that also sets that property to a different value then the > property will be parsed incorrectly. For example, consider the following > configuration file: > {noformat} > http://www.w3.org/2001/XInclude;> > > myprop > val1 > > > > {noformat} > with the contents of /some/other/file.xml as: > {noformat} > >myprop >val2 > > {noformat} > Parsing this configuration should result in myprop=val2, but it actually > results in myprop=val1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream
[ https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15973: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.3 3.1.2 3.0.4 3.2.0 2.10.0 Status: Resolved (was: Patch Available) Thanks, Eric! I committed this to trunk, branch-3.2, branch-3.2.0, branch-3.1, branch-3.0, branch-2 and branch-2.9. > Configuration: Included properties are not cached if resource is a stream > - > > Key: HADOOP-15973 > URL: https://issues.apache.org/jira/browse/HADOOP-15973 > Project: Hadoop Common > Issue Type: Bug >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Critical > Fix For: 2.10.0, 3.2.0, 3.0.4, 3.1.2, 2.9.3 > > Attachments: HADOOP-15973.001.patch, HADOOP-15973.002.patch, > HADOOP-15973.003.branch-2.patch, HADOOP-15973.003.branch-3.0.patch, > HADOOP-15973.003.patch > > > If a configuration resource is a bufferedinputstream and the resource has an > included xml file, the properties from the included file are read and stored > in the properties of the configuration, but they are not stored in the > resource cache. So, if a later resource is added to the config and the > properties are recalculated from the first resource, the included properties > are lost. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-16007) Order of property settings is incorrect when includes are processed
[ https://issues.apache.org/jira/browse/HADOOP-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved HADOOP-16007. - Resolution: Duplicate This was fixed by HADOOP-15554. > Order of property settings is incorrect when includes are processed > --- > > Key: HADOOP-16007 > URL: https://issues.apache.org/jira/browse/HADOOP-16007 > Project: Hadoop Common > Issue Type: Bug > Components: conf >Affects Versions: 3.2.0, 3.1.1, 3.0.4 >Reporter: Jason Lowe >Assignee: Eric Payne >Priority: Blocker > > If a configuration file contains a setting for a property then later includes > another file that also sets that property to a different value then the > property will be parsed incorrectly. For example, consider the following > configuration file: > {noformat} > http://www.w3.org/2001/XInclude;> > > myprop > val1 > > > > {noformat} > with the contents of /some/other/file.xml as: > {noformat} > >myprop >val2 > > {noformat} > Parsing this configuration should result in myprop=val2, but it actually > results in myprop=val1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream
[ https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725938#comment-16725938 ] Jason Lowe commented on HADOOP-15973: - +1 for the trunk, branch-3, and branch-2 patches. Committing this. > Configuration: Included properties are not cached if resource is a stream > - > > Key: HADOOP-15973 > URL: https://issues.apache.org/jira/browse/HADOOP-15973 > Project: Hadoop Common > Issue Type: Bug >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Critical > Attachments: HADOOP-15973.001.patch, HADOOP-15973.002.patch, > HADOOP-15973.003.branch-2.patch, HADOOP-15973.003.branch-3.0.patch, > HADOOP-15973.003.patch > > > If a configuration resource is a bufferedinputstream and the resource has an > included xml file, the properties from the included file are read and stored > in the properties of the configuration, but they are not stored in the > resource cache. So, if a later resource is added to the config and the > properties are recalculated from the first resource, the included properties > are lost. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream
[ https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725151#comment-16725151 ] Jason Lowe commented on HADOOP-15973: - Thanks for updating the patch! I agree the TestSSLFactory failure is unrelated. I also saw it sporadically fail in HADOOP-15995 with the same error. I filed HADOOP-16016 to track that error. Speaking of preserving behaviors, I noticed the old include handling behavior was to silently ignore cases where it could not build a stream reader if quiet was true. That's what loadResource does and it used to be leveraged to process include files. This behavior was not preserved in the new version, but I've always been confused about the use-case where we don't want to complain/throw when we can't load something. It seems like that would make things very hard to debug in practice. I wanted to point this out to confirm the omission of quiet mode suppression in the new include handling was intentional and see if anyone can think of a reason why this type of error would want to be suppressed. > Configuration: Included properties are not cached if resource is a stream > - > > Key: HADOOP-15973 > URL: https://issues.apache.org/jira/browse/HADOOP-15973 > Project: Hadoop Common > Issue Type: Bug >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Critical > Attachments: HADOOP-15973.001.patch, HADOOP-15973.002.patch, > HADOOP-15973.003.branch-2.patch, HADOOP-15973.003.branch-3.0.patch, > HADOOP-15973.003.patch > > > If a configuration resource is a bufferedinputstream and the resource has an > included xml file, the properties from the included file are read and stored > in the properties of the configuration, but they are not stored in the > resource cache. So, if a later resource is added to the config and the > properties are recalculated from the first resource, the included properties > are lost. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16016) TestSSLFactory#testServerWeakCiphers sporadically fails in precommit builds
[ https://issues.apache.org/jira/browse/HADOOP-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725148#comment-16725148 ] Jason Lowe commented on HADOOP-16016: - Full stack trace: {noformat} [ERROR] testServerWeakCiphers(org.apache.hadoop.security.ssl.TestSSLFactory) Time elapsed: 0.079 s <<< FAILURE! java.lang.AssertionError: Expected to find 'no cipher suites in common' but got unexpected exception:javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate) at sun.security.ssl.Handshaker.activate(Handshaker.java:509) at sun.security.ssl.SSLEngineImpl.kickstartHandshake(SSLEngineImpl.java:714) at sun.security.ssl.SSLEngineImpl.writeAppRecord(SSLEngineImpl.java:1212) at sun.security.ssl.SSLEngineImpl.wrap(SSLEngineImpl.java:1165) at javax.net.ssl.SSLEngine.wrap(SSLEngine.java:469) at org.apache.hadoop.security.ssl.TestSSLFactory.wrap(TestSSLFactory.java:246) at org.apache.hadoop.security.ssl.TestSSLFactory.testServerWeakCiphers(TestSSLFactory.java:220) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) at org.apache.hadoop.security.ssl.TestSSLFactory.testServerWeakCiphers(TestSSLFactory.java:240) Caused by: javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate) at org.apache.hadoop.security.ssl.TestSSLFactory.wrap(TestSSLFactory.java:246) at org.apache.hadoop.security.ssl.TestSSLFactory.testServerWeakCiphers(TestSSLFactory.java:220) {noformat} Looks like maybe the exception message changed in the Java libraries? Test is looking for "no cipher suites in common" but the exception message is "No appropriate protocol" instead. > TestSSLFactory#testServerWeakCiphers sporadically fails in precommit builds > --- > > Key: HADOOP-16016 > URL: https://issues.apache.org/jira/browse/HADOOP-16016 > Project: Hadoop Common > Issue Type: Bug > Components: test >Affects Versions: 3.3.0 >Reporter: Jason Lowe >Priority: Major > > I have seen a couple of precommit builds across JIRAs fail in > TestSSLFactory#testServerWeakCiphers with the error: > {noformat} > [ERROR] TestSSLFactory.testServerWeakCiphers:240 Expected to find 'no > cipher suites in common' but got unexpected >
[jira] [Created] (HADOOP-16016) TestSSLFactory#testServerWeakCiphers sporadically fails in precommit builds
Jason Lowe created HADOOP-16016: --- Summary: TestSSLFactory#testServerWeakCiphers sporadically fails in precommit builds Key: HADOOP-16016 URL: https://issues.apache.org/jira/browse/HADOOP-16016 Project: Hadoop Common Issue Type: Bug Components: test Affects Versions: 3.3.0 Reporter: Jason Lowe I have seen a couple of precommit builds across JIRAs fail in TestSSLFactory#testServerWeakCiphers with the error: {noformat} [ERROR] TestSSLFactory.testServerWeakCiphers:240 Expected to find 'no cipher suites in common' but got unexpected exception:javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream
[ https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724210#comment-16724210 ] Jason Lowe commented on HADOOP-15973: - Thanks for the patch! In addition to the unit test added, I manually verified that this patch fixes HADOOP-16007. Nit: We don't need to declare MalformedURLException as being thrown by handleInclude or handleStartElement since it's a subclass of IOException which was added to the throws list. And once that's removed we no longer need to import it. handleInclude doesn't handle the case where getStreamReader returns null. It should probably throw in a similar way to how loadResource does when the reader is null. > Configuration: Included properties are not cached if resource is a stream > - > > Key: HADOOP-15973 > URL: https://issues.apache.org/jira/browse/HADOOP-15973 > Project: Hadoop Common > Issue Type: Bug >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Critical > Attachments: HADOOP-15973.001.patch, HADOOP-15973.002.patch > > > If a configuration resource is a bufferedinputstream and the resource has an > included xml file, the properties from the included file are read and stored > in the properties of the configuration, but they are not stored in the > resource cache. So, if a later resource is added to the config and the > properties are recalculated from the first resource, the included properties > are lost. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16007) Order of property settings is incorrect when includes are processed
[ https://issues.apache.org/jira/browse/HADOOP-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724070#comment-16724070 ] Jason Lowe commented on HADOOP-16007: - This is not quite the same issue as discovered during the RC0 voting period, as that's HADOOP-15973. Eric and I have been discussing this quite a bit offline, and he said that rolling back to the commit before HADOOP-15554 did not fix HADOOP-15973, so they are related but slightly different issues. We _think_ there's a way to fix both of them with the same change, and Eric is actively working on that. I agree that we should hold the RC for these fixes, as not loading the intended config settings properly could lead to very bad behavior depending upon the property which was accidentally, silently dropped after upgrading. > Order of property settings is incorrect when includes are processed > --- > > Key: HADOOP-16007 > URL: https://issues.apache.org/jira/browse/HADOOP-16007 > Project: Hadoop Common > Issue Type: Bug > Components: conf >Affects Versions: 3.2.0, 3.1.1, 3.0.4 >Reporter: Jason Lowe >Assignee: Eric Payne >Priority: Blocker > > If a configuration file contains a setting for a property then later includes > another file that also sets that property to a different value then the > property will be parsed incorrectly. For example, consider the following > configuration file: > {noformat} > http://www.w3.org/2001/XInclude;> > > myprop > val1 > > > > {noformat} > with the contents of /some/other/file.xml as: > {noformat} > >myprop >val2 > > {noformat} > Parsing this configuration should result in myprop=val2, but it actually > results in myprop=val1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16007) Order of property settings is incorrect when includes are processed
[ https://issues.apache.org/jira/browse/HADOOP-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723015#comment-16723015 ] Jason Lowe commented on HADOOP-16007: - The behavior will only be noticed if an included resource overrides a previously set property from the same resource doing the include. If the include was overriding a value from a previously parsed resource (like core-default.xml) then the problem does not manifest. The parser directly sets the included properties on the conf as a side-effect of parsing but the non-included properties are returned as a parse result and those results are iterated to set them. The sideband processing of includes effectively reverses the order in which properties are processed if the xinclude appears after the property setting in the original resource. Here's the simple code I used to test it: {code:title=testconf.java} import org.apache.hadoop.conf.Configuration; class testconf { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); System.out.println("myconf = " + conf.get("myconf")); } } {code} Using this sample code with a core-site.xml and included file setup as described in the JIRA description, The following shows what I get at two adjacent commits on the trunk line: {noformat} $ git log -1 commit f51da9c4d1423c2ac92eb4f40e973264e7e968cc Author: Andrew Wang Date: Mon Jul 2 18:31:21 2018 +0200 HADOOP-15554. Improve JIT performance for Configuration parsing. Contributed by Todd Lipcon. $ mvn clean && mvn install -Pdist -DskipTests -DskipShade -Dmaven.javadoc.skip -am -pl :hadoop-common [...] $ java -cp "hadoop/testconf:hadoop/apache/hadoop/hadoop-common-project/hadoop-common/target/hadoop-common-3.2.0-SNAPSHOT/share/hadoop/common/*:hadoop/apache/hadoop/hadoop-common-project/hadoop-common/target/hadoop-common-3.2.0-SNAPSHOT/share/hadoop/common/lib/*:." testconf myconf = val1 {noformat} So the above shows the broken behavior. core-site.xml set myconf to val1 then xincluded another file which set it to val2, yet the property acts as if the xinclude occurred at the top of core-site.xml. Moving one commit earlier in time shows the expected behavior: {noformat} $ git checkout HEAD~1 Previous HEAD position was f51da9c... HADOOP-15554. Improve JIT performance for Configuration parsing. Contributed by Todd Lipcon. HEAD is now at 5d748bd... HDFS-13702. Remove HTrace hooks from DFSClient to reduce CPU usage. Contributed by Todd Lipcon. $ mvn clean && mvn install -Pdist -DskipTests -DskipShade -Dmaven.javadoc.skip -am -pl :hadoop-common [...] $ java -cp "hadoop/testconf:hadoop/apache/hadoop/hadoop-common-project/hadoop-common/target/hadoop-common-3.2.0-SNAPSHOT/share/hadoop/common/*:hadoop/apache/hadoop/hadoop-common-project/hadoop-common/target/hadoop-common-3.2.0-SNAPSHOT/share/hadoop/common/lib/*:." testconf myconf = val2 {noformat} > Order of property settings is incorrect when includes are processed > --- > > Key: HADOOP-16007 > URL: https://issues.apache.org/jira/browse/HADOOP-16007 > Project: Hadoop Common > Issue Type: Bug > Components: conf >Affects Versions: 3.2.0, 3.1.1, 3.0.4 >Reporter: Jason Lowe >Priority: Blocker > > If a configuration file contains a setting for a property then later includes > another file that also sets that property to a different value then the > property will be parsed incorrectly. For example, consider the following > configuration file: > {noformat} > http://www.w3.org/2001/XInclude;> > > myprop > val1 > > > > {noformat} > with the contents of /some/other/file.xml as: > {noformat} > >myprop >val2 > > {noformat} > Parsing this configuration should result in myprop=val2, but it actually > results in myprop=val1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15554) Improve JIT performance for Configuration parsing
[ https://issues.apache.org/jira/browse/HADOOP-15554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721850#comment-16721850 ] Jason Lowe commented on HADOOP-15554: - FYI I noticed an ordering issue with parsing includes and tracked it down to this commit as the cause. See HADOOP-16007. > Improve JIT performance for Configuration parsing > - > > Key: HADOOP-15554 > URL: https://issues.apache.org/jira/browse/HADOOP-15554 > Project: Hadoop Common > Issue Type: Improvement > Components: conf, performance >Affects Versions: 3.0.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Fix For: 3.2.0, 3.1.1, 3.0.4 > > Attachments: HADOOP-15554.branch-3.0.patch, > HADOOP-15554.branch-3.0.patch, hadoop-15554.patch, hadoop-15554.patch > > > In investigating a performance regression for small tasks between Hadoop 2 > and Hadoop 3, we found that the amount of time spent in JIT was significantly > higher. Using jitwatch we were able to determine that, due to a combination > of switching from DOM to SAX style parsing and just having more configuration > key/value pairs, Configuration.loadResource is now getting compiled with the > C2 compiler and taking quite some time. Breaking that very large function up > into several smaller ones and eliminating some redundant bits of code > improves the JIT performance measurably. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16007) Order of property settings is incorrect when includes are processed
[ https://issues.apache.org/jira/browse/HADOOP-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721844#comment-16721844 ] Jason Lowe commented on HADOOP-16007: - I tracked this behavior change down to HADOOP-15554. I believe the problem stems from Parser#handleInclude parsing the included sub-resource directly into the Configuration {{properties}} member which bypasses the ordering of properties returned by the Parser#parse method. > Order of property settings is incorrect when includes are processed > --- > > Key: HADOOP-16007 > URL: https://issues.apache.org/jira/browse/HADOOP-16007 > Project: Hadoop Common > Issue Type: Bug > Components: conf >Affects Versions: 3.2.0, 3.1.1, 3.0.4 >Reporter: Jason Lowe >Priority: Blocker > > If a configuration file contains a setting for a property then later includes > another file that also sets that property to a different value then the > property will be parsed incorrectly. For example, consider the following > configuration file: > {noformat} > http://www.w3.org/2001/XInclude;> > > myprop > val1 > > > > {noformat} > with the contents of /some/other/file.xml as: > {noformat} > >myprop >val2 > > {noformat} > Parsing this configuration should result in myprop=val2, but it actually > results in myprop=val1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16007) Order of property settings is incorrect when includes are processed
Jason Lowe created HADOOP-16007: --- Summary: Order of property settings is incorrect when includes are processed Key: HADOOP-16007 URL: https://issues.apache.org/jira/browse/HADOOP-16007 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 3.1.1, 3.2.0, 3.0.4 Reporter: Jason Lowe If a configuration file contains a setting for a property then later includes another file that also sets that property to a different value then the property will be parsed incorrectly. For example, consider the following configuration file: {noformat} http://www.w3.org/2001/XInclude;> myprop val1 {noformat} with the contents of /some/other/file.xml as: {noformat} myprop val2 {noformat} Parsing this configuration should result in myprop=val2, but it actually results in myprop=val1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream
[ https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721781#comment-16721781 ] Jason Lowe commented on HADOOP-15973: - Thanks for the patch! I'm not sure returning the full properties is correct. IIUC that will snapshot not only the properties that were loaded as part of parsing the input stream but also all previously loaded properties from prior resources. That will cause problems if a previously parsed resource is changed and the user tries to refresh configs, as this snapshot will have old property values from that resource that could clobber the new values trying to be refreshed. As I understand it, we need input streams to cache what was loaded from the input stream, including anything loaded from include directives found in that stream, and nothing else. It appears the bug is in Configuration.Parser#handleInclude since any properties loaded via an include are not returned in the list of properties returned as a result of the parse method. If those were included in {{results}} then I think we'd cache the proper amount which is what was found as a result of a full parse of the input stream. > Configuration: Included properties are not cached if resource is a stream > - > > Key: HADOOP-15973 > URL: https://issues.apache.org/jira/browse/HADOOP-15973 > Project: Hadoop Common > Issue Type: Bug >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Critical > Attachments: HADOOP-15973.001.patch > > > If a configuration resource is a bufferedinputstream and the resource has an > included xml file, the properties from the included file are read and stored > in the properties of the configuration, but they are not stored in the > resource cache. So, if a later resource is added to the config and the > properties are recalculated from the first resource, the included properties > are lost. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15974) Upgrade Curator version to 2.13.0 to fix ZK tests
[ https://issues.apache.org/jira/browse/HADOOP-15974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15974: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.2.1 3.1.2 3.3.0 3.0.4 Status: Resolved (was: Patch Available) Thanks, [~ajisakaa]! I committed this to trunk, branch-3.2, branch-3.1, and branch-3.0. > Upgrade Curator version to 2.13.0 to fix ZK tests > - > > Key: HADOOP-15974 > URL: https://issues.apache.org/jira/browse/HADOOP-15974 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.2.0, 3.0.4, 3.3.0, 3.1.2 >Reporter: Jason Lowe >Assignee: Akira Ajisaka >Priority: Major > Fix For: 3.0.4, 3.3.0, 3.1.2, 3.2.1 > > Attachments: YARN-8937.01.patch > > > TestLeaderElectorService hangs waiting for the TestingZooKeeperServer to > start and eventually gets killed by the surefire timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15973) Configuration: Included properties are not cached if resource is a stream
[ https://issues.apache.org/jira/browse/HADOOP-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15973: Priority: Critical (was: Major) Increasing the priority since this can be a very nasty bug in practice. The omission of xincluded properties from an input stream is _silent_ and only occurs _after_ the first parse. It's difficult to debug and can lead to some very bad behavior depending upon the nature of the properties omitted when a Configuration object ends up reparsing its resources. > Configuration: Included properties are not cached if resource is a stream > - > > Key: HADOOP-15973 > URL: https://issues.apache.org/jira/browse/HADOOP-15973 > Project: Hadoop Common > Issue Type: Bug >Reporter: Eric Payne >Priority: Critical > > If a configuration resource is a bufferedinputstream and the resource has an > included xml file, the properties from the included file are read and stored > in the properties of the configuration, but they are not stored in the > resource cache. So, if a later resource is added to the config and the > properties are recalculated from the first resource, the included properties > are lost. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Moved] (HADOOP-15974) Upgrade Curator version to 2.13.0 to fix ZK tests
[ https://issues.apache.org/jira/browse/HADOOP-15974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe moved YARN-8937 to HADOOP-15974: --- Affects Version/s: (was: 3.3.0) 3.1.2 3.3.0 3.0.4 3.2.0 Target Version/s: 3.0.4, 3.3.0, 3.1.2, 3.2.1 (was: 3.0.4, 3.1.2, 3.3.0, 3.2.1) Component/s: (was: test) Key: HADOOP-15974 (was: YARN-8937) Project: Hadoop Common (was: Hadoop YARN) > Upgrade Curator version to 2.13.0 to fix ZK tests > - > > Key: HADOOP-15974 > URL: https://issues.apache.org/jira/browse/HADOOP-15974 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.2.0, 3.0.4, 3.3.0, 3.1.2 >Reporter: Jason Lowe >Assignee: Akira Ajisaka >Priority: Major > Attachments: YARN-8937.01.patch > > > TestLeaderElectorService hangs waiting for the TestingZooKeeperServer to > start and eventually gets killed by the surefire timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15816) Upgrade Apache Zookeeper version due to security concerns
[ https://issues.apache.org/jira/browse/HADOOP-15816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661045#comment-16661045 ] Jason Lowe commented on HADOOP-15816: - This change caused at least one unit test to break, see YARN-8937. > Upgrade Apache Zookeeper version due to security concerns > - > > Key: HADOOP-15816 > URL: https://issues.apache.org/jira/browse/HADOOP-15816 > Project: Hadoop Common > Issue Type: Task >Affects Versions: 3.1.1, 3.0.3 >Reporter: Boris Vulikh >Assignee: Akira Ajisaka >Priority: Major > Fix For: 3.2.0, 3.0.4, 3.3.0, 3.1.2 > > Attachments: HADOOP-15816.01.patch > > > * > [CVE-2018-8012|https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2018-8012] > * > [CVE-2017-5637|https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2017-5637] > We should upgrade the dependency to version 3.4.11 or the latest, if possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15836) Review of AccessControlList
[ https://issues.apache.org/jira/browse/HADOOP-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660834#comment-16660834 ] Jason Lowe commented on HADOOP-15836: - Wondering if we should just use a LinkedHashSet here. That way we preserve the order things were added to the ACL and dump the string in that same order while preserving the same lookup performance we had before. > Review of AccessControlList > --- > > Key: HADOOP-15836 > URL: https://issues.apache.org/jira/browse/HADOOP-15836 > Project: Hadoop Common > Issue Type: Improvement > Components: common, security >Affects Versions: 3.2.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Fix For: 3.3.0 > > Attachments: HADOOP-15836.1.patch > > > * Improve unit tests (expected / actual were backwards) > * Unit test expected elements to be in order but the class's return > Collections were unordered > * Formatting cleanup > * Removed superfluous white space > * Remove use of LinkedList > * Removed superfluous code > * Use {{unmodifiable}} Collections where JavaDoc states that caller must not > manipulate the data structure -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15836) Review of AccessControlList
[ https://issues.apache.org/jira/browse/HADOOP-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660730#comment-16660730 ] Jason Lowe commented on HADOOP-15836: - bq. this information is supplied trough a configuration If we want to get pedantic about ordering then shouldn't the order specified in the configuration be the order that is preserved? That order isn't necessarily lexicographical. If there's a pressing need to order the results of getAclString then fine, we can do that. But that does not require the AccessControlList implementation to preserve that order at all times. As I understand it, isUserAllowed is the critical path on this class and getAclString is relatively rare. That makes me think this patch is optimizing for the rare case at the expense of the common case. > Review of AccessControlList > --- > > Key: HADOOP-15836 > URL: https://issues.apache.org/jira/browse/HADOOP-15836 > Project: Hadoop Common > Issue Type: Improvement > Components: common, security >Affects Versions: 3.2.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Fix For: 3.3.0 > > Attachments: HADOOP-15836.1.patch > > > * Improve unit tests (expected / actual were backwards) > * Unit test expected elements to be in order but the class's return > Collections were unordered > * Formatting cleanup > * Removed superfluous white space > * Remove use of LinkedList > * Removed superfluous code > * Use {{unmodifiable}} Collections where JavaDoc states that caller must not > manipulate the data structure -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15836) Review of AccessControlList
[ https://issues.apache.org/jira/browse/HADOOP-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660675#comment-16660675 ] Jason Lowe commented on HADOOP-15836: - Yes, this should be changed back to HashSet. I'm all for making the unit tests more resilient to different implementations, but I don't see why we would want to choose TreeSet over HashSet here. The API never said anything about ordering, and I don't think we should start ordering it just to make some lazy unit tests happy. Changing the implementation in a way that does not fix a bug in the implementation just adds risk. I propose we revert this change and put up another patch if there's still things worth changing without the TreeSet modification in place (like maybe the unmodifiable set change). > Review of AccessControlList > --- > > Key: HADOOP-15836 > URL: https://issues.apache.org/jira/browse/HADOOP-15836 > Project: Hadoop Common > Issue Type: Improvement > Components: common, security >Affects Versions: 3.2.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Fix For: 3.3.0 > > Attachments: HADOOP-15836.1.patch > > > * Improve unit tests (expected / actual were backwards) > * Unit test expected elements to be in order but the class's return > Collections were unordered > * Formatting cleanup > * Removed superfluous white space > * Remove use of LinkedList > * Removed superfluous code > * Use {{unmodifiable}} Collections where JavaDoc states that caller must not > manipulate the data structure -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15836) Review of AccessControlList
[ https://issues.apache.org/jira/browse/HADOOP-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659741#comment-16659741 ] Jason Lowe commented on HADOOP-15836: - bq. I think the right thing would be to fix the unit tests to not rely on the order? Agree unit tests should not rely on order, but it looks like the fix may be misplaced here. If I understand this change properly, a HashSet was changed to a TreeSet not because it was incorrect from an API semantic point of view but because tests were expecting a certain order. IMHO that's not a good change unless the API docs explicitly said it would preserve order. TreeSet is notoriously problematic from a performance point of view relative to HashSet. The getUsers method returns a Collection and no order should be implied there. If tests want to simplify their assertions then they can dump the collection into a temporary tree set for comparisons, but we shouldn't force the implementation to pay the performance penalty all the time so unit tests can do easy collection comparisons. > Review of AccessControlList > --- > > Key: HADOOP-15836 > URL: https://issues.apache.org/jira/browse/HADOOP-15836 > Project: Hadoop Common > Issue Type: Improvement > Components: common, security >Affects Versions: 3.2.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Fix For: 3.3.0 > > Attachments: HADOOP-15836.1.patch > > > * Improve unit tests (expected / actual were backwards) > * Unit test expected elements to be in order but the class's return > Collections were unordered > * Formatting cleanup > * Removed superfluous white space > * Remove use of LinkedList > * Removed superfluous code > * Use {{unmodifiable}} Collections where JavaDoc states that caller must not > manipulate the data structure -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15836) Review of AccessControlList
[ https://issues.apache.org/jira/browse/HADOOP-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659085#comment-16659085 ] Jason Lowe commented on HADOOP-15836: - This broke tests in other projects, see YARN-8928 and MAPREDUCE-7155. > Review of AccessControlList > --- > > Key: HADOOP-15836 > URL: https://issues.apache.org/jira/browse/HADOOP-15836 > Project: Hadoop Common > Issue Type: Improvement > Components: common, security >Affects Versions: 3.2.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Fix For: 3.3.0 > > Attachments: HADOOP-15836.1.patch > > > * Improve unit tests (expected / actual were backwards) > * Unit test expected elements to be in order but the class's return > Collections were unordered > * Formatting cleanup > * Removed superfluous white space > * Remove use of LinkedList > * Removed superfluous code > * Use {{unmodifiable}} Collections where JavaDoc states that caller must not > manipulate the data structure -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15859) ZStandardDecompressor.c mistakes a class for an instance
[ https://issues.apache.org/jira/browse/HADOOP-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15859: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.2 3.0.4 2.9.2 3.2.0 2.10.0 Status: Resolved (was: Patch Available) Thanks for the review, Kihwal! I committed this to trunk, branch-3.2, branch-3.1, branch-3.0, branch-2, and branch-2.9. > ZStandardDecompressor.c mistakes a class for an instance > > > Key: HADOOP-15859 > URL: https://issues.apache.org/jira/browse/HADOOP-15859 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Ben Lau >Assignee: Jason Lowe >Priority: Blocker > Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2 > > Attachments: HADOOP-15859.001.patch > > > As a follow up to HADOOP-15820, I was doing more testing on ZSTD compression > and still encountered segfaults in the JVM in HBase after that fix. > I took a deeper look and realized there is still another bug, which looks > like it's that we are actually [calling > setInt()|https://github.com/apache/hadoop/blob/f13e231025333ebf80b30bbdce1296cef554943b/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.c#L148] > on the "remaining" variable on the ZStandardDecompressor class itself > (instead of an instance of that class) because the Java stub for the native C > init() function [is marked > static|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L253], > leading to memory corruption and a crash during GC later. > Initially I thought we would fix this by changing the Java init() method to > be non-static, but it looks like the "remaining" setInt() call is actually > unnecessary anyway, because in ZStandardDecompressor.java's reset() we [set > "remaining" to 0 right after calling the JNI init() > call|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L216]. > So ZStandardDecompressor.java init() doesn't have to be changed to an > instance method, we can leave it as static, but remove the JNI init() call's > "remaining" setInt() call altogether. > Furthermore we should probably clean up the class/instance distinction in the > C file because that's what led to this confusion. There are some other > methods where the distinction is incorrect or ambiguous, we should fix them > to prevent this from happening again. > I talked to [~jlowe] who further pointed out the ZStandardCompressor also has > similar problems and needs to be fixed too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15859) ZStandardDecompressor.c mistakes a class for an instance
[ https://issues.apache.org/jira/browse/HADOOP-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15859: Status: Patch Available (was: Open) Attached a patch that removes the JNI setting of the remaining field per Ben's analysis above and cleans up the naming re: objects vs. classes in the JNI function arguments. > ZStandardDecompressor.c mistakes a class for an instance > > > Key: HADOOP-15859 > URL: https://issues.apache.org/jira/browse/HADOOP-15859 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0-alpha2, 2.9.0 >Reporter: Ben Lau >Assignee: Jason Lowe >Priority: Blocker > Attachments: HADOOP-15859.001.patch > > > As a follow up to HADOOP-15820, I was doing more testing on ZSTD compression > and still encountered segfaults in the JVM in HBase after that fix. > I took a deeper look and realized there is still another bug, which looks > like it's that we are actually [calling > setInt()|https://github.com/apache/hadoop/blob/f13e231025333ebf80b30bbdce1296cef554943b/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.c#L148] > on the "remaining" variable on the ZStandardDecompressor class itself > (instead of an instance of that class) because the Java stub for the native C > init() function [is marked > static|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L253], > leading to memory corruption and a crash during GC later. > Initially I thought we would fix this by changing the Java init() method to > be non-static, but it looks like the "remaining" setInt() call is actually > unnecessary anyway, because in ZStandardDecompressor.java's reset() we [set > "remaining" to 0 right after calling the JNI init() > call|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L216]. > So ZStandardDecompressor.java init() doesn't have to be changed to an > instance method, we can leave it as static, but remove the JNI init() call's > "remaining" setInt() call altogether. > Furthermore we should probably clean up the class/instance distinction in the > C file because that's what led to this confusion. There are some other > methods where the distinction is incorrect or ambiguous, we should fix them > to prevent this from happening again. > I talked to [~jlowe] who further pointed out the ZStandardCompressor also has > similar problems and needs to be fixed too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15859) ZStandardDecompressor.c mistakes a class for an instance
[ https://issues.apache.org/jira/browse/HADOOP-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15859: Attachment: HADOOP-15859.001.patch > ZStandardDecompressor.c mistakes a class for an instance > > > Key: HADOOP-15859 > URL: https://issues.apache.org/jira/browse/HADOOP-15859 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Ben Lau >Assignee: Jason Lowe >Priority: Blocker > Attachments: HADOOP-15859.001.patch > > > As a follow up to HADOOP-15820, I was doing more testing on ZSTD compression > and still encountered segfaults in the JVM in HBase after that fix. > I took a deeper look and realized there is still another bug, which looks > like it's that we are actually [calling > setInt()|https://github.com/apache/hadoop/blob/f13e231025333ebf80b30bbdce1296cef554943b/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.c#L148] > on the "remaining" variable on the ZStandardDecompressor class itself > (instead of an instance of that class) because the Java stub for the native C > init() function [is marked > static|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L253], > leading to memory corruption and a crash during GC later. > Initially I thought we would fix this by changing the Java init() method to > be non-static, but it looks like the "remaining" setInt() call is actually > unnecessary anyway, because in ZStandardDecompressor.java's reset() we [set > "remaining" to 0 right after calling the JNI init() > call|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L216]. > So ZStandardDecompressor.java init() doesn't have to be changed to an > instance method, we can leave it as static, but remove the JNI init() call's > "remaining" setInt() call altogether. > Furthermore we should probably clean up the class/instance distinction in the > C file because that's what led to this confusion. There are some other > methods where the distinction is incorrect or ambiguous, we should fix them > to prevent this from happening again. > I talked to [~jlowe] who further pointed out the ZStandardCompressor also has > similar problems and needs to be fixed too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15859) ZStandardDecompressor.c mistakes a class for an instance
[ https://issues.apache.org/jira/browse/HADOOP-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15859: Affects Version/s: 2.9.0 3.0.0-alpha2 Target Version/s: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2 > ZStandardDecompressor.c mistakes a class for an instance > > > Key: HADOOP-15859 > URL: https://issues.apache.org/jira/browse/HADOOP-15859 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Ben Lau >Assignee: Jason Lowe >Priority: Blocker > > As a follow up to HADOOP-15820, I was doing more testing on ZSTD compression > and still encountered segfaults in the JVM in HBase after that fix. > I took a deeper look and realized there is still another bug, which looks > like it's that we are actually [calling > setInt()|https://github.com/apache/hadoop/blob/f13e231025333ebf80b30bbdce1296cef554943b/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.c#L148] > on the "remaining" variable on the ZStandardDecompressor class itself > (instead of an instance of that class) because the Java stub for the native C > init() function [is marked > static|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L253], > leading to memory corruption and a crash during GC later. > Initially I thought we would fix this by changing the Java init() method to > be non-static, but it looks like the "remaining" setInt() call is actually > unnecessary anyway, because in ZStandardDecompressor.java's reset() we [set > "remaining" to 0 right after calling the JNI init() > call|https://github.com/apache/hadoop/blob/a0a276162147e843a5a4e028abdca5b66f5118da/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java#L216]. > So ZStandardDecompressor.java init() doesn't have to be changed to an > instance method, we can leave it as static, but remove the JNI init() call's > "remaining" setInt() call altogether. > Furthermore we should probably clean up the class/instance distinction in the > C file because that's what led to this confusion. There are some other > methods where the distinction is incorrect or ambiguous, we should fix them > to prevent this from happening again. > I talked to [~jlowe] who further pointed out the ZStandardCompressor also has > similar problems and needs to be fixed too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long
[ https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645035#comment-16645035 ] Jason Lowe commented on HADOOP-15820: - Thanks, [~jojochuang]! Sorry for missing that commit. > ZStandardDecompressor native code sets an integer field as a long > - > > Key: HADOOP-15820 > URL: https://issues.apache.org/jira/browse/HADOOP-15820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Blocker > Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2 > > Attachments: HADOOP-15820.001.patch > > > Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in > ZStandardDecompressor.c sets the {{remaining}} field as a long when it > actually is an integer. > Kudos to Ben Lau from our HBase team for discovering this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer
[ https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642522#comment-16642522 ] Jason Lowe commented on HADOOP-15822: - Looked into the unit test failures. * TestNameNodeMetadataConsistency failure is an existing issue tracked by HDFS-11439 * TestBalancer test has been failing in other precommit builds, filed HDFS-13975 * TestStandbyCheckpoints does not look related and does not reproduce locally * TestHAAppend is an inode create timeout that does not look related and does not reproduce locally * TestDirectoryScanner is a timeout that does not look related and does not reproduce locally * TestTimelineReaderWebServicesHBaseStorage has been failing in nightly builds, filed YARN-8856 > zstd compressor can fail with a small output buffer > --- > > Key: HADOOP-15822 > URL: https://issues.apache.org/jira/browse/HADOOP-15822 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch > > > TestZStandardCompressorDecompressor fails a couple of tests on my machine > with the latest zstd library (1.3.5). Compression can fail to successfully > finalize the stream when a small output buffer is used resulting in a failed > to init error, and decompression with a direct buffer can fail with an > invalid src size error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer
[ https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642104#comment-16642104 ] Jason Lowe commented on HADOOP-15822: - bq. do you think it's related? Or is it something different, maybe MR-specific? I do not think it is related. The MapOutput buffer code is miscalculating how much buffer space is remaining before it forces a spill. In this failure case the buffer involved is not dealing with compressed data, so it should not matter what codec is being used. Have you tried reproducing it with lz4 or no codec at all? I'll dig a bit into the Jenkins test failures to see if they are somehow related. > zstd compressor can fail with a small output buffer > --- > > Key: HADOOP-15822 > URL: https://issues.apache.org/jira/browse/HADOOP-15822 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch > > > TestZStandardCompressorDecompressor fails a couple of tests on my machine > with the latest zstd library (1.3.5). Compression can fail to successfully > finalize the stream when a small output buffer is used resulting in a failed > to init error, and decompression with a direct buffer can fail with an > invalid src size error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long
[ https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640436#comment-16640436 ] Jason Lowe commented on HADOOP-15820: - Thanks for moving this to a Blocker, [~leftnoteasy]. This issue can be particularly nasty since it corrupts the JVM process memory which can result in a difficult to debug JVM crash much later in the process lifetime. > ZStandardDecompressor native code sets an integer field as a long > - > > Key: HADOOP-15820 > URL: https://issues.apache.org/jira/browse/HADOOP-15820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Blocker > Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2 > > Attachments: HADOOP-15820.001.patch > > > Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in > ZStandardDecompressor.c sets the {{remaining}} field as a long when it > actually is an integer. > Kudos to Ben Lau from our HBase team for discovering this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer
[ https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640310#comment-16640310 ] Jason Lowe commented on HADOOP-15822: - Minor fix to move the libzstd addition in the Dockerfile to its proper lexicographical place. > zstd compressor can fail with a small output buffer > --- > > Key: HADOOP-15822 > URL: https://issues.apache.org/jira/browse/HADOOP-15822 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch > > > TestZStandardCompressorDecompressor fails a couple of tests on my machine > with the latest zstd library (1.3.5). Compression can fail to successfully > finalize the stream when a small output buffer is used resulting in a failed > to init error, and decompression with a direct buffer can fail with an > invalid src size error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15822) zstd compressor can fail with a small output buffer
[ https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15822: Attachment: HADOOP-15822.002.patch > zstd compressor can fail with a small output buffer > --- > > Key: HADOOP-15822 > URL: https://issues.apache.org/jira/browse/HADOOP-15822 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch > > > TestZStandardCompressorDecompressor fails a couple of tests on my machine > with the latest zstd library (1.3.5). Compression can fail to successfully > finalize the stream when a small output buffer is used resulting in a failed > to init error, and decompression with a direct buffer can fail with an > invalid src size error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15822) zstd compressor can fail with a small output buffer
[ https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15822: Status: Patch Available (was: Open) Compression flushing failure has to do with how the JNI wrapper code was invoking the zstd library. When using a small output buffer sometimes flushStream or endStream needs to be called successively to finish flushing everything, but the JNI code would always invoke the compressStream method on a null input buffer before invoking the flush/end call. Older versions of zstd apparently were OK with this, but the new ones are not. This patch skips calling compressStream if there is nothing in the input buffer to compress, so the zstd library will see a contiguous sequence of end stream calls towards the end of compression when using small output buffers. The decompress direct test failure is a bug in the interface between the Java layer and the JNI layer. The function takes a buffer pointer, a buffer length, and a buffer offset, as arguments but the Java layer was using remaining() instead of limit() to send down the size of the buffer. Occasionally during the test remaining() can be smaller than position() and the zstd library rightfully complains that we are asking it to use a buffer past the end of the reported length. In addition the test would sometimes fail to flip the output buffer which would break the test when that occurs. These tests also were not running during precommit because the zstandard libraries were missing from the build environment, so this patch adds the libzstd package to the build environment Dockerfile. > zstd compressor can fail with a small output buffer > --- > > Key: HADOOP-15822 > URL: https://issues.apache.org/jira/browse/HADOOP-15822 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0, 2.9.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: HADOOP-15822.001.patch > > > TestZStandardCompressorDecompressor fails a couple of tests on my machine > with the latest zstd library (1.3.5). Compression can fail to successfully > finalize the stream when a small output buffer is used resulting in a failed > to init error, and decompression with a direct buffer can fail with an > invalid src size error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15822) zstd compressor can fail with a small output buffer
[ https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15822: Attachment: HADOOP-15822.001.patch > zstd compressor can fail with a small output buffer > --- > > Key: HADOOP-15822 > URL: https://issues.apache.org/jira/browse/HADOOP-15822 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: HADOOP-15822.001.patch > > > TestZStandardCompressorDecompressor fails a couple of tests on my machine > with the latest zstd library (1.3.5). Compression can fail to successfully > finalize the stream when a small output buffer is used resulting in a failed > to init error, and decompression with a direct buffer can fail with an > invalid src size error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer
[ https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640275#comment-16640275 ] Jason Lowe commented on HADOOP-15822: - Sample test failures: {noformat} [INFO] Running org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor [ERROR] Tests run: 19, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.758 s <<< FAILURE! - in org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor [ERROR] testCompressingWithOneByteOutputBuffer(org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor) Time elapsed: 0.108 s <<< ERROR! java.lang.InternalError: Context should be init first at org.apache.hadoop.io.compress.zstd.ZStandardCompressor.deflateBytesDirect(Native Method) at org.apache.hadoop.io.compress.zstd.ZStandardCompressor.compress(ZStandardCompressor.java:216) at org.apache.hadoop.io.compress.CompressorStream.compress(CompressorStream.java:81) at org.apache.hadoop.io.compress.CompressorStream.finish(CompressorStream.java:92) at org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor.testCompressingWithOneByteOutputBuffer(TestZStandardCompressorDecompressor.java:300) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413) [ERROR] testZStandardDirectCompressDecompress(org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor) Time elapsed: 0.014 s <<< ERROR! java.lang.InternalError: Src size is incorrect at org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native Method) at org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateDirect(ZStandardDecompressor.java:264) at org.apache.hadoop.io.compress.zstd.ZStandardDecompressor$ZStandardDirectDecompressor.decompress(ZStandardDecompressor.java:307) at org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor.compressDecompressLoop(TestZStandardCompressorDecompressor.java:416) at org.apache.hadoop.io.compress.zstd.TestZStandardCompressorDecompressor.testZStandardDirectCompressDecompress(TestZStandardCompressorDecompressor.java:385) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at
[jira] [Created] (HADOOP-15822) zstd compressor can fail with a small output buffer
Jason Lowe created HADOOP-15822: --- Summary: zstd compressor can fail with a small output buffer Key: HADOOP-15822 URL: https://issues.apache.org/jira/browse/HADOOP-15822 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.9.0 Reporter: Jason Lowe Assignee: Jason Lowe TestZStandardCompressorDecompressor fails a couple of tests on my machine with the latest zstd library (1.3.5). Compression can fail to successfully finalize the stream when a small output buffer is used resulting in a failed to init error, and decompression with a direct buffer can fail with an invalid src size error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long
[ https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15820: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.2 3.0.4 2.9.2 3.2.0 2.10.0 Status: Resolved (was: Patch Available) Thanks again to [~benlau] for identifying the issue and to [~kihwal] and [~Jim_Brennan] for reviews! I committed this to trunk, branch-3.1, branch-3.0, branch-2, and branch-2.9. > ZStandardDecompressor native code sets an integer field as a long > - > > Key: HADOOP-15820 > URL: https://issues.apache.org/jira/browse/HADOOP-15820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2 > > Attachments: HADOOP-15820.001.patch > > > Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in > ZStandardDecompressor.c sets the {{remaining}} field as a long when it > actually is an integer. > Kudos to Ben Lau from our HBase team for discovering this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long
[ https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15820: Description: Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in ZStandardDecompressor.c sets the {{remaining}} field as a long when it actually is an integer. Kudos to Ben Lau from our HBase team for discovering this issue. was:Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in ZStandardDecompressor.c sets the {{remaining}} field as a long when it actually is an integer. Thanks for the reviews, [~kihwal] and [~Jim_Brennan]! Committing this. > ZStandardDecompressor native code sets an integer field as a long > - > > Key: HADOOP-15820 > URL: https://issues.apache.org/jira/browse/HADOOP-15820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: HADOOP-15820.001.patch > > > Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in > ZStandardDecompressor.c sets the {{remaining}} field as a long when it > actually is an integer. > Kudos to Ben Lau from our HBase team for discovering this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long
[ https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15820: Status: Patch Available (was: Open) Attaching a patch that changes the setting from a long field to an int field. Oddly this was done correctly in the Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_inflateBytesDirect function but was wrong in the init function. > ZStandardDecompressor native code sets an integer field as a long > - > > Key: HADOOP-15820 > URL: https://issues.apache.org/jira/browse/HADOOP-15820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0-alpha2, 2.9.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: HADOOP-15820.001.patch > > > Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in > ZStandardDecompressor.c sets the {{remaining}} field as a long when it > actually is an integer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long
[ https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15820: Attachment: HADOOP-15820.001.patch > ZStandardDecompressor native code sets an integer field as a long > - > > Key: HADOOP-15820 > URL: https://issues.apache.org/jira/browse/HADOOP-15820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: HADOOP-15820.001.patch > > > Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in > ZStandardDecompressor.c sets the {{remaining}} field as a long when it > actually is an integer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long
Jason Lowe created HADOOP-15820: --- Summary: ZStandardDecompressor native code sets an integer field as a long Key: HADOOP-15820 URL: https://issues.apache.org/jira/browse/HADOOP-15820 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0-alpha2, 2.9.0 Reporter: Jason Lowe Assignee: Jason Lowe Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in ZStandardDecompressor.c sets the {{remaining}} field as a long when it actually is an integer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15755) StringUtils#createStartupShutdownMessage throws NPE when args is null
[ https://issues.apache.org/jira/browse/HADOOP-15755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15755: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.2 3.0.4 2.9.2 3.2.0 2.10.0 Status: Resolved (was: Patch Available) Thanks [~dineshchitlangia] and [~ljain]! I committed this to trunk, branch-3.1, branch-3.0, branch-2, and branch-2.9. > StringUtils#createStartupShutdownMessage throws NPE when args is null > - > > Key: HADOOP-15755 > URL: https://issues.apache.org/jira/browse/HADOOP-15755 > Project: Hadoop Common > Issue Type: Bug >Reporter: Lokesh Jain >Assignee: Dinesh Chitlangia >Priority: Major > Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2 > > Attachments: HADOOP-15755.001.patch, HADOOP-15755.002.patch > > > StringUtils#createStartupShutdownMessage uses > {code:java} > Arrays.asList(args) > {code} > which throws NPE when args is null. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15755) StringUtils#createStartupShutdownMessage throws NPE when args is null
[ https://issues.apache.org/jira/browse/HADOOP-15755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619703#comment-16619703 ] Jason Lowe commented on HADOOP-15755: - Thanks for updating the patch! +1 lgtm. Committing this. > StringUtils#createStartupShutdownMessage throws NPE when args is null > - > > Key: HADOOP-15755 > URL: https://issues.apache.org/jira/browse/HADOOP-15755 > Project: Hadoop Common > Issue Type: Bug >Reporter: Lokesh Jain >Assignee: Dinesh Chitlangia >Priority: Major > Attachments: HADOOP-15755.001.patch, HADOOP-15755.002.patch > > > StringUtils#createStartupShutdownMessage uses > {code:java} > Arrays.asList(args) > {code} > which throws NPE when args is null. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15755) StringUtils#createStartupShutdownMessage throws NPE when args is null
[ https://issues.apache.org/jira/browse/HADOOP-15755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614840#comment-16614840 ] Jason Lowe commented on HADOOP-15755: - Thanks for the report and patch! Fix looks fine. It could use Collections.emptyList but that's not a must-fix. Would you mind adding a unit test? It's trivial in this case since the test just needs to invoke the method with a null args parameter. That way if someone later refactors the method a test will verify this doesn't regress. > StringUtils#createStartupShutdownMessage throws NPE when args is null > - > > Key: HADOOP-15755 > URL: https://issues.apache.org/jira/browse/HADOOP-15755 > Project: Hadoop Common > Issue Type: Bug >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HADOOP-15755.001.patch > > > StringUtils#createStartupShutdownMessage uses > {code:java} > Arrays.asList(args) > {code} > which throws NPE when args is null. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15738) MRAppBenchmark.benchmark1() fails with NullPointerException
[ https://issues.apache.org/jira/browse/HADOOP-15738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609359#comment-16609359 ] Jason Lowe commented on HADOOP-15738: - I added you as a contributor to the HADOOP and MAPREDUCE projects. > MRAppBenchmark.benchmark1() fails with NullPointerException > --- > > Key: HADOOP-15738 > URL: https://issues.apache.org/jira/browse/HADOOP-15738 > Project: Hadoop Common > Issue Type: Bug > Components: test >Reporter: Oleksandr Shevchenko >Priority: Minor > > MRAppBenchmark.benchmark1() fails with NullPointerException: > 1. We do not set any queue for this test. As the result we got the following > exception: > {noformat} > 2018-09-10 17:04:23,486 ERROR [Thread-0] rm.RMCommunicator > (RMCommunicator.java:register(177)) - Exception while registering > java.lang.NullPointerException > at org.apache.avro.util.Utf8$2.toUtf8(Utf8.java:123) > at org.apache.avro.util.Utf8.getBytesFor(Utf8.java:172) > at org.apache.avro.util.Utf8.(Utf8.java:39) > at > org.apache.hadoop.mapreduce.jobhistory.JobQueueChangeEvent.(JobQueueChangeEvent.java:35) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.setQueueName(JobImpl.java:1167) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:174) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:301) > at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:285) > at > org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.run(MRAppBenchmark.java:72) > at > org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.benchmark1(MRAppBenchmark.java:194) > {noformat} > 2. We override createSchedulerProxy method and do not set application > priority that was added later by MAPREDUCE-6515. We got the following error: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleJobPriorityChange(RMContainerAllocator.java:1025) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:880) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:286) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$AllocatorRunnable.run(RMCommunicator.java:280) > at java.lang.Thread.run(Thread.java:748) > {noformat} > In both cases, the job never will be run and the test stuck and will not be > finished. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-15738) MRAppBenchmark.benchmark1() fails with NullPointerException
[ https://issues.apache.org/jira/browse/HADOOP-15738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved HADOOP-15738. - Resolution: Duplicate > MRAppBenchmark.benchmark1() fails with NullPointerException > --- > > Key: HADOOP-15738 > URL: https://issues.apache.org/jira/browse/HADOOP-15738 > Project: Hadoop Common > Issue Type: Bug > Components: test >Reporter: Oleksandr Shevchenko >Priority: Minor > > MRAppBenchmark.benchmark1() fails with NullPointerException: > 1. We do not set any queue for this test. As the result we got the following > exception: > {noformat} > 2018-09-10 17:04:23,486 ERROR [Thread-0] rm.RMCommunicator > (RMCommunicator.java:register(177)) - Exception while registering > java.lang.NullPointerException > at org.apache.avro.util.Utf8$2.toUtf8(Utf8.java:123) > at org.apache.avro.util.Utf8.getBytesFor(Utf8.java:172) > at org.apache.avro.util.Utf8.(Utf8.java:39) > at > org.apache.hadoop.mapreduce.jobhistory.JobQueueChangeEvent.(JobQueueChangeEvent.java:35) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.setQueueName(JobImpl.java:1167) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:174) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:301) > at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:285) > at > org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.run(MRAppBenchmark.java:72) > at > org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.benchmark1(MRAppBenchmark.java:194) > {noformat} > 2. We override createSchedulerProxy method and do not set application > priority that was added later by MAPREDUCE-6515. We got the following error: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleJobPriorityChange(RMContainerAllocator.java:1025) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:880) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:286) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$AllocatorRunnable.run(RMCommunicator.java:280) > at java.lang.Thread.run(Thread.java:748) > {noformat} > In both cases, the job never will be run and the test stuck and will not be > finished. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15738) MRAppBenchmark.benchmark1() fails with NullPointerException
[ https://issues.apache.org/jira/browse/HADOOP-15738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609314#comment-16609314 ] Jason Lowe commented on HADOOP-15738: - A "Fixed" resolution should only be used for cases where a patch has been committed. Marking this as a duplicate of MAPREDUCE-7137. Note that in the future JIRAs can be moved between projects rather than refiling them from scratch. See the "Move" action under the "More" tab at the top of the JIRA. > MRAppBenchmark.benchmark1() fails with NullPointerException > --- > > Key: HADOOP-15738 > URL: https://issues.apache.org/jira/browse/HADOOP-15738 > Project: Hadoop Common > Issue Type: Bug > Components: test >Reporter: Oleksandr Shevchenko >Priority: Minor > > MRAppBenchmark.benchmark1() fails with NullPointerException: > 1. We do not set any queue for this test. As the result we got the following > exception: > {noformat} > 2018-09-10 17:04:23,486 ERROR [Thread-0] rm.RMCommunicator > (RMCommunicator.java:register(177)) - Exception while registering > java.lang.NullPointerException > at org.apache.avro.util.Utf8$2.toUtf8(Utf8.java:123) > at org.apache.avro.util.Utf8.getBytesFor(Utf8.java:172) > at org.apache.avro.util.Utf8.(Utf8.java:39) > at > org.apache.hadoop.mapreduce.jobhistory.JobQueueChangeEvent.(JobQueueChangeEvent.java:35) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.setQueueName(JobImpl.java:1167) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:174) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:301) > at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:285) > at > org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.run(MRAppBenchmark.java:72) > at > org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.benchmark1(MRAppBenchmark.java:194) > {noformat} > 2. We override createSchedulerProxy method and do not set application > priority that was added later by MAPREDUCE-6515. We got the following error: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleJobPriorityChange(RMContainerAllocator.java:1025) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:880) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:286) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$AllocatorRunnable.run(RMCommunicator.java:280) > at java.lang.Thread.run(Thread.java:748) > {noformat} > In both cases, the job never will be run and the test stuck and will not be > finished. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Reopened] (HADOOP-15738) MRAppBenchmark.benchmark1() fails with NullPointerException
[ https://issues.apache.org/jira/browse/HADOOP-15738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reopened HADOOP-15738: - > MRAppBenchmark.benchmark1() fails with NullPointerException > --- > > Key: HADOOP-15738 > URL: https://issues.apache.org/jira/browse/HADOOP-15738 > Project: Hadoop Common > Issue Type: Bug > Components: test >Reporter: Oleksandr Shevchenko >Priority: Minor > > MRAppBenchmark.benchmark1() fails with NullPointerException: > 1. We do not set any queue for this test. As the result we got the following > exception: > {noformat} > 2018-09-10 17:04:23,486 ERROR [Thread-0] rm.RMCommunicator > (RMCommunicator.java:register(177)) - Exception while registering > java.lang.NullPointerException > at org.apache.avro.util.Utf8$2.toUtf8(Utf8.java:123) > at org.apache.avro.util.Utf8.getBytesFor(Utf8.java:172) > at org.apache.avro.util.Utf8.(Utf8.java:39) > at > org.apache.hadoop.mapreduce.jobhistory.JobQueueChangeEvent.(JobQueueChangeEvent.java:35) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.setQueueName(JobImpl.java:1167) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:174) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:301) > at org.apache.hadoop.mapreduce.v2.app.MRApp.submit(MRApp.java:285) > at > org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.run(MRAppBenchmark.java:72) > at > org.apache.hadoop.mapreduce.v2.app.MRAppBenchmark.benchmark1(MRAppBenchmark.java:194) > {noformat} > 2. We override createSchedulerProxy method and do not set application > priority that was added later by MAPREDUCE-6515. We got the following error: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleJobPriorityChange(RMContainerAllocator.java:1025) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:880) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:286) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$AllocatorRunnable.run(RMCommunicator.java:280) > at java.lang.Thread.run(Thread.java:748) > {noformat} > In both cases, the job never will be run and the test stuck and will not be > finished. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15722) regression: Hadoop 2.7.7 release breaks spark submit
[ https://issues.apache.org/jira/browse/HADOOP-15722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607592#comment-16607592 ] Jason Lowe commented on HADOOP-15722: - I haven't been able to reproduce the issue yet, but looking closer at the logs I think it's related to variable expansion. Another aspect of restricted parsing is they are unable to access system properties or environment variables from the config since those could potentially contain secrets. Looks like in the following log snippets for the good and bad runs, the user.name system property is not getting expanded in the bad run because the conf resource is untrusted: Log excerpt from the session with hadoop 2.7.3: {noformat} 18/09/06 08:12:04 INFO SessionState: Created HDFS directory: /tmp/hive-admin/user_b/799640f8-3d34-4cb7-90fe-5368c22881d5 {noformat} Log excerpt from the session with hadoop 2.7.7: {noformat} 18/09/06 07:23:09 INFO SessionState: Created HDFS directory: /tmp/hive-${user.name}/user_b {noformat} [~yumwang] would you mind running with the following patch to Hadoop 2.7.7's Configuration to see if this fixes the issue or at least gets significantly farther? That would help validate my theory as to what's going on here. The patch keeps XML directives restricted for untrusted sources but re-enables system property access. {noformat} diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java index 5ce3e65..4df8491 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java @@ -905,7 +905,7 @@ public synchronized void reloadConfiguration() { private synchronized void addResourceObject(Resource resource) { resources.add(resource); // add to resources -restrictSystemProps |= resource.isParserRestricted(); +restrictSystemProps = false; reloadConfiguration(); } {noformat} If it indeed is the issue then we may need to reconsider the restriction on system properties. Choices include: - Removing the property expansion restriction completely so all system and env properties are available, and it would be up to admins to sanitize these when starting proxy servers - Allowing system properties but restricting environment variables, if we feel env variables are more common for passing secrets - Using a whitelist for system properties > regression: Hadoop 2.7.7 release breaks spark submit > > > Key: HADOOP-15722 > URL: https://issues.apache.org/jira/browse/HADOOP-15722 > Project: Hadoop Common > Issue Type: Bug > Components: build, conf, security >Affects Versions: 2.7.7 >Reporter: Steve Loughran >Priority: Major > > SPARK-25330 highlights that upgrading spark to hadoop 2.7.7 is causing a > regression in client setup, with things only working when > {{Configuration.getRestrictParserDefault(Object resource)}} = false. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15722) regression: Hadoop 2.7.7 release breaks spark submit
[ https://issues.apache.org/jira/browse/HADOOP-15722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607336#comment-16607336 ] Jason Lowe commented on HADOOP-15722: - The getRestrictedParserDefault method was added to address CVE-2017-15713 and shipped as part of 2.7.5. The idea behind the fix is to restrict the parsing of XML entities from a configuration Resource when that Resource may not be trusted. Untrusted resources that are parsed from Resources that come from the classpath are trusted, but resources that come from file streams as a proxy user are not. When parsing configs outside of the classpath as a proxy user, the contents are likely coming from conf data provided by a cluster user, and we would need to restrict certain XML entities in those cases. Failing to do so could expose the contents of local files on the server to the cluster user which is the crux of the CVE. I'll try to work through the repro steps listed in SPARK-25330 to see if I can reproduce the issue locally. If successful it should be relatively straightforward to see where the suspect conf is coming from and why it breaks when parsing of that conf is restricted. Note that restricted parsing doesn't mean the contents are not parsed at all, rather that the parser won't honor certain requested directives in the XML stream. > regression: Hadoop 2.7.7 release breaks spark submit > > > Key: HADOOP-15722 > URL: https://issues.apache.org/jira/browse/HADOOP-15722 > Project: Hadoop Common > Issue Type: Bug > Components: build, conf, security >Affects Versions: 2.7.7 >Reporter: Steve Loughran >Priority: Major > > SPARK-25330 highlights that upgrading spark to hadoop 2.7.7 is causing a > regression in client setup, with things only working when > {{Configuration.getRestrictParserDefault(Object resource)}} = false. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15614) TestGroupsCaching.testExceptionOnBackgroundRefreshHandled reliably fails
[ https://issues.apache.org/jira/browse/HADOOP-15614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546917#comment-16546917 ] Jason Lowe commented on HADOOP-15614: - It fails reliably when run in isolation on this line: {noformat} assertEquals(startingRequestCount, FakeGroupMapping.getRequestCount()); {noformat} but it also sporadically fails on this last code line below when run with the other tests: {noformat} // Now sleep for a short time and re-check the request count. It should have // increased, but the exception means the cache will not have updated Thread.sleep(50); FakeGroupMapping.setThrowException(false); assertEquals(startingRequestCount + 1, FakeGroupMapping.getRequestCount()); assertEquals(groups.getGroups("me").size(), 2); {noformat} The 50msec sleep screams racy test to me. > TestGroupsCaching.testExceptionOnBackgroundRefreshHandled reliably fails > > > Key: HADOOP-15614 > URL: https://issues.apache.org/jira/browse/HADOOP-15614 > Project: Hadoop Common > Issue Type: Bug >Reporter: Kihwal Lee >Priority: Major > > When {{testExceptionOnBackgroundRefreshHandled}} is run individually, it > reliably fails. It seems like a fundamental bug in the test or groups caching. > A similar issue was dealt with in HADOOP-13375. [~cheersyang], do you have > any insight into this? > This test case was added in HADOOP-13263. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15528) Deprecate ContainerLaunch#link by using FileUtil#SymLink
[ https://issues.apache.org/jira/browse/HADOOP-15528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537672#comment-16537672 ] Jason Lowe commented on HADOOP-15528: - It may be useful to have a config disable for this given it could cause difficulties, but it'd also be nice if we could avoid users shooting themselves in the foot with this config. If we know using this config with some container executors makes no sense then it'd be nice to either fail fast on NM startup, warn it's being ignored, or otherwise do something smarter than just failing every container execution in a difficult to debug manner. > Deprecate ContainerLaunch#link by using FileUtil#SymLink > > > Key: HADOOP-15528 > URL: https://issues.apache.org/jira/browse/HADOOP-15528 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: HADOOP-15528-HADOOP-15461.v1.patch, > HADOOP-15528-HADOOP-15461.v2.patch, HADOOP-15528-HADOOP-15461.v3.patch > > > {{ContainerLaunch}} currently uses its own utility to create links (including > winutils). > This should be deprecated and rely on {{FileUtil#SymLink}} which is already > multi-platform and pure Java. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15121) Encounter NullPointerException when using DecayRpcScheduler
[ https://issues.apache.org/jira/browse/HADOOP-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15121: Fix Version/s: 3.1.0 3.0.1 2.8.5 2.9.2 2.10.0 Thanks, [~Tao Jie]! I recently ran across this in 2.8 and saw it still wasn't fixed in branch-2.8, so I committed this to branch-2, branch-2.9, and branch-2.8. I also saw the Fix Version field in the JIRA wasn't set when this was committed, so I updated that as well to reflect the 3.x versions. > Encounter NullPointerException when using DecayRpcScheduler > --- > > Key: HADOOP-15121 > URL: https://issues.apache.org/jira/browse/HADOOP-15121 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.2 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Fix For: 3.1.0, 2.10.0, 3.0.1, 2.9.2, 2.8.5 > > Attachments: HADOOP-15121.001.patch, HADOOP-15121.002.patch, > HADOOP-15121.003.patch, HADOOP-15121.004.patch, HADOOP-15121.005.patch, > HADOOP-15121.006.patch, HADOOP-15121.007.patch, HADOOP-15121.008.patch > > > I set ipc.8020.scheduler.impl to org.apache.hadoop.ipc.DecayRpcScheduler, but > got excetion in namenode: > {code} > 2017-12-15 15:26:34,662 ERROR impl.MetricsSourceAdapter > (MetricsSourceAdapter.java:getMetrics(202)) - Error getting metrics from > source DecayRpcSchedulerMetrics2.ipc.8020 > java.lang.NullPointerException > at > org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.getMetrics(DecayRpcScheduler.java:781) > at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:199) > at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:182) > at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:155) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) > at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:66) > at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:222) > at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:100) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:268) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:233) > at > org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.registerMetrics2Source(DecayRpcScheduler.java:709) > at > org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.(DecayRpcScheduler.java:685) > at > org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.getInstance(DecayRpcScheduler.java:693) > at > org.apache.hadoop.ipc.DecayRpcScheduler.(DecayRpcScheduler.java:236) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ipc.CallQueueManager.createScheduler(CallQueueManager.java:102) > at > org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:76) > at org.apache.hadoop.ipc.Server.(Server.java:2612) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:958) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:374) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:349) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:415) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:755) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:697) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:905) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:884) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1610) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1678) > {code} > It seems that
[jira] [Commented] (HADOOP-15528) Deprecate ContainerLaunch#link by using FileUtil#SymLink
[ https://issues.apache.org/jira/browse/HADOOP-15528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535409#comment-16535409 ] Jason Lowe commented on HADOOP-15528: - Sorry for the delay in replying, as I recently got back from an extended vacation and am catching up on things. bq. However, the new behavior is the symlink operation is executed by NM itself, which is executed as a child process under NM itself, it shares the same execution environment as NM. This cannot work in a secure environment. Well at least the one we have today on Linux with the native container executor. In that secure environment the container is running as the user and therefore has access to things that the NM user does not. The container working directory is one of those things. Normally the NM user has no need or reason to be able to see the contents of the container working directory nor be able to modify it. > Deprecate ContainerLaunch#link by using FileUtil#SymLink > > > Key: HADOOP-15528 > URL: https://issues.apache.org/jira/browse/HADOOP-15528 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: HADOOP-15528-HADOOP-15461.v1.patch, > HADOOP-15528-HADOOP-15461.v2.patch, HADOOP-15528-HADOOP-15461.v3.patch > > > {{ContainerLaunch}} currently uses its own utility to create links (including > winutils). > This should be deprecated and rely on {{FileUtil#SymLink}} which is already > multi-platform and pure Java. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15571) Multiple FileContexts created with the same configuration object should be allowed to have different umask
[ https://issues.apache.org/jira/browse/HADOOP-15571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535342#comment-16535342 ] Jason Lowe commented on HADOOP-15571: - branch-3 should not exist, branch-3.0 is what you want. Please cherry-pick this to branch-3.0 and update the fix versions to include 3.0.4. > Multiple FileContexts created with the same configuration object should be > allowed to have different umask > -- > > Key: HADOOP-15571 > URL: https://issues.apache.org/jira/browse/HADOOP-15571 > Project: Hadoop Common > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Fix For: 3.2.0, 3.1.1 > > Attachments: HADOOP-15571.1.txt, HADOOP-15571.txt > > > Ran into a super hard-to-debug issue due to this. [Edit: Turns out the same > issue as YARN-5749 that [~Tao Yang] ran into] > h4. Issue > Configuration conf = new Configuration(); > fc1 = FileContext.getFileContext(uri1, conf); > fc2 = FileContext.getFileContext(uri2, conf); > fc.setUMask(umask_for_fc1); // Screws up umask for fc2 also! > This was not the case before HADOOP-13440. > h4. Symptoms: > h5. Scenario I ran into > When trying to localize a HDFS directory (hdfs:///my/dir/1.txt), NodeManager > tries to replicate the directory structure on the local file-system > ($yarn-local-dirs/filecache/my/dir/1.txt). > Now depending on whether NM has ever done a log-aggregation (completely > unrelated code that sets umask to be 137 for its own files on HDFS), the > directories /my and /my/dir on local-fs may have different permissions. In > the specific case where NM did log-aggregation, /my/dir was created with 137 > umask and so localization of 1.txt completely failed due to absent directory > executable permissions! > h5. Previous scenarios: > We ran into this before in test-cases and instead of fixing the root-cause, > we just fixed the test-cases: YARN-5679 / YARN-5749 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15450) Avoid fsync storm triggered by DiskChecker and handle disk full situation
[ https://issues.apache.org/jira/browse/HADOOP-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15450: Target Version/s: 2.8.4, 3.1.1, 2.9.2, 3.0.3 Fix Version/s: (was: 2.8.4) > Avoid fsync storm triggered by DiskChecker and handle disk full situation > - > > Key: HADOOP-15450 > URL: https://issues.apache.org/jira/browse/HADOOP-15450 > Project: Hadoop Common > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Arpit Agarwal >Priority: Blocker > > Fix disk checker issues reported by [~kihwal] in HADOOP-13738: > # When space is low, the os returns ENOSPC. Instead simply stop writing, the > drive is marked bad and replication happens. This make cluster-wide space > problem worse. If the number of "failed" drives exceeds the DFIP limit, the > datanode shuts down. > # There are non-hdfs users of DiskChecker, who use it proactively, not just > on failures. This was fine before, but now it incurs heavy I/O due to > introduction of fsync() in the code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15385) Many tests are failing in hadoop-distcp project in branch-2.8
[ https://issues.apache.org/jira/browse/HADOOP-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450026#comment-16450026 ] Jason Lowe commented on HADOOP-15385: - The failures are unrelated and pass for me locally on branch-2 with or without the patch applied. The ASF warnings are for hs_err pid files, so apparently the JVM was crashing on the system at least twice during the tests. That could easily explain why the test was failing with timeouts and other errors if the launched distcp didn't complete properly. I'm not sure that's so much a problem with the test as it is with the setup of the Jenkins host. > Many tests are failing in hadoop-distcp project in branch-2.8 > - > > Key: HADOOP-15385 > URL: https://issues.apache.org/jira/browse/HADOOP-15385 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Affects Versions: 2.8.2 >Reporter: Rushabh S Shah >Assignee: Jason Lowe >Priority: Blocker > Attachments: HADOOP-15385-branch-2.001.patch > > > Many tests are failing in hadoop-distcp project in branch-2.8 > Below are the failing tests. > {noformat} > Failed tests: > > TestDistCpViewFs.testUpdateGlobTargetMissingSingleLevel:326->checkResult:428 > expected:<4> but was:<5> > TestDistCpViewFs.testGlobTargetMissingMultiLevel:346->checkResult:428 > expected:<4> but was:<5> > TestDistCpViewFs.testGlobTargetMissingSingleLevel:306->checkResult:428 > expected:<2> but was:<3> > TestDistCpViewFs.testUpdateGlobTargetMissingMultiLevel:367->checkResult:428 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > Tests run: 258, Failures: 16, Errors: 0, Skipped: 0 > {noformat} > {noformat} > rushabhs$ pwd > /Users/rushabhs/hadoop/apacheHadoop/hadoop/hadoop-tools/hadoop-distcp > rushabhs$ git branch > branch-2 > branch-2.7 > * branch-2.8 > branch-2.9 > branch-3.0 > rushabhs$ git log --oneline | head -n3 > c4ea1c8bb73 HADOOP-14970. MiniHadoopClusterManager doesn't respect lack of > format option. Contributed by Erik Krogen > 1548205a845 YARN-8147. TestClientRMService#testGetApplications sporadically > fails. Contributed by Jason Lowe > c01b425ba31 YARN-8120. JVM can crash with SIGSEGV when exiting due to custom > leveldb logger. Contributed by Jason Lowe. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15398) StagingTestBase uses methods not available in Mockito 1.8.5
[ https://issues.apache.org/jira/browse/HADOOP-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449975#comment-16449975 ] Jason Lowe commented on HADOOP-15398: - I discovered HADOOP-12427 which is essentially proposing the same thing. It looks like there's a chance this update could break some existing tests, so we really should verify no test starts breaking as a result of this change. [~arshadmohammad] can you look into this? I'll try to find some time to run tests on my end as well. > StagingTestBase uses methods not available in Mockito 1.8.5 > --- > > Key: HADOOP-15398 > URL: https://issues.apache.org/jira/browse/HADOOP-15398 > Project: Hadoop Common > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Attachments: HADOOP-15398.001.patch > > > *Problem:* hadoop trunk compilation is failing > *Root Cause:* > compilation error is coming from > {{org.apache.hadoop.fs.s3a.commit.staging.StagingTestBase}}. Compilation > error is "The method getArgumentAt(int, Class) is > undefined for the type InvocationOnMock". > StagingTestBase is using getArgumentAt(int, Class) method > which is not available in mockito-all 1.8.5 version. getArgumentAt(int, > Class) method is available only from version 2.0.0-beta > *Expectations:* > Either mockito-all version to be upgraded or test case to be written only > with available functions in 1.8.5. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15406) hadoop-nfs dependencies for mockito and junit are not test scope
[ https://issues.apache.org/jira/browse/HADOOP-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15406: Assignee: Jason Lowe Target Version/s: 3.2.0, 3.1.1, 3.0.3 Status: Patch Available (was: Open) > hadoop-nfs dependencies for mockito and junit are not test scope > > > Key: HADOOP-15406 > URL: https://issues.apache.org/jira/browse/HADOOP-15406 > Project: Hadoop Common > Issue Type: Bug > Components: nfs >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: HADOOP-15406.001.patch > > > hadoop-nfs asks for mockito-all and junit for its unit tests but it does not > mark the dependency as being required only for tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15406) hadoop-nfs dependencies for mockito and junit are not test scope
[ https://issues.apache.org/jira/browse/HADOOP-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15406: Attachment: HADOOP-15406.001.patch > hadoop-nfs dependencies for mockito and junit are not test scope > > > Key: HADOOP-15406 > URL: https://issues.apache.org/jira/browse/HADOOP-15406 > Project: Hadoop Common > Issue Type: Bug > Components: nfs >Reporter: Jason Lowe >Priority: Major > Attachments: HADOOP-15406.001.patch > > > hadoop-nfs asks for mockito-all and junit for its unit tests but it does not > mark the dependency as being required only for tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-12427) [JDK8] Upgrade Mockito version to 1.10.19
[ https://issues.apache.org/jira/browse/HADOOP-12427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448931#comment-16448931 ] Jason Lowe commented on HADOOP-12427: - Ran across this as part of analyzing HADOOP-15398. Was there anything that kept this from going in? HADOOP-15398 proposes the same fix for its transient compile issue -- upgrading from 1.8.5 to 1.10.19. > [JDK8] Upgrade Mockito version to 1.10.19 > - > > Key: HADOOP-12427 > URL: https://issues.apache.org/jira/browse/HADOOP-12427 > Project: Hadoop Common > Issue Type: Improvement > Components: build >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Minor > Attachments: HADOOP-12427.v0.patch > > > The current version is 1.8.5 - inserted in 2011. > JDK 8 has been supported since 1.10.0. > https://github.com/mockito/mockito/blob/master/doc/release-notes/official.md > "Compatible with JDK8 with exception of defender methods, JDK8 support will > improve in 2.0" > http://mockito.org/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-15398) StagingTestBase uses methods not available in Mockito 1.8.5
[ https://issues.apache.org/jira/browse/HADOOP-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned HADOOP-15398: --- Assignee: Mohammad Arshad Summary: StagingTestBase uses methods not available in Mockito 1.8.5 (was: Compilation error in trunk in hadoop-aws ) Thanks for the patch! +1 lgtm. I'll commit this tomorrow if there are no objections. > StagingTestBase uses methods not available in Mockito 1.8.5 > --- > > Key: HADOOP-15398 > URL: https://issues.apache.org/jira/browse/HADOOP-15398 > Project: Hadoop Common > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Attachments: HADOOP-15398.001.patch > > > *Problem:* hadoop trunk compilation is failing > *Root Cause:* > compilation error is coming from > {{org.apache.hadoop.fs.s3a.commit.staging.StagingTestBase}}. Compilation > error is "The method getArgumentAt(int, Class) is > undefined for the type InvocationOnMock". > StagingTestBase is using getArgumentAt(int, Class) method > which is not available in mockito-all 1.8.5 version. getArgumentAt(int, > Class) method is available only from version 2.0.0-beta > *Expectations:* > Either mockito-all version to be upgraded or test case to be written only > with available functions in 1.8.5. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15406) hadoop-nfs dependencies for mockito and junit are not test scope
Jason Lowe created HADOOP-15406: --- Summary: hadoop-nfs dependencies for mockito and junit are not test scope Key: HADOOP-15406 URL: https://issues.apache.org/jira/browse/HADOOP-15406 Project: Hadoop Common Issue Type: Bug Components: nfs Reporter: Jason Lowe hadoop-nfs asks for mockito-all and junit for its unit tests but it does not mark the dependency as being required only for tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15403) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/HADOOP-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448602#comment-16448602 ] Jason Lowe commented on HADOOP-15403: - bq. would a change in config be ok? A change in the default value for a config is arguably the same thing as a code change that changes the default behavior from the perspective of a user. To be clear I'm not saying we can't ever change the default behavior, but we need to be careful about the ramifications. If we do, it needs to be marked as an incompatible change and have a corresponding release note that clearly explains the potential for silent data loss relative to the old behavior and what users can do to restore the old behavior. Given the behavior for non-recursive has been this way for quite a long time, either users aren't running into this very often or they've set the value to recursive. That leads me to suggest adding the ability to ignore directories but _not_ make it the default. Then we don't have a backward incompatibility and the one Hive case you're trying can still work once the config is updated (or Hive can run the job with that setting automatically if it makes sense for that use case). > FileInputFormat recursive=false fails instead of ignoring the directories. > -- > > Key: HADOOP-15403 > URL: https://issues.apache.org/jira/browse/HADOOP-15403 > Project: Hadoop Common > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HADOOP-15403.patch > > > We are trying to create a split in Hive that will only read files in a > directory and not subdirectories. > That fails with the below error. > Given how this error comes about (two pieces of code interact, one explicitly > adding directories to results without failing, and one failing on any > directories in results), this seems like a bug. > {noformat} > Caused by: java.io.IOException: Not a file: > file:/,...warehouse/simple_to_mm_text/delta_001_001_ > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) > ~[hadoop-mapreduce-client-core-3.1.0.jar:?] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > {noformat} > This code, when recursion is disabled, adds directories to results > {noformat} > if (recursive && stat.isDirectory()) { > result.dirsNeedingRecursiveCalls.add(stat); > } else { > result.locatedFileStatuses.add(stat); > } > {noformat} > However the getSplits code after that computes the size like this > {noformat} > long totalSize = 0; // compute total size > for (FileStatus file: files) {// check we have valid files > if (file.isDirectory()) { > throw new IOException("Not a file: "+ file.getPath()); > } > totalSize += > {noformat} > which would always fail combined with the above code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15403) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/HADOOP-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448264#comment-16448264 ] Jason Lowe commented on HADOOP-15403: - Does this have backward compatibility ramifications? The default for mapreduce.input.fileinputformat.input.dir.recursive is false, so unless users changed it the jobs are failing today if the input contains directories. If we change the behavior to ignore directories that could lead to lead to silent data loss if the job tried to consume an input location that now suddenly contains some directories. In short: is it OK to assume the users will be aware of and agree with the new behavior? Is there any way for users to revert to the old behavior if they do not want any inputs to be silently ignored? > FileInputFormat recursive=false fails instead of ignoring the directories. > -- > > Key: HADOOP-15403 > URL: https://issues.apache.org/jira/browse/HADOOP-15403 > Project: Hadoop Common > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HADOOP-15403.patch > > > We are trying to create a split in Hive that will only read files in a > directory and not subdirectories. > That fails with the below error. > Given how this error comes about (two pieces of code interact, one explicitly > adding directories to results without failing, and one failing on any > directories in results), this seems like a bug. > {noformat} > Caused by: java.io.IOException: Not a file: > file:/,...warehouse/simple_to_mm_text/delta_001_001_ > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) > ~[hadoop-mapreduce-client-core-3.1.0.jar:?] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > {noformat} > This code, when recursion is disabled, adds directories to results > {noformat} > if (recursive && stat.isDirectory()) { > result.dirsNeedingRecursiveCalls.add(stat); > } else { > result.locatedFileStatuses.add(stat); > } > {noformat} > However the getSplits code after that computes the size like this > {noformat} > long totalSize = 0; // compute total size > for (FileStatus file: files) {// check we have valid files > if (file.isDirectory()) { > throw new IOException("Not a file: "+ file.getPath()); > } > totalSize += > {noformat} > which would always fail combined with the above code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15385) Many tests are failing in hadoop-distcp project in branch-2.8
[ https://issues.apache.org/jira/browse/HADOOP-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15385: Status: Patch Available (was: Open) Here's a patch that uses a unique path for each test suite that won't collide with the hadoop.tmp.dir set during the unit test runs. This is something we would need to do for parallel unit tests anyway. > Many tests are failing in hadoop-distcp project in branch-2.8 > - > > Key: HADOOP-15385 > URL: https://issues.apache.org/jira/browse/HADOOP-15385 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Affects Versions: 2.8.2 >Reporter: Rushabh S Shah >Assignee: Jason Lowe >Priority: Blocker > Attachments: HADOOP-15385-branch-2.001.patch > > > Many tests are failing in hadoop-distcp project in branch-2.8 > Below are the failing tests. > {noformat} > Failed tests: > > TestDistCpViewFs.testUpdateGlobTargetMissingSingleLevel:326->checkResult:428 > expected:<4> but was:<5> > TestDistCpViewFs.testGlobTargetMissingMultiLevel:346->checkResult:428 > expected:<4> but was:<5> > TestDistCpViewFs.testGlobTargetMissingSingleLevel:306->checkResult:428 > expected:<2> but was:<3> > TestDistCpViewFs.testUpdateGlobTargetMissingMultiLevel:367->checkResult:428 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > Tests run: 258, Failures: 16, Errors: 0, Skipped: 0 > {noformat} > {noformat} > rushabhs$ pwd > /Users/rushabhs/hadoop/apacheHadoop/hadoop/hadoop-tools/hadoop-distcp > rushabhs$ git branch > branch-2 > branch-2.7 > * branch-2.8 > branch-2.9 > branch-3.0 > rushabhs$ git log --oneline | head -n3 > c4ea1c8bb73 HADOOP-14970. MiniHadoopClusterManager doesn't respect lack of > format option. Contributed by Erik Krogen > 1548205a845 YARN-8147. TestClientRMService#testGetApplications sporadically > fails. Contributed by Jason Lowe > c01b425ba31 YARN-8120. JVM can crash with SIGSEGV when exiting due to custom > leveldb logger. Contributed by Jason Lowe. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15385) Many tests are failing in hadoop-distcp project in branch-2.8
[ https://issues.apache.org/jira/browse/HADOOP-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15385: Attachment: HADOOP-15385-branch-2.001.patch > Many tests are failing in hadoop-distcp project in branch-2.8 > - > > Key: HADOOP-15385 > URL: https://issues.apache.org/jira/browse/HADOOP-15385 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Affects Versions: 2.8.2 >Reporter: Rushabh S Shah >Assignee: Jason Lowe >Priority: Blocker > Attachments: HADOOP-15385-branch-2.001.patch > > > Many tests are failing in hadoop-distcp project in branch-2.8 > Below are the failing tests. > {noformat} > Failed tests: > > TestDistCpViewFs.testUpdateGlobTargetMissingSingleLevel:326->checkResult:428 > expected:<4> but was:<5> > TestDistCpViewFs.testGlobTargetMissingMultiLevel:346->checkResult:428 > expected:<4> but was:<5> > TestDistCpViewFs.testGlobTargetMissingSingleLevel:306->checkResult:428 > expected:<2> but was:<3> > TestDistCpViewFs.testUpdateGlobTargetMissingMultiLevel:367->checkResult:428 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > Tests run: 258, Failures: 16, Errors: 0, Skipped: 0 > {noformat} > {noformat} > rushabhs$ pwd > /Users/rushabhs/hadoop/apacheHadoop/hadoop/hadoop-tools/hadoop-distcp > rushabhs$ git branch > branch-2 > branch-2.7 > * branch-2.8 > branch-2.9 > branch-3.0 > rushabhs$ git log --oneline | head -n3 > c4ea1c8bb73 HADOOP-14970. MiniHadoopClusterManager doesn't respect lack of > format option. Contributed by Erik Krogen > 1548205a845 YARN-8147. TestClientRMService#testGetApplications sporadically > fails. Contributed by Jason Lowe > c01b425ba31 YARN-8120. JVM can crash with SIGSEGV when exiting due to custom > leveldb logger. Contributed by Jason Lowe. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-15385) Many tests are failing in hadoop-distcp project in branch-2.8
[ https://issues.apache.org/jira/browse/HADOOP-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned HADOOP-15385: --- Assignee: Jason Lowe Affects Version/s: (was: 2.8.3) 2.8.2 Target Version/s: 2.10.0, 2.9.1, 2.8.4 (was: 2.9.1, 2.8.4) Thanks for the analysis, [~tasanuma0829]! I took a quick look at this, and the failures are an unfortunate collision between the test directory and the staging directory. Both of these end up using 'target/tmp' as a directory, so the staging directory used during the launch of the distcp job ends up showing up in the test directory and confuses the test when it checks how many files are in its test directory. I can put up a patch shortly. > Many tests are failing in hadoop-distcp project in branch-2.8 > - > > Key: HADOOP-15385 > URL: https://issues.apache.org/jira/browse/HADOOP-15385 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Affects Versions: 2.8.2 >Reporter: Rushabh S Shah >Assignee: Jason Lowe >Priority: Blocker > > Many tests are failing in hadoop-distcp project in branch-2.8 > Below are the failing tests. > {noformat} > Failed tests: > > TestDistCpViewFs.testUpdateGlobTargetMissingSingleLevel:326->checkResult:428 > expected:<4> but was:<5> > TestDistCpViewFs.testGlobTargetMissingMultiLevel:346->checkResult:428 > expected:<4> but was:<5> > TestDistCpViewFs.testGlobTargetMissingSingleLevel:306->checkResult:428 > expected:<2> but was:<3> > TestDistCpViewFs.testUpdateGlobTargetMissingMultiLevel:367->checkResult:428 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > TestIntegration.testUpdateGlobTargetMissingSingleLevel:431->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingMultiLevel:454->checkResult:577 > expected:<4> but was:<5> > TestIntegration.testGlobTargetMissingSingleLevel:408->checkResult:577 > expected:<2> but was:<3> > TestIntegration.testUpdateGlobTargetMissingMultiLevel:478->checkResult:577 > expected:<6> but was:<8> > Tests run: 258, Failures: 16, Errors: 0, Skipped: 0 > {noformat} > {noformat} > rushabhs$ pwd > /Users/rushabhs/hadoop/apacheHadoop/hadoop/hadoop-tools/hadoop-distcp > rushabhs$ git branch > branch-2 > branch-2.7 > * branch-2.8 > branch-2.9 > branch-3.0 > rushabhs$ git log --oneline | head -n3 > c4ea1c8bb73 HADOOP-14970. MiniHadoopClusterManager doesn't respect lack of > format option. Contributed by Erik Krogen > 1548205a845 YARN-8147. TestClientRMService#testGetApplications sporadically > fails. Contributed by Jason Lowe > c01b425ba31 YARN-8120. JVM can crash with SIGSEGV when exiting due to custom > leveldb logger. Contributed by Jason Lowe. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Moved] (HADOOP-15398) Compilation error in trunk in hadoop-aws
[ https://issues.apache.org/jira/browse/HADOOP-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe moved HDFS-13472 to HADOOP-15398: Target Version/s: 3.2.0, 3.1.1 (was: 3.2.0) Key: HADOOP-15398 (was: HDFS-13472) Project: Hadoop Common (was: Hadoop HDFS) > Compilation error in trunk in hadoop-aws > - > > Key: HADOOP-15398 > URL: https://issues.apache.org/jira/browse/HADOOP-15398 > Project: Hadoop Common > Issue Type: Bug >Reporter: Mohammad Arshad >Priority: Major > > *Problem:* hadoop trunk compilation is failing > *Root Cause:* > compilation error is coming from > {{org.apache.hadoop.fs.s3a.commit.staging.StagingTestBase}}. Compilation > error is "The method getArgumentAt(int, Class) is > undefined for the type InvocationOnMock". > StagingTestBase is using getArgumentAt(int, Class) method > which is not available in mockito-all 1.8.5 version. getArgumentAt(int, > Class) method is available only from version 2.0.0-beta > *Expectations:* > Either mockito-all version to be upgraded or test case to be written only > with available functions in 1.8.5. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution
[ https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15357: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.3 2.9.2 3.1.1 3.2.0 2.10.0 Status: Resolved (was: Patch Available) Thanks to [~Jim_Brennan] for the contribution and to [~lmccay] for addtional review! I committed this to trunk, branch-3.1, branch-3.0, branch-2, and branch-2.9. > Configuration.getPropsWithPrefix no longer does variable substitution > - > > Key: HADOOP-15357 > URL: https://issues.apache.org/jira/browse/HADOOP-15357 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3 > > Attachments: HADOOP-15357.001.patch, HADOOP-15357.002.patch, > HADOOP-15357.003.patch > > > Before [HADOOP-13556], Configuration.getPropsWithPrefix() used the > Configuration.get() method to get the value of the variables. After > [HADOOP-13556], it now uses props.getProperty(). > The difference is that Configuration.get() does deprecation handling and more > importantly variable substitution on the value. So if a property has a > variable specified with ${variable_name}, it will no longer be expanded when > retrieved via getPropsWithPrefix(). > Was this change in behavior intentional? I am using this function in the fix > for [MAPREDUCE-7069], but we do want variable expansion to happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution
[ https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15357: Affects Version/s: 2.9.0 3.0.0 Target Version/s: 3.2.0, 3.1.1, 2.9.2, 3.0.3 +1 the latest patch looks good to me as well. I'll commit this later today if there are no objections. > Configuration.getPropsWithPrefix no longer does variable substitution > - > > Key: HADOOP-15357 > URL: https://issues.apache.org/jira/browse/HADOOP-15357 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: HADOOP-15357.001.patch, HADOOP-15357.002.patch, > HADOOP-15357.003.patch > > > Before [HADOOP-13556], Configuration.getPropsWithPrefix() used the > Configuration.get() method to get the value of the variables. After > [HADOOP-13556], it now uses props.getProperty(). > The difference is that Configuration.get() does deprecation handling and more > importantly variable substitution on the value. So if a property has a > variable specified with ${variable_name}, it will no longer be expanded when > retrieved via getPropsWithPrefix(). > Was this change in behavior intentional? I am using this function in the fix > for [MAPREDUCE-7069], but we do want variable expansion to happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-15372) Race conditions and possible leaks in the Shell class
[ https://issues.apache.org/jira/browse/HADOOP-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned HADOOP-15372: --- Assignee: Eric Badger > Race conditions and possible leaks in the Shell class > - > > Key: HADOOP-15372 > URL: https://issues.apache.org/jira/browse/HADOOP-15372 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0, 3.2.0 >Reporter: Miklos Szegedi >Assignee: Eric Badger >Priority: Minor > > YARN-5641 introduced some cleanup code in the Shell class. It has a race > condition. {{Shell. > runCommand()}} can be called while/after {{Shell.getAllShells()}} returned > all the shells to be cleaned up. This new thread can avoid the clean up, so > that the process held by it can be leaked causing leaked localized files/etc. > I see another issue as well. {{Shell.runCommand()}} has a finally block with > a {{ > process.destroy();}} to clean up. However, the try catch block does not cover > all instructions after the process is started, so for example we can exit the > thread and leak the process, if {{ > timeOutTimer.schedule(timeoutTimerTask, timeOutInterval);}} causes an > exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15372) Race conditions and possible leaks in the Shell class
[ https://issues.apache.org/jira/browse/HADOOP-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16430747#comment-16430747 ] Jason Lowe commented on HADOOP-15372: - Thanks for the report, Miklos bq. runCommand()}} can be called while/after Shell.getAllShells() returned all the shells to be cleaned up. This new thread can avoid the clean up, so that the process held by it can be leaked causing leaked localized files/etc. Yes, there's a small bug where Shell#runCommand should be synchronizing on CHILD_SHELLS as it starts the subprocess. That way it either won't be started before destroyAllShells is called or it will be part of the list as long as it's been started. However I would argue it's outside the scope of the getAllShells and destoryAllShells APIs to prevent all future shells from being launched, as there may be a use case where someone wants to cleanup all current shells but still launch future ones. Its job is to kill all active ones which client code outside of Shell cannot do. In the specific case of localizing, it looks like we need a second destroy pass after awaiting the shutdown of the executor to catch any shell that was trying to launch just as we destroyed the active ones. {quote}I see another issue as well. [...] the try catch block does not cover all instructions after the process is started, so for example we can exit the thread and leak the process {quote} Yes that appears to be an issue as well. [~ebadger] do you have some cycles to look into this? > Race conditions and possible leaks in the Shell class > - > > Key: HADOOP-15372 > URL: https://issues.apache.org/jira/browse/HADOOP-15372 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0, 3.2.0 >Reporter: Miklos Szegedi >Priority: Minor > > YARN-5641 introduced some cleanup code in the Shell class. It has a race > condition. {{Shell. > runCommand()}} can be called while/after {{Shell.getAllShells()}} returned > all the shells to be cleaned up. This new thread can avoid the clean up, so > that the process held by it can be leaked causing leaked localized files/etc. > I see another issue as well. {{Shell.runCommand()}} has a finally block with > a {{ > process.destroy();}} to clean up. However, the try catch block does not cover > all instructions after the process is started, so for example we can exit the > thread and leak the process, if {{ > timeOutTimer.schedule(timeoutTimerTask, timeOutInterval);}} causes an > exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Reopened] (HADOOP-13500) Concurrency issues when using Configuration iterator
[ https://issues.apache.org/jira/browse/HADOOP-13500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reopened HADOOP-13500: - This is not a duplicate of HADOOP-13556. That JIRA only changed the getPropsWithPrefix method which was not involved in the error reported by this JIRA or TEZ-3413. AFAICT iterating a shared configuration object is still unsafe. > Concurrency issues when using Configuration iterator > > > Key: HADOOP-13500 > URL: https://issues.apache.org/jira/browse/HADOOP-13500 > Project: Hadoop Common > Issue Type: Bug > Components: conf >Reporter: Jason Lowe >Assignee: Ajay Kumar >Priority: Major > > It is possible to encounter a ConcurrentModificationException while trying to > iterate a Configuration object. The iterator method tries to walk the > underlying Property object without proper synchronization, so another thread > simultaneously calling the set method can trigger it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15336) NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol communication between 2.7 and 3.2
[ https://issues.apache.org/jira/browse/HADOOP-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411600#comment-16411600 ] Jason Lowe commented on HADOOP-15336: - Could you elaborate in the description on the nature of the NPE? A sample stacktrace would be immensely helpful here. > NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol communication > between 2.7 and 3.2 > - > > Key: HADOOP-15336 > URL: https://issues.apache.org/jira/browse/HADOOP-15336 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0, 3.2.0 >Reporter: Sherwood Zheng >Assignee: Sherwood Zheng >Priority: Major > Labels: backward-incompatible, common > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391522#comment-16391522 ] Jason Lowe commented on HADOOP-15206: - skipBytes is decremented because of the read() call. The skip() call is not guaranteed to be able to skip, and the workaround in that case is to try to read(). If the read() is successful then we were able to skip one more byte and need to account for that in the total number of bytes trying to be skipped. > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Assignee: Aki Tanaka >Priority: Major > Fix For: 3.1.0, 2.10.0, 2.9.1, 2.8.4, 2.7.6, 3.0.2 > > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, > HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, > HADOOP-15206.005.patch, HADOOP-15206.006.patch, HADOOP-15206.007.patch, > HADOOP-15206.008.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15284) Docker launch fails when user private filecache directory is missing
[ https://issues.apache.org/jira/browse/HADOOP-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15284: Affects Version/s: 3.1.0 Summary: Docker launch fails when user private filecache directory is missing (was: Could not determine real path of mount) ContainerLocalizer, which is run for every user-specific localization (i.e.: PRIVATE and APPLICATION visibility), creates both the usercache/_user_/filecache and usercache/_user_/appcache directories whenever it runs (see ContainerLocalizer#initDirs). If this directory is missing then I'm wondering if this is a case where _nothing_ was localized for this user, not just PRIVATE but also no APPLICATION visibility resources (i.e.: only public resources or no resources at all). The only reason this would have worked before YARN-7815 is because the container executor creates the container work directory which exists under the usercache/_user_ directory, and that's what it used to mount before tha changes in YARN-7815. > Docker launch fails when user private filecache directory is missing > > > Key: HADOOP-15284 > URL: https://issues.apache.org/jira/browse/HADOOP-15284 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Eric Yang >Priority: Major > > Docker container is failing to launch in trunk. The root cause is: > {code} > [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_20]: > [2018-03-02 23:26:09.196]Exception from container-launch. > Container id: container_1520032931921_0001_01_20 > Exit code: 29 > Exception message: image: hadoop/centos:latest is trusted in hadoop registry. > Could not determine real path of mount > '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache' > Could not determine real path of mount > '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache' > Invalid docker mount > '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache', > realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache > Error constructing docker command, docker error code=12, error > message='Invalid docker mount' > Shell output: main : command provided 4 > main : run as user is hbase > main : requested yarn user is hbase > Creating script paths... > Creating local dirs... > [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 > 23:26:09.240] > [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29. > [2018-03-02 23:26:39.278]Could not find > nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_20//container_1520032931921_0001_01_20.pid > in any of the directories > [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down > now... > {code} > The filecache cant not be mounted because it doesn't exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15284) Could not determine real path of mount
[ https://issues.apache.org/jira/browse/HADOOP-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386208#comment-16386208 ] Jason Lowe commented on HADOOP-15284: - Looks like this was caused by YARN-7815. The user's directory that was mounted before is always going to be there because the container executor creates the underlying container directory, but the user's filecache directory for resources with PRIVATE visibility may not be there. One straightforward fix is to have the container executor ensure the user's filecache directory is present when launching Docker containers, but there may be cleaner alternatives. > Could not determine real path of mount > -- > > Key: HADOOP-15284 > URL: https://issues.apache.org/jira/browse/HADOOP-15284 > Project: Hadoop Common > Issue Type: Bug >Reporter: Eric Yang >Priority: Major > > Docker container is failing to launch in trunk. The root cause is: > {code} > [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_20]: > [2018-03-02 23:26:09.196]Exception from container-launch. > Container id: container_1520032931921_0001_01_20 > Exit code: 29 > Exception message: image: hadoop/centos:latest is trusted in hadoop registry. > Could not determine real path of mount > '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache' > Could not determine real path of mount > '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache' > Invalid docker mount > '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache', > realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache > Error constructing docker command, docker error code=12, error > message='Invalid docker mount' > Shell output: main : command provided 4 > main : run as user is hbase > main : requested yarn user is hbase > Creating script paths... > Creating local dirs... > [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 > 23:26:09.240] > [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29. > [2018-03-02 23:26:39.278]Could not find > nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_20//container_1520032931921_0001_01_20.pid > in any of the directories > [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down > now... > {code} > The filecache cant not be mounted because it doesn't exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15279) increase maven heap size recommendations
[ https://issues.apache.org/jira/browse/HADOOP-15279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382687#comment-16382687 ] Jason Lowe commented on HADOOP-15279: - Thanks, Allen! +1 lgtm. > increase maven heap size recommendations > > > Key: HADOOP-15279 > URL: https://issues.apache.org/jira/browse/HADOOP-15279 > Project: Hadoop Common > Issue Type: Improvement > Components: build, test >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Allen Wittenauer >Priority: Minor > Attachments: HADOOP-15279.00.patch > > > 1G is just a bit too low for JDK8+surefire 2.20+hdfs unit tests running in > parallel. Bump it up a bit more. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15206: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.2 2.7.6 2.8.4 2.9.1 2.10.0 3.1.0 Status: Resolved (was: Patch Available) Thanks, [~tanakahda]! I committed this to trunk, branch-3.1, branch-3.0, branch-2, branch-2.9, branch-2.8, and branch-2.7. > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Assignee: Aki Tanaka >Priority: Major > Fix For: 3.1.0, 2.10.0, 2.9.1, 2.8.4, 2.7.6, 3.0.2 > > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, > HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, > HADOOP-15206.005.patch, HADOOP-15206.006.patch, HADOOP-15206.007.patch, > HADOOP-15206.008.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367851#comment-16367851 ] Jason Lowe commented on HADOOP-15206: - The TestMRJobs failure is unrelated and tracked by MAPREDUCE-7053. Committing this. > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Assignee: Aki Tanaka >Priority: Major > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, > HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, > HADOOP-15206.005.patch, HADOOP-15206.006.patch, HADOOP-15206.007.patch, > HADOOP-15206.008.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366237#comment-16366237 ] Jason Lowe commented on HADOOP-15206: - bq. Since this is the first time to apply a patch to the community, I apologize for having bothered you. No worries whatsoever. It is very common to go back and forth on a number of patches before anything is committed, so this is simply development as usual from my perspective. I deeply appreciate your contribution on this subtle and tricky issue! +1 for the latest patch, pending Jenkins. > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Assignee: Aki Tanaka >Priority: Major > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, > HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, > HADOOP-15206.005.patch, HADOOP-15206.006.patch, HADOOP-15206.007.patch, > HADOOP-15206.008.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365954#comment-16365954 ] Jason Lowe commented on HADOOP-15206: - bq. Deleted comments in the code Sorry, I didn't mean the entire comment needs to be deleted. I think the comments were very helpful to explain why the logic is there, but I just didn't see the need to call out the specific JIRA number. That is something trivially obtained from git. Speaking of comments, when they are reinstated I noticed that this comment is slightly incorrect: {code} // HADOOP-15206: When we're in BYBLOCK mode and the start position // is >=0 and < HEADER_LEN + SUB_HEADER_LEN, we should also skip // to right after the BZip2 header to avoid duplicated records skipPos = HEADER_LEN + SUB_HEADER_LEN + 1 - this.startingPos; {code} "Skip to right after the BZip2 header" may lead someone to think there's an off-by-one bug in the code. We need to skip to right after the start of the first bz2 block (which occurs right after the bz2 header). Nit: skipPos is not really a position but rather the number of bytes being skipped, so it looks incorrect when the code calls updateReportedByteCount on what appears to be a position rather than a byte delta. Something like numSkipped or numBytesSkipped would be a less confusing name. It would be nice to fix the checkstyle warning about line length on the comment. The unit test failures appear to be unrelated, and they pass for me locally with the patch applied. > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Assignee: Aki Tanaka >Priority: Major > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, > HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, > HADOOP-15206.005.patch, HADOOP-15206.006.patch, HADOOP-15206.007.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned HADOOP-15206: --- Assignee: Aki Tanaka > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Assignee: Aki Tanaka >Priority: Major > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, > HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, > HADOOP-15206.005.patch, HADOOP-15206.006.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15206: Status: Patch Available (was: Open) Thanks for updating the patch! Looks good overall, just a few nits. I think we're close, so moving this to Patch Available so the QA bot can comment on this as well. Why are we only skipping one byte at a time instead of trying to skip the rest of the way in one call? The code can track the remaining bytes in skipBytes, decrement that by the number of bytes skipped in the loop, then loop while skipBytes > 0. There is trailing whitespace on a couple of lines which would be nice to cleanup. I expect the QA bot to flag this in its whitespace check. I'm not sure it's necessary to call out the JIRA in the comments. That's what {{git blame}} is for. ;) Otherwise the code would be littered with JIRA numbers in every bugfix change. "steam is on BZip2 header" should be "a split is before the first BZip2 block" > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0, 2.8.3 >Reporter: Aki Tanaka >Priority: Major > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, > HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, > HADOOP-15206.005.patch, HADOOP-15206.006.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15227) add mapreduce.outputcommitter.factory.scheme.s3a to core-default
[ https://issues.apache.org/jira/browse/HADOOP-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364275#comment-16364275 ] Jason Lowe commented on HADOOP-15227: - Yeah, mapred-default and mapred-site aren't loaded until the JobConf class is loaded. A common mistake for code is to create a plain Configuration early in {{main}} and try to lookup mapred properties (or even hdfs or yarn properties) expecting to get the default if they are not set by the user. The easy fix is to create a JobConf instead of a Configuration if the code knows it wants to do mapred stuff. > add mapreduce.outputcommitter.factory.scheme.s3a to core-default > > > Key: HADOOP-15227 > URL: https://issues.apache.org/jira/browse/HADOOP-15227 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Blocker > > Need to add this property to core-default.xml. It's documented as being > there, but it isn't. > {code} > > mapreduce.outputcommitter.factory.scheme.s3a > org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory > > The committer factory to use when writing data to S3A filesystems. > > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363203#comment-16363203 ] Jason Lowe commented on HADOOP-15206: - Thanks for updating the patch! I believe the latest patch will break CONTINUOUS mode since it will no longer strip the bzip2 file header in that case. I don't think it will be OK to remove calling readStreamHeader when reset() is called. We're resetting the codec state to start afresh, and that means potentially reading a new file header (e.g.: concatenated bzip2 files). My thinking is that we need to read the header, but we should not report the byte position being updated when doing so while we're in BLOCK mode (i.e.: split processing). I think we need to revert the stream header reading logic to the original behavior. Instead we can put a small change in the BZip2InputStream constructor to handle the special case of small splits that can start at or before the first bz2 block. If the read mode is BLOCK and 0 < startingPos <= HEADER_LEN + SUB_HEADER_LEN then we skip bytes until we get to the HEADER_LEN + SUB_HEADER_LEN + 1 offset in the stream. The bufferedIn.skip method will be useful here, but it needs to be called in a loop in case the skip fails to skip everything in one call (per the javadoc). > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Priority: Major > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, > HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch, > HADOOP-15206.005.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15227) add mapreduce.outputcommitter.factory.scheme.s3a to core-default
[ https://issues.apache.org/jira/browse/HADOOP-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363017#comment-16363017 ] Jason Lowe commented on HADOOP-15227: - Does this go in core-default or mapred-default? The property name implies it would not belong in core-default, and it currently has the proper value in mapred-default. So maybe the documentation is what needs to be corrected instead? > add mapreduce.outputcommitter.factory.scheme.s3a to core-default > > > Key: HADOOP-15227 > URL: https://issues.apache.org/jira/browse/HADOOP-15227 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Blocker > > Need to add this property to core-default.xml. It's documented as being > there, but it isn't. > {code} > > mapreduce.outputcommitter.factory.scheme.s3a > org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory > > The committer factory to use when writing data to S3A filesystems. > > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361615#comment-16361615 ] Jason Lowe commented on HADOOP-15206: - Thanks for updating the patch! bq. In the current implementation, read only "BZ" header when the read mode is CONTINUOUS. Do you think we should keep this? Yes, because it's not important to read the header when the codec is in BLOCK mode. IIUC the main difference between CONTINUOUS and BLOCK mode is that BLOCK mode will be used when processing splits and CONTINUOUS mode is used when we're simply trying to decompress the data in one big chunk (i.e.: no splits). BLOCK mode always will scan for the start of the bz2 block, so it will automatically skip a bz2 file header while searching for the start of the first bz2 block from the specified start offset. Given the splittable codec is always scanning for the block and doesn't really care what bytes are being skipped, I'm now thinking we can go back to a much simpler implementation. I think the code can check if we're in BLOCK mode to know whether we are processing splits or not. If we are in BLOCK mode we avoid advertising the byte position if start offset is zero just as the previous patches. In BLOCK mode we should also skip to file offset HEADER_LEN + SUB_HEADER_LEN + 1 if the start position is >=0 and < HEADER_LEN + SUB_HEADER_LEN. That will put us one byte past the start of the first bz2 block, and BLOCK mode will automatically scan forward to the next block. This proposal is very similar to what was implemented in patch 003. I think we just need to make it only do the position adjustment if we're in BLOCK mode. > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Priority: Major > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, > HADOOP-15206.002.patch, HADOOP-15206.003.patch, HADOOP-15206.004.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358919#comment-16358919 ] Jason Lowe commented on HADOOP-15206: - Thanks for updating the patch! It seems the basic problem is that split 0, the first split, is _always_ responsible for the first record even if that record is technically past the byte offset of the end of the split. That's because all other splits will unconditionally throw away the first (potentially partial) record under the assumption the previous split is responsible for it. Therefore we need to do two things to avoid drops and duplicates: * If the first split ends before the start of the first bz2 block then we need to avoid advertising the updated byte position until we have started to consume the first bz2 block. This avoids the dropped record. * If subsequent splits start before the first bz2 block begins then we need to make sure any split that starts before the first block is artificially pushed past that first block. This avoids the duplicates. I'm wondering if it gets cleaner if we move this logic into readStreamHeader() and always call it. That method can check the starting position and do one of the following: * check for and read the full header if it is at starting position 0 * do nothing if start pos is past the full header + 1 * verify the bytes being skipped are the expected header bytes if start pos between 0 and full_header+1. If they are not the expected bytes then we reset the buffered input (just like starting pos 0 logic does today if header is not there). In the constructor we should be able to avoid updating the reported position if starting position is 0 (so we will always read into the first bz2 block), otherwise we advertise after reading any header so subsequent splits always start at least one byte after the start of the first bz2 block. > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Priority: Major > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, > HADOOP-15206.002.patch, HADOOP-15206.003.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354606#comment-16354606 ] Jason Lowe commented on HADOOP-15206: - Thanks for updating the patch! {quote}Because 4 is a position of the first bz2 block marker, and an input stream will start reading the first bz2 block if the start position of the input stream is 4. {quote} Ah, right. Thanks for the explanation. {quote}So, if the input stream tries to read from position 1-4, it will drop the first BZ2 block even though the block marker position is 4. {quote} This doesn't just drop the first bzip2 block, it drops the entire split. This goes back to my previous comment about the code assuming splits that start between bytes 1-4 are always tiny. Splits do not have to be equally sized, so theoretically there could be just two splits where the first split is a two-byte split starting at offset 0 and the other split is the rest of the file. I believe this change would cause all records to be dropped in that scenario. To fix that I think we only need to report a position that is one byte beyond the start of the first bzip2 block rather than at the end of the entire split (i.e.: header_len + 1 rather than end + 1). The logic regarding the header seems backwards. If the header is stripped then that means there was a header present, yet the logic is only adding up bytes for a header length if it was *not* stripped which is the case when the header is not there. I'm wondering how it's working since I think the header is always there in the unit tests. > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Priority: Major > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch, > HADOOP-15206.002.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354074#comment-16354074 ] Jason Lowe commented on HADOOP-15206: - Thanks for the patch! {code:java} if (this.startingPos > 0 && this.startingPos <= 4) { this.startingPos = end + 1; this.compressedStreamPosition = end + 1; } {code} The code above is making the following assumptions that I believe could not be true in some cases: * The bzip2 file header is always present at starting pos 0. I think it should be checking isHeaderStripped/isSubHeaderStripped. * If the split starts after byte 0 but before byte 5 then it must also end on or before byte 5. (Splits are not required to be equally sized.) If the bzip2 file header is four bytes, why is the condition {{<= 4}} instead of {{< 4}}? Should this code leverage HEADER_LEN and SUB_HEADER_LEN here? Nit: "emptry" s/b "empty" > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Priority: Major > Attachments: HADOOP-15206-test.patch, HADOOP-15206.001.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
[ https://issues.apache.org/jira/browse/HADOOP-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351044#comment-16351044 ] Jason Lowe commented on HADOOP-15206: - I found a bit of time to look into this, so I'm dumping my notes here. I'm not sure when I'll get some more time to work on it, so if someone feels brave enough to step in feel free. Here's how I believe records get dropped with very small split sizes: # There's only one bz2 block in the file # The split size is smaller than 4 bytes # First split starts to read the data. It consumes the 'BZh9' magic header then updates the reported byte position of the stream to be 4 # At this point the first split reader is beyond the end of the split before it ever read a single record, so it ends up returning with no records. # The second split starts in the middle of the 'BZh9' magic header and scans forward to find the start of a bz2 block and starts processing the split # Since this is not the first split, it throws away the first record with the assumption the previous split is responsible for it # Second split reader proceeds to consume all remaining data, since byte position is not updated until the next bz2 block and there's only one block # End result is first record is lost since first split never consumed it. I think we can fix this scenario by not advertising a new byte position after reading the 'BZh9' header and only updating the byte position when we read the bz2 block header following the current bz2 block. I didn't get as much time to look into the duplicated record scenario, but I suspect multiple splits end up discovering the beginning of the bz2 block and think it is their block to consume. Not sure yet how we can easily distinguish which split is the one, true split that is responsible for consuming the bz2 block given we're hiding the true byte offset from the upper layers most of the time. > BZip2 drops and duplicates records when input split size is small > - > > Key: HADOOP-15206 > URL: https://issues.apache.org/jira/browse/HADOOP-15206 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.3, 3.0.0 >Reporter: Aki Tanaka >Priority: Major > Attachments: HADOOP-15206-test.patch > > > BZip2 can drop and duplicate record when input split file is small. I > confirmed that this issue happens when the input split size is between 1byte > and 4bytes. > I am seeing the following 2 problem behaviors. > > 1. Drop record: > BZip2 skips the first record in the input file when the input split size is > small > > Set the split size to 3 and tested to load 100 records (0, 1, 2..99) > {code:java} > 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(317)) - > splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 > count=99{code} > > The input format read only 99 records but not 100 records > > 2. Duplicate Record: > 2 input splits has same BZip2 records when the input split size is small > > Set the split size to 1 and tested to load 100 records (0, 1, 2..99) > > {code:java} > 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file > /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 > count=99 > 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat > (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 > at position 8 > {code} > > I experienced this error when I execute Spark (SparkSQL) job under the > following conditions: > * The file size of the input files are small (around 1KB) > * Hadoop cluster has many slave nodes (able to launch many executor tasks) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava
[ https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15170: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.0 Status: Resolved (was: Patch Available) Thanks, [~ajayydv]! I committed this to trunk. > Add symlink support to FileUtil#unTarUsingJava > --- > > Key: HADOOP-15170 > URL: https://issues.apache.org/jira/browse/HADOOP-15170 > Project: Hadoop Common > Issue Type: Improvement > Components: util >Reporter: Jason Lowe >Assignee: Ajay Kumar >Priority: Minor > Fix For: 3.1.0 > > Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch, > HADOOP-15170.003.patch, HADOOP-15170.004.patch > > > Now that JDK7 or later is required, we can leverage > java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support > archives that contain symbolic links. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava
[ https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350699#comment-16350699 ] Jason Lowe commented on HADOOP-15170: - Thanks for updating the patch! +1 lgtm. Committing this. > Add symlink support to FileUtil#unTarUsingJava > --- > > Key: HADOOP-15170 > URL: https://issues.apache.org/jira/browse/HADOOP-15170 > Project: Hadoop Common > Issue Type: Improvement > Components: util >Reporter: Jason Lowe >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch, > HADOOP-15170.003.patch, HADOOP-15170.004.patch > > > Now that JDK7 or later is required, we can leverage > java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support > archives that contain symbolic links. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15200) Missing DistCpOptions constructor breaks downstream DistCp projects in 3.0
[ https://issues.apache.org/jira/browse/HADOOP-15200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15200: Target Version/s: 3.1.0, 3.0.1 > Missing DistCpOptions constructor breaks downstream DistCp projects in 3.0 > -- > > Key: HADOOP-15200 > URL: https://issues.apache.org/jira/browse/HADOOP-15200 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Affects Versions: 3.0.0 >Reporter: Kuhu Shukla >Priority: Critical > > Post HADOOP-14267, the constructor for DistCpOptions was removed and will > break any project using it for java based implementation/usage of DistCp. > This JIRA would track next steps required to reconcile/fix this > incompatibility. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava
[ https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347207#comment-16347207 ] Jason Lowe commented on HADOOP-15170: - Thanks for updating the patch! I tested this out on a manually created a tarball with some symlinks and the link targets are being mishandled. For example: {noformat} $ mkdir testdir $ cd testdir $ ln -s a b $ ln -s /tmp/foo c $ ls -l total 0 lrwxrwxrwx. 1 nobody nobody 1 Jan 31 10:40 b -> a lrwxrwxrwx. 1 nobody nobody 8 Jan 31 10:40 c -> /tmp/foo $ cd .. $ tar zcf testdir.tgz testdir {noformat} When I unpack this tarball to a destination directory of "output" with unTarUsingJava the symlinks are all relative to the top-level output directory which is incorrect: {noformat} $ ls -l output/testdir total 0 lrwxrwxrwx. 1 nobody nobody 8 Jan 31 10:41 b -> output/a lrwxrwxrwx. 1 nobody nobody 14 Jan 31 10:41 c -> output/tmp/foo {noformat} The fix is to just take the symlink name as-is rather than trying to make it relative to something else. > Add symlink support to FileUtil#unTarUsingJava > --- > > Key: HADOOP-15170 > URL: https://issues.apache.org/jira/browse/HADOOP-15170 > Project: Hadoop Common > Issue Type: Improvement > Components: util >Reporter: Jason Lowe >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch, > HADOOP-15170.003.patch > > > Now that JDK7 or later is required, we can leverage > java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support > archives that contain symbolic links. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava
[ https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341831#comment-16341831 ] Jason Lowe commented on HADOOP-15170: - Thanks for the patch! Apologies for the delay. The VisibleForTesting import looks shady (pun intended). We should be pulling this in from the normal location. Actually even better would be to simply make this package-private rather than public. Then I would argue the visibility marker isn't necessary. The createSymbolicLinkUsingJava does not seem worth it given it's private and it's less typing to call the Files method directly. > Add symlink support to FileUtil#unTarUsingJava > --- > > Key: HADOOP-15170 > URL: https://issues.apache.org/jira/browse/HADOOP-15170 > Project: Hadoop Common > Issue Type: Improvement > Components: util >Reporter: Jason Lowe >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch > > > Now that JDK7 or later is required, we can leverage > java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support > archives that contain symbolic links. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15027) AliyunOSS: Support multi-thread pre-read to improve sequential read from Hadoop to Aliyun OSS performance
[ https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HADOOP-15027: Fix Version/s: (was: 3.0.1) (was: 2.9.1) (was: 2.10.0) I reverted this from branch-3.0, branch-2 and branch-2.9 since it broke the builds there: {noformat} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-aliyun: Compilation failure: Compilation failure: [ERROR] /home/jlowe/hadoop/apache/hadoop/hadoop-tools/hadoop-aliyun/src/main/java/org/apache/hadoop/fs/aliyun/oss/AliyunOSSFileSystem.java:[46,30] cannot find symbol [ERROR] symbol: class BlockingThreadPoolExecutorService [ERROR] location: package org.apache.hadoop.util [ERROR] /home/jlowe/hadoop/apache/hadoop/hadoop-tools/hadoop-aliyun/src/main/java/org/apache/hadoop/fs/aliyun/oss/AliyunOSSFileSystem.java:[53,30] cannot find symbol [ERROR] symbol: class SemaphoredDelegatingExecutor [ERROR] location: package org.apache.hadoop.util [ERROR] /home/jlowe/hadoop/apache/hadoop/hadoop-tools/hadoop-aliyun/src/main/java/org/apache/hadoop/fs/aliyun/oss/AliyunOSSFileSystem.java:[332,30] cannot find symbol [ERROR] symbol: variable BlockingThreadPoolExecutorService [ERROR] location: class org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem [ERROR] /home/jlowe/hadoop/apache/hadoop/hadoop-tools/hadoop-aliyun/src/main/java/org/apache/hadoop/fs/aliyun/oss/AliyunOSSFileSystem.java:[549,13] cannot find symbol [ERROR] symbol: class SemaphoredDelegatingExecutor [ERROR] location: class org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem {noformat} If this needs to go into those releases, please revisit the patches for those branches. Looks like this patch depends upon HADOOP-15039 which only went into 3.1.0. > AliyunOSS: Support multi-thread pre-read to improve sequential read from > Hadoop to Aliyun OSS performance > - > > Key: HADOOP-15027 > URL: https://issues.apache.org/jira/browse/HADOOP-15027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/oss >Affects Versions: 3.0.0 >Reporter: wujinhu >Assignee: wujinhu >Priority: Major > Fix For: 3.1.0 > > Attachments: HADOOP-15027.001.patch, HADOOP-15027.002.patch, > HADOOP-15027.003.patch, HADOOP-15027.004.patch, HADOOP-15027.005.patch, > HADOOP-15027.006.patch, HADOOP-15027.007.patch, HADOOP-15027.008.patch, > HADOOP-15027.009.patch, HADOOP-15027.010.patch, HADOOP-15027.011.patch, > HADOOP-15027.012.patch, HADOOP-15027.013.patch, HADOOP-15027.014.patch > > > Currently, AliyunOSSInputStream uses single thread to read data from > AliyunOSS, so we can do some refactoring by using multi-thread pre-read to > improve read performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org