[jira] [Resolved] (HADOOP-11832) spnego authentication logs only log in debug mode so its difficult to debug auth isues
[ https://issues.apache.org/jira/browse/HADOOP-11832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HADOOP-11832. --- Resolution: Won't Fix spnego authentication logs only log in debug mode so its difficult to debug auth isues -- Key: HADOOP-11832 URL: https://issues.apache.org/jira/browse/HADOOP-11832 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.5.2 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: hadoop-11832.001.patch The following logs should be at info level so that auth failures can be debugged more easily. {code} 2015-03-18 06:49:40,397 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter org.apache.hadoop.hdfs.web.AuthFilter 2015-03-18 06:49:40,397 DEBUG server.AuthenticationFilter (AuthenticationFilter.java:doFilter(505)) - Request [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTO KENuser.name=hrt_qa] triggering authentication 2015-03-18 06:49:40,397 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - RESPONSE /webhdfs/v1/ 401 2015-03-18 06:49:40,549 DEBUG BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1499)) - BLOCK* neededReplications = 1357 pendingReplications = 0 2015-03-18 06:49:40,634 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - EOF 2015-03-18 06:49:40,639 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - REQUEST /webhdfs/v1/ on org.mortbay.jetty.HttpConnection@33c174b5 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - sessionManager=org.mortbay.jetty.servlet.HashSessionManager@a072d8c 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - session=null 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - servlet=com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - chain=NoCacheFilter-NoCacheFilter-safety-org.apache.hadoop.hdfs.web.AuthFilter-com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - servlet holder=com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter NoCacheFilter 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter NoCacheFilter 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter safety 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter org.apache.hadoop.hdfs.web.AuthFilter 2015-03-18 06:49:40,640 DEBUG server.AuthenticationFilter (AuthenticationFilter.java:doFilter(505)) - Request [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTOKENuser.name=hrt_qa] triggering authentication 2015-03-18 06:49:40,642 DEBUG server.AuthenticationFilter (AuthenticationFilter.java:doFilter(517)) - Request [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTOKENuser.name=hrt_qa] user [hrt_qa] authenticated 2015-03-18 06:49:40,642 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call servlet com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11656) Classpath isolation for downstream clients
[ https://issues.apache.org/jira/browse/HADOOP-11656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503814#comment-14503814 ] Sangjin Lee commented on HADOOP-11656: -- Thanks [~busbey] for the proposal! I have some preliminary high level comments, in no particular order: If we're going the route of shading for clients, IMO there is less incentive to use a different mechanism on the framework side; what would be a reason not to consider shading on the framework side if we're shading for the client? I think it would be great to provide the same type of solutions for both the client side and the framework side, and that would simplify things a lot for users. Also, note that the build side of things would bring those two aspects together anyway (see below). The name hadoop-client-reactor is rather awkward as the reactor has a specific meaning in programming, and this is not that. bq. Classes labeled InterfaceAudience.Public might inadvertently leak references to third-party libraries. Doing so will substantially complicate isolating things in the client-api. IMO this is the most significant risk with shading; i.e. no shaded types can ever leak to users. Granted, these issues do exist in other solutions including OSGi, but there might be ways to solve them in those solutions. With shading it is a hard failure. There might be other known issues with shading. I think Cloudera should have experience with shaded libraries, and could provide more details in terms of issues with shading? bq. Unfortunately, it doesn't provide much upgrade help for applications that rely on the classes found in the fallback case. Could you please elaborate on this point? Do you mean things will break if user code relied on a Hadoop dependency implicitly (without bringing their own copy) and Hadoop upgraded it to an incompatible version? Note that this type of issues may exist with the OSGi approach as well. If OSGi exported that particular dependency, then the user would start relying on that dependency implicitly too unless he/she brings the dependency. And in that case, if Hadoop upgraded that dependency, the user code will break in the same manner. If Hadoop does not intent to support that use case, OSGi does allow the possibility of not exporting these dependencies, in which case the user code will simply break right from the beginning until the user fixes it so they bring the dependency. bq. Downstream user-provided code must not be required to be an OSGi bundle. +1. To me this is the only viable approach. The user code (and its dependencies) needs to be converted into a fat bundle dynamically at runtime. You might want to look at what Apache Geronimo did with regards to this. The only caveat is what the underlying system bundles (Hadoop+system) should export. If we're going to use OSGi, I think we should only export the actual public APIs and types the user code can couple to. The implication of that decision is that things will fail miserably if any of the implicit dependencies is missing from the user code, and we'd spend a lot of time tracking down missing dependencies for users. Trust me, this is non-trivial support cost. I haven't thought through this completely, but we do need to think about the impact on user builds. To create their app (e.g. MR app), what maven artifacts would they need to depend on? Note that users usually have a single project for their client as well as the code that's executed on the cluster. Do we anticipate any changes users are required to make (e.g. clean up their 3rd party dependencies, etc.)? Although in theory everyone should have a clean pom, etc. etc., sadly the reality is very different, and we need to be able to tell users what is needed before they can start leveraging this. Classpath isolation for downstream clients -- Key: HADOOP-11656 URL: https://issues.apache.org/jira/browse/HADOOP-11656 Project: Hadoop Common Issue Type: New Feature Reporter: Sean Busbey Assignee: Sean Busbey Labels: classloading, classpath, dependencies, scripts, shell Attachments: HADOOP-11656_proposal.md Currently, Hadoop exposes downstream clients to a variety of third party libraries. As our code base grows and matures we increase the set of libraries we rely on. At the same time, as our user base grows we increase the likelihood that some downstream project will run into a conflict while attempting to use a different version of some library we depend on. This has already happened with i.e. Guava several times for HBase, Accumulo, and Spark (and I'm sure others). While YARN-286 and MAPREDUCE-1700 provided an initial effort, they default to off and they don't do anything to help dependency conflicts on the driver
[jira] [Commented] (HADOOP-11832) spnego authentication logs only log in debug mode so its difficult to debug auth isues
[ https://issues.apache.org/jira/browse/HADOOP-11832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503907#comment-14503907 ] Jitendra Nath Pandey commented on HADOOP-11832: --- I am +1, [~aw], are you ok to commit it? spnego authentication logs only log in debug mode so its difficult to debug auth isues -- Key: HADOOP-11832 URL: https://issues.apache.org/jira/browse/HADOOP-11832 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.5.2 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: hadoop-11832.001.patch The following logs should be at info level so that auth failures can be debugged more easily. {code} 2015-03-18 06:49:40,397 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter org.apache.hadoop.hdfs.web.AuthFilter 2015-03-18 06:49:40,397 DEBUG server.AuthenticationFilter (AuthenticationFilter.java:doFilter(505)) - Request [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTO KENuser.name=hrt_qa] triggering authentication 2015-03-18 06:49:40,397 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - RESPONSE /webhdfs/v1/ 401 2015-03-18 06:49:40,549 DEBUG BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1499)) - BLOCK* neededReplications = 1357 pendingReplications = 0 2015-03-18 06:49:40,634 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - EOF 2015-03-18 06:49:40,639 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - REQUEST /webhdfs/v1/ on org.mortbay.jetty.HttpConnection@33c174b5 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - sessionManager=org.mortbay.jetty.servlet.HashSessionManager@a072d8c 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - session=null 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - servlet=com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - chain=NoCacheFilter-NoCacheFilter-safety-org.apache.hadoop.hdfs.web.AuthFilter-com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - servlet holder=com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter NoCacheFilter 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter NoCacheFilter 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter safety 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter org.apache.hadoop.hdfs.web.AuthFilter 2015-03-18 06:49:40,640 DEBUG server.AuthenticationFilter (AuthenticationFilter.java:doFilter(505)) - Request [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTOKENuser.name=hrt_qa] triggering authentication 2015-03-18 06:49:40,642 DEBUG server.AuthenticationFilter (AuthenticationFilter.java:doFilter(517)) - Request [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTOKENuser.name=hrt_qa] user [hrt_qa] authenticated 2015-03-18 06:49:40,642 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call servlet com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11103) Clean up RemoteException
[ https://issues.apache.org/jira/browse/HADOOP-11103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503909#comment-14503909 ] Hadoop QA commented on HADOOP-11103: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726640/HADOOP-11103.2.patch against trunk revision f967fd2. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6131//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6131//console This message is automatically generated. Clean up RemoteException Key: HADOOP-11103 URL: https://issues.apache.org/jira/browse/HADOOP-11103 Project: Hadoop Common Issue Type: Improvement Components: ipc Reporter: Sean Busbey Assignee: Sean Busbey Priority: Trivial Attachments: HADOOP-11103.1.patch, HADOOP-11103.2.patch RemoteException has a number of undocumented behaviors * o.a.h.ipc.RemoteException has no javadocs on getClassName. Reading the source, the String returned is the classname of the wrapped remote exception. * RemoteException(String, String) is equivalent to calling RemoteException(String, String, null) * Constructors allow null for all arguments * Some of the test code doesn't check for correct error codes to correspond with the wrapped exception type * methods don't document when they might return null -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11850) Typos in hadoop-common java docs
[ https://issues.apache.org/jira/browse/HADOOP-11850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503658#comment-14503658 ] Jakob Homan commented on HADOOP-11850: -- {noformat}--- a/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AltKerberosAuthenticationHandler.java +++ b/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AltKerberosAuthenticationHandler.java @@ -25,7 +25,7 @@ * The {@link AltKerberosAuthenticationHandler} behaves exactly the same way as * the {@link KerberosAuthenticationHandler}, except that it allows for an * alternative form of authentication for browsers while still using Kerberos - * for Java access. This is an abstract class that should be subclassed + * for Java access. This is an abstract class that should be subclasses{noformat} subclassed here is correct, as it's being used a verb. {noformat}+++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/BufferedFSInputStream.java @@ -27,7 +27,7 @@ /** - * A class optimizes reading from FSInputStream by bufferring + * A class optimizes reading from FSInputStream by buffering */{noformat} Should be a {{a class that optimizes}} as well as your correction. {noformat}+++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Ls.java @@ -254,7 +254,7 @@ private int maxLength(int n, Object value) { } /** - * Initialise the comparator to be used for sorting files. If multiple options + * Initialize the comparator to be used for sorting files. If multiple options{noformat} Initialise is a valid British/Australian/Canadian/NewZealish spelling variant. I believe we had a discussion many years ago and agreed to let variants in wherever the original author preferred them. Otherwise, looks good. Typos in hadoop-common java docs Key: HADOOP-11850 URL: https://issues.apache.org/jira/browse/HADOOP-11850 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Priority: Minor Attachments: HADOOP-11850.patch This jira will fix the typo in hdfs-common project -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11852) Disable symlinks in trunk
[ https://issues.apache.org/jira/browse/HADOOP-11852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-11852: - Issue Type: Sub-task (was: Bug) Parent: HADOOP-10019 Disable symlinks in trunk - Key: HADOOP-11852 URL: https://issues.apache.org/jira/browse/HADOOP-11852 Project: Hadoop Common Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hadoop-11852.001.patch In HADOOP-10020 and HADOOP-10162 we disabled symlinks in branch-2. Since there's currently no plan to finish this work, let's disable it in trunk too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11852) Disable symlinks in trunk
[ https://issues.apache.org/jira/browse/HADOOP-11852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504115#comment-14504115 ] Hadoop QA commented on HADOOP-11852: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726733/hadoop-11852.001.patch against trunk revision 44872b7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6132//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6132//console This message is automatically generated. Disable symlinks in trunk - Key: HADOOP-11852 URL: https://issues.apache.org/jira/browse/HADOOP-11852 Project: Hadoop Common Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hadoop-11852.001.patch In HADOOP-10020 and HADOOP-10162 we disabled symlinks in branch-2. Since there's currently no plan to finish this work, let's disable it in trunk too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11812) Implement listLocatedStatus for ViewFileSystem to speed up split calculation
[ https://issues.apache.org/jira/browse/HADOOP-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503701#comment-14503701 ] Chris Nauroth commented on HADOOP-11812: Gera, once again you are correct, so here is my unconditional +1. :-) Implement listLocatedStatus for ViewFileSystem to speed up split calculation Key: HADOOP-11812 URL: https://issues.apache.org/jira/browse/HADOOP-11812 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.7.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Labels: performance Attachments: HADOOP-11812.001.patch, HADOOP-11812.002.patch, HADOOP-11812.003.patch, HADOOP-11812.004.patch, HADOOP-11812.005.patch ViewFileSystem is currently not taking advantage of MAPREDUCE-1981. This causes several x of RPC overhead and added latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11843) Make setting up the build environment easier
[ https://issues.apache.org/jira/browse/HADOOP-11843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503944#comment-14503944 ] Chris Nauroth commented on HADOOP-11843: After some digging, I found that Docker has a known issue with poor performance of volume mounts on Mac/boot2docker: https://github.com/boot2docker/boot2docker/issues/593 This issue has been resolved as a duplicate, and they point to a new feature for utilizing NFS mounts as the proposed solution: https://github.com/boot2docker/boot2docker/issues/64 This issue is still open. As it stands, I don't believe this is usable on Mac in its current form. I just retested to check the performance boost reported by Allen when using the Oracle JDK, but I still don't think the performance is anywhere near acceptable. That's unfortunate, since a lot of developers like Macs. Something else I can try is real Docker (not boot2docker) running in a Linux VM hosted by VirtualBox. I do get acceptable disk performance from my VirtualBox VMs, so maybe one extra layer of indirection from Docker in there won't hurt. Make setting up the build environment easier Key: HADOOP-11843 URL: https://issues.apache.org/jira/browse/HADOOP-11843 Project: Hadoop Common Issue Type: New Feature Reporter: Niels Basjes Assignee: Niels Basjes Attachments: HADOOP-11843-2015-04-17-1612.patch, HADOOP-11843-2015-04-17-2226.patch, HADOOP-11843-2015-04-17-2308.patch, HADOOP-11843-2015-04-19-2206.patch, HADOOP-11843-2015-04-19-2232.patch ( As discussed with [~aw] ) In AVRO-1537 a docker based solution was created to setup all the tools for doing a full build. This enables much easier reproduction of any issues and getting up and running for new developers. This issue is to 'copy/port' that setup into the hadoop project in preparation for the bug squash. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11850) Typos in hadoop-common java docs
[ https://issues.apache.org/jira/browse/HADOOP-11850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated HADOOP-11850: - Status: Open (was: Patch Available) Typos in hadoop-common java docs Key: HADOOP-11850 URL: https://issues.apache.org/jira/browse/HADOOP-11850 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Priority: Minor Attachments: HADOOP-11850.patch This jira will fix the typo in hdfs-common project -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11851) s3n to swallow IOEs on inner stream close
[ https://issues.apache.org/jira/browse/HADOOP-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503893#comment-14503893 ] Takenori Sato commented on HADOOP-11851: Isn't this the duplicate of HADOOP-11730? s3n to swallow IOEs on inner stream close - Key: HADOOP-11851 URL: https://issues.apache.org/jira/browse/HADOOP-11851 Project: Hadoop Common Issue Type: Improvement Components: fs/s3 Affects Versions: 2.6.0 Reporter: Steve Loughran Assignee: Anu Engineer Priority: Minor We've seen a situation where some work was failing from (recurrent) connection reset exceptions. Irrespective of the root cause, these were surfacing not in the read operations, but when the input stream was being closed -including during a seek() These exceptions could be caught logged warn, rather than trigger immediate failures. It shouldn't matter to the next GET whether the last stream closed prematurely, as long as the new one works -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11832) spnego authentication logs only log in debug mode so its difficult to debug auth isues
[ https://issues.apache.org/jira/browse/HADOOP-11832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503918#comment-14503918 ] Haohui Mai commented on HADOOP-11832: - I agree with [~aw] it is way too verbose to bring it to info. spnego authentication logs only log in debug mode so its difficult to debug auth isues -- Key: HADOOP-11832 URL: https://issues.apache.org/jira/browse/HADOOP-11832 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.5.2 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: hadoop-11832.001.patch The following logs should be at info level so that auth failures can be debugged more easily. {code} 2015-03-18 06:49:40,397 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter org.apache.hadoop.hdfs.web.AuthFilter 2015-03-18 06:49:40,397 DEBUG server.AuthenticationFilter (AuthenticationFilter.java:doFilter(505)) - Request [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTO KENuser.name=hrt_qa] triggering authentication 2015-03-18 06:49:40,397 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - RESPONSE /webhdfs/v1/ 401 2015-03-18 06:49:40,549 DEBUG BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1499)) - BLOCK* neededReplications = 1357 pendingReplications = 0 2015-03-18 06:49:40,634 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - EOF 2015-03-18 06:49:40,639 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - REQUEST /webhdfs/v1/ on org.mortbay.jetty.HttpConnection@33c174b5 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - sessionManager=org.mortbay.jetty.servlet.HashSessionManager@a072d8c 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - session=null 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - servlet=com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - chain=NoCacheFilter-NoCacheFilter-safety-org.apache.hadoop.hdfs.web.AuthFilter-com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - servlet holder=com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter NoCacheFilter 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter NoCacheFilter 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter safety 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call filter org.apache.hadoop.hdfs.web.AuthFilter 2015-03-18 06:49:40,640 DEBUG server.AuthenticationFilter (AuthenticationFilter.java:doFilter(505)) - Request [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTOKENuser.name=hrt_qa] triggering authentication 2015-03-18 06:49:40,642 DEBUG server.AuthenticationFilter (AuthenticationFilter.java:doFilter(517)) - Request [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTOKENuser.name=hrt_qa] user [hrt_qa] authenticated 2015-03-18 06:49:40,642 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call servlet com.sun.jersey.spi.container.servlet.ServletContainer-1953517520 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11852) Disable symlinks in trunk
[ https://issues.apache.org/jira/browse/HADOOP-11852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-11852: - Status: Patch Available (was: Open) Disable symlinks in trunk - Key: HADOOP-11852 URL: https://issues.apache.org/jira/browse/HADOOP-11852 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hadoop-11852.001.patch In HADOOP-10020 and HADOOP-10162 we disabled symlinks in branch-2. Since there's currently no plan to finish this work, let's disable it in trunk too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11852) Disable symlinks in trunk
Andrew Wang created HADOOP-11852: Summary: Disable symlinks in trunk Key: HADOOP-11852 URL: https://issues.apache.org/jira/browse/HADOOP-11852 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Andrew Wang In HADOOP-10020 and HADOOP-10162 we disabled symlinks in branch-2. Since there's currently no plan to finish this work, let's disable it in trunk too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11743) maven doesn't clean all the site files
[ https://issues.apache.org/jira/browse/HADOOP-11743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504104#comment-14504104 ] ramtin commented on HADOOP-11743: - [~aw] please review my patch and suggest me any modifications are required. maven doesn't clean all the site files -- Key: HADOOP-11743 URL: https://issues.apache.org/jira/browse/HADOOP-11743 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: ramtin Priority: Minor Attachments: HADOOP-11743.001.patch, HADOOP-11743.002.patch After building the site files, performing a mvn clean -Preleasedocs doesn't actually clean everything up as git complains about untracked files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HADOOP-11851) s3n to swallow IOEs on inner stream close
[ https://issues.apache.org/jira/browse/HADOOP-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer reassigned HADOOP-11851: - Assignee: Anu Engineer s3n to swallow IOEs on inner stream close - Key: HADOOP-11851 URL: https://issues.apache.org/jira/browse/HADOOP-11851 Project: Hadoop Common Issue Type: Improvement Components: fs/s3 Affects Versions: 2.6.0 Reporter: Steve Loughran Assignee: Anu Engineer Priority: Minor We've seen a situation where some work was failing from (recurrent) connection reset exceptions. Irrespective of the root cause, these were surfacing not in the read operations, but when the input stream was being closed -including during a seek() These exceptions could be caught logged warn, rather than trigger immediate failures. It shouldn't matter to the next GET whether the last stream closed prematurely, as long as the new one works -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11812) Implement listLocatedStatus for ViewFileSystem to speed up split calculation
[ https://issues.apache.org/jira/browse/HADOOP-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503668#comment-14503668 ] Gera Shegalov commented on HADOOP-11812: Chris, Jenkins run with 005 is the last Hadoop QA comment, and it is green: https://issues.apache.org/jira/browse/HADOOP-11812?focusedCommentId=14502526page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14502526 Implement listLocatedStatus for ViewFileSystem to speed up split calculation Key: HADOOP-11812 URL: https://issues.apache.org/jira/browse/HADOOP-11812 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.7.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Labels: performance Attachments: HADOOP-11812.001.patch, HADOOP-11812.002.patch, HADOOP-11812.003.patch, HADOOP-11812.004.patch, HADOOP-11812.005.patch ViewFileSystem is currently not taking advantage of MAPREDUCE-1981. This causes several x of RPC overhead and added latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11827) Speed-up distcp buildListing() using threadpool
[ https://issues.apache.org/jira/browse/HADOOP-11827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503884#comment-14503884 ] Ravi Prakash commented on HADOOP-11827: --- Hi Zoran! Thanks for all your work Here's a preliminary review. I haven't reviewed test code at all # Please remove todo(zoran). No FNF shouldn't lead to a retry. We should keep the behavior same (even though I think it should ideally exit with an exception because someone is likely modifying the tree during the distcp) # listStatuse - listStatus ? # Could you please document the max 40 thread limit? # Any reason you don't want to default numListstatusThreads to 1 (instead of 0)? # Do you really need {{protected SimpleCopyListing(Configuration configuration, Credentials credentials, int numListstatusThreads)}} ? Could you just set configuration it in configuration? # Could you please document FileStatusProcessor ? # I think its useful to have a wrapper ProducerConsumer implementation IF there isn't a way to trivially accomplish it. We should either move it into the org.apache.hadoop.tools.util package or use the trivial implementation # We can probably make maybePrintStats more efficient if we chose a number which is a power of 2 rather than 10 # dirCnt isn't used . getChildren too. # {{new FileStatusProcessor(sourcePathRoot.getFileSystem(getConf(; }} may not be the right thing to do. If two sources (from different file systems are used), would this cause an error? # ProducerConsumer.take() calls LinkedBlockingQueue.take() which claims to block. Should the javadoc say non-blocking? These were my questions so far. I'll keep reviewing, but in the meantime we can multithread progress on this issue ;-) Speed-up distcp buildListing() using threadpool --- Key: HADOOP-11827 URL: https://issues.apache.org/jira/browse/HADOOP-11827 Project: Hadoop Common Issue Type: Improvement Components: tools/distcp Affects Versions: 2.7.0, 2.7.1 Reporter: Zoran Dimitrijevic Assignee: Zoran Dimitrijevic Attachments: HADOOP-11827-02.patch, HADOOP-11827.patch Original Estimate: 24h Remaining Estimate: 24h For very large source trees on s3 distcp is taking long time to build file listing (client code, before starting mappers). For a dataset I used (1.5M files, 50K dirs) it was taking 65 minutes before my fix in HADOOP-11785 and 36 minutes after the fix). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11743) maven doesn't clean all the site files
[ https://issues.apache.org/jira/browse/HADOOP-11743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramtin updated HADOOP-11743: Attachment: HADOOP-11743.002.patch maven doesn't clean all the site files -- Key: HADOOP-11743 URL: https://issues.apache.org/jira/browse/HADOOP-11743 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: ramtin Priority: Minor Attachments: HADOOP-11743.001.patch, HADOOP-11743.002.patch After building the site files, performing a mvn clean -Preleasedocs doesn't actually clean everything up as git complains about untracked files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11802) DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm
[ https://issues.apache.org/jira/browse/HADOOP-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504148#comment-14504148 ] Colin Patrick McCabe commented on HADOOP-11802: --- Hi Eric, Good catch. I think the issue here is that there is a lot of buffering in the domain socket. So it's difficult to get the DataNode to fail when doing its write on the socket. In my experience, the write will succeed even when the other end has already shut down the socket. This buffering can be set by configuring SO_RCVBUF, but even the smallest value still buffers enough that the unit test will pass under every condition. This buffering is not a problem since in the event of a communication failure, the client will close the socket, triggering the DataNode to free the resources. However, it does make unit testing by injecting faults on the client side more difficult to do. The solution to this problem is to inject the failure directly on the DataNode side. The latest patch does this. I have confirmed that it fails without the fix applied. DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm - Key: HADOOP-11802 URL: https://issues.apache.org/jira/browse/HADOOP-11802 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.7.0 Reporter: Eric Payne Assignee: Colin Patrick McCabe Attachments: HADOOP-11802.001.patch, HADOOP-11802.002.patch In {{DataXceiver#requestShortCircuitShm}}, we attempt to recover from some errors by closing the {{DomainSocket}}. However, this violates the invariant that the domain socket should never be closed when it is being managed by the {{DomainSocketWatcher}}. Instead, we should call {{shutdown}} on the {{DomainSocket}}. When this bug hits, it terminates the {{DomainSocketWatcher}} thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11802) DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm
[ https://issues.apache.org/jira/browse/HADOOP-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HADOOP-11802: -- Attachment: HADOOP-11802.003.patch DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm - Key: HADOOP-11802 URL: https://issues.apache.org/jira/browse/HADOOP-11802 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.7.0 Reporter: Eric Payne Assignee: Colin Patrick McCabe Attachments: HADOOP-11802.001.patch, HADOOP-11802.002.patch, HADOOP-11802.003.patch In {{DataXceiver#requestShortCircuitShm}}, we attempt to recover from some errors by closing the {{DomainSocket}}. However, this violates the invariant that the domain socket should never be closed when it is being managed by the {{DomainSocketWatcher}}. Instead, we should call {{shutdown}} on the {{DomainSocket}}. When this bug hits, it terminates the {{DomainSocketWatcher}} thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11827) Speed-up distcp buildListing() using threadpool
[ https://issues.apache.org/jira/browse/HADOOP-11827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504391#comment-14504391 ] Zoran Dimitrijevic commented on HADOOP-11827: - Sorry Ravi. Thanks for the comments Ravi! Speed-up distcp buildListing() using threadpool --- Key: HADOOP-11827 URL: https://issues.apache.org/jira/browse/HADOOP-11827 Project: Hadoop Common Issue Type: Improvement Components: tools/distcp Affects Versions: 2.7.0, 2.7.1 Reporter: Zoran Dimitrijevic Assignee: Zoran Dimitrijevic Attachments: HADOOP-11827-02.patch, HADOOP-11827-03.patch, HADOOP-11827.patch Original Estimate: 24h Remaining Estimate: 24h For very large source trees on s3 distcp is taking long time to build file listing (client code, before starting mappers). For a dataset I used (1.5M files, 50K dirs) it was taking 65 minutes before my fix in HADOOP-11785 and 36 minutes after the fix). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11827) Speed-up distcp buildListing() using threadpool
[ https://issues.apache.org/jira/browse/HADOOP-11827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoran Dimitrijevic updated HADOOP-11827: Attachment: HADOOP-11827-03.patch Speed-up distcp buildListing() using threadpool --- Key: HADOOP-11827 URL: https://issues.apache.org/jira/browse/HADOOP-11827 Project: Hadoop Common Issue Type: Improvement Components: tools/distcp Affects Versions: 2.7.0, 2.7.1 Reporter: Zoran Dimitrijevic Assignee: Zoran Dimitrijevic Attachments: HADOOP-11827-02.patch, HADOOP-11827-03.patch, HADOOP-11827.patch Original Estimate: 24h Remaining Estimate: 24h For very large source trees on s3 distcp is taking long time to build file listing (client code, before starting mappers). For a dataset I used (1.5M files, 50K dirs) it was taking 65 minutes before my fix in HADOOP-11785 and 36 minutes after the fix). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11827) Speed-up distcp buildListing() using threadpool
[ https://issues.apache.org/jira/browse/HADOOP-11827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504359#comment-14504359 ] Zoran Dimitrijevic commented on HADOOP-11827: - Thanks for the comments Allen. I've addressed most of them: 1, 2, 3: done 4: in order to prefer flags over properties, i needed a value to know whether flag was set or not. 0 seemed easier than yet another bool. 5. I added it so that I can have minimal changes in the unittest (rerun tests for various number of threads using org.junit.runners.Parameterized 6. done 7. agree. I wanted to make multithreaded logic outside of SimpleCopyListing.java but if you think it's an overkill, I can refactor. But it'll be uglier and if we need this again, we won't have the wrapper. 8. considering how much code is invoked for each of these simple MaybePrintStats, I don't think it's worth doing it. But, I don't have strong opinions, I just think we should print some progress since this stage can be order of tens of minutes. 9. removed. 10. In current code, we use the same file system instance, so I don't think it's a problem. I use one per thread since we have small number of threads and these run listStatus in parallel. 11. changed the docs - both are blocking, but one can be interrupted by exceptions, and then the user must handle it. Please suggest better names and I'll refactor it. Or maybe just keep one. Speed-up distcp buildListing() using threadpool --- Key: HADOOP-11827 URL: https://issues.apache.org/jira/browse/HADOOP-11827 Project: Hadoop Common Issue Type: Improvement Components: tools/distcp Affects Versions: 2.7.0, 2.7.1 Reporter: Zoran Dimitrijevic Assignee: Zoran Dimitrijevic Attachments: HADOOP-11827-02.patch, HADOOP-11827-03.patch, HADOOP-11827.patch Original Estimate: 24h Remaining Estimate: 24h For very large source trees on s3 distcp is taking long time to build file listing (client code, before starting mappers). For a dataset I used (1.5M files, 50K dirs) it was taking 65 minutes before my fix in HADOOP-11785 and 36 minutes after the fix). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11743) maven doesn't clean all the site files
[ https://issues.apache.org/jira/browse/HADOOP-11743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504364#comment-14504364 ] Hadoop QA commented on HADOOP-11743: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726741/HADOOP-11743.002.patch against trunk revision 44872b7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6133//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6133//console This message is automatically generated. maven doesn't clean all the site files -- Key: HADOOP-11743 URL: https://issues.apache.org/jira/browse/HADOOP-11743 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: ramtin Priority: Minor Attachments: HADOOP-11743.001.patch, HADOOP-11743.002.patch After building the site files, performing a mvn clean -Preleasedocs doesn't actually clean everything up as git complains about untracked files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11827) Speed-up distcp buildListing() using threadpool
[ https://issues.apache.org/jira/browse/HADOOP-11827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504378#comment-14504378 ] Hadoop QA commented on HADOOP-11827: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726791/HADOOP-11827-03.patch against trunk revision d52de61. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6135//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6135//console This message is automatically generated. Speed-up distcp buildListing() using threadpool --- Key: HADOOP-11827 URL: https://issues.apache.org/jira/browse/HADOOP-11827 Project: Hadoop Common Issue Type: Improvement Components: tools/distcp Affects Versions: 2.7.0, 2.7.1 Reporter: Zoran Dimitrijevic Assignee: Zoran Dimitrijevic Attachments: HADOOP-11827-02.patch, HADOOP-11827-03.patch, HADOOP-11827.patch Original Estimate: 24h Remaining Estimate: 24h For very large source trees on s3 distcp is taking long time to build file listing (client code, before starting mappers). For a dataset I used (1.5M files, 50K dirs) it was taking 65 minutes before my fix in HADOOP-11785 and 36 minutes after the fix). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11802) DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm
[ https://issues.apache.org/jira/browse/HADOOP-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504377#comment-14504377 ] Hadoop QA commented on HADOOP-11802: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726751/HADOOP-11802.003.patch against trunk revision 44872b7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6134//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6134//console This message is automatically generated. DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm - Key: HADOOP-11802 URL: https://issues.apache.org/jira/browse/HADOOP-11802 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.7.0 Reporter: Eric Payne Assignee: Colin Patrick McCabe Attachments: HADOOP-11802.001.patch, HADOOP-11802.002.patch, HADOOP-11802.003.patch In {{DataXceiver#requestShortCircuitShm}}, we attempt to recover from some errors by closing the {{DomainSocket}}. However, this violates the invariant that the domain socket should never be closed when it is being managed by the {{DomainSocketWatcher}}. Instead, we should call {{shutdown}} on the {{DomainSocket}}. When this bug hits, it terminates the {{DomainSocketWatcher}} thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11851) s3n to swallow IOEs on inner stream close
Steve Loughran created HADOOP-11851: --- Summary: s3n to swallow IOEs on inner stream close Key: HADOOP-11851 URL: https://issues.apache.org/jira/browse/HADOOP-11851 Project: Hadoop Common Issue Type: Improvement Components: fs/s3 Affects Versions: 2.6.0 Reporter: Steve Loughran Priority: Minor We've seen a situation where some work was failing from (recurrent) connection reset exceptions. Irrespective of the root cause, these were surfacing not in the read operations, but when the input stream was being closed -including during a seek() These exceptions could be caught logged warn, rather than trigger immediate failures. It shouldn't matter to the next GET whether the last stream closed prematurely, as long as the new one works -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11851) s3n to swallow IOEs on inner stream close
[ https://issues.apache.org/jira/browse/HADOOP-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503532#comment-14503532 ] Steve Loughran commented on HADOOP-11851: - Inner stack {code} Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at sun.security.ssl.InputRecord.readFully(InputRecord.java:442) at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:554) at sun.security.ssl.InputRecord.read(InputRecord.java:509) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:934) at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:891) at sun.security.ssl.AppInputStream.read(AppInputStream.java:102) at org.apache.http.impl.io.AbstractSessionInputBuffer.read(AbstractSessionInputBuffer.java:204) at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:182) at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:204) at org.apache.http.impl.io.ContentLengthInputStream.close(ContentLengthInputStream.java:108) at org.apache.http.conn.BasicManagedEntity.streamClosed(BasicManagedEntity.java:164) at org.apache.http.conn.EofSensorInputStream.checkClose(EofSensorInputStream.java:237) at org.apache.http.conn.EofSensorInputStream.close(EofSensorInputStream.java:186) at org.apache.http.util.EntityUtils.consume(EntityUtils.java:87) at org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream.releaseConnection(HttpMethodReleaseInputStream.java:102) at org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream.close(HttpMethodReleaseInputStream.java:194) at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.closeInnerStream(NativeS3FileSystem.java:176) at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.updateInnerStream(NativeS3FileSystem.java:192) at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:207) at org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:96) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:62) {code} s3n to swallow IOEs on inner stream close - Key: HADOOP-11851 URL: https://issues.apache.org/jira/browse/HADOOP-11851 Project: Hadoop Common Issue Type: Improvement Components: fs/s3 Affects Versions: 2.6.0 Reporter: Steve Loughran Priority: Minor We've seen a situation where some work was failing from (recurrent) connection reset exceptions. Irrespective of the root cause, these were surfacing not in the read operations, but when the input stream was being closed -including during a seek() These exceptions could be caught logged warn, rather than trigger immediate failures. It shouldn't matter to the next GET whether the last stream closed prematurely, as long as the new one works -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11850) Typos in hadoop-common java docs
[ https://issues.apache.org/jira/browse/HADOOP-11850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503544#comment-14503544 ] Hadoop QA commented on HADOOP-11850: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726638/HADOOP-11850.patch against trunk revision f967fd2. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-auth hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6130//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6130//console This message is automatically generated. Typos in hadoop-common java docs Key: HADOOP-11850 URL: https://issues.apache.org/jira/browse/HADOOP-11850 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Priority: Minor Attachments: HADOOP-11850.patch This jira will fix the typo in hdfs-common project -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11766) Generic token authentication support for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-11766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HADOOP-11766: --- Attachment: HADOOP-11766-V1.patch Uploaded a rough patch illustrating the overall ideas: 1. Defined a generic token interface named {{AuthToken}}, abstracting common token attributes; 2. Implemented {{JwtAuthToken}} and {{CloudFoundryOAuth2Token}}, with corresponding decoders and validators, for checking signature, expiration, audiences and scope. The token decoder and validators are pluggable and configurable; 3. Provided a new {{AuthTokenAuthenticationHandler}} for hadoop Web UI, REST and WebHDFS, that can support the JWT token and cloudfoundry OAuth2 token. Generic token authentication support for Hadoop --- Key: HADOOP-11766 URL: https://issues.apache.org/jira/browse/HADOOP-11766 Project: Hadoop Common Issue Type: New Feature Components: security Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11766-V1.patch As a major goal of Rhino project, we proposed *TokenAuth* effort in HADOOP-9392, where it's to provide a common token authentication framework to integrate multiple authentication mechanisms, by adding a new {{AuthenticationMethod}} in lieu of {{KERBEROS}} and {{SIMPLE}}. To minimize the required changes and risk, we thought of another approach to achieve the general goals based on Kerberos as Kerberos itself supports a pre-authentication framework in both spec and implementation, which was discussed in HADOOP-10959 as *TokenPreauth*. In both approaches, we had performed workable prototypes covering both command line console and Hadoop web UI. As HADOOP-9392 is rather lengthy and heavy, HADOOP-10959 is mostly focused on the concrete implementation approach based on Kerberos, we open this for more general and updated discussions about requirement, use cases, and concerns for the generic token authentication support for Hadoop. We distinguish this token from existing Hadoop tokens as the token in this discussion is majorly for the initial and primary authentication. We will refine our existing codes in HADOOP-9392 and HADOOP-10959, break them down into smaller patches based on latest trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HADOOP-11847: --- Attachment: HADOOP-11847-v1.patch Uploaded the patch. Changes summary: * Allowing least required inputs when decoding, using null to indicate not to read; * Refining tests allowing to erase parity units; * Refactored some codes incorporating change from other issues. Will open similar issue separately for erasure coders as this is already rather large. Pending for review. Enhance raw coder allowing to read least required inputs in decoding Key: HADOOP-11847 URL: https://issues.apache.org/jira/browse/HADOOP-11847 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11847-v1.patch This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11746) rewrite test-patch.sh
[ https://issues.apache.org/jira/browse/HADOOP-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502467#comment-14502467 ] Allen Wittenauer commented on HADOOP-11746: --- bq. I found that surprising, as compared to say -0 since in those cases the QA bot can't judge the suitability of the patch. This is actually how the old test-patch.sh works as well. I tend to agree with the -1 because if the base universe is broken, it really can't judge how well the patch is going to work either. The +1's are going to be false flags. bq. The rewrite is a great improvement. Any idea what else you want to cover before pushing? Thanks! Not really. Playing around with HADOOP-11843 (which upgraded shellcheck) and realizing I hadn't exercised the site tests yet popped up the problems fixed in -19. At this point, I think all the subsystems have been thoroughly worked (at least by me) so any outstanding issues will likely be edge case bugs and/or issues with the Jenkins build environment. [There are one or two more optimizations I could make in site tests, but since those run so quickly anyway, they aren't a big concern.] rewrite test-patch.sh - Key: HADOOP-11746 URL: https://issues.apache.org/jira/browse/HADOOP-11746 Project: Hadoop Common Issue Type: Test Components: build, test Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Allen Wittenauer Attachments: HADOOP-11746-00.patch, HADOOP-11746-01.patch, HADOOP-11746-02.patch, HADOOP-11746-03.patch, HADOOP-11746-04.patch, HADOOP-11746-05.patch, HADOOP-11746-06.patch, HADOOP-11746-07.patch, HADOOP-11746-09.patch, HADOOP-11746-10.patch, HADOOP-11746-11.patch, HADOOP-11746-12.patch, HADOOP-11746-13.patch, HADOOP-11746-14.patch, HADOOP-11746-15.patch, HADOOP-11746-16.patch, HADOOP-11746-17.patch, HADOOP-11746-18.patch, HADOOP-11746-19.patch This code is bad and you should feel bad. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11812) Implement listLocatedStatus for ViewFileSystem to speed up split calculation
[ https://issues.apache.org/jira/browse/HADOOP-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated HADOOP-11812: --- Attachment: HADOOP-11812.005.patch Implement listLocatedStatus for ViewFileSystem to speed up split calculation Key: HADOOP-11812 URL: https://issues.apache.org/jira/browse/HADOOP-11812 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.7.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Labels: performance Attachments: HADOOP-11812.001.patch, HADOOP-11812.002.patch, HADOOP-11812.003.patch, HADOOP-11812.004.patch, HADOOP-11812.005.patch ViewFileSystem is currently not taking advantage of MAPREDUCE-1981. This causes several x of RPC overhead and added latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HADOOP-11746) rewrite test-patch.sh
[ https://issues.apache.org/jira/browse/HADOOP-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502467#comment-14502467 ] Allen Wittenauer edited comment on HADOOP-11746 at 4/20/15 7:51 AM: bq. I found that surprising, as compared to say -0 since in those cases the QA bot can't judge the suitability of the patch. This is actually how the old test-patch.sh works as well. I tend to agree with the -1 because if the base universe is broken, it really can't judge how well the patch is going to work either. The +1's are going to be false flags. bq. The rewrite is a great improvement. Any idea what else you want to cover before pushing? Thanks! Not really planning on more changes. Playing around with HADOOP-11843 (which upgraded shellcheck) and realizing I hadn't exercised the site tests yet popped up the problems fixed in -19. At this point, I think all the subsystems have been thoroughly worked (at least by me) so any outstanding issues will likely be edge case bugs and/or issues with the Jenkins build environment. [There are one or two more optimizations I could make in site tests, but since those run so quickly anyway, they aren't a big concern.] was (Author: aw): bq. I found that surprising, as compared to say -0 since in those cases the QA bot can't judge the suitability of the patch. This is actually how the old test-patch.sh works as well. I tend to agree with the -1 because if the base universe is broken, it really can't judge how well the patch is going to work either. The +1's are going to be false flags. bq. The rewrite is a great improvement. Any idea what else you want to cover before pushing? Thanks! Not really. Playing around with HADOOP-11843 (which upgraded shellcheck) and realizing I hadn't exercised the site tests yet popped up the problems fixed in -19. At this point, I think all the subsystems have been thoroughly worked (at least by me) so any outstanding issues will likely be edge case bugs and/or issues with the Jenkins build environment. [There are one or two more optimizations I could make in site tests, but since those run so quickly anyway, they aren't a big concern.] rewrite test-patch.sh - Key: HADOOP-11746 URL: https://issues.apache.org/jira/browse/HADOOP-11746 Project: Hadoop Common Issue Type: Test Components: build, test Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Allen Wittenauer Attachments: HADOOP-11746-00.patch, HADOOP-11746-01.patch, HADOOP-11746-02.patch, HADOOP-11746-03.patch, HADOOP-11746-04.patch, HADOOP-11746-05.patch, HADOOP-11746-06.patch, HADOOP-11746-07.patch, HADOOP-11746-09.patch, HADOOP-11746-10.patch, HADOOP-11746-11.patch, HADOOP-11746-12.patch, HADOOP-11746-13.patch, HADOOP-11746-14.patch, HADOOP-11746-15.patch, HADOOP-11746-16.patch, HADOOP-11746-17.patch, HADOOP-11746-18.patch, HADOOP-11746-19.patch This code is bad and you should feel bad. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11848) Incorrect arguments to sizeof in DomainSocket.c
[ https://issues.apache.org/jira/browse/HADOOP-11848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HADOOP-11848: -- Target Version/s: (was: 2.6.0) Incorrect arguments to sizeof in DomainSocket.c --- Key: HADOOP-11848 URL: https://issues.apache.org/jira/browse/HADOOP-11848 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.6.0 Reporter: Malcolm Kavalsky Assignee: Malcolm Kavalsky Original Estimate: 24h Remaining Estimate: 24h Length of buffer to be zeroed using sizeof , should not use the address of the structure rather the structure itself. DomainSocket.c line 156 Replace current: memset(addr,0,sizeof,(addr)); With: memset(addr, 0, sizeof(addr)); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HADOOP-11847: --- Description: This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. was: This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. Enhance raw coder allowing to read least required inputs in decoding Key: HADOOP-11847 URL: https://issues.apache.org/jira/browse/HADOOP-11847 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11847-v1.patch This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11743) maven doesn't clean all the site files
[ https://issues.apache.org/jira/browse/HADOOP-11743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502527#comment-14502527 ] Allen Wittenauer commented on HADOOP-11743: --- Pretty much all files should be cleaned as part of mvn clean *except* for the release directory that the releasedocs profile generates. Those should only be cleaned as part of the releasedocs profile since recreating them requires network activity. maven doesn't clean all the site files -- Key: HADOOP-11743 URL: https://issues.apache.org/jira/browse/HADOOP-11743 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: ramtin Priority: Minor Attachments: HADOOP-11743.001.patch After building the site files, performing a mvn clean -Preleasedocs doesn't actually clean everything up as git complains about untracked files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11812) Implement listLocatedStatus for ViewFileSystem to speed up split calculation
[ https://issues.apache.org/jira/browse/HADOOP-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502526#comment-14502526 ] Hadoop QA commented on HADOOP-11812: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726530/HADOOP-11812.005.patch against trunk revision 5c97db0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6127//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6127//console This message is automatically generated. Implement listLocatedStatus for ViewFileSystem to speed up split calculation Key: HADOOP-11812 URL: https://issues.apache.org/jira/browse/HADOOP-11812 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.7.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Labels: performance Attachments: HADOOP-11812.001.patch, HADOOP-11812.002.patch, HADOOP-11812.003.patch, HADOOP-11812.004.patch, HADOOP-11812.005.patch ViewFileSystem is currently not taking advantage of MAPREDUCE-1981. This causes several x of RPC overhead and added latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11743) maven doesn't clean all the site files
[ https://issues.apache.org/jira/browse/HADOOP-11743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502538#comment-14502538 ] Hadoop QA commented on HADOOP-11743: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726514/HADOOP-11743.001.patch against trunk revision 8511d80. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6126//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6126//console This message is automatically generated. maven doesn't clean all the site files -- Key: HADOOP-11743 URL: https://issues.apache.org/jira/browse/HADOOP-11743 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: ramtin Priority: Minor Attachments: HADOOP-11743.001.patch After building the site files, performing a mvn clean -Preleasedocs doesn't actually clean everything up as git complains about untracked files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11090) [Umbrella] Support Java 8 in Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502647#comment-14502647 ] Steve Loughran commented on HADOOP-11090: - 2.6 doesn't work on a secure JDK8 cluster. Have you seen any problems with the 2.7.0 RC0? [Umbrella] Support Java 8 in Hadoop --- Key: HADOOP-11090 URL: https://issues.apache.org/jira/browse/HADOOP-11090 Project: Hadoop Common Issue Type: New Feature Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Java 8 is coming quickly to various clusters. Making sure Hadoop seamlessly works with Java 8 is important for the Apache community. This JIRA is to track the issues/experiences encountered during Java 8 migration. If you find a potential bug , please create a separate JIRA either as a sub-task or linked into this JIRA. If you find a Hadoop or JVM configuration tuning, you can create a JIRA as well. Or you can add a comment here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11812) Implement listLocatedStatus for ViewFileSystem to speed up split calculation
[ https://issues.apache.org/jira/browse/HADOOP-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502530#comment-14502530 ] Gera Shegalov commented on HADOOP-11812: [~cnauroth], please take a look at 005 version with a test. Implement listLocatedStatus for ViewFileSystem to speed up split calculation Key: HADOOP-11812 URL: https://issues.apache.org/jira/browse/HADOOP-11812 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.7.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Labels: performance Attachments: HADOOP-11812.001.patch, HADOOP-11812.002.patch, HADOOP-11812.003.patch, HADOOP-11812.004.patch, HADOOP-11812.005.patch ViewFileSystem is currently not taking advantage of MAPREDUCE-1981. This causes several x of RPC overhead and added latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11746) rewrite test-patch.sh
[ https://issues.apache.org/jira/browse/HADOOP-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502561#comment-14502561 ] Allen Wittenauer commented on HADOOP-11746: --- bq. One thing I'm still confused by is why the original test-patch exit'd with 0 during some failure situations. Is this a bug in the original or was there a reason? I kept that behavior but it still seems wrong... I asked this question way above, but haven't gotten an answer. This still feels wrong to me and suspect these should really be exiting with 1. rewrite test-patch.sh - Key: HADOOP-11746 URL: https://issues.apache.org/jira/browse/HADOOP-11746 Project: Hadoop Common Issue Type: Test Components: build, test Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Allen Wittenauer Attachments: HADOOP-11746-00.patch, HADOOP-11746-01.patch, HADOOP-11746-02.patch, HADOOP-11746-03.patch, HADOOP-11746-04.patch, HADOOP-11746-05.patch, HADOOP-11746-06.patch, HADOOP-11746-07.patch, HADOOP-11746-09.patch, HADOOP-11746-10.patch, HADOOP-11746-11.patch, HADOOP-11746-12.patch, HADOOP-11746-13.patch, HADOOP-11746-14.patch, HADOOP-11746-15.patch, HADOOP-11746-16.patch, HADOOP-11746-17.patch, HADOOP-11746-18.patch, HADOOP-11746-19.patch This code is bad and you should feel bad. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11848) Incorrect arguments to sizeof in DomainSocket.c
Malcolm Kavalsky created HADOOP-11848: - Summary: Incorrect arguments to sizeof in DomainSocket.c Key: HADOOP-11848 URL: https://issues.apache.org/jira/browse/HADOOP-11848 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.6.0 Reporter: Malcolm Kavalsky Assignee: Malcolm Kavalsky Fix For: 2.6.0 Length of buffer to be zeroed using sizeof , should not use the address of the structure rather the structure itself. DomainSocket.c line 156 Replace current: memset(addr,0,sizeof,(addr)); With: memset(addr, 0, sizeof(addr)); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11848) Incorrect arguments to sizeof in DomainSocket.c
[ https://issues.apache.org/jira/browse/HADOOP-11848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HADOOP-11848: -- Fix Version/s: (was: 2.6.0) Incorrect arguments to sizeof in DomainSocket.c --- Key: HADOOP-11848 URL: https://issues.apache.org/jira/browse/HADOOP-11848 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.6.0 Reporter: Malcolm Kavalsky Assignee: Malcolm Kavalsky Original Estimate: 24h Remaining Estimate: 24h Length of buffer to be zeroed using sizeof , should not use the address of the structure rather the structure itself. DomainSocket.c line 156 Replace current: memset(addr,0,sizeof,(addr)); With: memset(addr, 0, sizeof(addr)); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10282) Create a FairCallQueue: a multi-level call queue which schedules incoming calls and multiplexes outgoing calls
[ https://issues.apache.org/jira/browse/HADOOP-10282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502890#comment-14502890 ] Harsh J commented on HADOOP-10282: -- This patch appears in 2.6.0 release onwards I think - Can the fix version on this and other sub-task JIRAs of the parent feature be set appropriately (especially since there's no feature branch this work is going into)? Create a FairCallQueue: a multi-level call queue which schedules incoming calls and multiplexes outgoing calls -- Key: HADOOP-10282 URL: https://issues.apache.org/jira/browse/HADOOP-10282 Project: Hadoop Common Issue Type: Sub-task Reporter: Chris Li Assignee: Chris Li Attachments: HADOOP-10282.patch, HADOOP-10282.patch, HADOOP-10282.patch, HADOOP-10282.patch The FairCallQueue ensures quality of service by altering the order of RPC calls internally. It consists of three parts: 1. a Scheduler (`HistoryRpcScheduler` is provided) which provides a priority number from 0 to N (0 being highest priority) 2. a Multi-level queue (residing in `FairCallQueue`) which provides a way to keep calls in priority order internally 3. a Multiplexer (`WeightedRoundRobinMultiplexer` is provided) which provides logic to control which queue to take from Currently the Mux and Scheduler are not pluggable, but they probably should be (up for discussion). This is how it is used: // Production 1. Call is created and given to the CallQueueManager 2. CallQueueManager requests a `put(T call)` into the `FairCallQueue` which implements `BlockingQueue` 3. `FairCallQueue` asks its scheduler for a scheduling decision, which is an integer e.g. 12 4. `FairCallQueue` inserts Call into the 12th queue: `queues.get(12).put(call)` // Consumption 1. CallQueueManager requests `take()` or `poll()` on FairCallQueue 2. `FairCallQueue` asks its multiplexer for which queue to draw from, which will also be an integer e.g. 2 3. `FairCallQueue` draws from this queue if it has an available call (or tries other queues if it is empty) Additional information is available in the linked JIRAs regarding the Scheduler and Multiplexer's roles. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11848) Incorrect arguments to sizeof in DomainSocket.c
[ https://issues.apache.org/jira/browse/HADOOP-11848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HADOOP-11848: --- Component/s: native Incorrect arguments to sizeof in DomainSocket.c --- Key: HADOOP-11848 URL: https://issues.apache.org/jira/browse/HADOOP-11848 Project: Hadoop Common Issue Type: Bug Components: native Affects Versions: 2.6.0 Reporter: Malcolm Kavalsky Assignee: Malcolm Kavalsky Original Estimate: 24h Remaining Estimate: 24h Length of buffer to be zeroed using sizeof , should not use the address of the structure rather the structure itself. DomainSocket.c line 156 Replace current: memset(addr,0,sizeof,(addr)); With: memset(addr, 0, sizeof(addr)); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11843) Make setting up the build environment easier
[ https://issues.apache.org/jira/browse/HADOOP-11843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503306#comment-14503306 ] Darrell Taylor commented on HADOOP-11843: - Some food for thought. I just wanted to run {code} dev-support/test-patch.sh /path/to/my.patch {code} As recommended on the wiki (http://wiki.apache.org/hadoop/HowToContribute) But I have my patch outside of the source directory that is 'mounted' by the container (i.e. ~/hadoop) so can't access the .patch file from within the container. I could copy the file into the ~/hadoop directory I guess but that feels a little bit dirty. Any recommendations on what to do here? Make setting up the build environment easier Key: HADOOP-11843 URL: https://issues.apache.org/jira/browse/HADOOP-11843 Project: Hadoop Common Issue Type: New Feature Reporter: Niels Basjes Assignee: Niels Basjes Attachments: HADOOP-11843-2015-04-17-1612.patch, HADOOP-11843-2015-04-17-2226.patch, HADOOP-11843-2015-04-17-2308.patch, HADOOP-11843-2015-04-19-2206.patch, HADOOP-11843-2015-04-19-2232.patch ( As discussed with [~aw] ) In AVRO-1537 a docker based solution was created to setup all the tools for doing a full build. This enables much easier reproduction of any issues and getting up and running for new developers. This issue is to 'copy/port' that setup into the hadoop project in preparation for the bug squash. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11746) rewrite test-patch.sh
[ https://issues.apache.org/jira/browse/HADOOP-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-11746: -- Attachment: HADOOP-11746-20.patch Yeah I think it was probably a mistake. Here's -20 which changes all of those to return 1, thus failing the jenkins build. I guess we'll see what happens. rewrite test-patch.sh - Key: HADOOP-11746 URL: https://issues.apache.org/jira/browse/HADOOP-11746 Project: Hadoop Common Issue Type: Test Components: build, test Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Allen Wittenauer Attachments: HADOOP-11746-00.patch, HADOOP-11746-01.patch, HADOOP-11746-02.patch, HADOOP-11746-03.patch, HADOOP-11746-04.patch, HADOOP-11746-05.patch, HADOOP-11746-06.patch, HADOOP-11746-07.patch, HADOOP-11746-09.patch, HADOOP-11746-10.patch, HADOOP-11746-11.patch, HADOOP-11746-12.patch, HADOOP-11746-13.patch, HADOOP-11746-14.patch, HADOOP-11746-15.patch, HADOOP-11746-16.patch, HADOOP-11746-17.patch, HADOOP-11746-18.patch, HADOOP-11746-19.patch, HADOOP-11746-20.patch This code is bad and you should feel bad. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11746) rewrite test-patch.sh
[ https://issues.apache.org/jira/browse/HADOOP-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503068#comment-14503068 ] Sean Busbey commented on HADOOP-11746: -- probably an oversight? exiting with 0 in those conditions would prevent the jenkins job from registering as a failure, but I'm not sure why that would be desirable. rewrite test-patch.sh - Key: HADOOP-11746 URL: https://issues.apache.org/jira/browse/HADOOP-11746 Project: Hadoop Common Issue Type: Test Components: build, test Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Allen Wittenauer Attachments: HADOOP-11746-00.patch, HADOOP-11746-01.patch, HADOOP-11746-02.patch, HADOOP-11746-03.patch, HADOOP-11746-04.patch, HADOOP-11746-05.patch, HADOOP-11746-06.patch, HADOOP-11746-07.patch, HADOOP-11746-09.patch, HADOOP-11746-10.patch, HADOOP-11746-11.patch, HADOOP-11746-12.patch, HADOOP-11746-13.patch, HADOOP-11746-14.patch, HADOOP-11746-15.patch, HADOOP-11746-16.patch, HADOOP-11746-17.patch, HADOOP-11746-18.patch, HADOOP-11746-19.patch This code is bad and you should feel bad. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11813) releasedocmaker.py should use today's date instead of unreleased
[ https://issues.apache.org/jira/browse/HADOOP-11813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503326#comment-14503326 ] Hadoop QA commented on HADOOP-11813: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726606/HADOOP-11813.patch against trunk revision c17cd4f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6128//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6128//console This message is automatically generated. releasedocmaker.py should use today's date instead of unreleased Key: HADOOP-11813 URL: https://issues.apache.org/jira/browse/HADOOP-11813 Project: Hadoop Common Issue Type: Task Components: build Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Darrell Taylor Priority: Minor Labels: newbie Attachments: HADOOP-11813.patch After discussing with a few folks, it'd be more convenient if releasedocmaker used the current date rather than unreleased when processing a version that JIRA hasn't declared released. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11813) releasedocmaker.py should use today's date instead of unreleased
[ https://issues.apache.org/jira/browse/HADOOP-11813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Darrell Taylor updated HADOOP-11813: Status: Patch Available (was: Open) * --usetoday option added * Small fix to escape out asterisks, these were breaking the tables * pom.xml updated to use the new argument releasedocmaker.py should use today's date instead of unreleased Key: HADOOP-11813 URL: https://issues.apache.org/jira/browse/HADOOP-11813 Project: Hadoop Common Issue Type: Task Components: build Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Darrell Taylor Priority: Minor Labels: newbie After discussing with a few folks, it'd be more convenient if releasedocmaker used the current date rather than unreleased when processing a version that JIRA hasn't declared released. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11813) releasedocmaker.py should use today's date instead of unreleased
[ https://issues.apache.org/jira/browse/HADOOP-11813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Darrell Taylor updated HADOOP-11813: Attachment: HADOOP-11813.patch releasedocmaker.py should use today's date instead of unreleased Key: HADOOP-11813 URL: https://issues.apache.org/jira/browse/HADOOP-11813 Project: Hadoop Common Issue Type: Task Components: build Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Darrell Taylor Priority: Minor Labels: newbie Attachments: HADOOP-11813.patch After discussing with a few folks, it'd be more convenient if releasedocmaker used the current date rather than unreleased when processing a version that JIRA hasn't declared released. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11812) Implement listLocatedStatus for ViewFileSystem to speed up split calculation
[ https://issues.apache.org/jira/browse/HADOOP-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503395#comment-14503395 ] Chris Nauroth commented on HADOOP-11812: [~jira.shegalov], thanks for adding the new test in v005. It looks good. However, comparing to v004, it also looks like Laurent's earlier feedback on the {{fixFileStatus}} method got dropped. Was that intentional? Implement listLocatedStatus for ViewFileSystem to speed up split calculation Key: HADOOP-11812 URL: https://issues.apache.org/jira/browse/HADOOP-11812 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.7.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Labels: performance Attachments: HADOOP-11812.001.patch, HADOOP-11812.002.patch, HADOOP-11812.003.patch, HADOOP-11812.004.patch, HADOOP-11812.005.patch ViewFileSystem is currently not taking advantage of MAPREDUCE-1981. This causes several x of RPC overhead and added latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11746) rewrite test-patch.sh
[ https://issues.apache.org/jira/browse/HADOOP-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503394#comment-14503394 ] Hadoop QA commented on HADOOP-11746: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726614/HADOOP-11746-20.patch against trunk revision f967fd2. {color:red}-1 @author{color}. The patch appears to contain 13 @author tags which the Hadoop community has agreed to not allow in code contributions. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/6129//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/6129//console This message is automatically generated. rewrite test-patch.sh - Key: HADOOP-11746 URL: https://issues.apache.org/jira/browse/HADOOP-11746 Project: Hadoop Common Issue Type: Test Components: build, test Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Allen Wittenauer Attachments: HADOOP-11746-00.patch, HADOOP-11746-01.patch, HADOOP-11746-02.patch, HADOOP-11746-03.patch, HADOOP-11746-04.patch, HADOOP-11746-05.patch, HADOOP-11746-06.patch, HADOOP-11746-07.patch, HADOOP-11746-09.patch, HADOOP-11746-10.patch, HADOOP-11746-11.patch, HADOOP-11746-12.patch, HADOOP-11746-13.patch, HADOOP-11746-14.patch, HADOOP-11746-15.patch, HADOOP-11746-16.patch, HADOOP-11746-17.patch, HADOOP-11746-18.patch, HADOOP-11746-19.patch, HADOOP-11746-20.patch This code is bad and you should feel bad. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11849) There should be a JUnit test runner which automatically initializes the mini cluster
[ https://issues.apache.org/jira/browse/HADOOP-11849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Rabe updated HADOOP-11849: --- Description: Setting up the mini cluster is not a big deal, but when writing a larger amount of integration tests, one quickly has a lot of code duplication. Creating abstract test classes which set up the cluster is a feasible solution, but then there is the problem with the order of *BeforeClass*, especially when a lengthy MR job is to be run exactly once and several tests then check the output data. It would be nice if the following were possible: {code:java} @RunWith(MiniCluster.class) public class SomeLengthyIntegrationTest { @Conf private static Configuration _conf; @FS private static FileSystem _fs; @BeforeClass public static void runMRJob() throws Exception { final Job job = Job.getInstance(_conf); // ... set up the job assertTrue(Job was successful, job.waitForCompletion(true)); } @Test public void testSomething() { // a test } // more tests } {code} Maybe there could be more annotation-driven configuration options. A nice example for such an approach is the *SpringJUnit4ClassRunner* from the Spring framework. was: Setting up the mini cluster is not a big deal, but when writing a larger amount of integration tests, one quickly has a lot of code duplication. Creating abstract test classes which set up the cluster is a feasible solution, but then there is the problem with the order of *BeforeClass*, especially when a lengthy MR job is to be run exactly once and several tests then check the output data. It would be nice if the following were possible: {code:java} @RunWith(MiniCluster.class) public class SomeLengthyIntegrationTest { @Conf private static Configuration _conf; @FS private static FileSystem _fs; @BeforeClass public static void runMRJob() throws Exception { final Job job = Job.getInstance(_conf); // ... set up the job assertTrue(Job was successful, job.waitForCompletion(true)); } @Test public void testSomething() { // a test } // more tests } Maybe there could be more annotation-driven configuration options. A nice example for such an approach is the *SpringJUnit4ClassRunner* from the Spring framework. There should be a JUnit test runner which automatically initializes the mini cluster Key: HADOOP-11849 URL: https://issues.apache.org/jira/browse/HADOOP-11849 Project: Hadoop Common Issue Type: Improvement Components: test Reporter: Jens Rabe Priority: Minor Setting up the mini cluster is not a big deal, but when writing a larger amount of integration tests, one quickly has a lot of code duplication. Creating abstract test classes which set up the cluster is a feasible solution, but then there is the problem with the order of *BeforeClass*, especially when a lengthy MR job is to be run exactly once and several tests then check the output data. It would be nice if the following were possible: {code:java} @RunWith(MiniCluster.class) public class SomeLengthyIntegrationTest { @Conf private static Configuration _conf; @FS private static FileSystem _fs; @BeforeClass public static void runMRJob() throws Exception { final Job job = Job.getInstance(_conf); // ... set up the job assertTrue(Job was successful, job.waitForCompletion(true)); } @Test public void testSomething() { // a test } // more tests } {code} Maybe there could be more annotation-driven configuration options. A nice example for such an approach is the *SpringJUnit4ClassRunner* from the Spring framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11849) There should be a JUnit test runner which automatically initializes the mini cluster
Jens Rabe created HADOOP-11849: -- Summary: There should be a JUnit test runner which automatically initializes the mini cluster Key: HADOOP-11849 URL: https://issues.apache.org/jira/browse/HADOOP-11849 Project: Hadoop Common Issue Type: Improvement Components: test Reporter: Jens Rabe Priority: Minor Setting up the mini cluster is not a big deal, but when writing a larger amount of integration tests, one quickly has a lot of code duplication. Creating abstract test classes which set up the cluster is a feasible solution, but then there is the problem with the order of *BeforeClass*, especially when a lengthy MR job is to be run exactly once and several tests then check the output data. It would be nice if the following were possible: {code:java} @RunWith(MiniCluster.class) public class SomeLengthyIntegrationTest { @Conf private static Configuration _conf; @FS private static FileSystem _fs; @BeforeClass public static void runMRJob() throws Exception { final Job job = Job.getInstance(_conf); // ... set up the job assertTrue(Job was successful, job.waitForCompletion(true)); } @Test public void testSomething() { // a test } // more tests } Maybe there could be more annotation-driven configuration options. A nice example for such an approach is the *SpringJUnit4ClassRunner* from the Spring framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11844) SLS docs point to invalid rumen link
[ https://issues.apache.org/jira/browse/HADOOP-11844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502392#comment-14502392 ] J.Andreina commented on HADOOP-11844: - Thanks [~aw] for raising this issue. Will soon update the patch. SLS docs point to invalid rumen link Key: HADOOP-11844 URL: https://issues.apache.org/jira/browse/HADOOP-11844 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Allen Wittenauer Assignee: J.Andreina Priority: Trivial Labels: newbie SchedulerLoadSimulator at least on 2.6.0 points to an invalid link to rumen. Need to verify and potentially fix this link in newer releases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11850) Typos in hadoop-common java docs
surendra singh lilhore created HADOOP-11850: --- Summary: Typos in hadoop-common java docs Key: HADOOP-11850 URL: https://issues.apache.org/jira/browse/HADOOP-11850 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Priority: Minor This jira will fix the typo in hdfs-common project -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11812) Implement listLocatedStatus for ViewFileSystem to speed up split calculation
[ https://issues.apache.org/jira/browse/HADOOP-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503465#comment-14503465 ] Chris Nauroth commented on HADOOP-11812: Thanks, that makes sense now. I missed the javac warning from the v004 Jenkins run. I am +1 for v005 pending a Jenkins run on that version. Implement listLocatedStatus for ViewFileSystem to speed up split calculation Key: HADOOP-11812 URL: https://issues.apache.org/jira/browse/HADOOP-11812 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.7.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Labels: performance Attachments: HADOOP-11812.001.patch, HADOOP-11812.002.patch, HADOOP-11812.003.patch, HADOOP-11812.004.patch, HADOOP-11812.005.patch ViewFileSystem is currently not taking advantage of MAPREDUCE-1981. This causes several x of RPC overhead and added latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11812) Implement listLocatedStatus for ViewFileSystem to speed up split calculation
[ https://issues.apache.org/jira/browse/HADOOP-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503458#comment-14503458 ] Gera Shegalov commented on HADOOP-11812: [~cnauroth], yes, unfortunately I had to undo [~laurentgo]'s suggestion because it required that unchecked cast. The method does not quite do what is expected by this pattern for the file case: in-class is the same as out-class. In the end, I was basically choosing between two alternatives: the one from 003 and the one from 004. They both required just one cast, but 004's is unchecked with a new warning. Implement listLocatedStatus for ViewFileSystem to speed up split calculation Key: HADOOP-11812 URL: https://issues.apache.org/jira/browse/HADOOP-11812 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.7.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Labels: performance Attachments: HADOOP-11812.001.patch, HADOOP-11812.002.patch, HADOOP-11812.003.patch, HADOOP-11812.004.patch, HADOOP-11812.005.patch ViewFileSystem is currently not taking advantage of MAPREDUCE-1981. This causes several x of RPC overhead and added latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11103) Clean up RemoteException
[ https://issues.apache.org/jira/browse/HADOOP-11103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HADOOP-11103: - Status: Open (was: Patch Available) Clean up RemoteException Key: HADOOP-11103 URL: https://issues.apache.org/jira/browse/HADOOP-11103 Project: Hadoop Common Issue Type: Improvement Components: ipc Reporter: Sean Busbey Assignee: Sean Busbey Priority: Trivial Attachments: HADOOP-11103.1.patch RemoteException has a number of undocumented behaviors * o.a.h.ipc.RemoteException has no javadocs on getClassName. Reading the source, the String returned is the classname of the wrapped remote exception. * RemoteException(String, String) is equivalent to calling RemoteException(String, String, null) * Constructors allow null for all arguments * Some of the test code doesn't check for correct error codes to correspond with the wrapped exception type * methods don't document when they might return null -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11850) Typos in hadoop-common java docs
[ https://issues.apache.org/jira/browse/HADOOP-11850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] surendra singh lilhore updated HADOOP-11850: Attachment: HADOOP-11850.patch Typos in hadoop-common java docs Key: HADOOP-11850 URL: https://issues.apache.org/jira/browse/HADOOP-11850 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.6.0 Reporter: surendra singh lilhore Assignee: surendra singh lilhore Priority: Minor Attachments: HADOOP-11850.patch This jira will fix the typo in hdfs-common project -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11103) Clean up RemoteException
[ https://issues.apache.org/jira/browse/HADOOP-11103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HADOOP-11103: - Status: Patch Available (was: Open) Clean up RemoteException Key: HADOOP-11103 URL: https://issues.apache.org/jira/browse/HADOOP-11103 Project: Hadoop Common Issue Type: Improvement Components: ipc Reporter: Sean Busbey Assignee: Sean Busbey Priority: Trivial Attachments: HADOOP-11103.1.patch, HADOOP-11103.2.patch RemoteException has a number of undocumented behaviors * o.a.h.ipc.RemoteException has no javadocs on getClassName. Reading the source, the String returned is the classname of the wrapped remote exception. * RemoteException(String, String) is equivalent to calling RemoteException(String, String, null) * Constructors allow null for all arguments * Some of the test code doesn't check for correct error codes to correspond with the wrapped exception type * methods don't document when they might return null -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11090) [Umbrella] Support Java 8 in Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502405#comment-14502405 ] Alexey Tomin commented on HADOOP-11090: --- The upcoming release of Java 7 update 80 (April 2015) marks the last public release in Orale’s JDK 7 family. Hadoop is NOT JDK8 ready :( [Umbrella] Support Java 8 in Hadoop --- Key: HADOOP-11090 URL: https://issues.apache.org/jira/browse/HADOOP-11090 Project: Hadoop Common Issue Type: New Feature Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Java 8 is coming quickly to various clusters. Making sure Hadoop seamlessly works with Java 8 is important for the Apache community. This JIRA is to track the issues/experiences encountered during Java 8 migration. If you find a potential bug , please create a separate JIRA either as a sub-task or linked into this JIRA. If you find a Hadoop or JVM configuration tuning, you can create a JIRA as well. Or you can add a comment here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)