[jira] [Commented] (HADOOP-10103) update commons-lang to 2.6
[ https://issues.apache.org/jira/browse/HADOOP-10103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827488#comment-13827488 ] Steve Loughran commented on HADOOP-10103: - There's a trick to getting jenkins to run the tests for you: submit the same patch to HDFS, YARN, MAPREDUCE: see HADOOP-10101 I'm explicitly doing that where there are code changes, otherwise just running the tests locally update commons-lang to 2.6 -- Key: HADOOP-10103 URL: https://issues.apache.org/jira/browse/HADOOP-10103 Project: Hadoop Common Issue Type: Sub-task Components: build Affects Versions: 2.3.0 Reporter: Steve Loughran Assignee: Akira AJISAKA Priority: Minor Attachments: HADOOP-10103.patch update commons-lang from 2.5 to 2.6 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well
[ https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay updated HADOOP-9867: -- Attachment: HADOOP-9867.patch Attaching a patch with the test mentioned by Jason. Reading one more record if the split ends between the delimiter bytes. Please review. org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well -- Key: HADOOP-9867 URL: https://issues.apache.org/jira/browse/HADOOP-9867 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.20.2, 0.23.9, 2.2.0 Environment: CDH3U2 Redhat linux 5.7 Reporter: Kris Geusebroek Priority: Critical Attachments: HADOOP-9867.patch Having defined a recorddelimiter of multiple bytes in a new InputFileFormat sometimes has the effect of skipping records from the input. This happens when the input splits are split off just after a recordseparator. Starting point for the next split would be non zero and skipFirstLine would be true. A seek into the file is done to start - 1 and the text until the first recorddelimiter is ignored (due to the presumption that this record is already handled by the previous maptask). Since the re ord delimiter is multibyte the seek only got the last byte of the delimiter into scope and its not recognized as a full delimiter. So the text is skipped until the next delimiter (ignoring a full record!!) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well
[ https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay updated HADOOP-9867: -- Attachment: HADOOP-9867.patch Updated possible NPE org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well -- Key: HADOOP-9867 URL: https://issues.apache.org/jira/browse/HADOOP-9867 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.20.2, 0.23.9, 2.2.0 Environment: CDH3U2 Redhat linux 5.7 Reporter: Kris Geusebroek Priority: Critical Attachments: HADOOP-9867.patch, HADOOP-9867.patch Having defined a recorddelimiter of multiple bytes in a new InputFileFormat sometimes has the effect of skipping records from the input. This happens when the input splits are split off just after a recordseparator. Starting point for the next split would be non zero and skipFirstLine would be true. A seek into the file is done to start - 1 and the text until the first recorddelimiter is ignored (due to the presumption that this record is already handled by the previous maptask). Since the re ord delimiter is multibyte the seek only got the last byte of the delimiter into scope and its not recognized as a full delimiter. So the text is skipped until the next delimiter (ignoring a full record!!) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well
[ https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay updated HADOOP-9867: -- Assignee: Vinay Status: Patch Available (was: Open) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well -- Key: HADOOP-9867 URL: https://issues.apache.org/jira/browse/HADOOP-9867 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 2.2.0, 0.23.9, 0.20.2 Environment: CDH3U2 Redhat linux 5.7 Reporter: Kris Geusebroek Assignee: Vinay Priority: Critical Attachments: HADOOP-9867.patch, HADOOP-9867.patch Having defined a recorddelimiter of multiple bytes in a new InputFileFormat sometimes has the effect of skipping records from the input. This happens when the input splits are split off just after a recordseparator. Starting point for the next split would be non zero and skipFirstLine would be true. A seek into the file is done to start - 1 and the text until the first recorddelimiter is ignored (due to the presumption that this record is already handled by the previous maptask). Since the re ord delimiter is multibyte the seek only got the last byte of the delimiter into scope and its not recognized as a full delimiter. So the text is skipped until the next delimiter (ignoring a full record!!) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10047) Add a directbuffer Decompressor API to hadoop
[ https://issues.apache.org/jira/browse/HADOOP-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827539#comment-13827539 ] Hudson commented on HADOOP-10047: - SUCCESS: Integrated in Hadoop-Yarn-trunk #397 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/397/]) HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by Gopal V. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543542) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DefaultCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressionCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/GzipCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/SnappyCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibFactory.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java Revert HADOOP-10047, wrong patch. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543538) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by Gopal V. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543456) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java Add a directbuffer Decompressor API to hadoop - Key: HADOOP-10047 URL: https://issues.apache.org/jira/browse/HADOOP-10047 Project: Hadoop Common Issue Type: New Feature Components: io Affects Versions: 2.3.0 Reporter: Gopal V Assignee: Gopal V Labels: compression Fix For: 2.3.0 Attachments: DirectCompressor.html, DirectDecompressor.html, HADOOP-10047-WIP.patch, HADOOP-10047-final.patch, HADOOP-10047-redo-WIP.patch, HADOOP-10047-trunk.patch, HADOOP-10047-with-tests.patch, decompress-benchmark.tgz With the Zero-Copy reads in HDFS (HDFS-5260), it becomes important to perform all I/O operations without copying data into byte[] buffers or other buffers which wrap over them. This is a proposal for adding a DirectDecompressor interface to the io.compress, to indicate codecs which want to surface the direct buffer layer upwards. The implementation should work with direct heap/mmap buffers and cannot assume .array() availability. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well
[ https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827567#comment-13827567 ] Hadoop QA commented on HADOOP-9867: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614864/HADOOP-9867.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapred.TestJobCleanup {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3302//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3302//console This message is automatically generated. org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well -- Key: HADOOP-9867 URL: https://issues.apache.org/jira/browse/HADOOP-9867 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.20.2, 0.23.9, 2.2.0 Environment: CDH3U2 Redhat linux 5.7 Reporter: Kris Geusebroek Assignee: Vinay Priority: Critical Attachments: HADOOP-9867.patch, HADOOP-9867.patch Having defined a recorddelimiter of multiple bytes in a new InputFileFormat sometimes has the effect of skipping records from the input. This happens when the input splits are split off just after a recordseparator. Starting point for the next split would be non zero and skipFirstLine would be true. A seek into the file is done to start - 1 and the text until the first recorddelimiter is ignored (due to the presumption that this record is already handled by the previous maptask). Since the re ord delimiter is multibyte the seek only got the last byte of the delimiter into scope and its not recognized as a full delimiter. So the text is skipped until the next delimiter (ignoring a full record!!) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10047) Add a directbuffer Decompressor API to hadoop
[ https://issues.apache.org/jira/browse/HADOOP-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827628#comment-13827628 ] Hudson commented on HADOOP-10047: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1588 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1588/]) HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by Gopal V. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543542) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DefaultCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressionCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/GzipCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/SnappyCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibFactory.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java Revert HADOOP-10047, wrong patch. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543538) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by Gopal V. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543456) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java Add a directbuffer Decompressor API to hadoop - Key: HADOOP-10047 URL: https://issues.apache.org/jira/browse/HADOOP-10047 Project: Hadoop Common Issue Type: New Feature Components: io Affects Versions: 2.3.0 Reporter: Gopal V Assignee: Gopal V Labels: compression Fix For: 2.3.0 Attachments: DirectCompressor.html, DirectDecompressor.html, HADOOP-10047-WIP.patch, HADOOP-10047-final.patch, HADOOP-10047-redo-WIP.patch, HADOOP-10047-trunk.patch, HADOOP-10047-with-tests.patch, decompress-benchmark.tgz With the Zero-Copy reads in HDFS (HDFS-5260), it becomes important to perform all I/O operations without copying data into byte[] buffers or other buffers which wrap over them. This is a proposal for adding a DirectDecompressor interface to the io.compress, to indicate codecs which want to surface the direct buffer layer upwards. The implementation should work with direct heap/mmap buffers and cannot assume .array() availability. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10047) Add a directbuffer Decompressor API to hadoop
[ https://issues.apache.org/jira/browse/HADOOP-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827640#comment-13827640 ] Hudson commented on HADOOP-10047: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1614 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1614/]) HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by Gopal V. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543542) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DefaultCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressionCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/GzipCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/SnappyCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibFactory.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java Revert HADOOP-10047, wrong patch. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543538) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by Gopal V. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543456) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java Add a directbuffer Decompressor API to hadoop - Key: HADOOP-10047 URL: https://issues.apache.org/jira/browse/HADOOP-10047 Project: Hadoop Common Issue Type: New Feature Components: io Affects Versions: 2.3.0 Reporter: Gopal V Assignee: Gopal V Labels: compression Fix For: 2.3.0 Attachments: DirectCompressor.html, DirectDecompressor.html, HADOOP-10047-WIP.patch, HADOOP-10047-final.patch, HADOOP-10047-redo-WIP.patch, HADOOP-10047-trunk.patch, HADOOP-10047-with-tests.patch, decompress-benchmark.tgz With the Zero-Copy reads in HDFS (HDFS-5260), it becomes important to perform all I/O operations without copying data into byte[] buffers or other buffers which wrap over them. This is a proposal for adding a DirectDecompressor interface to the io.compress, to indicate codecs which want to surface the direct buffer layer upwards. The implementation should work with direct heap/mmap buffers and cannot assume .array() availability. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10103) update commons-lang to 2.6
[ https://issues.apache.org/jira/browse/HADOOP-10103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827731#comment-13827731 ] Akira AJISAKA commented on HADOOP-10103: Thank you for sharing! If there is a need to change the code, I'll use the trick. I'm running the tests locally. update commons-lang to 2.6 -- Key: HADOOP-10103 URL: https://issues.apache.org/jira/browse/HADOOP-10103 Project: Hadoop Common Issue Type: Sub-task Components: build Affects Versions: 2.3.0 Reporter: Steve Loughran Assignee: Akira AJISAKA Priority: Minor Attachments: HADOOP-10103.patch update commons-lang from 2.5 to 2.6 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well
[ https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827795#comment-13827795 ] Jason Lowe commented on HADOOP-9867: Thanks for the patch, Vinay. I think this approach can work when the input is uncompressed, however I don't think it will work for block-compressed inputs. Block codecs often report the file position as being the start of the codec block and then it teleports to the byte position of the next block once the first byte of the next block is consumed. See HADOOP-9622 for a similar issue with the default delimiter and how it's being addressed. Also getFilePosition() for a compressed input is returning a compressed stream offset, so if we try to do math on that with an uncompressed delimiter length we're mixing different units. Since LineRecordReader::getFilePosition() can mean different things for different inputs, I think a better approach would be to change LineReader (not LineRecordReader) so the reported file position for multi-byte custom delimiters is the file position after the record but not including its delimiter. Either that or wait for HADOOP-9622 to be committed and update the SplitLineReader interface from the HADOOP-9622 patch so the uncompressed input reader would indicate an additional record needs to be read if the split ends mid-delimiter. org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well -- Key: HADOOP-9867 URL: https://issues.apache.org/jira/browse/HADOOP-9867 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.20.2, 0.23.9, 2.2.0 Environment: CDH3U2 Redhat linux 5.7 Reporter: Kris Geusebroek Assignee: Vinay Priority: Critical Attachments: HADOOP-9867.patch, HADOOP-9867.patch Having defined a recorddelimiter of multiple bytes in a new InputFileFormat sometimes has the effect of skipping records from the input. This happens when the input splits are split off just after a recordseparator. Starting point for the next split would be non zero and skipFirstLine would be true. A seek into the file is done to start - 1 and the text until the first recorddelimiter is ignored (due to the presumption that this record is already handled by the previous maptask). Since the re ord delimiter is multibyte the seek only got the last byte of the delimiter into scope and its not recognized as a full delimiter. So the text is skipped until the next delimiter (ignoring a full record!!) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10111) Allow DU to be initialized with an initial value
[ https://issues.apache.org/jira/browse/HADOOP-10111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827892#comment-13827892 ] Jonathan Eagles commented on HADOOP-10111: -- Looks promising for reducing datanode startup time, Kihwal. Couple of minor things. - Be consistent with the long literal _this(path, interval, -1)_ vs _this(path, interval, -1L)_ - Currently the tests don't test the new functionality of the initial value. Is this a better fit here or in HDFS-5498? Allow DU to be initialized with an initial value Key: HADOOP-10111 URL: https://issues.apache.org/jira/browse/HADOOP-10111 Project: Hadoop Common Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HADOOP-10111.patch When a DU object is created, the du command runs right away. If the target directory contains a huge number of files and directories, its constructor may not return for many seconds. It will be nice if it can be told to delay the initial scan and use a specified initial used value. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HADOOP-10118) FsShell never interpret --
[ https://issues.apache.org/jira/browse/HADOOP-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HADOOP-10118: Summary: FsShell never interpret -- (was: CommandFormat never parse --) FsShell never interpret -- Key: HADOOP-10118 URL: https://issues.apache.org/jira/browse/HADOOP-10118 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 3.0.0 Reporter: Kousuke Saruta We cannot use -- option to skip args following that. CommandFormat#parse is implemented as follows. {code} public void parse(ListString args) { ... } else if (arg.equals(--)) { // force end of option processing args.remove(pos); break; } ... {code} But, FsShell is called through ToolRunner and ToolRunner use GenericOptionParser. GenericOptionParser use GnuParser, which discard -- when parsing args. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HADOOP-10075) Update jetty dependency to version 9
[ https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HADOOP-10075: -- Status: Patch Available (was: Open) Update jetty dependency to version 9 Key: HADOOP-10075 URL: https://issues.apache.org/jira/browse/HADOOP-10075 Project: Hadoop Common Issue Type: Improvement Affects Versions: 2.2.0 Reporter: Robert Rati Attachments: HADOOP-10075.patch Jetty6 is no longer maintained. Update the dependency to jetty9. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HADOOP-10075) Update jetty dependency to version 9
[ https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HADOOP-10075: -- Assignee: Robert Rati Update jetty dependency to version 9 Key: HADOOP-10075 URL: https://issues.apache.org/jira/browse/HADOOP-10075 Project: Hadoop Common Issue Type: Improvement Affects Versions: 2.2.0 Reporter: Robert Rati Assignee: Robert Rati Attachments: HADOOP-10075.patch Jetty6 is no longer maintained. Update the dependency to jetty9. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9
[ https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827937#comment-13827937 ] Colin Patrick McCabe commented on HADOOP-10075: --- be sure to hit submit patch so that you will get a jenkins run on this. Update jetty dependency to version 9 Key: HADOOP-10075 URL: https://issues.apache.org/jira/browse/HADOOP-10075 Project: Hadoop Common Issue Type: Improvement Affects Versions: 2.2.0 Reporter: Robert Rati Assignee: Robert Rati Attachments: HADOOP-10075.patch Jetty6 is no longer maintained. Update the dependency to jetty9. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9
[ https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827946#comment-13827946 ] Hadoop QA commented on HADOOP-10075: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12610564/HADOOP-10075.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3303//console This message is automatically generated. Update jetty dependency to version 9 Key: HADOOP-10075 URL: https://issues.apache.org/jira/browse/HADOOP-10075 Project: Hadoop Common Issue Type: Improvement Affects Versions: 2.2.0 Reporter: Robert Rati Assignee: Robert Rati Attachments: HADOOP-10075.patch Jetty6 is no longer maintained. Update the dependency to jetty9. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9
[ https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827956#comment-13827956 ] Colin Patrick McCabe commented on HADOOP-10075: --- Thanks for looking at this. I think you will need to re-generate the patch, since it failed to apply on jenkins. {code} --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/http/TestSSLHttpServer.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/http/TestSSLHttpServer.java @@ -76,6 +76,7 @@ public void setup() throws Exception { conf.setInt(HttpServer.HTTP_MAX_THREADS, 10); conf.addResource(CONFIG_SITE_XML); +conf.addResource(conf.get(hadoop.ssl.server.conf,ssl-server.xml)); server = createServer(test, conf); server.addServlet(echo, /echo, TestHttpServer.EchoServlet.class); server.start(); {code} Why do we need this addition? {code} -InetAddress.getByName(server.getConnectors()[0].getHost()); - int port = server.getConnectors()[0].getPort(); + InetAddress.getByName(((ServerConnector)server.getConnectors()[0]).getHost()); + int port = ((ServerConnector)server.getConnectors()[0]).getPort(); {code} I see a lot of new typecasts like this. Is it possible to avoid these? If not, could we have an accessor function that makes this easier to read? Thanks. Update jetty dependency to version 9 Key: HADOOP-10075 URL: https://issues.apache.org/jira/browse/HADOOP-10075 Project: Hadoop Common Issue Type: Improvement Affects Versions: 2.2.0 Reporter: Robert Rati Assignee: Robert Rati Attachments: HADOOP-10075.patch Jetty6 is no longer maintained. Update the dependency to jetty9. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-9478) Fix race conditions during the initialization of Configuration related to deprecatedKeyMap
[ https://issues.apache.org/jira/browse/HADOOP-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828048#comment-13828048 ] Tsz Wo (Nicholas), SZE commented on HADOOP-9478: After this change, I somehow get NoClassDefFoundError: org/apache/commons/collections/map/UnmodifiableMap when I run any test under trunk/hadoop-hdfs-project/hadoop-hdfs. Running tests under project root (i.e. trunk/) is fine. I wonder if it is a problem in my local environment. Do you get the same thing? {noformat} Running org.apache.hadoop.hdfs.TestFileCreation Tests run: 22, Failures: 0, Errors: 20, Skipped: 2, Time elapsed: 0.161 sec FAILURE! - in org.apache.hadoop.hdfs.TestFileCreation testServerDefaults(org.apache.hadoop.hdfs.TestFileCreation) Time elapsed: 0.016 sec ERROR! java.lang.NoClassDefFoundError: org/apache/commons/collections/map/UnmodifiableMap at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at org.apache.hadoop.conf.Configuration$DeprecationContext.init(Configuration.java:394) at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:432) at org.apache.hadoop.hdfs.TestFileCreation.testServerDefaults(TestFileCreation.java:149) {noformat} Fix race conditions during the initialization of Configuration related to deprecatedKeyMap -- Key: HADOOP-9478 URL: https://issues.apache.org/jira/browse/HADOOP-9478 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 2.0.0-alpha Environment: OS: CentOS release 6.3 (Final) JDK: java version 1.6.0_27 Java(TM) SE Runtime Environment (build 1.6.0_27-b07) Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode) Hadoop: hadoop-2.0.0-cdh4.1.3/hadoop-2.0.0-cdh4.2.0 Security: Kerberos Reporter: Dongyong Wang Assignee: Colin Patrick McCabe Fix For: 2.2.1 Attachments: HADOOP-9478.001.patch, HADOOP-9478.002.patch, HADOOP-9478.003.patch, HADOOP-9478.004.patch, HADOOP-9478.005.patch, hadoop-9478-1.patch, hadoop-9478-2.patch When we lanuch the client appliation which use kerberos security,the FileSystem can't be create because the exception ' java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.SecurityUtil'. I check the exception stack trace,it maybe caused by the unsafe get operation of the deprecatedKeyMap which used by the org.apache.hadoop.conf.Configuration. So I write a simple test case: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.hdfs.HdfsConfiguration; public class HTest { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); conf.addResource(core-site.xml); conf.addResource(hdfs-site.xml); FileSystem fileSystem = FileSystem.get(conf); System.out.println(fileSystem); System.exit(0); } } Then I launch this test case many times,the following exception is thrown: Exception in thread TGT Renewer for XXX java.lang.ExceptionInInitializerError at org.apache.hadoop.security.UserGroupInformation.getTGT(UserGroupInformation.java:719) at org.apache.hadoop.security.UserGroupInformation.access$1100(UserGroupInformation.java:77) at org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:746) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ArrayIndexOutOfBoundsException: 16 at java.util.HashMap.getEntry(HashMap.java:345) at java.util.HashMap.containsKey(HashMap.java:335) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1989) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1867) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785) at org.apache.hadoop.conf.Configuration.get(Configuration.java:712) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:731) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1047) at org.apache.hadoop.security.SecurityUtil.clinit(SecurityUtil.java:76) ... 4 more Exception in thread main java.io.IOException: Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider at
[jira] [Commented] (HADOOP-9478) Fix race conditions during the initialization of Configuration related to deprecatedKeyMap
[ https://issues.apache.org/jira/browse/HADOOP-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828113#comment-13828113 ] Andrew Wang commented on HADOOP-9478: - Hey Nicholas, I've been running trunk tests the last few weeks without seeing this. It might be your local environment like you suspect. Fix race conditions during the initialization of Configuration related to deprecatedKeyMap -- Key: HADOOP-9478 URL: https://issues.apache.org/jira/browse/HADOOP-9478 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 2.0.0-alpha Environment: OS: CentOS release 6.3 (Final) JDK: java version 1.6.0_27 Java(TM) SE Runtime Environment (build 1.6.0_27-b07) Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode) Hadoop: hadoop-2.0.0-cdh4.1.3/hadoop-2.0.0-cdh4.2.0 Security: Kerberos Reporter: Dongyong Wang Assignee: Colin Patrick McCabe Fix For: 2.2.1 Attachments: HADOOP-9478.001.patch, HADOOP-9478.002.patch, HADOOP-9478.003.patch, HADOOP-9478.004.patch, HADOOP-9478.005.patch, hadoop-9478-1.patch, hadoop-9478-2.patch When we lanuch the client appliation which use kerberos security,the FileSystem can't be create because the exception ' java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.SecurityUtil'. I check the exception stack trace,it maybe caused by the unsafe get operation of the deprecatedKeyMap which used by the org.apache.hadoop.conf.Configuration. So I write a simple test case: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.hdfs.HdfsConfiguration; public class HTest { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); conf.addResource(core-site.xml); conf.addResource(hdfs-site.xml); FileSystem fileSystem = FileSystem.get(conf); System.out.println(fileSystem); System.exit(0); } } Then I launch this test case many times,the following exception is thrown: Exception in thread TGT Renewer for XXX java.lang.ExceptionInInitializerError at org.apache.hadoop.security.UserGroupInformation.getTGT(UserGroupInformation.java:719) at org.apache.hadoop.security.UserGroupInformation.access$1100(UserGroupInformation.java:77) at org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:746) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ArrayIndexOutOfBoundsException: 16 at java.util.HashMap.getEntry(HashMap.java:345) at java.util.HashMap.containsKey(HashMap.java:335) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1989) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1867) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785) at org.apache.hadoop.conf.Configuration.get(Configuration.java:712) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:731) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1047) at org.apache.hadoop.security.SecurityUtil.clinit(SecurityUtil.java:76) ... 4 more Exception in thread main java.io.IOException: Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider at org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:453) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:133) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:436) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:403) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:125) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2262) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2296) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2278) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162) at HTest.main(HTest.java:11) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at
[jira] [Commented] (HADOOP-9478) Fix race conditions during the initialization of Configuration related to deprecatedKeyMap
[ https://issues.apache.org/jira/browse/HADOOP-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828164#comment-13828164 ] Colin Patrick McCabe commented on HADOOP-9478: -- I have not seen that. I think it's your local environment. The class you are referring to is part of {{org.apache.commons.collections}} and should be provided by {{commons-collections-3.2.1.jar}}. If that jar is not in your {{CLASSPATH}}, you need to figure out why. Note that we also used {{org.apache.commons.collections}} in hadoop-common prior to this change, in {{FileUtil}}. Fix race conditions during the initialization of Configuration related to deprecatedKeyMap -- Key: HADOOP-9478 URL: https://issues.apache.org/jira/browse/HADOOP-9478 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 2.0.0-alpha Environment: OS: CentOS release 6.3 (Final) JDK: java version 1.6.0_27 Java(TM) SE Runtime Environment (build 1.6.0_27-b07) Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode) Hadoop: hadoop-2.0.0-cdh4.1.3/hadoop-2.0.0-cdh4.2.0 Security: Kerberos Reporter: Dongyong Wang Assignee: Colin Patrick McCabe Fix For: 2.2.1 Attachments: HADOOP-9478.001.patch, HADOOP-9478.002.patch, HADOOP-9478.003.patch, HADOOP-9478.004.patch, HADOOP-9478.005.patch, hadoop-9478-1.patch, hadoop-9478-2.patch When we lanuch the client appliation which use kerberos security,the FileSystem can't be create because the exception ' java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.SecurityUtil'. I check the exception stack trace,it maybe caused by the unsafe get operation of the deprecatedKeyMap which used by the org.apache.hadoop.conf.Configuration. So I write a simple test case: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.hdfs.HdfsConfiguration; public class HTest { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); conf.addResource(core-site.xml); conf.addResource(hdfs-site.xml); FileSystem fileSystem = FileSystem.get(conf); System.out.println(fileSystem); System.exit(0); } } Then I launch this test case many times,the following exception is thrown: Exception in thread TGT Renewer for XXX java.lang.ExceptionInInitializerError at org.apache.hadoop.security.UserGroupInformation.getTGT(UserGroupInformation.java:719) at org.apache.hadoop.security.UserGroupInformation.access$1100(UserGroupInformation.java:77) at org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:746) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ArrayIndexOutOfBoundsException: 16 at java.util.HashMap.getEntry(HashMap.java:345) at java.util.HashMap.containsKey(HashMap.java:335) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1989) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1867) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785) at org.apache.hadoop.conf.Configuration.get(Configuration.java:712) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:731) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1047) at org.apache.hadoop.security.SecurityUtil.clinit(SecurityUtil.java:76) ... 4 more Exception in thread main java.io.IOException: Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider at org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:453) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:133) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:436) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:403) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:125) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2262) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2296) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2278) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162) at HTest.main(HTest.java:11) Caused by: java.lang.reflect.InvocationTargetException at
[jira] [Created] (HADOOP-10120) Additional sliding window metrics
Andrew Wang created HADOOP-10120: Summary: Additional sliding window metrics Key: HADOOP-10120 URL: https://issues.apache.org/jira/browse/HADOOP-10120 Project: Hadoop Common Issue Type: New Feature Components: metrics Affects Versions: 2.2.0 Reporter: Andrew Wang Assignee: Andrew Wang For HDFS-5350 we'd like to report the last few fsimage transfer times as a health metric. This would mean (for example) a sliding window of the last 10 transfer times, when it was last updated, the total count. It'd be nice to have a metrics class that did this. It'd also be interesting to have some kind of time-based sliding window for statistics like counts and averages. This would let us answer questions like how many RPCs happened in the last 10s? minute? 5 minutes? 10 minutes?. Commutative metrics like counts and averages are easy to aggregate in this fashion. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-9478) Fix race conditions during the initialization of Configuration related to deprecatedKeyMap
[ https://issues.apache.org/jira/browse/HADOOP-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828247#comment-13828247 ] Bikas Saha commented on HADOOP-9478: We noticed that the changes in jira caused client side deployment of Tez to have errors. Tez is designed to have a client side install. So we package Tez and its dependencies and upload that onto HDFS and those jars are used to run Tez job. Tez brings in mapreduce-client-core.jar as a dependency for InputFormats etc. When we build Tez against trunk then the mapreduce-client-core.jar that we bring in uses DeprecatedDelta added in that jar. However, the Configuration in the cluster comes from the cluster deployed jars for hadoop common and that does not have DeprecationDelta. So the execution fails. This basically means that if someone compiles MR from trunk and runs MR against a cluster deployed with 2.2 then MR will not work. Fix race conditions during the initialization of Configuration related to deprecatedKeyMap -- Key: HADOOP-9478 URL: https://issues.apache.org/jira/browse/HADOOP-9478 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 2.0.0-alpha Environment: OS: CentOS release 6.3 (Final) JDK: java version 1.6.0_27 Java(TM) SE Runtime Environment (build 1.6.0_27-b07) Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode) Hadoop: hadoop-2.0.0-cdh4.1.3/hadoop-2.0.0-cdh4.2.0 Security: Kerberos Reporter: Dongyong Wang Assignee: Colin Patrick McCabe Fix For: 2.2.1 Attachments: HADOOP-9478.001.patch, HADOOP-9478.002.patch, HADOOP-9478.003.patch, HADOOP-9478.004.patch, HADOOP-9478.005.patch, hadoop-9478-1.patch, hadoop-9478-2.patch When we lanuch the client appliation which use kerberos security,the FileSystem can't be create because the exception ' java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.SecurityUtil'. I check the exception stack trace,it maybe caused by the unsafe get operation of the deprecatedKeyMap which used by the org.apache.hadoop.conf.Configuration. So I write a simple test case: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.hdfs.HdfsConfiguration; public class HTest { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); conf.addResource(core-site.xml); conf.addResource(hdfs-site.xml); FileSystem fileSystem = FileSystem.get(conf); System.out.println(fileSystem); System.exit(0); } } Then I launch this test case many times,the following exception is thrown: Exception in thread TGT Renewer for XXX java.lang.ExceptionInInitializerError at org.apache.hadoop.security.UserGroupInformation.getTGT(UserGroupInformation.java:719) at org.apache.hadoop.security.UserGroupInformation.access$1100(UserGroupInformation.java:77) at org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:746) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ArrayIndexOutOfBoundsException: 16 at java.util.HashMap.getEntry(HashMap.java:345) at java.util.HashMap.containsKey(HashMap.java:335) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1989) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1867) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785) at org.apache.hadoop.conf.Configuration.get(Configuration.java:712) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:731) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1047) at org.apache.hadoop.security.SecurityUtil.clinit(SecurityUtil.java:76) ... 4 more Exception in thread main java.io.IOException: Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider at org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:453) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:133) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:436) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:403) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:125) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2262) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2296) at
[jira] [Updated] (HADOOP-10111) Allow DU to be initialized with an initial value
[ https://issues.apache.org/jira/browse/HADOOP-10111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HADOOP-10111: Attachment: HADOOP-10111.patch The new patch addresses the review comments. A test case is added. Allow DU to be initialized with an initial value Key: HADOOP-10111 URL: https://issues.apache.org/jira/browse/HADOOP-10111 Project: Hadoop Common Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HADOOP-10111.patch, HADOOP-10111.patch When a DU object is created, the du command runs right away. If the target directory contains a huge number of files and directories, its constructor may not return for many seconds. It will be nice if it can be told to delay the initial scan and use a specified initial used value. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10111) Allow DU to be initialized with an initial value
[ https://issues.apache.org/jira/browse/HADOOP-10111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828275#comment-13828275 ] Jonathan Eagles commented on HADOOP-10111: -- +1. pending results from Hadoop QA Allow DU to be initialized with an initial value Key: HADOOP-10111 URL: https://issues.apache.org/jira/browse/HADOOP-10111 Project: Hadoop Common Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HADOOP-10111.patch, HADOOP-10111.patch When a DU object is created, the du command runs right away. If the target directory contains a huge number of files and directories, its constructor may not return for many seconds. It will be nice if it can be told to delay the initial scan and use a specified initial used value. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-9478) Fix race conditions during the initialization of Configuration related to deprecatedKeyMap
[ https://issues.apache.org/jira/browse/HADOOP-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828313#comment-13828313 ] Colin Patrick McCabe commented on HADOOP-9478: -- We have never supported mixing and matching jars from trunk with jars from other branches. For example, you can't compile the trunk version of HDFS and run it against the branch-2.1 version of common. It may happen to work sometimes, but it will never be a supported configuration. I don't see why Tez would be any different here. If you do want to mix and match in the Tez project, I suggest using Maven-shade to include the hadoop-common jar inside the client-side Tez jar. Fix race conditions during the initialization of Configuration related to deprecatedKeyMap -- Key: HADOOP-9478 URL: https://issues.apache.org/jira/browse/HADOOP-9478 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 2.0.0-alpha Environment: OS: CentOS release 6.3 (Final) JDK: java version 1.6.0_27 Java(TM) SE Runtime Environment (build 1.6.0_27-b07) Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode) Hadoop: hadoop-2.0.0-cdh4.1.3/hadoop-2.0.0-cdh4.2.0 Security: Kerberos Reporter: Dongyong Wang Assignee: Colin Patrick McCabe Fix For: 2.2.1 Attachments: HADOOP-9478.001.patch, HADOOP-9478.002.patch, HADOOP-9478.003.patch, HADOOP-9478.004.patch, HADOOP-9478.005.patch, hadoop-9478-1.patch, hadoop-9478-2.patch When we lanuch the client appliation which use kerberos security,the FileSystem can't be create because the exception ' java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.SecurityUtil'. I check the exception stack trace,it maybe caused by the unsafe get operation of the deprecatedKeyMap which used by the org.apache.hadoop.conf.Configuration. So I write a simple test case: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.hdfs.HdfsConfiguration; public class HTest { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); conf.addResource(core-site.xml); conf.addResource(hdfs-site.xml); FileSystem fileSystem = FileSystem.get(conf); System.out.println(fileSystem); System.exit(0); } } Then I launch this test case many times,the following exception is thrown: Exception in thread TGT Renewer for XXX java.lang.ExceptionInInitializerError at org.apache.hadoop.security.UserGroupInformation.getTGT(UserGroupInformation.java:719) at org.apache.hadoop.security.UserGroupInformation.access$1100(UserGroupInformation.java:77) at org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:746) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ArrayIndexOutOfBoundsException: 16 at java.util.HashMap.getEntry(HashMap.java:345) at java.util.HashMap.containsKey(HashMap.java:335) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1989) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1867) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785) at org.apache.hadoop.conf.Configuration.get(Configuration.java:712) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:731) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1047) at org.apache.hadoop.security.SecurityUtil.clinit(SecurityUtil.java:76) ... 4 more Exception in thread main java.io.IOException: Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider at org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:453) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:133) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:436) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:403) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:125) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2262) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2296) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2278) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162) at HTest.main(HTest.java:11) Caused by:
[jira] [Commented] (HADOOP-10112) har file listing doesn't work with wild card
[ https://issues.apache.org/jira/browse/HADOOP-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828318#comment-13828318 ] Chris Nauroth commented on HADOOP-10112: It looks like this was fixed by HADOOP-9981, which optimized the new {{Globber}} code. If I revert that patch from branch-2, then I see the bug. After restoring that patch, the bug goes away and I see the listing of the har contents as I would expect. HADOOP-9981 was committed to trunk and branch-2, but not branch-2.2, because it was a fix in the new {{Globber}} code, which doesn't exist in branch-2.2. har file listing doesn't work with wild card - Key: HADOOP-10112 URL: https://issues.apache.org/jira/browse/HADOOP-10112 Project: Hadoop Common Issue Type: Bug Components: tools Affects Versions: 2.2.1 Reporter: Brandon Li [test@test001 root]$ hdfs dfs -ls har:///tmp/filename.har/* -ls: Can not create a Path from an empty string Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...] It works without *. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10111) Allow DU to be initialized with an initial value
[ https://issues.apache.org/jira/browse/HADOOP-10111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828322#comment-13828322 ] Hadoop QA commented on HADOOP-10111: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615022/HADOOP-10111.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3304//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3304//console This message is automatically generated. Allow DU to be initialized with an initial value Key: HADOOP-10111 URL: https://issues.apache.org/jira/browse/HADOOP-10111 Project: Hadoop Common Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HADOOP-10111.patch, HADOOP-10111.patch When a DU object is created, the du command runs right away. If the target directory contains a huge number of files and directories, its constructor may not return for many seconds. It will be nice if it can be told to delay the initial scan and use a specified initial used value. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10087) UserGroupInformation.getGroupNames() fails to return primary group first when JniBasedUnixGroupsMappingWithFallback is used
[ https://issues.apache.org/jira/browse/HADOOP-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828434#comment-13828434 ] Yu Gao commented on HADOOP-10087: - Hi Colin, thanks for the patch. It solves the reported problem. UserGroupInformation.getGroupNames() fails to return primary group first when JniBasedUnixGroupsMappingWithFallback is used --- Key: HADOOP-10087 URL: https://issues.apache.org/jira/browse/HADOOP-10087 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.1.0-beta, 2.2.0 Environment: SUSE Linux Enterprise Server 11 (x86_64) Reporter: Yu Gao Assignee: Colin Patrick McCabe Labels: security Attachments: HADOOP-10087.001.patch When JniBasedUnixGroupsMappingWithFallback is used as the group mapping resolution provider, UserGroupInformation.getGroupNames() fails to return the primary group first in the list as documented. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-9622) bzip2 codec can drop records when reading data in splits
[ https://issues.apache.org/jira/browse/HADOOP-9622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828448#comment-13828448 ] Vinay commented on HADOOP-9622: --- Thanks Jason for the patch for this tricky issue. Patch looks good to me. One small nit. There are already two Test classes TestLineRecordReader in mapred and mapreduce.lib.input packages in hadoop-mapreduce-client-jobclient project. It will be better to move included tests to these classes instead of creating multiple classes. bzip2 codec can drop records when reading data in splits Key: HADOOP-9622 URL: https://issues.apache.org/jira/browse/HADOOP-9622 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 2.0.4-alpha, 0.23.8 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Attachments: HADOOP-9622-2.patch, HADOOP-9622-testcase.patch, HADOOP-9622.patch, blockEndingInCR.txt.bz2, blockEndingInCRThenLF.txt.bz2 Bzip2Codec.BZip2CompressionInputStream can cause records to be dropped when reading them in splits based on where record delimiters occur relative to compression block boundaries. Thanks to [~knoguchi] for discovering this problem while working on PIG-3251. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well
[ https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828453#comment-13828453 ] Vinay commented on HADOOP-9867: --- Thanks Jason, I prefer waiting for HADOOP-9622 to be committed. Meanwhile I will try to update SplitLineReader offline. org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well -- Key: HADOOP-9867 URL: https://issues.apache.org/jira/browse/HADOOP-9867 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.20.2, 0.23.9, 2.2.0 Environment: CDH3U2 Redhat linux 5.7 Reporter: Kris Geusebroek Assignee: Vinay Priority: Critical Attachments: HADOOP-9867.patch, HADOOP-9867.patch Having defined a recorddelimiter of multiple bytes in a new InputFileFormat sometimes has the effect of skipping records from the input. This happens when the input splits are split off just after a recordseparator. Starting point for the next split would be non zero and skipFirstLine would be true. A seek into the file is done to start - 1 and the text until the first recorddelimiter is ignored (due to the presumption that this record is already handled by the previous maptask). Since the re ord delimiter is multibyte the seek only got the last byte of the delimiter into scope and its not recognized as a full delimiter. So the text is skipped until the next delimiter (ignoring a full record!!) -- This message was sent by Atlassian JIRA (v6.1#6144)