[jira] [Commented] (HADOOP-10103) update commons-lang to 2.6

2013-11-20 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827488#comment-13827488
 ] 

Steve Loughran commented on HADOOP-10103:
-

There's a trick to getting jenkins to run the tests for you: submit the same 
patch to HDFS, YARN, MAPREDUCE: see HADOOP-10101

I'm explicitly doing that where there are code changes, otherwise just running 
the tests locally

 update commons-lang to 2.6
 --

 Key: HADOOP-10103
 URL: https://issues.apache.org/jira/browse/HADOOP-10103
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: build
Affects Versions: 2.3.0
Reporter: Steve Loughran
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: HADOOP-10103.patch


 update commons-lang from 2.5 to 2.6



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well

2013-11-20 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HADOOP-9867:
--

Attachment: HADOOP-9867.patch

Attaching a patch with the test mentioned by Jason.

Reading one more record if the split ends between the delimiter bytes.

Please review.

 org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record 
 delimiters well
 --

 Key: HADOOP-9867
 URL: https://issues.apache.org/jira/browse/HADOOP-9867
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2, 0.23.9, 2.2.0
 Environment: CDH3U2 Redhat linux 5.7
Reporter: Kris Geusebroek
Priority: Critical
 Attachments: HADOOP-9867.patch


 Having defined a recorddelimiter of multiple bytes in a new InputFileFormat 
 sometimes has the effect of skipping records from the input.
 This happens when the input splits are split off just after a 
 recordseparator. Starting point for the next split would be non zero and 
 skipFirstLine would be true. A seek into the file is done to start - 1 and 
 the text until the first recorddelimiter is ignored (due to the presumption 
 that this record is already handled by the previous maptask). Since the re 
 ord delimiter is multibyte the seek only got the last byte of the delimiter 
 into scope and its not recognized as a full delimiter. So the text is skipped 
 until the next delimiter (ignoring a full record!!)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well

2013-11-20 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HADOOP-9867:
--

Attachment: HADOOP-9867.patch

Updated possible NPE

 org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record 
 delimiters well
 --

 Key: HADOOP-9867
 URL: https://issues.apache.org/jira/browse/HADOOP-9867
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2, 0.23.9, 2.2.0
 Environment: CDH3U2 Redhat linux 5.7
Reporter: Kris Geusebroek
Priority: Critical
 Attachments: HADOOP-9867.patch, HADOOP-9867.patch


 Having defined a recorddelimiter of multiple bytes in a new InputFileFormat 
 sometimes has the effect of skipping records from the input.
 This happens when the input splits are split off just after a 
 recordseparator. Starting point for the next split would be non zero and 
 skipFirstLine would be true. A seek into the file is done to start - 1 and 
 the text until the first recorddelimiter is ignored (due to the presumption 
 that this record is already handled by the previous maptask). Since the re 
 ord delimiter is multibyte the seek only got the last byte of the delimiter 
 into scope and its not recognized as a full delimiter. So the text is skipped 
 until the next delimiter (ignoring a full record!!)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well

2013-11-20 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HADOOP-9867:
--

Assignee: Vinay
  Status: Patch Available  (was: Open)

 org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record 
 delimiters well
 --

 Key: HADOOP-9867
 URL: https://issues.apache.org/jira/browse/HADOOP-9867
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.2.0, 0.23.9, 0.20.2
 Environment: CDH3U2 Redhat linux 5.7
Reporter: Kris Geusebroek
Assignee: Vinay
Priority: Critical
 Attachments: HADOOP-9867.patch, HADOOP-9867.patch


 Having defined a recorddelimiter of multiple bytes in a new InputFileFormat 
 sometimes has the effect of skipping records from the input.
 This happens when the input splits are split off just after a 
 recordseparator. Starting point for the next split would be non zero and 
 skipFirstLine would be true. A seek into the file is done to start - 1 and 
 the text until the first recorddelimiter is ignored (due to the presumption 
 that this record is already handled by the previous maptask). Since the re 
 ord delimiter is multibyte the seek only got the last byte of the delimiter 
 into scope and its not recognized as a full delimiter. So the text is skipped 
 until the next delimiter (ignoring a full record!!)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10047) Add a directbuffer Decompressor API to hadoop

2013-11-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827539#comment-13827539
 ] 

Hudson commented on HADOOP-10047:
-

SUCCESS: Integrated in Hadoop-Yarn-trunk #397 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/397/])
HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by 
Gopal V. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543542)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DefaultCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressionCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/GzipCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/SnappyCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibFactory.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java
Revert HADOOP-10047, wrong patch. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543538)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java
HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by 
Gopal V. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543456)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java


 Add a directbuffer Decompressor API to hadoop
 -

 Key: HADOOP-10047
 URL: https://issues.apache.org/jira/browse/HADOOP-10047
 Project: Hadoop Common
  Issue Type: New Feature
  Components: io
Affects Versions: 2.3.0
Reporter: Gopal V
Assignee: Gopal V
  Labels: compression
 Fix For: 2.3.0

 Attachments: DirectCompressor.html, DirectDecompressor.html, 
 HADOOP-10047-WIP.patch, HADOOP-10047-final.patch, 
 HADOOP-10047-redo-WIP.patch, HADOOP-10047-trunk.patch, 
 HADOOP-10047-with-tests.patch, decompress-benchmark.tgz


 With the Zero-Copy reads in HDFS (HDFS-5260), it becomes important to perform 
 all I/O operations without copying data into byte[] buffers or other buffers 
 which wrap over them.
 This is a proposal for adding a DirectDecompressor interface to the 
 io.compress, to indicate codecs which want to surface the direct buffer layer 
 upwards.
 The implementation should work with direct heap/mmap buffers and cannot 
 assume .array() availability.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well

2013-11-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827567#comment-13827567
 ] 

Hadoop QA commented on HADOOP-9867:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12614864/HADOOP-9867.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

  org.apache.hadoop.mapred.TestJobCleanup

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3302//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3302//console

This message is automatically generated.

 org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record 
 delimiters well
 --

 Key: HADOOP-9867
 URL: https://issues.apache.org/jira/browse/HADOOP-9867
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2, 0.23.9, 2.2.0
 Environment: CDH3U2 Redhat linux 5.7
Reporter: Kris Geusebroek
Assignee: Vinay
Priority: Critical
 Attachments: HADOOP-9867.patch, HADOOP-9867.patch


 Having defined a recorddelimiter of multiple bytes in a new InputFileFormat 
 sometimes has the effect of skipping records from the input.
 This happens when the input splits are split off just after a 
 recordseparator. Starting point for the next split would be non zero and 
 skipFirstLine would be true. A seek into the file is done to start - 1 and 
 the text until the first recorddelimiter is ignored (due to the presumption 
 that this record is already handled by the previous maptask). Since the re 
 ord delimiter is multibyte the seek only got the last byte of the delimiter 
 into scope and its not recognized as a full delimiter. So the text is skipped 
 until the next delimiter (ignoring a full record!!)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10047) Add a directbuffer Decompressor API to hadoop

2013-11-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827628#comment-13827628
 ] 

Hudson commented on HADOOP-10047:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #1588 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1588/])
HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by 
Gopal V. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543542)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DefaultCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressionCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/GzipCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/SnappyCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibFactory.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java
Revert HADOOP-10047, wrong patch. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543538)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java
HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by 
Gopal V. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543456)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java


 Add a directbuffer Decompressor API to hadoop
 -

 Key: HADOOP-10047
 URL: https://issues.apache.org/jira/browse/HADOOP-10047
 Project: Hadoop Common
  Issue Type: New Feature
  Components: io
Affects Versions: 2.3.0
Reporter: Gopal V
Assignee: Gopal V
  Labels: compression
 Fix For: 2.3.0

 Attachments: DirectCompressor.html, DirectDecompressor.html, 
 HADOOP-10047-WIP.patch, HADOOP-10047-final.patch, 
 HADOOP-10047-redo-WIP.patch, HADOOP-10047-trunk.patch, 
 HADOOP-10047-with-tests.patch, decompress-benchmark.tgz


 With the Zero-Copy reads in HDFS (HDFS-5260), it becomes important to perform 
 all I/O operations without copying data into byte[] buffers or other buffers 
 which wrap over them.
 This is a proposal for adding a DirectDecompressor interface to the 
 io.compress, to indicate codecs which want to surface the direct buffer layer 
 upwards.
 The implementation should work with direct heap/mmap buffers and cannot 
 assume .array() availability.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10047) Add a directbuffer Decompressor API to hadoop

2013-11-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827640#comment-13827640
 ] 

Hudson commented on HADOOP-10047:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1614 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1614/])
HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by 
Gopal V. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543542)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DefaultCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressionCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/GzipCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/SnappyCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibFactory.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java
Revert HADOOP-10047, wrong patch. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543538)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java
HADOOP-10047. Add a direct-buffer based apis for compression. Contributed by 
Gopal V. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543456)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DirectDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibCompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zlib/ZlibDecompressor.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zlib/TestZlibCompressorDecompressor.java


 Add a directbuffer Decompressor API to hadoop
 -

 Key: HADOOP-10047
 URL: https://issues.apache.org/jira/browse/HADOOP-10047
 Project: Hadoop Common
  Issue Type: New Feature
  Components: io
Affects Versions: 2.3.0
Reporter: Gopal V
Assignee: Gopal V
  Labels: compression
 Fix For: 2.3.0

 Attachments: DirectCompressor.html, DirectDecompressor.html, 
 HADOOP-10047-WIP.patch, HADOOP-10047-final.patch, 
 HADOOP-10047-redo-WIP.patch, HADOOP-10047-trunk.patch, 
 HADOOP-10047-with-tests.patch, decompress-benchmark.tgz


 With the Zero-Copy reads in HDFS (HDFS-5260), it becomes important to perform 
 all I/O operations without copying data into byte[] buffers or other buffers 
 which wrap over them.
 This is a proposal for adding a DirectDecompressor interface to the 
 io.compress, to indicate codecs which want to surface the direct buffer layer 
 upwards.
 The implementation should work with direct heap/mmap buffers and cannot 
 assume .array() availability.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10103) update commons-lang to 2.6

2013-11-20 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827731#comment-13827731
 ] 

Akira AJISAKA commented on HADOOP-10103:


Thank you for sharing!
If there is a need to change the code, I'll use the trick. I'm running the 
tests locally.

 update commons-lang to 2.6
 --

 Key: HADOOP-10103
 URL: https://issues.apache.org/jira/browse/HADOOP-10103
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: build
Affects Versions: 2.3.0
Reporter: Steve Loughran
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: HADOOP-10103.patch


 update commons-lang from 2.5 to 2.6



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well

2013-11-20 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827795#comment-13827795
 ] 

Jason Lowe commented on HADOOP-9867:


Thanks for the patch, Vinay.  I think this approach can work when the input is 
uncompressed, however I don't think it will work for block-compressed inputs.  
Block codecs often report the file position as being the start of the codec 
block and then it teleports to the byte position of the next block once the 
first byte of the next block is consumed.  See HADOOP-9622 for a similar issue 
with the default delimiter and how it's being addressed.  Also 
getFilePosition() for a compressed input is returning a compressed stream 
offset, so if we try to do math on that with an uncompressed delimiter length 
we're mixing different units.

Since LineRecordReader::getFilePosition() can mean different things for 
different inputs, I think a better approach would be to change LineReader (not 
LineRecordReader) so the reported file position for multi-byte custom 
delimiters is the file position after the record but not including its 
delimiter.  Either that or wait for HADOOP-9622 to be committed and  update the 
SplitLineReader interface from the HADOOP-9622 patch so the uncompressed input 
reader would indicate an additional record needs to be read if the split ends 
mid-delimiter.

 org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record 
 delimiters well
 --

 Key: HADOOP-9867
 URL: https://issues.apache.org/jira/browse/HADOOP-9867
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2, 0.23.9, 2.2.0
 Environment: CDH3U2 Redhat linux 5.7
Reporter: Kris Geusebroek
Assignee: Vinay
Priority: Critical
 Attachments: HADOOP-9867.patch, HADOOP-9867.patch


 Having defined a recorddelimiter of multiple bytes in a new InputFileFormat 
 sometimes has the effect of skipping records from the input.
 This happens when the input splits are split off just after a 
 recordseparator. Starting point for the next split would be non zero and 
 skipFirstLine would be true. A seek into the file is done to start - 1 and 
 the text until the first recorddelimiter is ignored (due to the presumption 
 that this record is already handled by the previous maptask). Since the re 
 ord delimiter is multibyte the seek only got the last byte of the delimiter 
 into scope and its not recognized as a full delimiter. So the text is skipped 
 until the next delimiter (ignoring a full record!!)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10111) Allow DU to be initialized with an initial value

2013-11-20 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827892#comment-13827892
 ] 

Jonathan Eagles commented on HADOOP-10111:
--

Looks promising for reducing datanode startup time, Kihwal. 
Couple of minor things.
  - Be consistent with the long literal _this(path, interval, -1)_ vs 
_this(path, interval, -1L)_
  - Currently the tests don't test the new functionality of the initial value. 
Is this a better fit here or in HDFS-5498?

 Allow DU to be initialized with an initial value
 

 Key: HADOOP-10111
 URL: https://issues.apache.org/jira/browse/HADOOP-10111
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HADOOP-10111.patch


 When a DU object is created, the du command runs right away. If the target 
 directory contains a huge number of files and directories, its constructor 
 may not return for many seconds.  It will be nice if it can be told to delay 
 the initial scan and use a specified initial used value.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-10118) FsShell never interpret --

2013-11-20 Thread Kousuke Saruta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated HADOOP-10118:


Summary: FsShell never interpret --  (was: CommandFormat never parse --)

 FsShell never interpret --
 

 Key: HADOOP-10118
 URL: https://issues.apache.org/jira/browse/HADOOP-10118
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 3.0.0
Reporter: Kousuke Saruta

 We cannot use -- option to skip args following that.
 CommandFormat#parse is implemented as follows.
 {code}
 public void parse(ListString args) {
 ...
   } else if (arg.equals(--)) { // force end of option processing
 args.remove(pos);
 break;
   }
 ...
 {code}
 But, FsShell is called through ToolRunner and ToolRunner use 
 GenericOptionParser. GenericOptionParser use GnuParser, which discard -- 
 when parsing args.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-10075) Update jetty dependency to version 9

2013-11-20 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HADOOP-10075:
--

Status: Patch Available  (was: Open)

 Update jetty dependency to version 9
 

 Key: HADOOP-10075
 URL: https://issues.apache.org/jira/browse/HADOOP-10075
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Robert Rati
 Attachments: HADOOP-10075.patch


 Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-10075) Update jetty dependency to version 9

2013-11-20 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HADOOP-10075:
--

Assignee: Robert Rati

 Update jetty dependency to version 9
 

 Key: HADOOP-10075
 URL: https://issues.apache.org/jira/browse/HADOOP-10075
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Robert Rati
Assignee: Robert Rati
 Attachments: HADOOP-10075.patch


 Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2013-11-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827937#comment-13827937
 ] 

Colin Patrick McCabe commented on HADOOP-10075:
---

be sure to hit submit patch so that you will get a jenkins run on this.

 Update jetty dependency to version 9
 

 Key: HADOOP-10075
 URL: https://issues.apache.org/jira/browse/HADOOP-10075
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Robert Rati
Assignee: Robert Rati
 Attachments: HADOOP-10075.patch


 Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2013-11-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827946#comment-13827946
 ] 

Hadoop QA commented on HADOOP-10075:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12610564/HADOOP-10075.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3303//console

This message is automatically generated.

 Update jetty dependency to version 9
 

 Key: HADOOP-10075
 URL: https://issues.apache.org/jira/browse/HADOOP-10075
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Robert Rati
Assignee: Robert Rati
 Attachments: HADOOP-10075.patch


 Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2013-11-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827956#comment-13827956
 ] 

Colin Patrick McCabe commented on HADOOP-10075:
---

Thanks for looking at this.  I think you will need to re-generate the patch, 
since it failed to apply on jenkins.

{code}
--- 
a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/http/TestSSLHttpServer.java
+++ 
b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/http/TestSSLHttpServer.java
@@ -76,6 +76,7 @@ public void setup() throws Exception {
 
 conf.setInt(HttpServer.HTTP_MAX_THREADS, 10);
 conf.addResource(CONFIG_SITE_XML);
+conf.addResource(conf.get(hadoop.ssl.server.conf,ssl-server.xml));
 server = createServer(test, conf);
 server.addServlet(echo, /echo, TestHttpServer.EchoServlet.class);
 server.start();
{code}

Why do we need this addition?

{code}
-InetAddress.getByName(server.getConnectors()[0].getHost());
-  int port = server.getConnectors()[0].getPort();
+
InetAddress.getByName(((ServerConnector)server.getConnectors()[0]).getHost());
+  int port = ((ServerConnector)server.getConnectors()[0]).getPort();
{code}

I see a lot of new typecasts like this.  Is it possible to avoid these?  If 
not, could we have an accessor function that makes this easier to read?  Thanks.

 Update jetty dependency to version 9
 

 Key: HADOOP-10075
 URL: https://issues.apache.org/jira/browse/HADOOP-10075
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Robert Rati
Assignee: Robert Rati
 Attachments: HADOOP-10075.patch


 Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9478) Fix race conditions during the initialization of Configuration related to deprecatedKeyMap

2013-11-20 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828048#comment-13828048
 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-9478:


After this change, I somehow get NoClassDefFoundError: 
org/apache/commons/collections/map/UnmodifiableMap when I run any test under 
trunk/hadoop-hdfs-project/hadoop-hdfs.  Running tests under project root (i.e. 
trunk/) is fine.  I wonder if it is a problem in my local environment.  Do you 
get the same thing?
{noformat}
Running org.apache.hadoop.hdfs.TestFileCreation
Tests run: 22, Failures: 0, Errors: 20, Skipped: 2, Time elapsed: 0.161 sec  
FAILURE! - in org.apache.hadoop.hdfs.TestFileCreation
testServerDefaults(org.apache.hadoop.hdfs.TestFileCreation)  Time elapsed: 
0.016 sec   ERROR!
java.lang.NoClassDefFoundError: 
org/apache/commons/collections/map/UnmodifiableMap
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at 
org.apache.hadoop.conf.Configuration$DeprecationContext.init(Configuration.java:394)
at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:432)
at 
org.apache.hadoop.hdfs.TestFileCreation.testServerDefaults(TestFileCreation.java:149)
{noformat}


 Fix race conditions during the initialization of Configuration related to 
 deprecatedKeyMap
 --

 Key: HADOOP-9478
 URL: https://issues.apache.org/jira/browse/HADOOP-9478
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 2.0.0-alpha
 Environment: OS:
 CentOS release 6.3 (Final)
 JDK:
 java version 1.6.0_27
 Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)
 Hadoop:
 hadoop-2.0.0-cdh4.1.3/hadoop-2.0.0-cdh4.2.0
 Security:
 Kerberos
Reporter: Dongyong Wang
Assignee: Colin Patrick McCabe
 Fix For: 2.2.1

 Attachments: HADOOP-9478.001.patch, HADOOP-9478.002.patch, 
 HADOOP-9478.003.patch, HADOOP-9478.004.patch, HADOOP-9478.005.patch, 
 hadoop-9478-1.patch, hadoop-9478-2.patch


 When we lanuch the client appliation which use kerberos security,the 
 FileSystem can't be create because the exception ' 
 java.lang.NoClassDefFoundError: Could not initialize class 
 org.apache.hadoop.security.SecurityUtil'.
 I check the exception stack trace,it maybe caused by the unsafe get operation 
 of the deprecatedKeyMap which used by the 
 org.apache.hadoop.conf.Configuration.
 So I write a simple test case:
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.hdfs.HdfsConfiguration;
 public class HTest {
 public static void main(String[] args) throws Exception {
 Configuration conf = new Configuration();
 conf.addResource(core-site.xml);
 conf.addResource(hdfs-site.xml);
 FileSystem fileSystem = FileSystem.get(conf);
 System.out.println(fileSystem);
 System.exit(0);
 }
 }
 Then I launch this test case many times,the following exception is thrown:
 Exception in thread TGT Renewer for XXX 
 java.lang.ExceptionInInitializerError
  at 
 org.apache.hadoop.security.UserGroupInformation.getTGT(UserGroupInformation.java:719)
  at 
 org.apache.hadoop.security.UserGroupInformation.access$1100(UserGroupInformation.java:77)
  at 
 org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:746)
  at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 16
  at java.util.HashMap.getEntry(HashMap.java:345)
  at java.util.HashMap.containsKey(HashMap.java:335)
  at 
 org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1989)
  at 
 org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1867)
  at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785)
  at org.apache.hadoop.conf.Configuration.get(Configuration.java:712)
  at 
 org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:731)
  at 
 org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1047)
  at org.apache.hadoop.security.SecurityUtil.clinit(SecurityUtil.java:76)
  ... 4 more
 Exception in thread main java.io.IOException: Couldn't create proxy 
 provider class 
 org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
  at 
 

[jira] [Commented] (HADOOP-9478) Fix race conditions during the initialization of Configuration related to deprecatedKeyMap

2013-11-20 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828113#comment-13828113
 ] 

Andrew Wang commented on HADOOP-9478:
-

Hey Nicholas, I've been running trunk tests the last few weeks without seeing 
this. It might be your local environment like you suspect.

 Fix race conditions during the initialization of Configuration related to 
 deprecatedKeyMap
 --

 Key: HADOOP-9478
 URL: https://issues.apache.org/jira/browse/HADOOP-9478
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 2.0.0-alpha
 Environment: OS:
 CentOS release 6.3 (Final)
 JDK:
 java version 1.6.0_27
 Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)
 Hadoop:
 hadoop-2.0.0-cdh4.1.3/hadoop-2.0.0-cdh4.2.0
 Security:
 Kerberos
Reporter: Dongyong Wang
Assignee: Colin Patrick McCabe
 Fix For: 2.2.1

 Attachments: HADOOP-9478.001.patch, HADOOP-9478.002.patch, 
 HADOOP-9478.003.patch, HADOOP-9478.004.patch, HADOOP-9478.005.patch, 
 hadoop-9478-1.patch, hadoop-9478-2.patch


 When we lanuch the client appliation which use kerberos security,the 
 FileSystem can't be create because the exception ' 
 java.lang.NoClassDefFoundError: Could not initialize class 
 org.apache.hadoop.security.SecurityUtil'.
 I check the exception stack trace,it maybe caused by the unsafe get operation 
 of the deprecatedKeyMap which used by the 
 org.apache.hadoop.conf.Configuration.
 So I write a simple test case:
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.hdfs.HdfsConfiguration;
 public class HTest {
 public static void main(String[] args) throws Exception {
 Configuration conf = new Configuration();
 conf.addResource(core-site.xml);
 conf.addResource(hdfs-site.xml);
 FileSystem fileSystem = FileSystem.get(conf);
 System.out.println(fileSystem);
 System.exit(0);
 }
 }
 Then I launch this test case many times,the following exception is thrown:
 Exception in thread TGT Renewer for XXX 
 java.lang.ExceptionInInitializerError
  at 
 org.apache.hadoop.security.UserGroupInformation.getTGT(UserGroupInformation.java:719)
  at 
 org.apache.hadoop.security.UserGroupInformation.access$1100(UserGroupInformation.java:77)
  at 
 org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:746)
  at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 16
  at java.util.HashMap.getEntry(HashMap.java:345)
  at java.util.HashMap.containsKey(HashMap.java:335)
  at 
 org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1989)
  at 
 org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1867)
  at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785)
  at org.apache.hadoop.conf.Configuration.get(Configuration.java:712)
  at 
 org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:731)
  at 
 org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1047)
  at org.apache.hadoop.security.SecurityUtil.clinit(SecurityUtil.java:76)
  ... 4 more
 Exception in thread main java.io.IOException: Couldn't create proxy 
 provider class 
 org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
  at 
 org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:453)
  at 
 org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:133)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:436)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:403)
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:125)
  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2262)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86)
  at 
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2296)
  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2278)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162)
  at HTest.main(HTest.java:11)
 Caused by: java.lang.reflect.InvocationTargetException
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
  at 

[jira] [Commented] (HADOOP-9478) Fix race conditions during the initialization of Configuration related to deprecatedKeyMap

2013-11-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828164#comment-13828164
 ] 

Colin Patrick McCabe commented on HADOOP-9478:
--

I have not seen that.  I think it's your local environment.

The class you are referring to is part of {{org.apache.commons.collections}} 
and should be provided by {{commons-collections-3.2.1.jar}}.  If that jar is 
not in your {{CLASSPATH}}, you need to figure out why.  Note that we also used  
{{org.apache.commons.collections}} in hadoop-common prior to this change, in 
{{FileUtil}}.

 Fix race conditions during the initialization of Configuration related to 
 deprecatedKeyMap
 --

 Key: HADOOP-9478
 URL: https://issues.apache.org/jira/browse/HADOOP-9478
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 2.0.0-alpha
 Environment: OS:
 CentOS release 6.3 (Final)
 JDK:
 java version 1.6.0_27
 Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)
 Hadoop:
 hadoop-2.0.0-cdh4.1.3/hadoop-2.0.0-cdh4.2.0
 Security:
 Kerberos
Reporter: Dongyong Wang
Assignee: Colin Patrick McCabe
 Fix For: 2.2.1

 Attachments: HADOOP-9478.001.patch, HADOOP-9478.002.patch, 
 HADOOP-9478.003.patch, HADOOP-9478.004.patch, HADOOP-9478.005.patch, 
 hadoop-9478-1.patch, hadoop-9478-2.patch


 When we lanuch the client appliation which use kerberos security,the 
 FileSystem can't be create because the exception ' 
 java.lang.NoClassDefFoundError: Could not initialize class 
 org.apache.hadoop.security.SecurityUtil'.
 I check the exception stack trace,it maybe caused by the unsafe get operation 
 of the deprecatedKeyMap which used by the 
 org.apache.hadoop.conf.Configuration.
 So I write a simple test case:
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.hdfs.HdfsConfiguration;
 public class HTest {
 public static void main(String[] args) throws Exception {
 Configuration conf = new Configuration();
 conf.addResource(core-site.xml);
 conf.addResource(hdfs-site.xml);
 FileSystem fileSystem = FileSystem.get(conf);
 System.out.println(fileSystem);
 System.exit(0);
 }
 }
 Then I launch this test case many times,the following exception is thrown:
 Exception in thread TGT Renewer for XXX 
 java.lang.ExceptionInInitializerError
  at 
 org.apache.hadoop.security.UserGroupInformation.getTGT(UserGroupInformation.java:719)
  at 
 org.apache.hadoop.security.UserGroupInformation.access$1100(UserGroupInformation.java:77)
  at 
 org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:746)
  at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 16
  at java.util.HashMap.getEntry(HashMap.java:345)
  at java.util.HashMap.containsKey(HashMap.java:335)
  at 
 org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1989)
  at 
 org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1867)
  at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785)
  at org.apache.hadoop.conf.Configuration.get(Configuration.java:712)
  at 
 org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:731)
  at 
 org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1047)
  at org.apache.hadoop.security.SecurityUtil.clinit(SecurityUtil.java:76)
  ... 4 more
 Exception in thread main java.io.IOException: Couldn't create proxy 
 provider class 
 org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
  at 
 org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:453)
  at 
 org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:133)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:436)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:403)
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:125)
  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2262)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86)
  at 
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2296)
  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2278)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162)
  at HTest.main(HTest.java:11)
 Caused by: java.lang.reflect.InvocationTargetException
  at 

[jira] [Created] (HADOOP-10120) Additional sliding window metrics

2013-11-20 Thread Andrew Wang (JIRA)
Andrew Wang created HADOOP-10120:


 Summary: Additional sliding window metrics
 Key: HADOOP-10120
 URL: https://issues.apache.org/jira/browse/HADOOP-10120
 Project: Hadoop Common
  Issue Type: New Feature
  Components: metrics
Affects Versions: 2.2.0
Reporter: Andrew Wang
Assignee: Andrew Wang


For HDFS-5350 we'd like to report the last few fsimage transfer times as a 
health metric. This would mean (for example) a sliding window of the last 10 
transfer times, when it was last updated, the total count. It'd be nice to have 
a metrics class that did this.

It'd also be interesting to have some kind of time-based sliding window for 
statistics like counts and averages. This would let us answer questions like 
how many RPCs happened in the last 10s? minute? 5 minutes? 10 minutes?. 
Commutative metrics like counts and averages are easy to aggregate in this 
fashion.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9478) Fix race conditions during the initialization of Configuration related to deprecatedKeyMap

2013-11-20 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828247#comment-13828247
 ] 

Bikas Saha commented on HADOOP-9478:


We noticed that the changes in jira caused client side deployment of Tez to 
have errors. 
Tez is designed to have a client side install. So we package Tez and its 
dependencies and upload that onto HDFS and those jars are used to run Tez job. 
Tez brings in mapreduce-client-core.jar as a dependency for InputFormats etc.
When we build Tez against trunk then the mapreduce-client-core.jar that we 
bring in uses DeprecatedDelta added in that jar. However, the Configuration in 
the cluster comes from the cluster deployed jars for hadoop common and that 
does not have DeprecationDelta. So the execution fails.
This basically means that if someone compiles MR from trunk and runs MR against 
a cluster deployed with 2.2 then MR will not work.

 Fix race conditions during the initialization of Configuration related to 
 deprecatedKeyMap
 --

 Key: HADOOP-9478
 URL: https://issues.apache.org/jira/browse/HADOOP-9478
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 2.0.0-alpha
 Environment: OS:
 CentOS release 6.3 (Final)
 JDK:
 java version 1.6.0_27
 Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)
 Hadoop:
 hadoop-2.0.0-cdh4.1.3/hadoop-2.0.0-cdh4.2.0
 Security:
 Kerberos
Reporter: Dongyong Wang
Assignee: Colin Patrick McCabe
 Fix For: 2.2.1

 Attachments: HADOOP-9478.001.patch, HADOOP-9478.002.patch, 
 HADOOP-9478.003.patch, HADOOP-9478.004.patch, HADOOP-9478.005.patch, 
 hadoop-9478-1.patch, hadoop-9478-2.patch


 When we lanuch the client appliation which use kerberos security,the 
 FileSystem can't be create because the exception ' 
 java.lang.NoClassDefFoundError: Could not initialize class 
 org.apache.hadoop.security.SecurityUtil'.
 I check the exception stack trace,it maybe caused by the unsafe get operation 
 of the deprecatedKeyMap which used by the 
 org.apache.hadoop.conf.Configuration.
 So I write a simple test case:
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.hdfs.HdfsConfiguration;
 public class HTest {
 public static void main(String[] args) throws Exception {
 Configuration conf = new Configuration();
 conf.addResource(core-site.xml);
 conf.addResource(hdfs-site.xml);
 FileSystem fileSystem = FileSystem.get(conf);
 System.out.println(fileSystem);
 System.exit(0);
 }
 }
 Then I launch this test case many times,the following exception is thrown:
 Exception in thread TGT Renewer for XXX 
 java.lang.ExceptionInInitializerError
  at 
 org.apache.hadoop.security.UserGroupInformation.getTGT(UserGroupInformation.java:719)
  at 
 org.apache.hadoop.security.UserGroupInformation.access$1100(UserGroupInformation.java:77)
  at 
 org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:746)
  at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 16
  at java.util.HashMap.getEntry(HashMap.java:345)
  at java.util.HashMap.containsKey(HashMap.java:335)
  at 
 org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1989)
  at 
 org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1867)
  at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785)
  at org.apache.hadoop.conf.Configuration.get(Configuration.java:712)
  at 
 org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:731)
  at 
 org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1047)
  at org.apache.hadoop.security.SecurityUtil.clinit(SecurityUtil.java:76)
  ... 4 more
 Exception in thread main java.io.IOException: Couldn't create proxy 
 provider class 
 org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
  at 
 org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:453)
  at 
 org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:133)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:436)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:403)
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:125)
  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2262)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86)
  at 
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2296)
  at 

[jira] [Updated] (HADOOP-10111) Allow DU to be initialized with an initial value

2013-11-20 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-10111:


Attachment: HADOOP-10111.patch

The new patch addresses the review comments. A test case is added.

 Allow DU to be initialized with an initial value
 

 Key: HADOOP-10111
 URL: https://issues.apache.org/jira/browse/HADOOP-10111
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HADOOP-10111.patch, HADOOP-10111.patch


 When a DU object is created, the du command runs right away. If the target 
 directory contains a huge number of files and directories, its constructor 
 may not return for many seconds.  It will be nice if it can be told to delay 
 the initial scan and use a specified initial used value.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10111) Allow DU to be initialized with an initial value

2013-11-20 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828275#comment-13828275
 ] 

Jonathan Eagles commented on HADOOP-10111:
--

+1. pending results from Hadoop QA

 Allow DU to be initialized with an initial value
 

 Key: HADOOP-10111
 URL: https://issues.apache.org/jira/browse/HADOOP-10111
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HADOOP-10111.patch, HADOOP-10111.patch


 When a DU object is created, the du command runs right away. If the target 
 directory contains a huge number of files and directories, its constructor 
 may not return for many seconds.  It will be nice if it can be told to delay 
 the initial scan and use a specified initial used value.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9478) Fix race conditions during the initialization of Configuration related to deprecatedKeyMap

2013-11-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828313#comment-13828313
 ] 

Colin Patrick McCabe commented on HADOOP-9478:
--

We have never supported mixing and matching jars from trunk with jars from 
other branches.  For example, you can't compile the trunk version of HDFS and 
run it against the branch-2.1 version of common.  It may happen to work 
sometimes, but it will never be a supported configuration.  I don't see why Tez 
would be any different here.

If you do want to mix and match in the Tez project, I suggest using Maven-shade 
to include the hadoop-common jar inside the client-side Tez jar.

 Fix race conditions during the initialization of Configuration related to 
 deprecatedKeyMap
 --

 Key: HADOOP-9478
 URL: https://issues.apache.org/jira/browse/HADOOP-9478
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 2.0.0-alpha
 Environment: OS:
 CentOS release 6.3 (Final)
 JDK:
 java version 1.6.0_27
 Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)
 Hadoop:
 hadoop-2.0.0-cdh4.1.3/hadoop-2.0.0-cdh4.2.0
 Security:
 Kerberos
Reporter: Dongyong Wang
Assignee: Colin Patrick McCabe
 Fix For: 2.2.1

 Attachments: HADOOP-9478.001.patch, HADOOP-9478.002.patch, 
 HADOOP-9478.003.patch, HADOOP-9478.004.patch, HADOOP-9478.005.patch, 
 hadoop-9478-1.patch, hadoop-9478-2.patch


 When we lanuch the client appliation which use kerberos security,the 
 FileSystem can't be create because the exception ' 
 java.lang.NoClassDefFoundError: Could not initialize class 
 org.apache.hadoop.security.SecurityUtil'.
 I check the exception stack trace,it maybe caused by the unsafe get operation 
 of the deprecatedKeyMap which used by the 
 org.apache.hadoop.conf.Configuration.
 So I write a simple test case:
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.hdfs.HdfsConfiguration;
 public class HTest {
 public static void main(String[] args) throws Exception {
 Configuration conf = new Configuration();
 conf.addResource(core-site.xml);
 conf.addResource(hdfs-site.xml);
 FileSystem fileSystem = FileSystem.get(conf);
 System.out.println(fileSystem);
 System.exit(0);
 }
 }
 Then I launch this test case many times,the following exception is thrown:
 Exception in thread TGT Renewer for XXX 
 java.lang.ExceptionInInitializerError
  at 
 org.apache.hadoop.security.UserGroupInformation.getTGT(UserGroupInformation.java:719)
  at 
 org.apache.hadoop.security.UserGroupInformation.access$1100(UserGroupInformation.java:77)
  at 
 org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:746)
  at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 16
  at java.util.HashMap.getEntry(HashMap.java:345)
  at java.util.HashMap.containsKey(HashMap.java:335)
  at 
 org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1989)
  at 
 org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1867)
  at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785)
  at org.apache.hadoop.conf.Configuration.get(Configuration.java:712)
  at 
 org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:731)
  at 
 org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1047)
  at org.apache.hadoop.security.SecurityUtil.clinit(SecurityUtil.java:76)
  ... 4 more
 Exception in thread main java.io.IOException: Couldn't create proxy 
 provider class 
 org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
  at 
 org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:453)
  at 
 org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:133)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:436)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:403)
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:125)
  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2262)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86)
  at 
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2296)
  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2278)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162)
  at HTest.main(HTest.java:11)
 Caused by: 

[jira] [Commented] (HADOOP-10112) har file listing doesn't work with wild card

2013-11-20 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828318#comment-13828318
 ] 

Chris Nauroth commented on HADOOP-10112:


It looks like this was fixed by HADOOP-9981, which optimized the new 
{{Globber}} code.  If I revert that patch from branch-2, then I see the bug.  
After restoring that patch, the bug goes away and I see the listing of the har 
contents as I would expect.

HADOOP-9981 was committed to trunk and branch-2, but not branch-2.2, because it 
was a fix in the new {{Globber}} code, which doesn't exist in branch-2.2.

 har file listing  doesn't work with wild card
 -

 Key: HADOOP-10112
 URL: https://issues.apache.org/jira/browse/HADOOP-10112
 Project: Hadoop Common
  Issue Type: Bug
  Components: tools
Affects Versions: 2.2.1
Reporter: Brandon Li

 [test@test001 root]$ hdfs dfs -ls har:///tmp/filename.har/*
 -ls: Can not create a Path from an empty string
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 It works without *.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10111) Allow DU to be initialized with an initial value

2013-11-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828322#comment-13828322
 ] 

Hadoop QA commented on HADOOP-10111:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12615022/HADOOP-10111.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3304//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3304//console

This message is automatically generated.

 Allow DU to be initialized with an initial value
 

 Key: HADOOP-10111
 URL: https://issues.apache.org/jira/browse/HADOOP-10111
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HADOOP-10111.patch, HADOOP-10111.patch


 When a DU object is created, the du command runs right away. If the target 
 directory contains a huge number of files and directories, its constructor 
 may not return for many seconds.  It will be nice if it can be told to delay 
 the initial scan and use a specified initial used value.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10087) UserGroupInformation.getGroupNames() fails to return primary group first when JniBasedUnixGroupsMappingWithFallback is used

2013-11-20 Thread Yu Gao (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828434#comment-13828434
 ] 

Yu Gao commented on HADOOP-10087:
-

Hi Colin, thanks for the patch. It solves the reported problem.

 UserGroupInformation.getGroupNames() fails to return primary group first when 
 JniBasedUnixGroupsMappingWithFallback is used
 ---

 Key: HADOOP-10087
 URL: https://issues.apache.org/jira/browse/HADOOP-10087
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 2.1.0-beta, 2.2.0
 Environment: SUSE Linux Enterprise Server 11 (x86_64)
Reporter: Yu Gao
Assignee: Colin Patrick McCabe
  Labels: security
 Attachments: HADOOP-10087.001.patch


 When JniBasedUnixGroupsMappingWithFallback is used as the group mapping 
 resolution provider, UserGroupInformation.getGroupNames() fails to return the 
 primary group first in the list as documented.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9622) bzip2 codec can drop records when reading data in splits

2013-11-20 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828448#comment-13828448
 ] 

Vinay commented on HADOOP-9622:
---

Thanks Jason for the patch for this tricky issue.
Patch looks good to me.

One small nit.
There are already two Test classes TestLineRecordReader in mapred and 
mapreduce.lib.input packages in hadoop-mapreduce-client-jobclient project. It 
will be better to move included tests to these classes instead of creating 
multiple classes.

 bzip2 codec can drop records when reading data in splits
 

 Key: HADOOP-9622
 URL: https://issues.apache.org/jira/browse/HADOOP-9622
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.0.4-alpha, 0.23.8
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: HADOOP-9622-2.patch, HADOOP-9622-testcase.patch, 
 HADOOP-9622.patch, blockEndingInCR.txt.bz2, blockEndingInCRThenLF.txt.bz2


 Bzip2Codec.BZip2CompressionInputStream can cause records to be dropped when 
 reading them in splits based on where record delimiters occur relative to 
 compression block boundaries.
 Thanks to [~knoguchi] for discovering this problem while working on PIG-3251.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well

2013-11-20 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828453#comment-13828453
 ] 

Vinay commented on HADOOP-9867:
---

Thanks Jason, I prefer waiting for HADOOP-9622 to be committed. 
Meanwhile I will try to update SplitLineReader offline. 

 org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record 
 delimiters well
 --

 Key: HADOOP-9867
 URL: https://issues.apache.org/jira/browse/HADOOP-9867
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2, 0.23.9, 2.2.0
 Environment: CDH3U2 Redhat linux 5.7
Reporter: Kris Geusebroek
Assignee: Vinay
Priority: Critical
 Attachments: HADOOP-9867.patch, HADOOP-9867.patch


 Having defined a recorddelimiter of multiple bytes in a new InputFileFormat 
 sometimes has the effect of skipping records from the input.
 This happens when the input splits are split off just after a 
 recordseparator. Starting point for the next split would be non zero and 
 skipFirstLine would be true. A seek into the file is done to start - 1 and 
 the text until the first recorddelimiter is ignored (due to the presumption 
 that this record is already handled by the previous maptask). Since the re 
 ord delimiter is multibyte the seek only got the last byte of the delimiter 
 into scope and its not recognized as a full delimiter. So the text is skipped 
 until the next delimiter (ignoring a full record!!)



--
This message was sent by Atlassian JIRA
(v6.1#6144)