[jira] [Commented] (HADOOP-8455) Address user name format on domain joined Windows machines

2012-07-12 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412582#comment-13412582
 ] 

Owen O'Malley commented on HADOOP-8455:
---

This is already possible, by using the auth_to_local mapping. The cluster 
operator can define arbitrary mappings between long (FOO@DOMAIN) and short 
names (DOMAINFOO).

See http://hortonworks.com/blog/fine-tune-your-apache-hadoop-security-settings/


 Address user name format on domain joined Windows machines
 --

 Key: HADOOP-8455
 URL: https://issues.apache.org/jira/browse/HADOOP-8455
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 1.1.0, 0.24.0
Reporter: Chuan Liu
Assignee: Ivan Mitic
Priority: Minor

 For a domain joined Windows machine, user name along is not a unique 
 identifier. User name plus domain name is need in order to unique identify 
 the user. For example, we can have both ‘Win1\Alex’ and ‘Redmond\Alex’ on a 
 computer named Win1 that joins Redmond domain. In order to avoid ambiguity, 
 ‘whoami’ on Windows and the new ‘winutils’ created in 
 [Hadoop-8235|https://issues.apache.org/jira/browse/HADOOP-8235] both return 
 [domain]\[username] as the username. In Hadoop, we only use user name right 
 now. This may lead to some inconsistency, and production bugs if users of the 
 same name exist on the machine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-6817) SequenceFile.Reader can't read gzip format compressed sequence file which produce by a mapreduce job without native compression library

2012-07-12 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412705#comment-13412705
 ] 

Niels Basjes commented on HADOOP-6817:
--

To me this seems NOT to be a duplicate of HADOOP-8582 .
To me this issue is essentially: Problem with Gzip in specific situation.  
HADOOP-8582 effectively says Lets make the error message clear until we fix 
the real problem.

So I propose we keep this open as an unsolved 'non-duplicate' bug


 SequenceFile.Reader can't read gzip format compressed sequence file which 
 produce by a mapreduce job without native compression library
 ---

 Key: HADOOP-6817
 URL: https://issues.apache.org/jira/browse/HADOOP-6817
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2
 Environment: Cluster:CentOS 5,jdk1.6.0_20
 Client:Mac SnowLeopard,jdk1.6.0_20
Reporter: Wenjun Huang

 An hadoop job output a gzip compressed sequence file(whether record 
 compressed or block compressed).The client program use SequenceFile.Reader to 
 read this sequence file,when reading the client program shows the following 
 exceptions:
 2090 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load 
 native-hadoop library for your platform... using builtin-java classes where 
 applicable
 2091 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new 
 decompressor
 Exception in thread main java.io.EOFException
   at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:207)
   at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:197)
   at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:136)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:58)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:68)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.init(GzipCodec.java:92)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.init(GzipCodec.java:101)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:170)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:180)
   at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1428)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1417)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1412)
   at 
 com.shiningware.intelligenceonline.taobao.mapreduce.HtmlContentSeqOutputView.main(HtmlContentSeqOutputView.java:28)
 I studied the code in org.apache.hadoop.io.SequenceFile.Reader.init method 
 and read:
   // Initialize... *not* if this we are constructing a temporary Reader
   if (!tempReader) {
 valBuffer = new DataInputBuffer();
 if (decompress) {
   valDecompressor = CodecPool.getDecompressor(codec);
   valInFilter = codec.createInputStream(valBuffer, valDecompressor);
   valIn = new DataInputStream(valInFilter);
 } else {
   valIn = valBuffer;
 }
 the problem seems to be caused by valBuffer = new DataInputBuffer(); 
 ,because GzipCodec.createInputStream creates an instance of GzipInputStream 
 whose constructor creates an instance of ResetableGZIPInputStream class.When 
 ResetableGZIPInputStream's constructor calls it base class 
 java.util.zip.GZIPInputStream's constructor ,it trys to read the empty 
 valBuffer = new DataInputBuffer(); and get no content,so it throws an 
 EOFException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API

2012-07-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412710#comment-13412710
 ] 

Hudson commented on HADOOP-8521:


Integrated in Hadoop-Hdfs-trunk #1101 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1101/])
HADOOP-8521. Port StreamInputFormat to new Map Reduce API (madhukara phatak 
via bobby) (Revision 1360238)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360238
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamBaseRecordReader.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamUtil.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamXmlRecordReader.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamBaseRecordReader.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamInputFormat.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamXmlRecordReader.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/test/java/org/apache/hadoop/streaming/mapreduce
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/test/java/org/apache/hadoop/streaming/mapreduce/TestStreamXmlRecordReader.java


 Port StreamInputFormat to new Map Reduce API
 

 Key: HADOOP-8521
 URL: https://issues.apache.org/jira/browse/HADOOP-8521
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: madhukara phatak
Assignee: madhukara phatak
 Fix For: 3.0.0

 Attachments: HADOOP-8521-1.patch, HADOOP-8521-2.patch, 
 HADOOP-8521-3.patch, HADOOP-8521.patch


 As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to 
 the new M/R API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8541) Better high-percentile latency metrics

2012-07-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412712#comment-13412712
 ] 

Hudson commented on HADOOP-8541:


Integrated in Hadoop-Hdfs-trunk #1101 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1101/])
HADOOP-8541. Better high-percentile latency metrics. Contributed by Andrew 
Wang. (Revision 1360501)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360501
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MetricsRegistry.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MutableQuantiles.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/Quantile.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/SampleQuantiles.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/lib/TestMutableMetrics.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/util/TestSampleQuantiles.java


 Better high-percentile latency metrics
 --

 Key: HADOOP-8541
 URL: https://issues.apache.org/jira/browse/HADOOP-8541
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.0.1-alpha

 Attachments: hadoop-8541-1.patch, hadoop-8541-2.patch, 
 hadoop-8541-3.patch, hadoop-8541-4.patch, hadoop-8541-5.patch, 
 hadoop-8541-6.patch


 Based on discussion in HBASE-6261 and with some HDFS devs, I'd like to make 
 better high-percentile latency metrics a part of hadoop-common.
 I've already got a working implementation of [1], an efficient algorithm for 
 estimating quantiles on a stream of values. It allows you to specify 
 arbitrary quantiles to track (e.g. 50th, 75th, 90th, 95th, 99th), along with 
 tight error bounds. This estimator can be snapshotted and reset periodically 
 to get a feel for how these percentiles are changing over time.
 I propose creating a new MutableQuantiles class that does this. [1] isn't 
 completely without overhead (~1MB memory for reasonably sized windows), which 
 is why I hesitate to add it to the existing MutableStat class.
 [1] Cormode, Korn, Muthukrishnan, and Srivastava. Effective Computation of 
 Biased Quantiles over Data Streams in ICDE 2005.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8585) Fix initialization circularity between UserGroupInformation and HadoopConfiguration

2012-07-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412715#comment-13412715
 ] 

Hudson commented on HADOOP-8585:


Integrated in Hadoop-Hdfs-trunk #1101 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1101/])
HADOOP-8585. Fix initialization circularity between UserGroupInformation 
and HadoopConfiguration. Contributed by Colin Patrick McCabe. (Revision 1360498)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360498
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java


 Fix initialization circularity between UserGroupInformation and 
 HadoopConfiguration
 ---

 Key: HADOOP-8585
 URL: https://issues.apache.org/jira/browse/HADOOP-8585
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.0.1-alpha

 Attachments: HDFS-3632.001.patch


 Fix findbugs warning about initialization circularity between 
 UserGroupInformation and UserGroupInformation#HadoopConfiguration.
 From the findbugs text:
 {code}
 Initialization circularity between 
 org.apache.hadoop.security.UserGroupInformation and 
 org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration
   
 Bug type IC_INIT_CIRCULARITY (click for details)
 In class org.apache.hadoop.security.UserGroupInformation
 In class org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration
 At UserGroupInformation.java:[lines 76-1395]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8587) HarFileSystem access of harMetaCache isn't threadsafe

2012-07-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412717#comment-13412717
 ] 

Hudson commented on HADOOP-8587:


Integrated in Hadoop-Hdfs-trunk #1101 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1101/])
HADOOP-8587. HarFileSystem access of harMetaCache isn't threadsafe. 
Contributed by Eli Collins (Revision 1360448)

 Result = FAILURE
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360448
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HarFileSystem.java


 HarFileSystem access of harMetaCache isn't threadsafe
 -

 Key: HADOOP-8587
 URL: https://issues.apache.org/jira/browse/HADOOP-8587
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor
 Fix For: 1.2.0, 2.0.1-alpha

 Attachments: hadoop-8587-b1.txt, hadoop-8587.txt, hadoop-8587.txt


 HarFileSystem's use of the static harMetaCache map is not threadsafe. Credit 
 to Todd for pointing this out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8423) MapFile.Reader.get() crashes jvm or throws EOFException on Snappy or LZO block-compressed data

2012-07-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412742#comment-13412742
 ] 

Hudson commented on HADOOP-8423:


Integrated in Hadoop-Hdfs-0.23-Build #311 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/311/])
svn merge -c 1359866 FIXES: HADOOP-8423. MapFile.Reader.get() crashes jvm 
or throws EOFException on Snappy or LZO block-compressed data. Contributed by 
Todd Lipcon. (Revision 1360264)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360264
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/BlockDecompressorStream.java
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java


 MapFile.Reader.get() crashes jvm or throws EOFException on Snappy or LZO 
 block-compressed data
 --

 Key: HADOOP-8423
 URL: https://issues.apache.org/jira/browse/HADOOP-8423
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2
 Environment: Linux 2.6.32.23-0.3-default #1 SMP 2010-10-07 14:57:45 
 +0200 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Jason B
Assignee: Todd Lipcon
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8423-branch-1.patch, HADOOP-8423-branch-1.patch, 
 MapFileCodecTest.java, hadoop-8423.txt


 I am using Cloudera distribution cdh3u1.
 When trying to check native codecs for better decompression
 performance such as Snappy or LZO, I ran into issues with random
 access using MapFile.Reader.get(key, value) method.
 First call of MapFile.Reader.get() works but a second call fails.
 Also  I am getting different exceptions depending on number of entries
 in a map file.
 With LzoCodec and 10 record file, jvm gets aborted.
 At the same time the DefaultCodec works fine for all cases, as well as
 record compression for the native codecs.
 I created a simple test program (attached) that creates map files
 locally with sizes of 10 and 100 records for three codecs: Default,
 Snappy, and LZO.
 (The test requires corresponding native library available)
 The summary of problems are given below:
 Map Size: 100
 Compression: RECORD
 ==
 DefaultCodec:  OK
 SnappyCodec: OK
 LzoCodec: OK
 Map Size: 10
 Compression: RECORD
 ==
 DefaultCodec:  OK
 SnappyCodec: OK
 LzoCodec: OK
 Map Size: 100
 Compression: BLOCK
 
 DefaultCodec:  OK
 SnappyCodec: java.io.EOFException  at
 org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114)
 LzoCodec: java.io.EOFException at
 org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114)
 Map Size: 10
 Compression: BLOCK
 ==
 DefaultCodec:  OK
 SnappyCodec: java.lang.NoClassDefFoundError: Ljava/lang/InternalError
 at 
 org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect(Native
 Method)
 LzoCodec:
 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x2b068ffcbc00, pid=6385, tid=47304763508496
 #
 # JRE version: 6.0_21-b07
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0-b17 mixed mode linux-amd64 
 )
 # Problematic frame:
 # C  [liblzo2.so.2+0x13c00]  lzo1x_decompress+0x1a0
 #

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-7836) TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain

2012-07-12 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HADOOP-7836:


Attachment: HADOOP-7836.patch

Actually, the test is flawed.  It's using the address the rpc server is 
reporting to set the service.  However, it is the client's responsibility to 
set the token service since the server has no way to know exactly what 
hostname/ip the client used.

Eli, please see if this fixes the issue for you.

 TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname 
 localhost.localdomain
 

 Key: HADOOP-7836
 URL: https://issues.apache.org/jira/browse/HADOOP-7836
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc, test
Affects Versions: 1.1.0
Reporter: Eli Collins
Priority: Minor
 Attachments: HADOOP-7836.patch, hadoop-7836.txt


 TestSaslRPC#testDigestAuthMethodHostBasedToken fails on branch-1 on some 
 hosts.
 null expected:localhost[] but was:localhost[.localdomain]
 junit.framework.ComparisonFailure: null expected:localhost[] but 
 was:localhost[.localdomain]
 null expected:[localhost] but was:[eli-thinkpad]
 junit.framework.ComparisonFailure: null expected:[localhost] but 
 was:[eli-thinkpad]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-3886) Error in javadoc of Reporter, Mapper and Progressable

2012-07-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412802#comment-13412802
 ] 

Hudson commented on HADOOP-3886:


Integrated in Hadoop-Mapreduce-trunk #1134 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/])
HADOOP-3886. Error in javadoc of Reporter, Mapper and Progressable. 
Contributed by Jingguo Yao. (harsh) (Revision 1360222)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360222
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Progressable.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Mapper.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Reporter.java


 Error in javadoc of Reporter, Mapper and Progressable
 -

 Key: HADOOP-3886
 URL: https://issues.apache.org/jira/browse/HADOOP-3886
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.23.0
Reporter: brien colwell
Assignee: Jingguo Yao
Priority: Minor
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-3886.patch, HADOOP-3886.patch


 The javadoc for Reporter says:
 In scenarios where the application takes an insignificant amount of time to 
 process individual key/value pairs
 Shouldn't this read /significant/ instead of insignificant?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API

2012-07-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412801#comment-13412801
 ] 

Hudson commented on HADOOP-8521:


Integrated in Hadoop-Mapreduce-trunk #1134 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/])
HADOOP-8521. Port StreamInputFormat to new Map Reduce API (madhukara phatak 
via bobby) (Revision 1360238)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360238
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamBaseRecordReader.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamUtil.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamXmlRecordReader.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamBaseRecordReader.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamInputFormat.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamXmlRecordReader.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/test/java/org/apache/hadoop/streaming/mapreduce
* 
/hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/test/java/org/apache/hadoop/streaming/mapreduce/TestStreamXmlRecordReader.java


 Port StreamInputFormat to new Map Reduce API
 

 Key: HADOOP-8521
 URL: https://issues.apache.org/jira/browse/HADOOP-8521
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: madhukara phatak
Assignee: madhukara phatak
 Fix For: 3.0.0

 Attachments: HADOOP-8521-1.patch, HADOOP-8521-2.patch, 
 HADOOP-8521-3.patch, HADOOP-8521.patch


 As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to 
 the new M/R API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8541) Better high-percentile latency metrics

2012-07-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412803#comment-13412803
 ] 

Hudson commented on HADOOP-8541:


Integrated in Hadoop-Mapreduce-trunk #1134 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/])
HADOOP-8541. Better high-percentile latency metrics. Contributed by Andrew 
Wang. (Revision 1360501)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360501
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MetricsRegistry.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MutableQuantiles.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/Quantile.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/SampleQuantiles.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/lib/TestMutableMetrics.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/util/TestSampleQuantiles.java


 Better high-percentile latency metrics
 --

 Key: HADOOP-8541
 URL: https://issues.apache.org/jira/browse/HADOOP-8541
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.0.1-alpha

 Attachments: hadoop-8541-1.patch, hadoop-8541-2.patch, 
 hadoop-8541-3.patch, hadoop-8541-4.patch, hadoop-8541-5.patch, 
 hadoop-8541-6.patch


 Based on discussion in HBASE-6261 and with some HDFS devs, I'd like to make 
 better high-percentile latency metrics a part of hadoop-common.
 I've already got a working implementation of [1], an efficient algorithm for 
 estimating quantiles on a stream of values. It allows you to specify 
 arbitrary quantiles to track (e.g. 50th, 75th, 90th, 95th, 99th), along with 
 tight error bounds. This estimator can be snapshotted and reset periodically 
 to get a feel for how these percentiles are changing over time.
 I propose creating a new MutableQuantiles class that does this. [1] isn't 
 completely without overhead (~1MB memory for reasonably sized windows), which 
 is why I hesitate to add it to the existing MutableStat class.
 [1] Cormode, Korn, Muthukrishnan, and Srivastava. Effective Computation of 
 Biased Quantiles over Data Streams in ICDE 2005.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8587) HarFileSystem access of harMetaCache isn't threadsafe

2012-07-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412808#comment-13412808
 ] 

Hudson commented on HADOOP-8587:


Integrated in Hadoop-Mapreduce-trunk #1134 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/])
HADOOP-8587. HarFileSystem access of harMetaCache isn't threadsafe. 
Contributed by Eli Collins (Revision 1360448)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360448
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HarFileSystem.java


 HarFileSystem access of harMetaCache isn't threadsafe
 -

 Key: HADOOP-8587
 URL: https://issues.apache.org/jira/browse/HADOOP-8587
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor
 Fix For: 1.2.0, 2.0.1-alpha

 Attachments: hadoop-8587-b1.txt, hadoop-8587.txt, hadoop-8587.txt


 HarFileSystem's use of the static harMetaCache map is not threadsafe. Credit 
 to Todd for pointing this out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8585) Fix initialization circularity between UserGroupInformation and HadoopConfiguration

2012-07-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412806#comment-13412806
 ] 

Hudson commented on HADOOP-8585:


Integrated in Hadoop-Mapreduce-trunk #1134 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/])
HADOOP-8585. Fix initialization circularity between UserGroupInformation 
and HadoopConfiguration. Contributed by Colin Patrick McCabe. (Revision 1360498)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360498
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java


 Fix initialization circularity between UserGroupInformation and 
 HadoopConfiguration
 ---

 Key: HADOOP-8585
 URL: https://issues.apache.org/jira/browse/HADOOP-8585
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.0.1-alpha

 Attachments: HDFS-3632.001.patch


 Fix findbugs warning about initialization circularity between 
 UserGroupInformation and UserGroupInformation#HadoopConfiguration.
 From the findbugs text:
 {code}
 Initialization circularity between 
 org.apache.hadoop.security.UserGroupInformation and 
 org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration
   
 Bug type IC_INIT_CIRCULARITY (click for details)
 In class org.apache.hadoop.security.UserGroupInformation
 In class org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration
 At UserGroupInformation.java:[lines 76-1395]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8551) fs -mkdir creates parent directories without the -p option

2012-07-12 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HADOOP-8551:


Attachment: HADOOP-8551.patch

This should fix it, but need to write tests.

 fs -mkdir creates parent directories without the -p option
 --

 Key: HADOOP-8551
 URL: https://issues.apache.org/jira/browse/HADOOP-8551
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 0.23.3, 2.0.1-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Daryn Sharp
 Attachments: HADOOP-8551.patch


 hadoop fs -mkdir foo/bar will work even if bar is not present.  It should 
 only work if -p is given and foo is not present.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8550) hadoop fs -touchz automatically created parent directories

2012-07-12 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HADOOP-8550:


Attachment: HADOOP-8550.patch

This should fix it, but need to write tests.

 hadoop fs -touchz automatically created parent directories
 --

 Key: HADOOP-8550
 URL: https://issues.apache.org/jira/browse/HADOOP-8550
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 0.23.3, 2.0.1-alpha, 3.0.0
Reporter: Robert Joseph Evans
 Attachments: HADOOP-8550.patch


 Recently many of the fsShell commands were updated to be more POSIX 
 compliant.  touchz appears to have been missed, or has regressed.  If it has 
 regressed then the target version should be 0.23.3.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7753) Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class

2012-07-12 Thread Cristina L. Abad (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412866#comment-13412866
 ] 

Cristina L. Abad commented on HADOOP-7753:
--

In NativeIO.c it might be worth referencing RedHat BZ 554735 (see 
https://bugzilla.redhat.com/show_bug.cgi?id=554735). We were doing some tests 
on RHEL5.4 and that bug lead to exceptions (and decreased performance). We 
fixed it by compiling/using the 64-bit native libraries. We are now seeing 
significant increase in performance: an average of 24% improvement in running 
time across 10-runs of the Sort benchmark (10-node cluster, 35GB being sorted).


 Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class
 

 Key: HADOOP-7753
 URL: https://issues.apache.org/jira/browse/HADOOP-7753
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: io, native, performance
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HADOOP-7753.branch-1.patch, HADOOP-7753.branch-1.patch, 
 hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, 
 hadoop-7753.txt


 This JIRA adds JNI wrappers for sync_data_range and posix_fadvise. It also 
 implements a ReadaheadPool class for future use from HDFS and MapReduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8587) HarFileSystem access of harMetaCache isn't threadsafe

2012-07-12 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HADOOP-8587:


Fix Version/s: 0.23.3

 HarFileSystem access of harMetaCache isn't threadsafe
 -

 Key: HADOOP-8587
 URL: https://issues.apache.org/jira/browse/HADOOP-8587
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor
 Fix For: 1.2.0, 0.23.3, 2.0.1-alpha

 Attachments: hadoop-8587-b1.txt, hadoop-8587.txt, hadoop-8587.txt


 HarFileSystem's use of the static harMetaCache map is not threadsafe. Credit 
 to Todd for pointing this out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8585) Fix initialization circularity between UserGroupInformation and HadoopConfiguration

2012-07-12 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412935#comment-13412935
 ] 

Aaron T. Myers commented on HADOOP-8585:


bq. Wouldn't it have been easier to just suppress the warning as a false alarm?

That would certainly work, but it doesn't seem much easier to me. I also don't 
see how the method used by this patch could possibly cause any problems, as it 
changes the behavior back to what it was before the recently-committed 
HDFS-3568, i.e. each invocation of newLoginContext will create a new 
HadoopConfiguration object.

 Fix initialization circularity between UserGroupInformation and 
 HadoopConfiguration
 ---

 Key: HADOOP-8585
 URL: https://issues.apache.org/jira/browse/HADOOP-8585
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.0.1-alpha

 Attachments: HDFS-3632.001.patch


 Fix findbugs warning about initialization circularity between 
 UserGroupInformation and UserGroupInformation#HadoopConfiguration.
 From the findbugs text:
 {code}
 Initialization circularity between 
 org.apache.hadoop.security.UserGroupInformation and 
 org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration
   
 Bug type IC_INIT_CIRCULARITY (click for details)
 In class org.apache.hadoop.security.UserGroupInformation
 In class org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration
 At UserGroupInformation.java:[lines 76-1395]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8582) Improve error reporting for GZIP-compressed SequenceFiles with missing native libraries.

2012-07-12 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413007#comment-13413007
 ] 

Harsh J commented on HADOOP-8582:
-

Daryn,

Its sorta the latter. To be clearer, the reason is this, from HADOOP-538:

{quote}
Arun:

Context: gzip is just zlib algo + extra headers. 
java.util.zip.GZIP{Input|Output}Stream and hence existing GzipCodec won't work 
with SequenceFile due the fact that java.util.zip.GZIP{Input|Output}Streams 
will try to read/write gzip headers in the constructors which won't work in 
SequenceFiles since we typically read data from disk onto buffers, these 
buffers are empty on startup/after-reset and cause the 
java.util.zip.GZIP{Input|Output}Streams to fail.
{quote}

 Improve error reporting for GZIP-compressed SequenceFiles with missing native 
 libraries.
 

 Key: HADOOP-8582
 URL: https://issues.apache.org/jira/browse/HADOOP-8582
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Paul Wilkinson
Priority: Minor
 Attachments: HADOOP-8582-1.diff


 At present it is not possible to write or read block-compressed SequenceFiles 
 using the GZIP codec without the native libraries being available.
 The SequenceFile.Writer code checks for the availability of native libraries 
 and throws a useful exception, but the SequenceFile.Reader doesn't do the 
 same:
 {noformat}
 Exception in thread main java.io.EOFException
   at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:249)
   at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:239)
   at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:142)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:58)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:67)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.init(GzipCodec.java:95)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.init(GzipCodec.java:104)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:173)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:183)
   at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1591)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1493)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1480)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1475)
   at test.SequenceReader.read(SequenceReader.java:23)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-6817) SequenceFile.Reader can't read gzip format compressed sequence file which produce by a mapreduce job without native compression library

2012-07-12 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413009#comment-13413009
 ] 

Harsh J commented on HADOOP-6817:
-

Hi Niels,

Am happy to reopen this, but the reason for this not to work is explained at 
HADOOP-538.

I will add that as a link as well.

Do you still wish to reopen?

 SequenceFile.Reader can't read gzip format compressed sequence file which 
 produce by a mapreduce job without native compression library
 ---

 Key: HADOOP-6817
 URL: https://issues.apache.org/jira/browse/HADOOP-6817
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2
 Environment: Cluster:CentOS 5,jdk1.6.0_20
 Client:Mac SnowLeopard,jdk1.6.0_20
Reporter: Wenjun Huang

 An hadoop job output a gzip compressed sequence file(whether record 
 compressed or block compressed).The client program use SequenceFile.Reader to 
 read this sequence file,when reading the client program shows the following 
 exceptions:
 2090 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load 
 native-hadoop library for your platform... using builtin-java classes where 
 applicable
 2091 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new 
 decompressor
 Exception in thread main java.io.EOFException
   at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:207)
   at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:197)
   at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:136)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:58)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:68)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.init(GzipCodec.java:92)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.init(GzipCodec.java:101)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:170)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:180)
   at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1428)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1417)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1412)
   at 
 com.shiningware.intelligenceonline.taobao.mapreduce.HtmlContentSeqOutputView.main(HtmlContentSeqOutputView.java:28)
 I studied the code in org.apache.hadoop.io.SequenceFile.Reader.init method 
 and read:
   // Initialize... *not* if this we are constructing a temporary Reader
   if (!tempReader) {
 valBuffer = new DataInputBuffer();
 if (decompress) {
   valDecompressor = CodecPool.getDecompressor(codec);
   valInFilter = codec.createInputStream(valBuffer, valDecompressor);
   valIn = new DataInputStream(valInFilter);
 } else {
   valIn = valBuffer;
 }
 the problem seems to be caused by valBuffer = new DataInputBuffer(); 
 ,because GzipCodec.createInputStream creates an instance of GzipInputStream 
 whose constructor creates an instance of ResetableGZIPInputStream class.When 
 ResetableGZIPInputStream's constructor calls it base class 
 java.util.zip.GZIPInputStream's constructor ,it trys to read the empty 
 valBuffer = new DataInputBuffer(); and get no content,so it throws an 
 EOFException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Moved] (HADOOP-8591) TestZKFailoverController.testOneOfEverything timesout

2012-07-12 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon moved HDFS-3635 to HADOOP-8591:
---

 Target Version/s:   (was: 2.0.1-alpha)
Affects Version/s: (was: 2.0.0-alpha)
   2.0.0-alpha
   Issue Type: Bug  (was: Improvement)
  Key: HADOOP-8591  (was: HDFS-3635)
  Project: Hadoop Common  (was: Hadoop HDFS)

 TestZKFailoverController.testOneOfEverything timesout
 -

 Key: HADOOP-8591
 URL: https://issues.apache.org/jira/browse/HADOOP-8591
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins

 Looks like the TestZKFailoverController timeout needs to be bumped.
 {noformat}
 java.lang.Exception: test timed out after 3 milliseconds
   at java.lang.Object.wait(Native Method)
   at 
 org.apache.hadoop.ha.ZKFailoverController.waitForActiveAttempt(ZKFailoverController.java:460)
   at 
 org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:648)
   at 
 org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:58)
   at 
 org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:593)
   at 
 org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:590)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1334)
   at 
 org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:590)
   at 
 org.apache.hadoop.ha.TestZKFailoverController.testOneOfEverything(TestZKFailoverController.java:575)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8591) TestZKFailoverController.testOneOfEverything timesout

2012-07-12 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-8591:


Component/s: test
 ha
 auto-failover

 TestZKFailoverController.testOneOfEverything timesout
 -

 Key: HADOOP-8591
 URL: https://issues.apache.org/jira/browse/HADOOP-8591
 Project: Hadoop Common
  Issue Type: Bug
  Components: auto-failover, ha, test
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins

 Looks like the TestZKFailoverController timeout needs to be bumped.
 {noformat}
 java.lang.Exception: test timed out after 3 milliseconds
   at java.lang.Object.wait(Native Method)
   at 
 org.apache.hadoop.ha.ZKFailoverController.waitForActiveAttempt(ZKFailoverController.java:460)
   at 
 org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:648)
   at 
 org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:58)
   at 
 org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:593)
   at 
 org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:590)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1334)
   at 
 org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:590)
   at 
 org.apache.hadoop.ha.TestZKFailoverController.testOneOfEverything(TestZKFailoverController.java:575)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7836) TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain

2012-07-12 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413041#comment-13413041
 ] 

Eli Collins commented on HADOOP-7836:
-

+1  looks good

 TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname 
 localhost.localdomain
 

 Key: HADOOP-7836
 URL: https://issues.apache.org/jira/browse/HADOOP-7836
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc, test
Affects Versions: 1.1.0
Reporter: Eli Collins
Priority: Minor
 Attachments: HADOOP-7836.patch, hadoop-7836.txt


 TestSaslRPC#testDigestAuthMethodHostBasedToken fails on branch-1 on some 
 hosts.
 null expected:localhost[] but was:localhost[.localdomain]
 junit.framework.ComparisonFailure: null expected:localhost[] but 
 was:localhost[.localdomain]
 null expected:[localhost] but was:[eli-thinkpad]
 junit.framework.ComparisonFailure: null expected:[localhost] but 
 was:[eli-thinkpad]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-7836) TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain

2012-07-12 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-7836.
-

  Resolution: Fixed
   Fix Version/s: 1.2.0
Target Version/s:   (was: 1.1.0)

I've committed this, thanks Daryn!

Do we need a jira for the same test forward ported to trunk?

 TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname 
 localhost.localdomain
 

 Key: HADOOP-7836
 URL: https://issues.apache.org/jira/browse/HADOOP-7836
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc, test
Affects Versions: 1.1.0
Reporter: Eli Collins
Priority: Minor
 Fix For: 1.2.0

 Attachments: HADOOP-7836.patch, hadoop-7836.txt


 TestSaslRPC#testDigestAuthMethodHostBasedToken fails on branch-1 on some 
 hosts.
 null expected:localhost[] but was:localhost[.localdomain]
 junit.framework.ComparisonFailure: null expected:localhost[] but 
 was:localhost[.localdomain]
 null expected:[localhost] but was:[eli-thinkpad]
 junit.framework.ComparisonFailure: null expected:[localhost] but 
 was:[eli-thinkpad]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8591) TestZKFailoverController.testOneOfEverything timesout

2012-07-12 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413049#comment-13413049
 ] 

Eli Collins commented on HADOOP-8591:
-

Yup, thanks.

 TestZKFailoverController.testOneOfEverything timesout
 -

 Key: HADOOP-8591
 URL: https://issues.apache.org/jira/browse/HADOOP-8591
 Project: Hadoop Common
  Issue Type: Bug
  Components: auto-failover, ha, test
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins

 Looks like the TestZKFailoverController timeout needs to be bumped.
 {noformat}
 java.lang.Exception: test timed out after 3 milliseconds
   at java.lang.Object.wait(Native Method)
   at 
 org.apache.hadoop.ha.ZKFailoverController.waitForActiveAttempt(ZKFailoverController.java:460)
   at 
 org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:648)
   at 
 org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:58)
   at 
 org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:593)
   at 
 org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:590)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1334)
   at 
 org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:590)
   at 
 org.apache.hadoop.ha.TestZKFailoverController.testOneOfEverything(TestZKFailoverController.java:575)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-6817) SequenceFile.Reader can't read gzip format compressed sequence file, which produce by a mapreduce job, without native compression library

2012-07-12 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-6817:


Summary: SequenceFile.Reader can't read gzip format compressed sequence 
file, which produce by a mapreduce job, without native compression library  
(was: SequenceFile.Reader can't read gzip format compressed sequence file which 
produce by a mapreduce job without native compression library)

 SequenceFile.Reader can't read gzip format compressed sequence file, which 
 produce by a mapreduce job, without native compression library
 -

 Key: HADOOP-6817
 URL: https://issues.apache.org/jira/browse/HADOOP-6817
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2
 Environment: Cluster:CentOS 5,jdk1.6.0_20
 Client:Mac SnowLeopard,jdk1.6.0_20
Reporter: Wenjun Huang

 An hadoop job output a gzip compressed sequence file(whether record 
 compressed or block compressed).The client program use SequenceFile.Reader to 
 read this sequence file,when reading the client program shows the following 
 exceptions:
 2090 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load 
 native-hadoop library for your platform... using builtin-java classes where 
 applicable
 2091 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new 
 decompressor
 Exception in thread main java.io.EOFException
   at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:207)
   at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:197)
   at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:136)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:58)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:68)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.init(GzipCodec.java:92)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.init(GzipCodec.java:101)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:170)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:180)
   at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1428)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1417)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1412)
   at 
 com.shiningware.intelligenceonline.taobao.mapreduce.HtmlContentSeqOutputView.main(HtmlContentSeqOutputView.java:28)
 I studied the code in org.apache.hadoop.io.SequenceFile.Reader.init method 
 and read:
   // Initialize... *not* if this we are constructing a temporary Reader
   if (!tempReader) {
 valBuffer = new DataInputBuffer();
 if (decompress) {
   valDecompressor = CodecPool.getDecompressor(codec);
   valInFilter = codec.createInputStream(valBuffer, valDecompressor);
   valIn = new DataInputStream(valInFilter);
 } else {
   valIn = valBuffer;
 }
 the problem seems to be caused by valBuffer = new DataInputBuffer(); 
 ,because GzipCodec.createInputStream creates an instance of GzipInputStream 
 whose constructor creates an instance of ResetableGZIPInputStream class.When 
 ResetableGZIPInputStream's constructor calls it base class 
 java.util.zip.GZIPInputStream's constructor ,it trys to read the empty 
 valBuffer = new DataInputBuffer(); and get no content,so it throws an 
 EOFException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7836) TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain

2012-07-12 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413105#comment-13413105
 ] 

Daryn Sharp commented on HADOOP-7836:
-

I haven't checked but if trunk is indeed missing these tests -- and it's not 
that they moved? -- then yes we need to port to trunk.

 TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname 
 localhost.localdomain
 

 Key: HADOOP-7836
 URL: https://issues.apache.org/jira/browse/HADOOP-7836
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc, test
Affects Versions: 1.1.0
Reporter: Eli Collins
Priority: Minor
 Fix For: 1.2.0

 Attachments: HADOOP-7836.patch, hadoop-7836.txt


 TestSaslRPC#testDigestAuthMethodHostBasedToken fails on branch-1 on some 
 hosts.
 null expected:localhost[] but was:localhost[.localdomain]
 junit.framework.ComparisonFailure: null expected:localhost[] but 
 was:localhost[.localdomain]
 null expected:[localhost] but was:[eli-thinkpad]
 junit.framework.ComparisonFailure: null expected:[localhost] but 
 was:[eli-thinkpad]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8592) Hadoop-auth should use o.a.h.util.Time methods instead of System#currentTimeMillis

2012-07-12 Thread Eli Collins (JIRA)
Eli Collins created HADOOP-8592:
---

 Summary: Hadoop-auth should use o.a.h.util.Time methods instead of 
System#currentTimeMillis
 Key: HADOOP-8592
 URL: https://issues.apache.org/jira/browse/HADOOP-8592
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Priority: Minor


HDFS-3641 moved HDFS' Time methods to common so they can be used by MR (and 
eventually others). We should replace used of System#currentTimeMillis in MR 
with Time#now (or Time#monotonicNow when computing intervals, eg to sleep).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8552) Conflict: Same security.log.file for multiple users.

2012-07-12 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413127#comment-13413127
 ] 

Alejandro Abdelnur commented on HADOOP-8552:


+1

 Conflict: Same security.log.file for multiple users. 
 -

 Key: HADOOP-8552
 URL: https://issues.apache.org/jira/browse/HADOOP-8552
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf, security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: HADOOP-8552_branch1.patch, HADOOP-8552_branch2.patch


 In log4j.properties, hadoop.security.log.file is set to SecurityAuth.audit. 
 In the presence of multiple users, this can lead to a potential conflict.
 Adding username to the log file would avoid this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7753) Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class

2012-07-12 Thread Giridharan Kesavan (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413138#comment-13413138
 ] 

Giridharan Kesavan commented on HADOOP-7753:


All the hadoop releases ships both 32 and 64 bit libraries. User can decide on 
using 32 or 64 bit library which is appropriate for his environment. So the ans 
is we compile 32 and 64 bit libs by setting CFLAGS and CXXFLAGS with the 
appropriate JDK - for more info refer to hadoop release wiki. 

 Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class
 

 Key: HADOOP-7753
 URL: https://issues.apache.org/jira/browse/HADOOP-7753
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: io, native, performance
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HADOOP-7753.branch-1.patch, HADOOP-7753.branch-1.patch, 
 hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, 
 hadoop-7753.txt


 This JIRA adds JNI wrappers for sync_data_range and posix_fadvise. It also 
 implements a ReadaheadPool class for future use from HDFS and MapReduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7753) Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class

2012-07-12 Thread Cristina L. Abad (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413165#comment-13413165
 ] 

Cristina L. Abad commented on HADOOP-7753:
--

Yes, I know that. The issue is that some people may be using the 32-bit 
libraries on 64-bit architectures. This actually works fine except if you turn 
on these features and you are running on a system with the bug I mentioned. 
Anyway, this is a minor problem, just thought it would be a good idea to 
mention it here in case someone runs into the problem and doesn't know what's 
going on.


 Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class
 

 Key: HADOOP-7753
 URL: https://issues.apache.org/jira/browse/HADOOP-7753
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: io, native, performance
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HADOOP-7753.branch-1.patch, HADOOP-7753.branch-1.patch, 
 hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, 
 hadoop-7753.txt


 This JIRA adds JNI wrappers for sync_data_range and posix_fadvise. It also 
 implements a ReadaheadPool class for future use from HDFS and MapReduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8552) Conflict: Same security.log.file for multiple users.

2012-07-12 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413213#comment-13413213
 ] 

Devaraj Das commented on HADOOP-8552:
-

Hi Karthik, is this on the client or on the server side? (Guessing its on 
client.. please confirm). In general, the audit log stuff doesn't make sense on 
the client side. It's meant to be used on the server side only (and in 
deployments I know about, the security audit logging is turned off on the 
client side). 
Your patch will work though. But I'll note that it might be introducing 
compatibility issues due to the filename change of the log file (if someone is 
collecting logs based on file names, etc.).

 Conflict: Same security.log.file for multiple users. 
 -

 Key: HADOOP-8552
 URL: https://issues.apache.org/jira/browse/HADOOP-8552
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf, security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: HADOOP-8552_branch1.patch, HADOOP-8552_branch2.patch


 In log4j.properties, hadoop.security.log.file is set to SecurityAuth.audit. 
 In the presence of multiple users, this can lead to a potential conflict.
 Adding username to the log file would avoid this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8453) Add unit tests for winutils

2012-07-12 Thread Chuan Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413303#comment-13413303
 ] 

Chuan Liu commented on HADOOP-8453:
---

Hi Bikas, I think we can still check this in as a separate or standalone test 
suites for 'winutils', and create a new JIRA for adding 'winutils' related 
tests to TestShell.

 Add unit tests for winutils
 ---

 Key: HADOOP-8453
 URL: https://issues.apache.org/jira/browse/HADOOP-8453
 Project: Hadoop Common
  Issue Type: Task
  Components: test
Affects Versions: 1.1.0, 0.24.0
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: HADOOP-8453-branch-1-win-2.patch, 
 HADOOP-8453-branch-1-win.patch


 In [Hadoop-8235|https://issues.apache.org/jira/browse/HADOOP-8235], we 
 created a Windows console program, named ‘winutils’, to emulate some Linux 
 command line utilities used by Hadoop. However no tests are provided in the 
 original patch. As this code is quite complicated, and the complexity may 
 even grow up in the future. We think unit tests are necessary to ensure code 
 quality, as well as smooth future development.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8552) Conflict: Same security.log.file for multiple users.

2012-07-12 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413317#comment-13413317
 ] 

Karthik Kambatla commented on HADOOP-8552:
--

Devaraj, thanks for the feedback.

It is both on the client/server side. By server side, I mean for the 
jobtracker/namenode. Thanks for pointing the potential compatibility issue, I 
agree we need to note the incompatibility in log file change.

 Conflict: Same security.log.file for multiple users. 
 -

 Key: HADOOP-8552
 URL: https://issues.apache.org/jira/browse/HADOOP-8552
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf, security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: HADOOP-8552_branch1.patch, HADOOP-8552_branch2.patch


 In log4j.properties, hadoop.security.log.file is set to SecurityAuth.audit. 
 In the presence of multiple users, this can lead to a potential conflict.
 Adding username to the log file would avoid this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8499) Lower min.user.id to 500 for the tests

2012-07-12 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HADOOP-8499:


 Priority: Minor  (was: Major)
 Target Version/s: 2.0.1-alpha
Affects Version/s: 2.0.0-alpha
  Summary: Lower min.user.id to 500 for the tests  (was: fix mvn 
compile -Pnative on CentOS / RHEL / Fedora / SuSE / etc)

ATM, reasonable to lower the min id to 500 for the tests?

 Lower min.user.id to 500 for the tests
 --

 Key: HADOOP-8499
 URL: https://issues.apache.org/jira/browse/HADOOP-8499
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HADOOP-8499.002.patch


 On Linux platforms where user IDs start at 500 rather than 1000, the build 
 currently is broken.  This includes CentOS, RHEL, Fedora, SuSE, and probably 
 most other Linux platforms.  It does happen to work on Debian and Ubuntu, 
 which explains why Jenkins hasn't caught it yet.
 Other users will see something like this:
 {code}
 [INFO] Requested user cmccabe has id 500, which is below the minimum allowed 
 1000
 [INFO] FAIL: test-container-executor
 [INFO] 
 [INFO] 1 of 1 test failed
 [INFO] Please report to mapreduce-...@hadoop.apache.org
 [INFO] 
 [INFO] make[1]: *** [check-TESTS] Error 1
 [INFO] make[1]: Leaving directory 
 `/home/cmccabe/hadoop4/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn
 -server/hadoop-yarn-server-nodemanager/target/native/container-executor'
 {code}
 And then the build fails.  Since native unit tests are currently unskippable 
 (HADOOP-8480) this makes the project unbuildable.
 The easy solution to this is to relax the constraint for the unit test.  
 Since the unit test already writes its own configuration file, we just need 
 to change it there.
 In general, I believe that it would make sense to change this to 500 across 
 the board.  I'm not aware of any Linuxes that create system users with IDs 
 higher than or equal to 500.  System user IDs tend to be below 200.
 However, if we do nothing else, we should at least fix the build by relaxing 
 the constraint for unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8457) Address file ownership issue for users in Administrators group on Windows.

2012-07-12 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413429#comment-13413429
 ] 

Sanjay Radia commented on HADOOP-8457:
--

Patch adds public method FileStatus#isOwnedByUser(UserGroupInformation ugi)
* Does not make sense to expose a low level structure like ugi through a 
fundamental class like FileStatus. Make the parameter String user.
* I hate the idea of adding a public method when getOwener().equals(user) is 
good enough -  but FileStatus is subclassed and is useful for the 
RawFileSystem's FileStatus. Is there another way to solve the problem in a 
simple way without adding such a method?
**  e.g. put in a util? May not work since the new code applies to FileStatus 
of RawLocalFileSystem of windows-filesystem  and not to FileStatus of hdfs 
where the client is running on a windows box.

 Address file ownership issue for users in Administrators group on Windows.
 --

 Key: HADOOP-8457
 URL: https://issues.apache.org/jira/browse/HADOOP-8457
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 1.1.0, 0.24.0
Reporter: Chuan Liu
Assignee: Ivan Mitic
Priority: Minor
 Attachments: HADOOP-8457-branch-1-win_Admins(2).patch, 
 HADOOP-8457-branch-1-win_Admins.patch


 On Linux, the initial file owners are the creators. (I think this is true in 
 general. If there are exceptions, please let me know.) On Windows, the file 
 created by a user in the Administrators group has the initial owner 
 ‘Administrators’, i.e. the the Administrators group is the initial owner of 
 the file. As a result, this leads to an exception when we check file 
 ownership in SecureIOUtils .checkStat() method. As a result, this method is 
 disabled right now. We need to address this problem and enable the method on 
 Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8457) Address file ownership issue for users in Administrators group on Windows.

2012-07-12 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413436#comment-13413436
 ] 

Ivan Mitic commented on HADOOP-8457:


Thanks for comments Sanjay. 

bq. Does not make sense to expose a low level structure like ugi through a 
fundamental class like FileStatus. Make the parameter String user.
I saw that FileSystem already had an API that accepts UGI 
({{FileSystem#closeAllForUGI())}}, that is why I thought this is fine. On the 
other hand, I didn't want to expose this on the FileSystem, as it would require 
for a caller to query the FileSystem twice if it wants to check both ownership 
and permissions.

bq. Is there another way to solve the problem in a simple way without adding 
such a method? 
Exposing it through the API makes it easier to specialize through 
RawLocalFileStatus as you noted. Yes, we could expose this functionality as a 
Util function. However, it would only do the Administrators group check if 
{{FileSystem}} is {{instanceof LocalFileSystem}} (and on Windows). Do you 
believe this would be more appropriate?


 Address file ownership issue for users in Administrators group on Windows.
 --

 Key: HADOOP-8457
 URL: https://issues.apache.org/jira/browse/HADOOP-8457
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 1.1.0, 0.24.0
Reporter: Chuan Liu
Assignee: Ivan Mitic
Priority: Minor
 Attachments: HADOOP-8457-branch-1-win_Admins(2).patch, 
 HADOOP-8457-branch-1-win_Admins.patch


 On Linux, the initial file owners are the creators. (I think this is true in 
 general. If there are exceptions, please let me know.) On Windows, the file 
 created by a user in the Administrators group has the initial owner 
 ‘Administrators’, i.e. the the Administrators group is the initial owner of 
 the file. As a result, this leads to an exception when we check file 
 ownership in SecureIOUtils .checkStat() method. As a result, this method is 
 disabled right now. We need to address this problem and enable the method on 
 Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8487) Many HDFS tests use a test path intended for local file system tests

2012-07-12 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413443#comment-13413443
 ] 

Ivan Mitic commented on HADOOP-8487:


Thanks for reviewing Daryn.

bq. In FileSystemTestHelper, does the final keyword need to be removed?
We need this because we want to be able to override the value by tests 
({{TestFSMainOperationsWebHdfs}} is one example). This path is not always used 
in the context of the local path, and this causes problems on Windows, as paths 
like {{c:/some/path/build/test/data}} are not valid DFS paths (because of the 
colon).

bq. In TestFSMainOperationsLocalFileSystem, are any changes actually needed? 
Ie. why override just to call super?
I think I saw this test failing on Windows because the super {{tearDown()}} was 
not called, causing subsequent tests to fail (could it be related to junit 
version?). Will try to repro the problem and report back.

bq. In all of test class changes, please default to build/test/data. Hardcoding 
/tmp may cause multiple test runs to collide.
This is actually the test fix. Similar comment to #1, {{build.test.data}} is a 
local path and given that it is used in HDFS tests, it fails the valid DFS path 
check. IOW, these tests should not write to the local file system, and if I 
understood your worry correctly, should not collide with other tests. We also 
have the test name embedded in the path, so it should be easy to spot such 
cases if they exists. Does it make sense?

 Many HDFS tests use a test path intended for local file system tests
 

 Key: HADOOP-8487
 URL: https://issues.apache.org/jira/browse/HADOOP-8487
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: HADOOP-8487-branch-1-win(2).patch, 
 HADOOP-8487-branch-1-win(3).patch, HADOOP-8487-branch-1-win.alternate.patch, 
 HADOOP-8487-branch-1-win.patch


 Many tests use a test path intended for local tests setup by build 
 environment. In some cases the tests fails on platforms such as windows 
 because the path contains a c:

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8582) Improve error reporting for GZIP-compressed SequenceFiles with missing native libraries.

2012-07-12 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413489#comment-13413489
 ] 

Harsh J commented on HADOOP-8582:
-

Daryn, are you good with the patch's general approach, given the above?

Paul, will you be sending an updated patch soon? If not, let me know, and I'm 
happy to tweak on your behalf and add in those changes.

 Improve error reporting for GZIP-compressed SequenceFiles with missing native 
 libraries.
 

 Key: HADOOP-8582
 URL: https://issues.apache.org/jira/browse/HADOOP-8582
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Paul Wilkinson
Priority: Minor
 Attachments: HADOOP-8582-1.diff


 At present it is not possible to write or read block-compressed SequenceFiles 
 using the GZIP codec without the native libraries being available.
 The SequenceFile.Writer code checks for the availability of native libraries 
 and throws a useful exception, but the SequenceFile.Reader doesn't do the 
 same:
 {noformat}
 Exception in thread main java.io.EOFException
   at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:249)
   at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:239)
   at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:142)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:58)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:67)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.init(GzipCodec.java:95)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.init(GzipCodec.java:104)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:173)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:183)
   at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1591)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1493)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1480)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1475)
   at test.SequenceReader.read(SequenceReader.java:23)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8457) Address file ownership issue for users in Administrators group on Windows.

2012-07-12 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413503#comment-13413503
 ] 

Bikas Saha commented on HADOOP-8457:


IMO, to me calling FileStatus.isOwnedBy() seems like a natural API to use. I 
agree it would be nice to have something other than UGI to represent user/group 
information but unfortunately there does not seem to be any such abstraction. I 
would ideally like to see FileStatus.getOwner().equals(ownerObj) where ownerObj 
is an object representing the owner that encapsulates users/groups etc. 
Currently ownerObj is simply a string name and it has worked because of simple 
1 owner, 1 group Unix model.

 Address file ownership issue for users in Administrators group on Windows.
 --

 Key: HADOOP-8457
 URL: https://issues.apache.org/jira/browse/HADOOP-8457
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 1.1.0, 0.24.0
Reporter: Chuan Liu
Assignee: Ivan Mitic
Priority: Minor
 Attachments: HADOOP-8457-branch-1-win_Admins(2).patch, 
 HADOOP-8457-branch-1-win_Admins.patch


 On Linux, the initial file owners are the creators. (I think this is true in 
 general. If there are exceptions, please let me know.) On Windows, the file 
 created by a user in the Administrators group has the initial owner 
 ‘Administrators’, i.e. the the Administrators group is the initial owner of 
 the file. As a result, this leads to an exception when we check file 
 ownership in SecureIOUtils .checkStat() method. As a result, this method is 
 disabled right now. We need to address this problem and enable the method on 
 Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira