[jira] [Commented] (HADOOP-8455) Address user name format on domain joined Windows machines
[ https://issues.apache.org/jira/browse/HADOOP-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412582#comment-13412582 ] Owen O'Malley commented on HADOOP-8455: --- This is already possible, by using the auth_to_local mapping. The cluster operator can define arbitrary mappings between long (FOO@DOMAIN) and short names (DOMAINFOO). See http://hortonworks.com/blog/fine-tune-your-apache-hadoop-security-settings/ Address user name format on domain joined Windows machines -- Key: HADOOP-8455 URL: https://issues.apache.org/jira/browse/HADOOP-8455 Project: Hadoop Common Issue Type: Bug Components: native Affects Versions: 1.1.0, 0.24.0 Reporter: Chuan Liu Assignee: Ivan Mitic Priority: Minor For a domain joined Windows machine, user name along is not a unique identifier. User name plus domain name is need in order to unique identify the user. For example, we can have both ‘Win1\Alex’ and ‘Redmond\Alex’ on a computer named Win1 that joins Redmond domain. In order to avoid ambiguity, ‘whoami’ on Windows and the new ‘winutils’ created in [Hadoop-8235|https://issues.apache.org/jira/browse/HADOOP-8235] both return [domain]\[username] as the username. In Hadoop, we only use user name right now. This may lead to some inconsistency, and production bugs if users of the same name exist on the machine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-6817) SequenceFile.Reader can't read gzip format compressed sequence file which produce by a mapreduce job without native compression library
[ https://issues.apache.org/jira/browse/HADOOP-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412705#comment-13412705 ] Niels Basjes commented on HADOOP-6817: -- To me this seems NOT to be a duplicate of HADOOP-8582 . To me this issue is essentially: Problem with Gzip in specific situation. HADOOP-8582 effectively says Lets make the error message clear until we fix the real problem. So I propose we keep this open as an unsolved 'non-duplicate' bug SequenceFile.Reader can't read gzip format compressed sequence file which produce by a mapreduce job without native compression library --- Key: HADOOP-6817 URL: https://issues.apache.org/jira/browse/HADOOP-6817 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.20.2 Environment: Cluster:CentOS 5,jdk1.6.0_20 Client:Mac SnowLeopard,jdk1.6.0_20 Reporter: Wenjun Huang An hadoop job output a gzip compressed sequence file(whether record compressed or block compressed).The client program use SequenceFile.Reader to read this sequence file,when reading the client program shows the following exceptions: 2090 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2091 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor Exception in thread main java.io.EOFException at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:207) at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:197) at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:136) at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:58) at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:68) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.init(GzipCodec.java:92) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.init(GzipCodec.java:101) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:170) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:180) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1428) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1417) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1412) at com.shiningware.intelligenceonline.taobao.mapreduce.HtmlContentSeqOutputView.main(HtmlContentSeqOutputView.java:28) I studied the code in org.apache.hadoop.io.SequenceFile.Reader.init method and read: // Initialize... *not* if this we are constructing a temporary Reader if (!tempReader) { valBuffer = new DataInputBuffer(); if (decompress) { valDecompressor = CodecPool.getDecompressor(codec); valInFilter = codec.createInputStream(valBuffer, valDecompressor); valIn = new DataInputStream(valInFilter); } else { valIn = valBuffer; } the problem seems to be caused by valBuffer = new DataInputBuffer(); ,because GzipCodec.createInputStream creates an instance of GzipInputStream whose constructor creates an instance of ResetableGZIPInputStream class.When ResetableGZIPInputStream's constructor calls it base class java.util.zip.GZIPInputStream's constructor ,it trys to read the empty valBuffer = new DataInputBuffer(); and get no content,so it throws an EOFException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API
[ https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412710#comment-13412710 ] Hudson commented on HADOOP-8521: Integrated in Hadoop-Hdfs-trunk #1101 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1101/]) HADOOP-8521. Port StreamInputFormat to new Map Reduce API (madhukara phatak via bobby) (Revision 1360238) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360238 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamBaseRecordReader.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamUtil.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamXmlRecordReader.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamBaseRecordReader.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamInputFormat.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamXmlRecordReader.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/test/java/org/apache/hadoop/streaming/mapreduce * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/test/java/org/apache/hadoop/streaming/mapreduce/TestStreamXmlRecordReader.java Port StreamInputFormat to new Map Reduce API Key: HADOOP-8521 URL: https://issues.apache.org/jira/browse/HADOOP-8521 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: madhukara phatak Assignee: madhukara phatak Fix For: 3.0.0 Attachments: HADOOP-8521-1.patch, HADOOP-8521-2.patch, HADOOP-8521-3.patch, HADOOP-8521.patch As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to the new M/R API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8541) Better high-percentile latency metrics
[ https://issues.apache.org/jira/browse/HADOOP-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412712#comment-13412712 ] Hudson commented on HADOOP-8541: Integrated in Hadoop-Hdfs-trunk #1101 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1101/]) HADOOP-8541. Better high-percentile latency metrics. Contributed by Andrew Wang. (Revision 1360501) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360501 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MetricsRegistry.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MutableQuantiles.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/Quantile.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/SampleQuantiles.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/lib/TestMutableMetrics.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/util/TestSampleQuantiles.java Better high-percentile latency metrics -- Key: HADOOP-8541 URL: https://issues.apache.org/jira/browse/HADOOP-8541 Project: Hadoop Common Issue Type: Improvement Components: metrics Affects Versions: 2.0.0-alpha Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.0.1-alpha Attachments: hadoop-8541-1.patch, hadoop-8541-2.patch, hadoop-8541-3.patch, hadoop-8541-4.patch, hadoop-8541-5.patch, hadoop-8541-6.patch Based on discussion in HBASE-6261 and with some HDFS devs, I'd like to make better high-percentile latency metrics a part of hadoop-common. I've already got a working implementation of [1], an efficient algorithm for estimating quantiles on a stream of values. It allows you to specify arbitrary quantiles to track (e.g. 50th, 75th, 90th, 95th, 99th), along with tight error bounds. This estimator can be snapshotted and reset periodically to get a feel for how these percentiles are changing over time. I propose creating a new MutableQuantiles class that does this. [1] isn't completely without overhead (~1MB memory for reasonably sized windows), which is why I hesitate to add it to the existing MutableStat class. [1] Cormode, Korn, Muthukrishnan, and Srivastava. Effective Computation of Biased Quantiles over Data Streams in ICDE 2005. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8585) Fix initialization circularity between UserGroupInformation and HadoopConfiguration
[ https://issues.apache.org/jira/browse/HADOOP-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412715#comment-13412715 ] Hudson commented on HADOOP-8585: Integrated in Hadoop-Hdfs-trunk #1101 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1101/]) HADOOP-8585. Fix initialization circularity between UserGroupInformation and HadoopConfiguration. Contributed by Colin Patrick McCabe. (Revision 1360498) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360498 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java Fix initialization circularity between UserGroupInformation and HadoopConfiguration --- Key: HADOOP-8585 URL: https://issues.apache.org/jira/browse/HADOOP-8585 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.0.1-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.1-alpha Attachments: HDFS-3632.001.patch Fix findbugs warning about initialization circularity between UserGroupInformation and UserGroupInformation#HadoopConfiguration. From the findbugs text: {code} Initialization circularity between org.apache.hadoop.security.UserGroupInformation and org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration Bug type IC_INIT_CIRCULARITY (click for details) In class org.apache.hadoop.security.UserGroupInformation In class org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration At UserGroupInformation.java:[lines 76-1395] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8587) HarFileSystem access of harMetaCache isn't threadsafe
[ https://issues.apache.org/jira/browse/HADOOP-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412717#comment-13412717 ] Hudson commented on HADOOP-8587: Integrated in Hadoop-Hdfs-trunk #1101 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1101/]) HADOOP-8587. HarFileSystem access of harMetaCache isn't threadsafe. Contributed by Eli Collins (Revision 1360448) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360448 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HarFileSystem.java HarFileSystem access of harMetaCache isn't threadsafe - Key: HADOOP-8587 URL: https://issues.apache.org/jira/browse/HADOOP-8587 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Fix For: 1.2.0, 2.0.1-alpha Attachments: hadoop-8587-b1.txt, hadoop-8587.txt, hadoop-8587.txt HarFileSystem's use of the static harMetaCache map is not threadsafe. Credit to Todd for pointing this out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8423) MapFile.Reader.get() crashes jvm or throws EOFException on Snappy or LZO block-compressed data
[ https://issues.apache.org/jira/browse/HADOOP-8423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412742#comment-13412742 ] Hudson commented on HADOOP-8423: Integrated in Hadoop-Hdfs-0.23-Build #311 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/311/]) svn merge -c 1359866 FIXES: HADOOP-8423. MapFile.Reader.get() crashes jvm or throws EOFException on Snappy or LZO block-compressed data. Contributed by Todd Lipcon. (Revision 1360264) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360264 Files : * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/BlockDecompressorStream.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java MapFile.Reader.get() crashes jvm or throws EOFException on Snappy or LZO block-compressed data -- Key: HADOOP-8423 URL: https://issues.apache.org/jira/browse/HADOOP-8423 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.20.2 Environment: Linux 2.6.32.23-0.3-default #1 SMP 2010-10-07 14:57:45 +0200 x86_64 x86_64 x86_64 GNU/Linux Reporter: Jason B Assignee: Todd Lipcon Fix For: 2.0.1-alpha Attachments: HADOOP-8423-branch-1.patch, HADOOP-8423-branch-1.patch, MapFileCodecTest.java, hadoop-8423.txt I am using Cloudera distribution cdh3u1. When trying to check native codecs for better decompression performance such as Snappy or LZO, I ran into issues with random access using MapFile.Reader.get(key, value) method. First call of MapFile.Reader.get() works but a second call fails. Also I am getting different exceptions depending on number of entries in a map file. With LzoCodec and 10 record file, jvm gets aborted. At the same time the DefaultCodec works fine for all cases, as well as record compression for the native codecs. I created a simple test program (attached) that creates map files locally with sizes of 10 and 100 records for three codecs: Default, Snappy, and LZO. (The test requires corresponding native library available) The summary of problems are given below: Map Size: 100 Compression: RECORD == DefaultCodec: OK SnappyCodec: OK LzoCodec: OK Map Size: 10 Compression: RECORD == DefaultCodec: OK SnappyCodec: OK LzoCodec: OK Map Size: 100 Compression: BLOCK DefaultCodec: OK SnappyCodec: java.io.EOFException at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114) LzoCodec: java.io.EOFException at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114) Map Size: 10 Compression: BLOCK == DefaultCodec: OK SnappyCodec: java.lang.NoClassDefFoundError: Ljava/lang/InternalError at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect(Native Method) LzoCodec: # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x2b068ffcbc00, pid=6385, tid=47304763508496 # # JRE version: 6.0_21-b07 # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0-b17 mixed mode linux-amd64 ) # Problematic frame: # C [liblzo2.so.2+0x13c00] lzo1x_decompress+0x1a0 # -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-7836) TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain
[ https://issues.apache.org/jira/browse/HADOOP-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HADOOP-7836: Attachment: HADOOP-7836.patch Actually, the test is flawed. It's using the address the rpc server is reporting to set the service. However, it is the client's responsibility to set the token service since the server has no way to know exactly what hostname/ip the client used. Eli, please see if this fixes the issue for you. TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain Key: HADOOP-7836 URL: https://issues.apache.org/jira/browse/HADOOP-7836 Project: Hadoop Common Issue Type: Bug Components: ipc, test Affects Versions: 1.1.0 Reporter: Eli Collins Priority: Minor Attachments: HADOOP-7836.patch, hadoop-7836.txt TestSaslRPC#testDigestAuthMethodHostBasedToken fails on branch-1 on some hosts. null expected:localhost[] but was:localhost[.localdomain] junit.framework.ComparisonFailure: null expected:localhost[] but was:localhost[.localdomain] null expected:[localhost] but was:[eli-thinkpad] junit.framework.ComparisonFailure: null expected:[localhost] but was:[eli-thinkpad] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-3886) Error in javadoc of Reporter, Mapper and Progressable
[ https://issues.apache.org/jira/browse/HADOOP-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412802#comment-13412802 ] Hudson commented on HADOOP-3886: Integrated in Hadoop-Mapreduce-trunk #1134 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/]) HADOOP-3886. Error in javadoc of Reporter, Mapper and Progressable. Contributed by Jingguo Yao. (harsh) (Revision 1360222) Result = SUCCESS harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360222 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Progressable.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Mapper.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Reporter.java Error in javadoc of Reporter, Mapper and Progressable - Key: HADOOP-3886 URL: https://issues.apache.org/jira/browse/HADOOP-3886 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 0.23.0 Reporter: brien colwell Assignee: Jingguo Yao Priority: Minor Fix For: 2.0.1-alpha Attachments: HADOOP-3886.patch, HADOOP-3886.patch The javadoc for Reporter says: In scenarios where the application takes an insignificant amount of time to process individual key/value pairs Shouldn't this read /significant/ instead of insignificant? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API
[ https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412801#comment-13412801 ] Hudson commented on HADOOP-8521: Integrated in Hadoop-Mapreduce-trunk #1134 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/]) HADOOP-8521. Port StreamInputFormat to new Map Reduce API (madhukara phatak via bobby) (Revision 1360238) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360238 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamBaseRecordReader.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamUtil.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/StreamXmlRecordReader.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamBaseRecordReader.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamInputFormat.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/mapreduce/StreamXmlRecordReader.java * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/test/java/org/apache/hadoop/streaming/mapreduce * /hadoop/common/trunk/hadoop-tools/hadoop-streaming/src/test/java/org/apache/hadoop/streaming/mapreduce/TestStreamXmlRecordReader.java Port StreamInputFormat to new Map Reduce API Key: HADOOP-8521 URL: https://issues.apache.org/jira/browse/HADOOP-8521 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: madhukara phatak Assignee: madhukara phatak Fix For: 3.0.0 Attachments: HADOOP-8521-1.patch, HADOOP-8521-2.patch, HADOOP-8521-3.patch, HADOOP-8521.patch As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to the new M/R API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8541) Better high-percentile latency metrics
[ https://issues.apache.org/jira/browse/HADOOP-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412803#comment-13412803 ] Hudson commented on HADOOP-8541: Integrated in Hadoop-Mapreduce-trunk #1134 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/]) HADOOP-8541. Better high-percentile latency metrics. Contributed by Andrew Wang. (Revision 1360501) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360501 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MetricsRegistry.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MutableQuantiles.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/Quantile.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/SampleQuantiles.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/lib/TestMutableMetrics.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/util/TestSampleQuantiles.java Better high-percentile latency metrics -- Key: HADOOP-8541 URL: https://issues.apache.org/jira/browse/HADOOP-8541 Project: Hadoop Common Issue Type: Improvement Components: metrics Affects Versions: 2.0.0-alpha Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.0.1-alpha Attachments: hadoop-8541-1.patch, hadoop-8541-2.patch, hadoop-8541-3.patch, hadoop-8541-4.patch, hadoop-8541-5.patch, hadoop-8541-6.patch Based on discussion in HBASE-6261 and with some HDFS devs, I'd like to make better high-percentile latency metrics a part of hadoop-common. I've already got a working implementation of [1], an efficient algorithm for estimating quantiles on a stream of values. It allows you to specify arbitrary quantiles to track (e.g. 50th, 75th, 90th, 95th, 99th), along with tight error bounds. This estimator can be snapshotted and reset periodically to get a feel for how these percentiles are changing over time. I propose creating a new MutableQuantiles class that does this. [1] isn't completely without overhead (~1MB memory for reasonably sized windows), which is why I hesitate to add it to the existing MutableStat class. [1] Cormode, Korn, Muthukrishnan, and Srivastava. Effective Computation of Biased Quantiles over Data Streams in ICDE 2005. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8587) HarFileSystem access of harMetaCache isn't threadsafe
[ https://issues.apache.org/jira/browse/HADOOP-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412808#comment-13412808 ] Hudson commented on HADOOP-8587: Integrated in Hadoop-Mapreduce-trunk #1134 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/]) HADOOP-8587. HarFileSystem access of harMetaCache isn't threadsafe. Contributed by Eli Collins (Revision 1360448) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360448 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HarFileSystem.java HarFileSystem access of harMetaCache isn't threadsafe - Key: HADOOP-8587 URL: https://issues.apache.org/jira/browse/HADOOP-8587 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Fix For: 1.2.0, 2.0.1-alpha Attachments: hadoop-8587-b1.txt, hadoop-8587.txt, hadoop-8587.txt HarFileSystem's use of the static harMetaCache map is not threadsafe. Credit to Todd for pointing this out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8585) Fix initialization circularity between UserGroupInformation and HadoopConfiguration
[ https://issues.apache.org/jira/browse/HADOOP-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412806#comment-13412806 ] Hudson commented on HADOOP-8585: Integrated in Hadoop-Mapreduce-trunk #1134 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/]) HADOOP-8585. Fix initialization circularity between UserGroupInformation and HadoopConfiguration. Contributed by Colin Patrick McCabe. (Revision 1360498) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360498 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java Fix initialization circularity between UserGroupInformation and HadoopConfiguration --- Key: HADOOP-8585 URL: https://issues.apache.org/jira/browse/HADOOP-8585 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.0.1-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.1-alpha Attachments: HDFS-3632.001.patch Fix findbugs warning about initialization circularity between UserGroupInformation and UserGroupInformation#HadoopConfiguration. From the findbugs text: {code} Initialization circularity between org.apache.hadoop.security.UserGroupInformation and org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration Bug type IC_INIT_CIRCULARITY (click for details) In class org.apache.hadoop.security.UserGroupInformation In class org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration At UserGroupInformation.java:[lines 76-1395] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8551) fs -mkdir creates parent directories without the -p option
[ https://issues.apache.org/jira/browse/HADOOP-8551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HADOOP-8551: Attachment: HADOOP-8551.patch This should fix it, but need to write tests. fs -mkdir creates parent directories without the -p option -- Key: HADOOP-8551 URL: https://issues.apache.org/jira/browse/HADOOP-8551 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 0.23.3, 2.0.1-alpha, 3.0.0 Reporter: Robert Joseph Evans Assignee: Daryn Sharp Attachments: HADOOP-8551.patch hadoop fs -mkdir foo/bar will work even if bar is not present. It should only work if -p is given and foo is not present. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8550) hadoop fs -touchz automatically created parent directories
[ https://issues.apache.org/jira/browse/HADOOP-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HADOOP-8550: Attachment: HADOOP-8550.patch This should fix it, but need to write tests. hadoop fs -touchz automatically created parent directories -- Key: HADOOP-8550 URL: https://issues.apache.org/jira/browse/HADOOP-8550 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 0.23.3, 2.0.1-alpha, 3.0.0 Reporter: Robert Joseph Evans Attachments: HADOOP-8550.patch Recently many of the fsShell commands were updated to be more POSIX compliant. touchz appears to have been missed, or has regressed. If it has regressed then the target version should be 0.23.3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7753) Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class
[ https://issues.apache.org/jira/browse/HADOOP-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412866#comment-13412866 ] Cristina L. Abad commented on HADOOP-7753: -- In NativeIO.c it might be worth referencing RedHat BZ 554735 (see https://bugzilla.redhat.com/show_bug.cgi?id=554735). We were doing some tests on RHEL5.4 and that bug lead to exceptions (and decreased performance). We fixed it by compiling/using the 64-bit native libraries. We are now seeing significant increase in performance: an average of 24% improvement in running time across 10-runs of the Sort benchmark (10-node cluster, 35GB being sorted). Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class Key: HADOOP-7753 URL: https://issues.apache.org/jira/browse/HADOOP-7753 Project: Hadoop Common Issue Type: Sub-task Components: io, native, performance Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: HADOOP-7753.branch-1.patch, HADOOP-7753.branch-1.patch, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt This JIRA adds JNI wrappers for sync_data_range and posix_fadvise. It also implements a ReadaheadPool class for future use from HDFS and MapReduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8587) HarFileSystem access of harMetaCache isn't threadsafe
[ https://issues.apache.org/jira/browse/HADOOP-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated HADOOP-8587: Fix Version/s: 0.23.3 HarFileSystem access of harMetaCache isn't threadsafe - Key: HADOOP-8587 URL: https://issues.apache.org/jira/browse/HADOOP-8587 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Fix For: 1.2.0, 0.23.3, 2.0.1-alpha Attachments: hadoop-8587-b1.txt, hadoop-8587.txt, hadoop-8587.txt HarFileSystem's use of the static harMetaCache map is not threadsafe. Credit to Todd for pointing this out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8585) Fix initialization circularity between UserGroupInformation and HadoopConfiguration
[ https://issues.apache.org/jira/browse/HADOOP-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412935#comment-13412935 ] Aaron T. Myers commented on HADOOP-8585: bq. Wouldn't it have been easier to just suppress the warning as a false alarm? That would certainly work, but it doesn't seem much easier to me. I also don't see how the method used by this patch could possibly cause any problems, as it changes the behavior back to what it was before the recently-committed HDFS-3568, i.e. each invocation of newLoginContext will create a new HadoopConfiguration object. Fix initialization circularity between UserGroupInformation and HadoopConfiguration --- Key: HADOOP-8585 URL: https://issues.apache.org/jira/browse/HADOOP-8585 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.0.1-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.1-alpha Attachments: HDFS-3632.001.patch Fix findbugs warning about initialization circularity between UserGroupInformation and UserGroupInformation#HadoopConfiguration. From the findbugs text: {code} Initialization circularity between org.apache.hadoop.security.UserGroupInformation and org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration Bug type IC_INIT_CIRCULARITY (click for details) In class org.apache.hadoop.security.UserGroupInformation In class org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration At UserGroupInformation.java:[lines 76-1395] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8582) Improve error reporting for GZIP-compressed SequenceFiles with missing native libraries.
[ https://issues.apache.org/jira/browse/HADOOP-8582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413007#comment-13413007 ] Harsh J commented on HADOOP-8582: - Daryn, Its sorta the latter. To be clearer, the reason is this, from HADOOP-538: {quote} Arun: Context: gzip is just zlib algo + extra headers. java.util.zip.GZIP{Input|Output}Stream and hence existing GzipCodec won't work with SequenceFile due the fact that java.util.zip.GZIP{Input|Output}Streams will try to read/write gzip headers in the constructors which won't work in SequenceFiles since we typically read data from disk onto buffers, these buffers are empty on startup/after-reset and cause the java.util.zip.GZIP{Input|Output}Streams to fail. {quote} Improve error reporting for GZIP-compressed SequenceFiles with missing native libraries. Key: HADOOP-8582 URL: https://issues.apache.org/jira/browse/HADOOP-8582 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.0.0-alpha Reporter: Paul Wilkinson Priority: Minor Attachments: HADOOP-8582-1.diff At present it is not possible to write or read block-compressed SequenceFiles using the GZIP codec without the native libraries being available. The SequenceFile.Writer code checks for the availability of native libraries and throws a useful exception, but the SequenceFile.Reader doesn't do the same: {noformat} Exception in thread main java.io.EOFException at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:249) at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:239) at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:142) at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:58) at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:67) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.init(GzipCodec.java:95) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.init(GzipCodec.java:104) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:173) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:183) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1591) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1493) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1480) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1475) at test.SequenceReader.read(SequenceReader.java:23) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-6817) SequenceFile.Reader can't read gzip format compressed sequence file which produce by a mapreduce job without native compression library
[ https://issues.apache.org/jira/browse/HADOOP-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413009#comment-13413009 ] Harsh J commented on HADOOP-6817: - Hi Niels, Am happy to reopen this, but the reason for this not to work is explained at HADOOP-538. I will add that as a link as well. Do you still wish to reopen? SequenceFile.Reader can't read gzip format compressed sequence file which produce by a mapreduce job without native compression library --- Key: HADOOP-6817 URL: https://issues.apache.org/jira/browse/HADOOP-6817 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.20.2 Environment: Cluster:CentOS 5,jdk1.6.0_20 Client:Mac SnowLeopard,jdk1.6.0_20 Reporter: Wenjun Huang An hadoop job output a gzip compressed sequence file(whether record compressed or block compressed).The client program use SequenceFile.Reader to read this sequence file,when reading the client program shows the following exceptions: 2090 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2091 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor Exception in thread main java.io.EOFException at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:207) at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:197) at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:136) at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:58) at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:68) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.init(GzipCodec.java:92) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.init(GzipCodec.java:101) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:170) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:180) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1428) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1417) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1412) at com.shiningware.intelligenceonline.taobao.mapreduce.HtmlContentSeqOutputView.main(HtmlContentSeqOutputView.java:28) I studied the code in org.apache.hadoop.io.SequenceFile.Reader.init method and read: // Initialize... *not* if this we are constructing a temporary Reader if (!tempReader) { valBuffer = new DataInputBuffer(); if (decompress) { valDecompressor = CodecPool.getDecompressor(codec); valInFilter = codec.createInputStream(valBuffer, valDecompressor); valIn = new DataInputStream(valInFilter); } else { valIn = valBuffer; } the problem seems to be caused by valBuffer = new DataInputBuffer(); ,because GzipCodec.createInputStream creates an instance of GzipInputStream whose constructor creates an instance of ResetableGZIPInputStream class.When ResetableGZIPInputStream's constructor calls it base class java.util.zip.GZIPInputStream's constructor ,it trys to read the empty valBuffer = new DataInputBuffer(); and get no content,so it throws an EOFException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Moved] (HADOOP-8591) TestZKFailoverController.testOneOfEverything timesout
[ https://issues.apache.org/jira/browse/HADOOP-8591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon moved HDFS-3635 to HADOOP-8591: --- Target Version/s: (was: 2.0.1-alpha) Affects Version/s: (was: 2.0.0-alpha) 2.0.0-alpha Issue Type: Bug (was: Improvement) Key: HADOOP-8591 (was: HDFS-3635) Project: Hadoop Common (was: Hadoop HDFS) TestZKFailoverController.testOneOfEverything timesout - Key: HADOOP-8591 URL: https://issues.apache.org/jira/browse/HADOOP-8591 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Eli Collins Looks like the TestZKFailoverController timeout needs to be bumped. {noformat} java.lang.Exception: test timed out after 3 milliseconds at java.lang.Object.wait(Native Method) at org.apache.hadoop.ha.ZKFailoverController.waitForActiveAttempt(ZKFailoverController.java:460) at org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:648) at org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:58) at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:593) at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:590) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1334) at org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:590) at org.apache.hadoop.ha.TestZKFailoverController.testOneOfEverything(TestZKFailoverController.java:575) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8591) TestZKFailoverController.testOneOfEverything timesout
[ https://issues.apache.org/jira/browse/HADOOP-8591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HADOOP-8591: Component/s: test ha auto-failover TestZKFailoverController.testOneOfEverything timesout - Key: HADOOP-8591 URL: https://issues.apache.org/jira/browse/HADOOP-8591 Project: Hadoop Common Issue Type: Bug Components: auto-failover, ha, test Affects Versions: 2.0.0-alpha Reporter: Eli Collins Looks like the TestZKFailoverController timeout needs to be bumped. {noformat} java.lang.Exception: test timed out after 3 milliseconds at java.lang.Object.wait(Native Method) at org.apache.hadoop.ha.ZKFailoverController.waitForActiveAttempt(ZKFailoverController.java:460) at org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:648) at org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:58) at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:593) at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:590) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1334) at org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:590) at org.apache.hadoop.ha.TestZKFailoverController.testOneOfEverything(TestZKFailoverController.java:575) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7836) TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain
[ https://issues.apache.org/jira/browse/HADOOP-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413041#comment-13413041 ] Eli Collins commented on HADOOP-7836: - +1 looks good TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain Key: HADOOP-7836 URL: https://issues.apache.org/jira/browse/HADOOP-7836 Project: Hadoop Common Issue Type: Bug Components: ipc, test Affects Versions: 1.1.0 Reporter: Eli Collins Priority: Minor Attachments: HADOOP-7836.patch, hadoop-7836.txt TestSaslRPC#testDigestAuthMethodHostBasedToken fails on branch-1 on some hosts. null expected:localhost[] but was:localhost[.localdomain] junit.framework.ComparisonFailure: null expected:localhost[] but was:localhost[.localdomain] null expected:[localhost] but was:[eli-thinkpad] junit.framework.ComparisonFailure: null expected:[localhost] but was:[eli-thinkpad] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-7836) TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain
[ https://issues.apache.org/jira/browse/HADOOP-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved HADOOP-7836. - Resolution: Fixed Fix Version/s: 1.2.0 Target Version/s: (was: 1.1.0) I've committed this, thanks Daryn! Do we need a jira for the same test forward ported to trunk? TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain Key: HADOOP-7836 URL: https://issues.apache.org/jira/browse/HADOOP-7836 Project: Hadoop Common Issue Type: Bug Components: ipc, test Affects Versions: 1.1.0 Reporter: Eli Collins Priority: Minor Fix For: 1.2.0 Attachments: HADOOP-7836.patch, hadoop-7836.txt TestSaslRPC#testDigestAuthMethodHostBasedToken fails on branch-1 on some hosts. null expected:localhost[] but was:localhost[.localdomain] junit.framework.ComparisonFailure: null expected:localhost[] but was:localhost[.localdomain] null expected:[localhost] but was:[eli-thinkpad] junit.framework.ComparisonFailure: null expected:[localhost] but was:[eli-thinkpad] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8591) TestZKFailoverController.testOneOfEverything timesout
[ https://issues.apache.org/jira/browse/HADOOP-8591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413049#comment-13413049 ] Eli Collins commented on HADOOP-8591: - Yup, thanks. TestZKFailoverController.testOneOfEverything timesout - Key: HADOOP-8591 URL: https://issues.apache.org/jira/browse/HADOOP-8591 Project: Hadoop Common Issue Type: Bug Components: auto-failover, ha, test Affects Versions: 2.0.0-alpha Reporter: Eli Collins Looks like the TestZKFailoverController timeout needs to be bumped. {noformat} java.lang.Exception: test timed out after 3 milliseconds at java.lang.Object.wait(Native Method) at org.apache.hadoop.ha.ZKFailoverController.waitForActiveAttempt(ZKFailoverController.java:460) at org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:648) at org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:58) at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:593) at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:590) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1334) at org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:590) at org.apache.hadoop.ha.TestZKFailoverController.testOneOfEverything(TestZKFailoverController.java:575) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-6817) SequenceFile.Reader can't read gzip format compressed sequence file, which produce by a mapreduce job, without native compression library
[ https://issues.apache.org/jira/browse/HADOOP-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HADOOP-6817: Summary: SequenceFile.Reader can't read gzip format compressed sequence file, which produce by a mapreduce job, without native compression library (was: SequenceFile.Reader can't read gzip format compressed sequence file which produce by a mapreduce job without native compression library) SequenceFile.Reader can't read gzip format compressed sequence file, which produce by a mapreduce job, without native compression library - Key: HADOOP-6817 URL: https://issues.apache.org/jira/browse/HADOOP-6817 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.20.2 Environment: Cluster:CentOS 5,jdk1.6.0_20 Client:Mac SnowLeopard,jdk1.6.0_20 Reporter: Wenjun Huang An hadoop job output a gzip compressed sequence file(whether record compressed or block compressed).The client program use SequenceFile.Reader to read this sequence file,when reading the client program shows the following exceptions: 2090 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2091 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor Exception in thread main java.io.EOFException at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:207) at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:197) at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:136) at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:58) at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:68) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.init(GzipCodec.java:92) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.init(GzipCodec.java:101) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:170) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:180) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1428) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1417) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1412) at com.shiningware.intelligenceonline.taobao.mapreduce.HtmlContentSeqOutputView.main(HtmlContentSeqOutputView.java:28) I studied the code in org.apache.hadoop.io.SequenceFile.Reader.init method and read: // Initialize... *not* if this we are constructing a temporary Reader if (!tempReader) { valBuffer = new DataInputBuffer(); if (decompress) { valDecompressor = CodecPool.getDecompressor(codec); valInFilter = codec.createInputStream(valBuffer, valDecompressor); valIn = new DataInputStream(valInFilter); } else { valIn = valBuffer; } the problem seems to be caused by valBuffer = new DataInputBuffer(); ,because GzipCodec.createInputStream creates an instance of GzipInputStream whose constructor creates an instance of ResetableGZIPInputStream class.When ResetableGZIPInputStream's constructor calls it base class java.util.zip.GZIPInputStream's constructor ,it trys to read the empty valBuffer = new DataInputBuffer(); and get no content,so it throws an EOFException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7836) TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain
[ https://issues.apache.org/jira/browse/HADOOP-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413105#comment-13413105 ] Daryn Sharp commented on HADOOP-7836: - I haven't checked but if trunk is indeed missing these tests -- and it's not that they moved? -- then yes we need to port to trunk. TestSaslRPC#testDigestAuthMethodHostBasedToken fails with hostname localhost.localdomain Key: HADOOP-7836 URL: https://issues.apache.org/jira/browse/HADOOP-7836 Project: Hadoop Common Issue Type: Bug Components: ipc, test Affects Versions: 1.1.0 Reporter: Eli Collins Priority: Minor Fix For: 1.2.0 Attachments: HADOOP-7836.patch, hadoop-7836.txt TestSaslRPC#testDigestAuthMethodHostBasedToken fails on branch-1 on some hosts. null expected:localhost[] but was:localhost[.localdomain] junit.framework.ComparisonFailure: null expected:localhost[] but was:localhost[.localdomain] null expected:[localhost] but was:[eli-thinkpad] junit.framework.ComparisonFailure: null expected:[localhost] but was:[eli-thinkpad] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8592) Hadoop-auth should use o.a.h.util.Time methods instead of System#currentTimeMillis
Eli Collins created HADOOP-8592: --- Summary: Hadoop-auth should use o.a.h.util.Time methods instead of System#currentTimeMillis Key: HADOOP-8592 URL: https://issues.apache.org/jira/browse/HADOOP-8592 Project: Hadoop Common Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Eli Collins Priority: Minor HDFS-3641 moved HDFS' Time methods to common so they can be used by MR (and eventually others). We should replace used of System#currentTimeMillis in MR with Time#now (or Time#monotonicNow when computing intervals, eg to sleep). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8552) Conflict: Same security.log.file for multiple users.
[ https://issues.apache.org/jira/browse/HADOOP-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413127#comment-13413127 ] Alejandro Abdelnur commented on HADOOP-8552: +1 Conflict: Same security.log.file for multiple users. - Key: HADOOP-8552 URL: https://issues.apache.org/jira/browse/HADOOP-8552 Project: Hadoop Common Issue Type: Bug Components: conf, security Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: HADOOP-8552_branch1.patch, HADOOP-8552_branch2.patch In log4j.properties, hadoop.security.log.file is set to SecurityAuth.audit. In the presence of multiple users, this can lead to a potential conflict. Adding username to the log file would avoid this scenario. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7753) Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class
[ https://issues.apache.org/jira/browse/HADOOP-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413138#comment-13413138 ] Giridharan Kesavan commented on HADOOP-7753: All the hadoop releases ships both 32 and 64 bit libraries. User can decide on using 32 or 64 bit library which is appropriate for his environment. So the ans is we compile 32 and 64 bit libs by setting CFLAGS and CXXFLAGS with the appropriate JDK - for more info refer to hadoop release wiki. Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class Key: HADOOP-7753 URL: https://issues.apache.org/jira/browse/HADOOP-7753 Project: Hadoop Common Issue Type: Sub-task Components: io, native, performance Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: HADOOP-7753.branch-1.patch, HADOOP-7753.branch-1.patch, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt This JIRA adds JNI wrappers for sync_data_range and posix_fadvise. It also implements a ReadaheadPool class for future use from HDFS and MapReduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7753) Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class
[ https://issues.apache.org/jira/browse/HADOOP-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413165#comment-13413165 ] Cristina L. Abad commented on HADOOP-7753: -- Yes, I know that. The issue is that some people may be using the 32-bit libraries on 64-bit architectures. This actually works fine except if you turn on these features and you are running on a system with the bug I mentioned. Anyway, this is a minor problem, just thought it would be a good idea to mention it here in case someone runs into the problem and doesn't know what's going on. Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class Key: HADOOP-7753 URL: https://issues.apache.org/jira/browse/HADOOP-7753 Project: Hadoop Common Issue Type: Sub-task Components: io, native, performance Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: HADOOP-7753.branch-1.patch, HADOOP-7753.branch-1.patch, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt This JIRA adds JNI wrappers for sync_data_range and posix_fadvise. It also implements a ReadaheadPool class for future use from HDFS and MapReduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8552) Conflict: Same security.log.file for multiple users.
[ https://issues.apache.org/jira/browse/HADOOP-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413213#comment-13413213 ] Devaraj Das commented on HADOOP-8552: - Hi Karthik, is this on the client or on the server side? (Guessing its on client.. please confirm). In general, the audit log stuff doesn't make sense on the client side. It's meant to be used on the server side only (and in deployments I know about, the security audit logging is turned off on the client side). Your patch will work though. But I'll note that it might be introducing compatibility issues due to the filename change of the log file (if someone is collecting logs based on file names, etc.). Conflict: Same security.log.file for multiple users. - Key: HADOOP-8552 URL: https://issues.apache.org/jira/browse/HADOOP-8552 Project: Hadoop Common Issue Type: Bug Components: conf, security Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: HADOOP-8552_branch1.patch, HADOOP-8552_branch2.patch In log4j.properties, hadoop.security.log.file is set to SecurityAuth.audit. In the presence of multiple users, this can lead to a potential conflict. Adding username to the log file would avoid this scenario. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8453) Add unit tests for winutils
[ https://issues.apache.org/jira/browse/HADOOP-8453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413303#comment-13413303 ] Chuan Liu commented on HADOOP-8453: --- Hi Bikas, I think we can still check this in as a separate or standalone test suites for 'winutils', and create a new JIRA for adding 'winutils' related tests to TestShell. Add unit tests for winutils --- Key: HADOOP-8453 URL: https://issues.apache.org/jira/browse/HADOOP-8453 Project: Hadoop Common Issue Type: Task Components: test Affects Versions: 1.1.0, 0.24.0 Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: HADOOP-8453-branch-1-win-2.patch, HADOOP-8453-branch-1-win.patch In [Hadoop-8235|https://issues.apache.org/jira/browse/HADOOP-8235], we created a Windows console program, named ‘winutils’, to emulate some Linux command line utilities used by Hadoop. However no tests are provided in the original patch. As this code is quite complicated, and the complexity may even grow up in the future. We think unit tests are necessary to ensure code quality, as well as smooth future development. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8552) Conflict: Same security.log.file for multiple users.
[ https://issues.apache.org/jira/browse/HADOOP-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413317#comment-13413317 ] Karthik Kambatla commented on HADOOP-8552: -- Devaraj, thanks for the feedback. It is both on the client/server side. By server side, I mean for the jobtracker/namenode. Thanks for pointing the potential compatibility issue, I agree we need to note the incompatibility in log file change. Conflict: Same security.log.file for multiple users. - Key: HADOOP-8552 URL: https://issues.apache.org/jira/browse/HADOOP-8552 Project: Hadoop Common Issue Type: Bug Components: conf, security Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: HADOOP-8552_branch1.patch, HADOOP-8552_branch2.patch In log4j.properties, hadoop.security.log.file is set to SecurityAuth.audit. In the presence of multiple users, this can lead to a potential conflict. Adding username to the log file would avoid this scenario. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8499) Lower min.user.id to 500 for the tests
[ https://issues.apache.org/jira/browse/HADOOP-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HADOOP-8499: Priority: Minor (was: Major) Target Version/s: 2.0.1-alpha Affects Version/s: 2.0.0-alpha Summary: Lower min.user.id to 500 for the tests (was: fix mvn compile -Pnative on CentOS / RHEL / Fedora / SuSE / etc) ATM, reasonable to lower the min id to 500 for the tests? Lower min.user.id to 500 for the tests -- Key: HADOOP-8499 URL: https://issues.apache.org/jira/browse/HADOOP-8499 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HADOOP-8499.002.patch On Linux platforms where user IDs start at 500 rather than 1000, the build currently is broken. This includes CentOS, RHEL, Fedora, SuSE, and probably most other Linux platforms. It does happen to work on Debian and Ubuntu, which explains why Jenkins hasn't caught it yet. Other users will see something like this: {code} [INFO] Requested user cmccabe has id 500, which is below the minimum allowed 1000 [INFO] FAIL: test-container-executor [INFO] [INFO] 1 of 1 test failed [INFO] Please report to mapreduce-...@hadoop.apache.org [INFO] [INFO] make[1]: *** [check-TESTS] Error 1 [INFO] make[1]: Leaving directory `/home/cmccabe/hadoop4/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn -server/hadoop-yarn-server-nodemanager/target/native/container-executor' {code} And then the build fails. Since native unit tests are currently unskippable (HADOOP-8480) this makes the project unbuildable. The easy solution to this is to relax the constraint for the unit test. Since the unit test already writes its own configuration file, we just need to change it there. In general, I believe that it would make sense to change this to 500 across the board. I'm not aware of any Linuxes that create system users with IDs higher than or equal to 500. System user IDs tend to be below 200. However, if we do nothing else, we should at least fix the build by relaxing the constraint for unit tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8457) Address file ownership issue for users in Administrators group on Windows.
[ https://issues.apache.org/jira/browse/HADOOP-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413429#comment-13413429 ] Sanjay Radia commented on HADOOP-8457: -- Patch adds public method FileStatus#isOwnedByUser(UserGroupInformation ugi) * Does not make sense to expose a low level structure like ugi through a fundamental class like FileStatus. Make the parameter String user. * I hate the idea of adding a public method when getOwener().equals(user) is good enough - but FileStatus is subclassed and is useful for the RawFileSystem's FileStatus. Is there another way to solve the problem in a simple way without adding such a method? ** e.g. put in a util? May not work since the new code applies to FileStatus of RawLocalFileSystem of windows-filesystem and not to FileStatus of hdfs where the client is running on a windows box. Address file ownership issue for users in Administrators group on Windows. -- Key: HADOOP-8457 URL: https://issues.apache.org/jira/browse/HADOOP-8457 Project: Hadoop Common Issue Type: Bug Components: native Affects Versions: 1.1.0, 0.24.0 Reporter: Chuan Liu Assignee: Ivan Mitic Priority: Minor Attachments: HADOOP-8457-branch-1-win_Admins(2).patch, HADOOP-8457-branch-1-win_Admins.patch On Linux, the initial file owners are the creators. (I think this is true in general. If there are exceptions, please let me know.) On Windows, the file created by a user in the Administrators group has the initial owner ‘Administrators’, i.e. the the Administrators group is the initial owner of the file. As a result, this leads to an exception when we check file ownership in SecureIOUtils .checkStat() method. As a result, this method is disabled right now. We need to address this problem and enable the method on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8457) Address file ownership issue for users in Administrators group on Windows.
[ https://issues.apache.org/jira/browse/HADOOP-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413436#comment-13413436 ] Ivan Mitic commented on HADOOP-8457: Thanks for comments Sanjay. bq. Does not make sense to expose a low level structure like ugi through a fundamental class like FileStatus. Make the parameter String user. I saw that FileSystem already had an API that accepts UGI ({{FileSystem#closeAllForUGI())}}, that is why I thought this is fine. On the other hand, I didn't want to expose this on the FileSystem, as it would require for a caller to query the FileSystem twice if it wants to check both ownership and permissions. bq. Is there another way to solve the problem in a simple way without adding such a method? Exposing it through the API makes it easier to specialize through RawLocalFileStatus as you noted. Yes, we could expose this functionality as a Util function. However, it would only do the Administrators group check if {{FileSystem}} is {{instanceof LocalFileSystem}} (and on Windows). Do you believe this would be more appropriate? Address file ownership issue for users in Administrators group on Windows. -- Key: HADOOP-8457 URL: https://issues.apache.org/jira/browse/HADOOP-8457 Project: Hadoop Common Issue Type: Bug Components: native Affects Versions: 1.1.0, 0.24.0 Reporter: Chuan Liu Assignee: Ivan Mitic Priority: Minor Attachments: HADOOP-8457-branch-1-win_Admins(2).patch, HADOOP-8457-branch-1-win_Admins.patch On Linux, the initial file owners are the creators. (I think this is true in general. If there are exceptions, please let me know.) On Windows, the file created by a user in the Administrators group has the initial owner ‘Administrators’, i.e. the the Administrators group is the initial owner of the file. As a result, this leads to an exception when we check file ownership in SecureIOUtils .checkStat() method. As a result, this method is disabled right now. We need to address this problem and enable the method on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8487) Many HDFS tests use a test path intended for local file system tests
[ https://issues.apache.org/jira/browse/HADOOP-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413443#comment-13413443 ] Ivan Mitic commented on HADOOP-8487: Thanks for reviewing Daryn. bq. In FileSystemTestHelper, does the final keyword need to be removed? We need this because we want to be able to override the value by tests ({{TestFSMainOperationsWebHdfs}} is one example). This path is not always used in the context of the local path, and this causes problems on Windows, as paths like {{c:/some/path/build/test/data}} are not valid DFS paths (because of the colon). bq. In TestFSMainOperationsLocalFileSystem, are any changes actually needed? Ie. why override just to call super? I think I saw this test failing on Windows because the super {{tearDown()}} was not called, causing subsequent tests to fail (could it be related to junit version?). Will try to repro the problem and report back. bq. In all of test class changes, please default to build/test/data. Hardcoding /tmp may cause multiple test runs to collide. This is actually the test fix. Similar comment to #1, {{build.test.data}} is a local path and given that it is used in HDFS tests, it fails the valid DFS path check. IOW, these tests should not write to the local file system, and if I understood your worry correctly, should not collide with other tests. We also have the test name embedded in the path, so it should be easy to spot such cases if they exists. Does it make sense? Many HDFS tests use a test path intended for local file system tests Key: HADOOP-8487 URL: https://issues.apache.org/jira/browse/HADOOP-8487 Project: Hadoop Common Issue Type: Bug Components: test Reporter: Ivan Mitic Assignee: Ivan Mitic Attachments: HADOOP-8487-branch-1-win(2).patch, HADOOP-8487-branch-1-win(3).patch, HADOOP-8487-branch-1-win.alternate.patch, HADOOP-8487-branch-1-win.patch Many tests use a test path intended for local tests setup by build environment. In some cases the tests fails on platforms such as windows because the path contains a c: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8582) Improve error reporting for GZIP-compressed SequenceFiles with missing native libraries.
[ https://issues.apache.org/jira/browse/HADOOP-8582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413489#comment-13413489 ] Harsh J commented on HADOOP-8582: - Daryn, are you good with the patch's general approach, given the above? Paul, will you be sending an updated patch soon? If not, let me know, and I'm happy to tweak on your behalf and add in those changes. Improve error reporting for GZIP-compressed SequenceFiles with missing native libraries. Key: HADOOP-8582 URL: https://issues.apache.org/jira/browse/HADOOP-8582 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.0.0-alpha Reporter: Paul Wilkinson Priority: Minor Attachments: HADOOP-8582-1.diff At present it is not possible to write or read block-compressed SequenceFiles using the GZIP codec without the native libraries being available. The SequenceFile.Writer code checks for the availability of native libraries and throws a useful exception, but the SequenceFile.Reader doesn't do the same: {noformat} Exception in thread main java.io.EOFException at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:249) at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:239) at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:142) at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:58) at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:67) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.init(GzipCodec.java:95) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.init(GzipCodec.java:104) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:173) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:183) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1591) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1493) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1480) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1475) at test.SequenceReader.read(SequenceReader.java:23) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8457) Address file ownership issue for users in Administrators group on Windows.
[ https://issues.apache.org/jira/browse/HADOOP-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413503#comment-13413503 ] Bikas Saha commented on HADOOP-8457: IMO, to me calling FileStatus.isOwnedBy() seems like a natural API to use. I agree it would be nice to have something other than UGI to represent user/group information but unfortunately there does not seem to be any such abstraction. I would ideally like to see FileStatus.getOwner().equals(ownerObj) where ownerObj is an object representing the owner that encapsulates users/groups etc. Currently ownerObj is simply a string name and it has worked because of simple 1 owner, 1 group Unix model. Address file ownership issue for users in Administrators group on Windows. -- Key: HADOOP-8457 URL: https://issues.apache.org/jira/browse/HADOOP-8457 Project: Hadoop Common Issue Type: Bug Components: native Affects Versions: 1.1.0, 0.24.0 Reporter: Chuan Liu Assignee: Ivan Mitic Priority: Minor Attachments: HADOOP-8457-branch-1-win_Admins(2).patch, HADOOP-8457-branch-1-win_Admins.patch On Linux, the initial file owners are the creators. (I think this is true in general. If there are exceptions, please let me know.) On Windows, the file created by a user in the Administrators group has the initial owner ‘Administrators’, i.e. the the Administrators group is the initial owner of the file. As a result, this leads to an exception when we check file ownership in SecureIOUtils .checkStat() method. As a result, this method is disabled right now. We need to address this problem and enable the method on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira