[jira] [Created] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API
madhukara phatak created HADOOP-8521: Summary: Port StreamInputFormat to new Map Reduce API Key: HADOOP-8521 URL: https://issues.apache.org/jira/browse/HADOOP-8521 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: madhukara phatak Assignee: madhukara phatak As of now , hadoop streaming still used old Hadoop API. This JIRA ports it to the new M/R API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API
[ https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] madhukara phatak updated HADOOP-8521: - Description: As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to the new M/R API. (was: As of now , hadoop streaming still used old Hadoop API. This JIRA ports it to the new M/R API.) Port StreamInputFormat to new Map Reduce API Key: HADOOP-8521 URL: https://issues.apache.org/jira/browse/HADOOP-8521 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: madhukara phatak Assignee: madhukara phatak As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to the new M/R API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API
[ https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] madhukara phatak updated HADOOP-8521: - Attachment: HADOOP-8521.patch Port StreamInputFormat to new Map Reduce API Key: HADOOP-8521 URL: https://issues.apache.org/jira/browse/HADOOP-8521 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: madhukara phatak Assignee: madhukara phatak Attachments: HADOOP-8521.patch As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to the new M/R API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API
[ https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] madhukara phatak updated HADOOP-8521: - Status: Patch Available (was: Open) Port StreamInputFormat to new Map Reduce API Key: HADOOP-8521 URL: https://issues.apache.org/jira/browse/HADOOP-8521 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: madhukara phatak Assignee: madhukara phatak Attachments: HADOOP-8521.patch As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to the new M/R API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8468) Umbrella of enhancements to support different failure and locality topologies
[ https://issues.apache.org/jira/browse/HADOOP-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398269#comment-13398269 ] Konstantin Shvachko commented on HADOOP-8468: - 3rd on local node of 2nd How so? Junping try to rewrite the policy I stated earlier using your terms for 4-level topology with node-groups as the third level, and you will see many words change. If you put it in terms when virtual nodes are added as the fourth level, then you don't need to change a word in the old policy. I thought it's a good thing to keep old policies consistent with new use cases. Confirms (1) that it's a good policy, and (2) that it's a good design. Agree. That's what I try to do previously also. What changed your mind? Sounds like the right direction to me. Umbrella of enhancements to support different failure and locality topologies - Key: HADOOP-8468 URL: https://issues.apache.org/jira/browse/HADOOP-8468 Project: Hadoop Common Issue Type: Bug Components: ha, io Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Junping Du Assignee: Junping Du Priority: Critical Attachments: HADOOP-8468-total-v3.patch, HADOOP-8468-total.patch, Proposal for enchanced failure and locality topologies (revised-1.0).pdf, Proposal for enchanced failure and locality topologies.pdf The current hadoop network topology (described in some previous issues like: Hadoop-692) works well in classic three-tiers network when it comes out. However, it does not take into account other failure models or changes in the infrastructure that can affect network bandwidth efficiency like: virtualization. Virtualized platform has following genes that shouldn't been ignored by hadoop topology in scheduling tasks, placing replica, do balancing or fetching block for reading: 1. VMs on the same physical host are affected by the same hardware failure. In order to match the reliability of a physical deployment, replication of data across two virtual machines on the same host should be avoided. 2. The network between VMs on the same physical host has higher throughput and lower latency and does not consume any physical switch bandwidth. Thus, we propose to make hadoop network topology extend-able and introduce a new level in the hierarchical topology, a node group level, which maps well onto an infrastructure that is based on a virtualized environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used
Mike Percy created HADOOP-8522: -- Summary: ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used Key: HADOOP-8522 URL: https://issues.apache.org/jira/browse/HADOOP-8522 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 2.0.0-alpha, 1.0.3 Reporter: Mike Percy ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used. The issue is that finish() flushes the compressor buffer and writes the gzip CRC32 + data length trailer. After that, resetState() does not repeat the gzip header, but simply starts writing more deflate-compressed data. The resultant files are not readable by the Linux gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip files. The gzip format is specified in [RFC 1952|https://tools.ietf.org/html/rfc1952]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used
[ https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398288#comment-13398288 ] Mike Percy commented on HADOOP-8522: Some additional info. Currently, resetState() works when using the native Zlib gzip implementation; the output appears to comply with the spec and works with gunzip because it writes the full header and trailer (basically concatenated gzip files). That may be one reason this bug has lain dormant for so long with the non-native implementation (serious users tend to use the native libs). So, the problem is with the non-native gzip implementation. ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used Key: HADOOP-8522 URL: https://issues.apache.org/jira/browse/HADOOP-8522 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Mike Percy ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used. The issue is that finish() flushes the compressor buffer and writes the gzip CRC32 + data length trailer. After that, resetState() does not repeat the gzip header, but simply starts writing more deflate-compressed data. The resultant files are not readable by the Linux gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip files. The gzip format is specified in [RFC 1952|https://tools.ietf.org/html/rfc1952]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used
[ https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Percy updated HADOOP-8522: --- Attachment: HADOOP-8522-2.patch I am attaching a patch to make the behavior of non-native resetState() consistent with native resetState(), which will make them both compliant with RFC1952 and gunzip. Implementation totally lifted from HBase: https://svn.apache.org/viewvc/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/ReusableStreamGzipCodec.java?revision=1342856view=markup I added one unit test which simply tests that the output is readable with GZipInputStream, and one in which I had to comment-out the assert() because JDK GZipInputStream cannot handle multi-member gzip files. I'm open to suggestions for improving the unit test... it looks like HBase actually stores the expected bytes and requires an exact match in their test. Testing done: manual inspection that the data generated via the 2nd unit test creates headers, trailers, crc32 checksums, and lengths corresponding to the two members included. Also verified that the output of unit test 2 is readable with gunzip and that the output matches the provided input. ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used Key: HADOOP-8522 URL: https://issues.apache.org/jira/browse/HADOOP-8522 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Mike Percy Attachments: HADOOP-8522-2.patch ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used. The issue is that finish() flushes the compressor buffer and writes the gzip CRC32 + data length trailer. After that, resetState() does not repeat the gzip header, but simply starts writing more deflate-compressed data. The resultant files are not readable by the Linux gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip files. The gzip format is specified in [RFC 1952|https://tools.ietf.org/html/rfc1952]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used
[ https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398292#comment-13398292 ] Mike Percy commented on HADOOP-8522: (note: trunk patch) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used Key: HADOOP-8522 URL: https://issues.apache.org/jira/browse/HADOOP-8522 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Mike Percy Attachments: HADOOP-8522-2.patch ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used. The issue is that finish() flushes the compressor buffer and writes the gzip CRC32 + data length trailer. After that, resetState() does not repeat the gzip header, but simply starts writing more deflate-compressed data. The resultant files are not readable by the Linux gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip files. The gzip format is specified in [RFC 1952|https://tools.ietf.org/html/rfc1952]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8468) Umbrella of enhancements to support different failure and locality topologies
[ https://issues.apache.org/jira/browse/HADOOP-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398294#comment-13398294 ] Junping Du commented on HADOOP-8468: Hi Konstantin, Thanks for your comments. Please see my reply: If you put it in terms when virtual nodes are added as the fourth level, then you don't need to change a word in the old policy. Still need some slightly change as first replica should be placed on local virtual node but not node local. Let me show a two different way of translation the original rules you list above (in rule 2, I omit on two different nodes there as it is duplicated with rule 0). Original: 0. No more than one replica is placed at any one node 1. First replica on the local node 2. Second and third replicas are in the same rack 3. Other replicas on random nodes with restriction that no more than two replicas are placed in the same rack, if there is enough racks. two ways: 1) node, rack - node, *nodegroup*; 2) node, rack - *virtual node*, node, rack. The black word represent additional layer. way 1: 0. No more than one replica is placed at any one *nodegroup* 1. First replica on the local node 2. Second and third replicas are in the same rack 3. Other replicas on random nodes with restriction that no more than two replicas are placed in the same rack, if there is enough racks way 2: 0. No more than one replica is placed at any one node 1. First replica on the local *virtual node* 2. Second and third replicas are in the same rack 3. Other replicas on random nodes with restriction that no more than two replicas are placed in the same rack, if there is enough racks So you can see it is equivalent in words. Umbrella of enhancements to support different failure and locality topologies - Key: HADOOP-8468 URL: https://issues.apache.org/jira/browse/HADOOP-8468 Project: Hadoop Common Issue Type: Bug Components: ha, io Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Junping Du Assignee: Junping Du Priority: Critical Attachments: HADOOP-8468-total-v3.patch, HADOOP-8468-total.patch, Proposal for enchanced failure and locality topologies (revised-1.0).pdf, Proposal for enchanced failure and locality topologies.pdf The current hadoop network topology (described in some previous issues like: Hadoop-692) works well in classic three-tiers network when it comes out. However, it does not take into account other failure models or changes in the infrastructure that can affect network bandwidth efficiency like: virtualization. Virtualized platform has following genes that shouldn't been ignored by hadoop topology in scheduling tasks, placing replica, do balancing or fetching block for reading: 1. VMs on the same physical host are affected by the same hardware failure. In order to match the reliability of a physical deployment, replication of data across two virtual machines on the same host should be avoided. 2. The network between VMs on the same physical host has higher throughput and lower latency and does not consume any physical switch bandwidth. Thus, we propose to make hadoop network topology extend-able and introduce a new level in the hierarchical topology, a node group level, which maps well onto an infrastructure that is based on a virtualized environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8468) Umbrella of enhancements to support different failure and locality topologies
[ https://issues.apache.org/jira/browse/HADOOP-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398316#comment-13398316 ] Junping Du commented on HADOOP-8468: What changed your mind? Sounds like the right direction to me. From above comments, you can see way-1 inherit original policy almost as much as way-2. But way-1 will take more simplicity in implementation for some reasons like: DatanodeDescriptor don't have to remap to additional *virtual node* layer, NetworkTopology structure is easier to extend in InnerNode rather than leaf node, etc. Thoughts? Umbrella of enhancements to support different failure and locality topologies - Key: HADOOP-8468 URL: https://issues.apache.org/jira/browse/HADOOP-8468 Project: Hadoop Common Issue Type: Bug Components: ha, io Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Junping Du Assignee: Junping Du Priority: Critical Attachments: HADOOP-8468-total-v3.patch, HADOOP-8468-total.patch, Proposal for enchanced failure and locality topologies (revised-1.0).pdf, Proposal for enchanced failure and locality topologies.pdf The current hadoop network topology (described in some previous issues like: Hadoop-692) works well in classic three-tiers network when it comes out. However, it does not take into account other failure models or changes in the infrastructure that can affect network bandwidth efficiency like: virtualization. Virtualized platform has following genes that shouldn't been ignored by hadoop topology in scheduling tasks, placing replica, do balancing or fetching block for reading: 1. VMs on the same physical host are affected by the same hardware failure. In order to match the reliability of a physical deployment, replication of data across two virtual machines on the same host should be avoided. 2. The network between VMs on the same physical host has higher throughput and lower latency and does not consume any physical switch bandwidth. Thus, we propose to make hadoop network topology extend-able and introduce a new level in the hierarchical topology, a node group level, which maps well onto an infrastructure that is based on a virtualized environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used
[ https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Percy updated HADOOP-8522: --- Attachment: (was: HADOOP-8522-2.patch) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used Key: HADOOP-8522 URL: https://issues.apache.org/jira/browse/HADOOP-8522 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Mike Percy Attachments: HADOOP-8522-2a.patch ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used. The issue is that finish() flushes the compressor buffer and writes the gzip CRC32 + data length trailer. After that, resetState() does not repeat the gzip header, but simply starts writing more deflate-compressed data. The resultant files are not readable by the Linux gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip files. The gzip format is specified in [RFC 1952|https://tools.ietf.org/html/rfc1952]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used
[ https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Percy updated HADOOP-8522: --- Attachment: HADOOP-8522-2a.patch ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used Key: HADOOP-8522 URL: https://issues.apache.org/jira/browse/HADOOP-8522 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Mike Percy Attachments: HADOOP-8522-2a.patch ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used. The issue is that finish() flushes the compressor buffer and writes the gzip CRC32 + data length trailer. After that, resetState() does not repeat the gzip header, but simply starts writing more deflate-compressed data. The resultant files are not readable by the Linux gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip files. The gzip format is specified in [RFC 1952|https://tools.ietf.org/html/rfc1952]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API
[ https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398453#comment-13398453 ] Robert Joseph Evans commented on HADOOP-8521: - I am a bit confused here. I see that you added in a new mapreduce StreamInputFormat, with the corresponding StreamXmlRecordReader and StreamBaseRecordReader, but how does this enable us to use the new MapReduce API? Can you update the documentation to provide some examples of how you can use these new classes you have added? Also the test you have added in is not actually testing the new code at all. It is still testing the old input format code. I can delete the new code entirely and the test still passes. It looks like a great start, but I think there is some more wiring that needs to be done to make this work. Port StreamInputFormat to new Map Reduce API Key: HADOOP-8521 URL: https://issues.apache.org/jira/browse/HADOOP-8521 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: madhukara phatak Assignee: madhukara phatak Attachments: HADOOP-8521.patch As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to the new M/R API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API
[ https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated HADOOP-8521: Target Version/s: 3.0.0 Status: Open (was: Patch Available) Port StreamInputFormat to new Map Reduce API Key: HADOOP-8521 URL: https://issues.apache.org/jira/browse/HADOOP-8521 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: madhukara phatak Assignee: madhukara phatak Attachments: HADOOP-8521.patch As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to the new M/R API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8517) --config option does not work with Hadoop installation on Windows
[ https://issues.apache.org/jira/browse/HADOOP-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398461#comment-13398461 ] Daryn Sharp commented on HADOOP-8517: - The correct switch is {{-conf}} and ordering of the args is wrong. The correct cmdline is: {{hadoop fs -conf XXX -ls /}} --config option does not work with Hadoop installation on Windows - Key: HADOOP-8517 URL: https://issues.apache.org/jira/browse/HADOOP-8517 Project: Hadoop Common Issue Type: Bug Reporter: Trupti Dhavle I ran following command hadoop --config c:\directory\hadoop\conf fs -ls / I get following error for --config option Unrecognized option: --config Could not create the Java virtual machine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-8517) --config option does not work with Hadoop installation on Windows
[ https://issues.apache.org/jira/browse/HADOOP-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp resolved HADOOP-8517. - Resolution: Not A Problem --config option does not work with Hadoop installation on Windows - Key: HADOOP-8517 URL: https://issues.apache.org/jira/browse/HADOOP-8517 Project: Hadoop Common Issue Type: Bug Reporter: Trupti Dhavle I ran following command hadoop --config c:\directory\hadoop\conf fs -ls / I get following error for --config option Unrecognized option: --config Could not create the Java virtual machine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8497) Shell needs a way to list amount of physical consumed space in a directory
[ https://issues.apache.org/jira/browse/HADOOP-8497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398469#comment-13398469 ] Daryn Sharp commented on HADOOP-8497: - Would a {{du}} option that reports sizes after multiplying by the replication factor address this issue? Shell needs a way to list amount of physical consumed space in a directory -- Key: HADOOP-8497 URL: https://issues.apache.org/jira/browse/HADOOP-8497 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 1.0.3, 2.0.0-alpha, 3.0.0 Reporter: Todd Lipcon Assignee: Andy Isaacson Currently, there is no way to see the physical consumed space for a directory. du lists the logical (pre-replication) space, and fs -count only displays the consumed space when a quota is set. This makes it hard for administrators to set a quota on a directory, since they have no way to determine a reasonable value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8496) FsShell is broken with s3 filesystems
[ https://issues.apache.org/jira/browse/HADOOP-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398470#comment-13398470 ] Daryn Sharp commented on HADOOP-8496: - Does the directory contain a -0s entry? What is the exact cmdline and the exact contents of the directory? FsShell is broken with s3 filesystems - Key: HADOOP-8496 URL: https://issues.apache.org/jira/browse/HADOOP-8496 Project: Hadoop Common Issue Type: Bug Components: fs/s3 Affects Versions: 2.0.1-alpha Reporter: Alejandro Abdelnur Priority: Critical After setting up a S3 account, configuring the site.xml with the accesskey/password, when doing an ls on a non-empty bucket I get: {code} Found 4 items -ls: -0s Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...] {code} Note that it correctly shows the number of items in the root of the bucket, it does not show the contents of the root. I've tried -get and -put and it works fine, accessing a folder in the bucket seems to be fully broken. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API
[ https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398472#comment-13398472 ] Tom White commented on HADOOP-8521: --- Also, MAPREDUCE-1122 would be a useful pre-requisite for this. Port StreamInputFormat to new Map Reduce API Key: HADOOP-8521 URL: https://issues.apache.org/jira/browse/HADOOP-8521 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: madhukara phatak Assignee: madhukara phatak Attachments: HADOOP-8521.patch As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to the new M/R API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8523) test-patch.sh is doesn't validate patches before building
Jack Dintruff created HADOOP-8523: - Summary: test-patch.sh is doesn't validate patches before building Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.1 Reporter: Jack Dintruff Priority: Trivial When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Status: Open (was: Patch Available) test-patch.sh is doesn't validate patches before building - Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.3 Reporter: Jack Dintruff Priority: Trivial Labels: newbie Fix For: 0.23.3 When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Fix Version/s: 0.23.3 Target Version/s: 0.23.3 (was: 0.23.1) Affects Version/s: (was: 0.23.1) 0.23.3 Status: Patch Available (was: Open) test-patch.sh is doesn't validate patches before building - Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.3 Reporter: Jack Dintruff Priority: Trivial Labels: newbie Fix For: 0.23.3 When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Attachment: Hadoop-8523.patch test-patch.sh is doesn't validate patches before building - Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.3 Reporter: Jack Dintruff Priority: Trivial Labels: newbie Fix For: 0.23.3 Attachments: Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Status: Patch Available (was: Open) test-patch.sh is doesn't validate patches before building - Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.3 Reporter: Jack Dintruff Priority: Trivial Labels: newbie Fix For: 0.23.3 Attachments: Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Status: Open (was: Patch Available) test-patch.sh is doesn't validate patches before building - Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.3 Reporter: Jack Dintruff Priority: Trivial Labels: newbie Fix For: 0.23.3 Attachments: Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Status: Patch Available (was: Open) test-patch.sh is doesn't validate patches before building - Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.3 Reporter: Jack Dintruff Priority: Trivial Labels: newbie Fix For: 0.23.3 Attachments: Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8523) test-patch.sh is doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398509#comment-13398509 ] Hadoop QA commented on HADOOP-8523: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532885/Hadoop-8523.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 13 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1130//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1130//console This message is automatically generated. test-patch.sh is doesn't validate patches before building - Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.3 Reporter: Jack Dintruff Priority: Trivial Labels: newbie Fix For: 0.23.3 Attachments: Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8496) FsShell is broken with s3 filesystems
[ https://issues.apache.org/jira/browse/HADOOP-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398513#comment-13398513 ] Alejandro Abdelnur commented on HADOOP-8496: Daryn, no '-s0' entry. Using Hadoop 1.0.1 things works: {code} $ bin/hadoop fs -conf ~/aws-s3.xml -ls s3n://tucu/ Found 4 items -rwxrwxrwx 1 5 2012-06-08 14:00 /foo.txt drwxrwxrwx - 0 1969-12-31 16:00 /test -rwxrwxrwx 1 5 2012-06-08 13:53 /test.txt -rwxrwxrwx 1 5 2012-06-08 13:56 /test1.txt $ {code} Using Hadoop 2.0.0/trunk things fail: {code} $ bin/hadoop fs -conf ~/aws-s3.xml -ls s3n://tucu/ Found 4 items -ls: -0s Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...] $ {code} FsShell is broken with s3 filesystems - Key: HADOOP-8496 URL: https://issues.apache.org/jira/browse/HADOOP-8496 Project: Hadoop Common Issue Type: Bug Components: fs/s3 Affects Versions: 2.0.1-alpha Reporter: Alejandro Abdelnur Priority: Critical After setting up a S3 account, configuring the site.xml with the accesskey/password, when doing an ls on a non-empty bucket I get: {code} Found 4 items -ls: -0s Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...] {code} Note that it correctly shows the number of items in the root of the bucket, it does not show the contents of the root. I've tried -get and -put and it works fine, accessing a folder in the bucket seems to be fully broken. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Summary: test-patch.sh doesn't validate patches before building (was: test-patch.sh is doesn't validate patches before building) test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.3 Reporter: Jack Dintruff Priority: Trivial Labels: newbie Fix For: 0.23.3 Attachments: Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398550#comment-13398550 ] Jonathan Eagles commented on HADOOP-8523: - Thanks for the patch, Jack. I like that this will save me some time as I know I have hit this case before. Couple of comments. There are two mode this script runs in, from Jenkins to run the commit builds and by developers for testing purposes. Jenkins mode relies on setup to download the patch file from the JIRA, so the check will need to happen after the patch is downloaded. Additionally, setup needs an unpatched build to check if trunk is stable and determine pre-patched javac warnings. Perhaps by adding a --dry-run option to smart-apply-patch this can be achieved. test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.3 Reporter: Jack Dintruff Priority: Trivial Labels: newbie Fix For: 0.23.3 Attachments: Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398561#comment-13398561 ] Jack Dintruff commented on HADOOP-8523: --- The smart-apply-patch.sh executes a --dry-run of patch, so I'll add a patch that just moves my check until after setup is called. Thanks for pointing that out. test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.3 Reporter: Jack Dintruff Priority: Trivial Labels: newbie Fix For: 0.23.3 Attachments: Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Attachment: Hadoop-8523.patch Just moved the smart-apply-patch.sh call to run after setup as per Jon's suggestion.. test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.23.3 Reporter: Jack Dintruff Priority: Trivial Labels: newbie Fix For: 0.23.3 Attachments: Hadoop-8523.patch, Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8524) Allow users to get source of a Configuration parameter
Harsh J created HADOOP-8524: --- Summary: Allow users to get source of a Configuration parameter Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Reporter: Harsh J Priority: Trivial When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Component/s: (was: scripts) build Target Version/s: 0.23.3, 2.0.1-alpha (was: 0.23.3) Affects Version/s: 2.0.1-alpha Fix Version/s: (was: 0.23.3) test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jack Dintruff Priority: Trivial Labels: newbie Attachments: Hadoop-8523.patch, Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398628#comment-13398628 ] Hadoop QA commented on HADOOP-8523: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532902/Hadoop-8523.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 13 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1131//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1131//console This message is automatically generated. test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jack Dintruff Priority: Trivial Labels: newbie Attachments: Hadoop-8523.patch, Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8497) Shell needs a way to list amount of physical consumed space in a directory
[ https://issues.apache.org/jira/browse/HADOOP-8497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398630#comment-13398630 ] Todd Lipcon commented on HADOOP-8497: - Yes, IMO it would. Shell needs a way to list amount of physical consumed space in a directory -- Key: HADOOP-8497 URL: https://issues.apache.org/jira/browse/HADOOP-8497 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 1.0.3, 2.0.0-alpha, 3.0.0 Reporter: Todd Lipcon Assignee: Andy Isaacson Currently, there is no way to see the physical consumed space for a directory. du lists the logical (pre-replication) space, and fs -count only displays the consumed space when a quota is set. This makes it hard for administrators to set a quota on a directory, since they have no way to determine a reasonable value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8517) --config option does not work with Hadoop installation on Windows
[ https://issues.apache.org/jira/browse/HADOOP-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398639#comment-13398639 ] Trupti Dhavle commented on HADOOP-8517: --- --config is a switch given to export the HADOOP_CONF_DIR path in the command line. When I run this on non-windows environment, the command works- /Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/bin/hadoop --config /Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/conf dfs -ls / Found 3 items drwxr-xr-x - trupti supergroup 0 2012-06-12 14:09 /hdfsRegressionData drwxr-xr-x - trupti supergroup 0 2012-06-12 11:31 /tmp drwxr-xr-x - trupti supergroup 0 2012-06-20 10:12 /user /Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/bin/hadoop Usage: hadoop [--config confdir] COMMAND where COMMAND is one of: namenode -format format the DFS filesystem ... ... But the same switch is unavailable for Windows Hadoop. --config option does not work with Hadoop installation on Windows - Key: HADOOP-8517 URL: https://issues.apache.org/jira/browse/HADOOP-8517 Project: Hadoop Common Issue Type: Bug Reporter: Trupti Dhavle I ran following command hadoop --config c:\directory\hadoop\conf fs -ls / I get following error for --config option Unrecognized option: --config Could not create the Java virtual machine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HADOOP-8517) --config option does not work with Hadoop installation on Windows
[ https://issues.apache.org/jira/browse/HADOOP-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trupti Dhavle reopened HADOOP-8517: --- --config is a switch given to export the HADOOP_CONF_DIR path in the command line. When I run this on non-windows environment, the command works- /Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/bin/hadoop --config /Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/conf dfs -ls / Found 3 items drwxr-xr-x - trupti supergroup 0 2012-06-12 14:09 /hdfsRegressionData drwxr-xr-x - trupti supergroup 0 2012-06-12 11:31 /tmp drwxr-xr-x - trupti supergroup 0 2012-06-20 10:12 /user /Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/bin/hadoop Usage: hadoop [--config confdir] COMMAND where COMMAND is one of: namenode -format format the DFS filesystem ... ... But the same switch is unavailable for Windows Hadoop. --config option does not work with Hadoop installation on Windows - Key: HADOOP-8517 URL: https://issues.apache.org/jira/browse/HADOOP-8517 Project: Hadoop Common Issue Type: Bug Reporter: Trupti Dhavle I ran following command hadoop --config c:\directory\hadoop\conf fs -ls / I get following error for --config option Unrecognized option: --config Could not create the Java virtual machine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HADOOP-8524: Attachment: HADOOP-8524.patch Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HADOOP-8524: Target Version/s: 2.0.1-alpha, 3.0.0 Affects Version/s: 2.0.0-alpha Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HADOOP-8524: Status: Patch Available (was: Open) Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398697#comment-13398697 ] Hadoop QA commented on HADOOP-8524: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532913/HADOOP-8524.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 13 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.fs.viewfs.TestViewFsTrash +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1132//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/1132//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1132//console This message is automatically generated. Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398703#comment-13398703 ] Jonathan Eagles commented on HADOOP-8523: - This is looking better. If we can move it right after patch download and before the javac warnings build in setup, then we can see even more speedup! test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jack Dintruff Priority: Trivial Labels: newbie Attachments: Hadoop-8523.patch, Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated HADOOP-8523: Status: Open (was: Patch Available) test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jack Dintruff Priority: Trivial Labels: newbie Attachments: HADOOP-8523.patch, Hadoop-8523.patch, Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Attachment: HADOOP-8523.patch Here's a more elegant fix up for review that doesn't require the initial build and allows the Jenkins machines to take advantage of this fix as well. test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jack Dintruff Priority: Trivial Labels: newbie Attachments: HADOOP-8523.patch, Hadoop-8523.patch, Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Status: Patch Available (was: Open) test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jack Dintruff Priority: Trivial Labels: newbie Attachments: HADOOP-8523.patch, Hadoop-8523.patch, Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor
[ https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha reopened HADOOP-8486: This has caused build failure because Path.getUriPath is no longer available. [javac] c:\Users\bikas\Code\hdp\src\test\org\apache\hadoop\io\TestSequenceFile.java:59: cannot find symbol [javac] symbol : method getUriPath() [javac] location: class org.apache.hadoop.fs.Path [javac] File f = new File(nonSeqFile.getUriPath()); Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor -- Key: HADOOP-8486 URL: https://issues.apache.org/jira/browse/HADOOP-8486 Project: Hadoop Common Issue Type: Bug Components: fs, io Affects Versions: 1.0.2, 1-win Reporter: Kanna Karanam Assignee: Kanna Karanam Fix For: 1-win Attachments: HADOOP-8486-branch-1-win-(2).patch, HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, HADOOP-8486-branch-1-win.patch I noticed this problem while I am working on porting HIVE to work on windows. Hive is attempting to create this class object to validate the file format and end up with resource leak. Because of this leak, we can’t move, rename or delete the files on windows when there is an open file handle whereas in UNIX we can perform all these operation with no issues even with open file handles. Please suggest me if you similar issues in any other places. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Dintruff updated HADOOP-8523: -- Priority: Minor (was: Trivial) Issue Type: Improvement (was: Bug) test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jack Dintruff Priority: Minor Labels: newbie Attachments: HADOOP-8523.patch, Hadoop-8523.patch, Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398731#comment-13398731 ] Hadoop QA commented on HADOOP-8523: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532919/HADOOP-8523.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 13 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1133//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1133//console This message is automatically generated. test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jack Dintruff Priority: Minor Labels: newbie Attachments: HADOOP-8523.patch, Hadoop-8523.patch, Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7967) Need generalized multi-token filesystem support
[ https://issues.apache.org/jira/browse/HADOOP-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398740#comment-13398740 ] Sanjay Radia commented on HADOOP-7967: -- Daryn, rather than do this in two or three steps, let us do a single patch to address this problem completely. I have summarized our discussion in MAPREDUCE-3825 in this [comment.|https://issues.apache.org/jira/browse/MAPREDUCE-3825?focusedCommentId=13398726page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13398726] Need generalized multi-token filesystem support --- Key: HADOOP-7967 URL: https://issues.apache.org/jira/browse/HADOOP-7967 Project: Hadoop Common Issue Type: Bug Components: fs, security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HADOOP-7967-2.patch, HADOOP-7967-3.patch, HADOOP-7967-4.patch, HADOOP-7967-compat.patch, HADOOP-7967.patch Multi-token filesystem support and its interactions with the MR {{TokenCache}} is problematic. The {{TokenCache}} tries to assume it has the knowledge to know if the tokens for a filesystem are available, which it can't possibly know for multi-token filesystems. Filtered filesystems are also problematic, such as har on viewfs. When mergeFs is implemented, it too will become a problem with the current implementation. Currently {{FileSystem}} will leak tokens even when some tokens are already present. The decision for token acquisition, and which tokens, should be pushed all the way down into the {{FileSystem}} level. The {{TokenCache}} should be ignorant and simply request tokens from each {{FileSystem}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HADOOP-8524: Attachment: HADOOP-8524.patch New patch that should fix the findbugs sync issue (wasn't sync'd over the internal DSes). The javadoc and the failing test warnings are unrelated to this one. Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8524.patch, HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398773#comment-13398773 ] Hadoop QA commented on HADOOP-8524: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532927/HADOOP-8524.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 13 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1134//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1134//console This message is automatically generated. Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8524.patch, HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor
[ https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kanna Karanam updated HADOOP-8486: -- Attachment: HADOOP-8486-branch-1-win-(5).patch Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor -- Key: HADOOP-8486 URL: https://issues.apache.org/jira/browse/HADOOP-8486 Project: Hadoop Common Issue Type: Bug Components: fs, io Affects Versions: 1.0.2, 1-win Reporter: Kanna Karanam Assignee: Kanna Karanam Fix For: 1-win Attachments: HADOOP-8486-branch-1-win-(2).patch, HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, HADOOP-8486-branch-1-win-(5).patch, HADOOP-8486-branch-1-win.patch I noticed this problem while I am working on porting HIVE to work on windows. Hive is attempting to create this class object to validate the file format and end up with resource leak. Because of this leak, we can’t move, rename or delete the files on windows when there is an open file handle whereas in UNIX we can perform all these operation with no issues even with open file handles. Please suggest me if you similar issues in any other places. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8517) --config option does not work with Hadoop installation on Windows
[ https://issues.apache.org/jira/browse/HADOOP-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398794#comment-13398794 ] Daryn Sharp commented on HADOOP-8517: - My apologies... Too many variations of specifying a config. --config option does not work with Hadoop installation on Windows - Key: HADOOP-8517 URL: https://issues.apache.org/jira/browse/HADOOP-8517 Project: Hadoop Common Issue Type: Bug Reporter: Trupti Dhavle I ran following command hadoop --config c:\directory\hadoop\conf fs -ls / I get following error for --config option Unrecognized option: --config Could not create the Java virtual machine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor
[ https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398793#comment-13398793 ] Kanna Karanam commented on HADOOP-8486: --- Attached the fix for build failure. Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor -- Key: HADOOP-8486 URL: https://issues.apache.org/jira/browse/HADOOP-8486 Project: Hadoop Common Issue Type: Bug Components: fs, io Affects Versions: 1.0.2, 1-win Reporter: Kanna Karanam Assignee: Kanna Karanam Fix For: 1-win Attachments: HADOOP-8486-branch-1-win-(2).patch, HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, HADOOP-8486-branch-1-win-(5).patch, HADOOP-8486-branch-1-win.patch I noticed this problem while I am working on porting HIVE to work on windows. Hive is attempting to create this class object to validate the file format and end up with resource leak. Because of this leak, we can’t move, rename or delete the files on windows when there is an open file handle whereas in UNIX we can perform all these operation with no issues even with open file handles. Please suggest me if you similar issues in any other places. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8496) FsShell is broken with s3 filesystems
[ https://issues.apache.org/jira/browse/HADOOP-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398817#comment-13398817 ] Daryn Sharp commented on HADOOP-8496: - Oh, I know what this is. The format string is angry about %-0s for the user/group fields. I'm positive there was a jira and patch to fix this. It must not have been committed. FsShell is broken with s3 filesystems - Key: HADOOP-8496 URL: https://issues.apache.org/jira/browse/HADOOP-8496 Project: Hadoop Common Issue Type: Bug Components: fs/s3 Affects Versions: 2.0.1-alpha Reporter: Alejandro Abdelnur Priority: Critical After setting up a S3 account, configuring the site.xml with the accesskey/password, when doing an ls on a non-empty bucket I get: {code} Found 4 items -ls: -0s Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...] {code} Note that it correctly shows the number of items in the root of the bucket, it does not show the contents of the root. I've tried -get and -put and it works fine, accessing a folder in the bucket seems to be fully broken. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398822#comment-13398822 ] Robert Joseph Evans commented on HADOOP-8524: - The code itself looks good to me. The only issue I have has nothing to do with your code. In the past I have talked with others about improving the debuggability/traceability of these values. For Map/Reduce jobs the config is produced and then written out to job.xml. After that the config will only look like all of the settings came from job.xml. We lose traceability to the original configurations. This is even worse for the job.xml that gets sent to the job history server, because it is written out yet again after being read in by the AM, so even the comments in there only say job.xml. I don't really expect you to fix this in your patch, unless you really want to. It would just be nice to have the API marked as @Unstable so that if I ever do get around to putting in better traceability we don't have to change this API. If you are OK with the idea and marking it unstable I will file a separate JIRA for actually adding in the traceability. Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8524.patch, HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules
[ https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398823#comment-13398823 ] Rohini Palaniswamy commented on HADOOP-8518: Tucu, The client should support Server principal canonicalization through DNS. It is one of the standard practices and many clients like curl, Firefox do it. http://books.google.com/books?id=dGMd-uay-lkCpg=PT232lpg=PT232 http://docs.oracle.com/cd/E19253-01/816-4557/planning-25/index.html Having to configure hadoop.security.auth_to_local for something that is a very common Kerberos practice/standard is not ideal. SPNEGO client side should use KerberosName rules Key: HADOOP-8518 URL: https://issues.apache.org/jira/browse/HADOOP-8518 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 1.1.0, 2.0.1-alpha currently KerberosName is used only on the server side to resolve the client name, we should use it on the client side as well to resolve the server name before getting the kerberos ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules
[ https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398826#comment-13398826 ] Rohini Palaniswamy commented on HADOOP-8518: hadoop.security.auth_to_local can be used as for override if required. SPNEGO client side should use KerberosName rules Key: HADOOP-8518 URL: https://issues.apache.org/jira/browse/HADOOP-8518 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 1.1.0, 2.0.1-alpha currently KerberosName is used only on the server side to resolve the client name, we should use it on the client side as well to resolve the server name before getting the kerberos ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HADOOP-8524: Attachment: HADOOP-8524.patch Hey, That makes complete sense and I am also perfectly fine with the idea of marking this as unstable. I do love the idea of proper traceability and can work on it if no one's begun yet (or if you're not gonna be driving it yet). Do file one and chalk it out. This request arose out of Oozie configs however at my end, not purely MR but yes the traceability will help there too. I've attached a patch to mark the API unstable. Thanks for the comment and for taking a look quickly! Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8525) Provide Improved Traceability for Configuration
Robert Joseph Evans created HADOOP-8525: --- Summary: Provide Improved Traceability for Configuration Key: HADOOP-8525 URL: https://issues.apache.org/jira/browse/HADOOP-8525 Project: Hadoop Common Issue Type: Improvement Reporter: Robert Joseph Evans Priority: Trivial Configuration provides basic traceability to see where a config setting came from, but once the configuration is written out that information is written to a comment in the XML and then lost the next time the configuration is read back in. It would really be great to be able to store a complete history of where the config came from in the XML, so that it can then be retrieved later for debugging. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules
[ https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398834#comment-13398834 ] Alejandro Abdelnur commented on HADOOP-8518: @Rohini, so does this mean that we should deconstruct the URL, canonize the hostname using InetAddress and recreate the URL before making the connection? And as fallback provide auth_to_local mapping for cases the canonization does not work as expected? thx!! SPNEGO client side should use KerberosName rules Key: HADOOP-8518 URL: https://issues.apache.org/jira/browse/HADOOP-8518 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 1.1.0, 2.0.1-alpha currently KerberosName is used only on the server side to resolve the client name, we should use it on the client side as well to resolve the server name before getting the kerberos ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HADOOP-8525) Provide Improved Traceability for Configuration
[ https://issues.apache.org/jira/browse/HADOOP-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans reassigned HADOOP-8525: --- Assignee: Robert Joseph Evans Provide Improved Traceability for Configuration --- Key: HADOOP-8525 URL: https://issues.apache.org/jira/browse/HADOOP-8525 Project: Hadoop Common Issue Type: Improvement Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Trivial Configuration provides basic traceability to see where a config setting came from, but once the configuration is written out that information is written to a comment in the XML and then lost the next time the configuration is read back in. It would really be great to be able to store a complete history of where the config came from in the XML, so that it can then be retrieved later for debugging. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398835#comment-13398835 ] Robert Joseph Evans commented on HADOOP-8524: - Great I just filed HADOOP-8525 for this. If you want to work on it go right ahead, if not I will try to throw together a quick patch sometime this week. and +1 for this patch. Looks good to me. Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8525) Provide Improved Traceability for Configuration
[ https://issues.apache.org/jira/browse/HADOOP-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398837#comment-13398837 ] Robert Joseph Evans commented on HADOOP-8525: - I think the only big change here is a small addition to the XML format, so that we can add in a list of previous files it was read from and remove the comment that more or less gives the same information. Internally the Map storing this into will have to now point to a list instead of a single value. It looks fairly straight forward. Provide Improved Traceability for Configuration --- Key: HADOOP-8525 URL: https://issues.apache.org/jira/browse/HADOOP-8525 Project: Hadoop Common Issue Type: Improvement Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Trivial Configuration provides basic traceability to see where a config setting came from, but once the configuration is written out that information is written to a comment in the XML and then lost the next time the configuration is read back in. It would really be great to be able to store a complete history of where the config came from in the XML, so that it can then be retrieved later for debugging. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor
[ https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398849#comment-13398849 ] Bikas Saha commented on HADOOP-8486: lgtm. thanks for the quick fix. Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor -- Key: HADOOP-8486 URL: https://issues.apache.org/jira/browse/HADOOP-8486 Project: Hadoop Common Issue Type: Bug Components: fs, io Affects Versions: 1.0.2, 1-win Reporter: Kanna Karanam Assignee: Kanna Karanam Fix For: 1-win Attachments: HADOOP-8486-branch-1-win-(2).patch, HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, HADOOP-8486-branch-1-win-(5).patch, HADOOP-8486-branch-1-win.patch I noticed this problem while I am working on porting HIVE to work on windows. Hive is attempting to create this class object to validate the file format and end up with resource leak. Because of this leak, we can’t move, rename or delete the files on windows when there is an open file handle whereas in UNIX we can perform all these operation with no issues even with open file handles. Please suggest me if you similar issues in any other places. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used
[ https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398854#comment-13398854 ] Tom White commented on HADOOP-8522: --- The patch looks good to me. A small suggestion regarding the test: why not use GzipCodec to decompress too? Then you can have an assert in the test, and it checks roundtripping using Hadoop APIs. That may be one reason this bug has lain dormant for so long with the non-native implementation (serious users tend to use the native libs). Also, Hadoop has only supported concatenated gzip since HADOOP-6835, so files that had corrupt later members would have been ignored by versions of Hadoop prior to this. ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used Key: HADOOP-8522 URL: https://issues.apache.org/jira/browse/HADOOP-8522 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Mike Percy Attachments: HADOOP-8522-2a.patch ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used. The issue is that finish() flushes the compressor buffer and writes the gzip CRC32 + data length trailer. After that, resetState() does not repeat the gzip header, but simply starts writing more deflate-compressed data. The resultant files are not readable by the Linux gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip files. The gzip format is specified in [RFC 1952|https://tools.ietf.org/html/rfc1952]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HADOOP-8524: Resolution: Fixed Fix Version/s: 2.0.1-alpha Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Robert. Committed to branch-2 and trunk. Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Fix For: 2.0.1-alpha Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HADOOP-8524: Target Version/s: (was: 2.0.1-alpha, 3.0.0) Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Fix For: 2.0.1-alpha Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398877#comment-13398877 ] Hudson commented on HADOOP-8524: Integrated in Hadoop-Common-trunk-Commit #2378 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2378/]) HADOOP-8524. Allow users to get source of a Configuration parameter. (harsh) (Revision 1352689) Result = SUCCESS harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1352689 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Fix For: 2.0.1-alpha Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-8409) Fix TestCommandLineJobSubmission and TestGenericOptionsParser to work for windows
[ https://issues.apache.org/jira/browse/HADOOP-8409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha resolved HADOOP-8409. Resolution: Fixed Fix TestCommandLineJobSubmission and TestGenericOptionsParser to work for windows - Key: HADOOP-8409 URL: https://issues.apache.org/jira/browse/HADOOP-8409 Project: Hadoop Common Issue Type: Bug Components: fs, test, util Affects Versions: 1.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Attachments: HADOOP-8409-branch-1-win.2.patch, HADOOP-8409-branch-1-win.3.patch, HADOOP-8409-branch-1-win.4.patch, HADOOP-8409-branch-1-win.patch Original Estimate: 168h Remaining Estimate: 168h There are multiple places in prod and test code where Windows paths are not handled properly. From a high level this could be summarized with: 1. Windows paths are not necessarily valid DFS paths (while Unix paths are) 2. Windows paths are not necessarily valid URIs (while Unix paths are) #1 causes a number of tests to fail because they implicitly assume that local paths are valid DFS paths (by extracting the DFS test path from for example test.build.data property) #2 causes issues when URIs are directly created on path strings passed in by the user -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398881#comment-13398881 ] Hudson commented on HADOOP-8524: Integrated in Hadoop-Hdfs-trunk-Commit #2448 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2448/]) HADOOP-8524. Allow users to get source of a Configuration parameter. (harsh) (Revision 1352689) Result = SUCCESS harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1352689 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Fix For: 2.0.1-alpha Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules
[ https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1339#comment-1339 ] Daryn Sharp commented on HADOOP-8518: - I'm not up to speed on spnego, so could you educate me as to why the client needs to canonicalize the remote host? SPNEGO client side should use KerberosName rules Key: HADOOP-8518 URL: https://issues.apache.org/jira/browse/HADOOP-8518 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 1.1.0, 2.0.1-alpha currently KerberosName is used only on the server side to resolve the client name, we should use it on the client side as well to resolve the server name before getting the kerberos ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor
[ https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398895#comment-13398895 ] Sanjay Radia commented on HADOOP-8486: -- This patch also directly applies to trunk - can you please provide a trunk patch. Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor -- Key: HADOOP-8486 URL: https://issues.apache.org/jira/browse/HADOOP-8486 Project: Hadoop Common Issue Type: Bug Components: fs, io Affects Versions: 1.0.2, 1-win Reporter: Kanna Karanam Assignee: Kanna Karanam Fix For: 1-win Attachments: HADOOP-8486-branch-1-win-(2).patch, HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, HADOOP-8486-branch-1-win-(5).patch, HADOOP-8486-branch-1-win.patch I noticed this problem while I am working on porting HIVE to work on windows. Hive is attempting to create this class object to validate the file format and end up with resource leak. Because of this leak, we can’t move, rename or delete the files on windows when there is an open file handle whereas in UNIX we can perform all these operation with no issues even with open file handles. Please suggest me if you similar issues in any other places. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules
[ https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398901#comment-13398901 ] Kihwal Lee commented on HADOOP-8518: bq. I'm not up to speed on spnego, so could you educate me as to why the client needs to canonicalize the remote host? HADOOP-8043 might offer you some background/hint. SPNEGO client side should use KerberosName rules Key: HADOOP-8518 URL: https://issues.apache.org/jira/browse/HADOOP-8518 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 1.1.0, 2.0.1-alpha currently KerberosName is used only on the server side to resolve the client name, we should use it on the client side as well to resolve the server name before getting the kerberos ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules
[ https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398918#comment-13398918 ] Alejandro Abdelnur commented on HADOOP-8518: @Daryn, the hadoop-auth SPNEGO client creates a token with HTTP/HOST as server principal where HOST is the host specifid in the URL. If you are using a hostname alias, then the resolved server principal will be HTTPHOST-alias. Then problem is that the KDC will not recognize this principal because it does not exist. This means that the hadoop-auth SPNEGO client should find out what is the real hostname to use as HOST. Hope this clarifies. SPNEGO client side should use KerberosName rules Key: HADOOP-8518 URL: https://issues.apache.org/jira/browse/HADOOP-8518 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 1.1.0, 2.0.1-alpha currently KerberosName is used only on the server side to resolve the client name, we should use it on the client side as well to resolve the server name before getting the kerberos ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules
[ https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398929#comment-13398929 ] Rohini Palaniswamy commented on HADOOP-8518: Not as a fallback, but as a override. What I had done was to get the canonical name of the host from the URL to connect to and use it to construct the service principal's host part (HTTP/canonicalhostname). If a specific Configuration property was set as to what the FQDN of the service principal should be, then used that instead of constructing the service principal from the url. The override would help if the service prinicipal was in a different realm than the default realm too. You can have a separate specif config parameter to specify the service principal override and use the rule mapping configuration. SPNEGO client side should use KerberosName rules Key: HADOOP-8518 URL: https://issues.apache.org/jira/browse/HADOOP-8518 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 1.1.0, 2.0.1-alpha currently KerberosName is used only on the server side to resolve the client name, we should use it on the client side as well to resolve the server name before getting the kerberos ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter
[ https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398936#comment-13398936 ] Hudson commented on HADOOP-8524: Integrated in Hadoop-Mapreduce-trunk-Commit #2397 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2397/]) HADOOP-8524. Allow users to get source of a Configuration parameter. (harsh) (Revision 1352689) Result = FAILURE harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1352689 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java Allow users to get source of a Configuration parameter -- Key: HADOOP-8524 URL: https://issues.apache.org/jira/browse/HADOOP-8524 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Priority: Trivial Fix For: 2.0.1-alpha Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch When we load the various XMLs via the Configuration class, the source of the XML file (filename) is usually kept in the Configuration class but not exposed programmatically. It is presently exposed as comments such as Loaded from mapred-site.xml in the XML dump/serialization but can't be accessed otherwise (Via the Configuration API). For debugging/etc. purposes, it may be useful to expose this safely (such as an API for where did this property come from? queries for a specific property, via an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HADOOP-8519) ERROR level log message should probably be changed to INFO
[ https://issues.apache.org/jira/browse/HADOOP-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Isaacson reassigned HADOOP-8519: - Assignee: Andy Isaacson ERROR level log message should probably be changed to INFO -- Key: HADOOP-8519 URL: https://issues.apache.org/jira/browse/HADOOP-8519 Project: Hadoop Common Issue Type: Task Affects Versions: 0.20.2 Environment: Red Hat Enterprise Linux Server release 6.2 (Santiago) Reporter: Jeff Lord Assignee: Andy Isaacson Datanode service is logging java.net.SocketTimeoutException at ERROR level. This message indicates that the datanode is not able to send data to the client because the client has stopped reading. This message is not really a cause for alarm and should be INFO level. 2012-06-18 17:47:13 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode DatanodeRegistration(x.x.x.x:50010, storageID=DS-196671195-10.10.120.67-50010-1334328338972, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.10.120.67:50010 remote=/10.10.120.67:59282] at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:267) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:163) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8519) idle client socket triggers DN ERROR log (should be INFO or DEBUG)
[ https://issues.apache.org/jira/browse/HADOOP-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Isaacson updated HADOOP-8519: -- Target Version/s: 2.0.1-alpha Issue Type: Bug (was: Task) Summary: idle client socket triggers DN ERROR log (should be INFO or DEBUG) (was: ERROR level log message should probably be changed to INFO) Simple to reproduce, just do {code} FileSystem fs = FileSystem.get(new Configuration()); DataInputStream f = fs.open(new Path(args[0])); f.read(new byte[1024]); Thread.sleep(500 * 1000); {code} The DN eventually gives up on the client socket and ERRORs the DataXceiver SocketTimeoutException. This is definitely not ERROR worthy, I would say DEBUG or INFO at most. idle client socket triggers DN ERROR log (should be INFO or DEBUG) -- Key: HADOOP-8519 URL: https://issues.apache.org/jira/browse/HADOOP-8519 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.20.2 Environment: Red Hat Enterprise Linux Server release 6.2 (Santiago) Reporter: Jeff Lord Assignee: Andy Isaacson Datanode service is logging java.net.SocketTimeoutException at ERROR level. This message indicates that the datanode is not able to send data to the client because the client has stopped reading. This message is not really a cause for alarm and should be INFO level. 2012-06-18 17:47:13 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode DatanodeRegistration(x.x.x.x:50010, storageID=DS-196671195-10.10.120.67-50010-1334328338972, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.10.120.67:50010 remote=/10.10.120.67:59282] at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:267) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:163) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8519) idle client socket triggers DN ERROR log (should be INFO or DEBUG)
[ https://issues.apache.org/jira/browse/HADOOP-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399028#comment-13399028 ] Jeff Lord commented on HADOOP-8519: --- Does this look right? --- .../hadoop/hdfs/server/datanode/DataXceiver.java |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java index 6c280d8..2fb5878 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java @@ -241,7 +241,7 @@ class DataXceiver extends Receiver implements Runnable { } catch(IOException e) { String msg = opReadBlock + block + received exception + e; LOG.info(msg); -sendResponse(s, ERROR, msg, dnConf.socketWriteTimeout); +sendResponse(s, INFO, msg, dnConf.socketWriteTimeout); throw e; } -- idle client socket triggers DN ERROR log (should be INFO or DEBUG) -- Key: HADOOP-8519 URL: https://issues.apache.org/jira/browse/HADOOP-8519 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.20.2 Environment: Red Hat Enterprise Linux Server release 6.2 (Santiago) Reporter: Jeff Lord Assignee: Andy Isaacson Datanode service is logging java.net.SocketTimeoutException at ERROR level. This message indicates that the datanode is not able to send data to the client because the client has stopped reading. This message is not really a cause for alarm and should be INFO level. 2012-06-18 17:47:13 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode DatanodeRegistration(x.x.x.x:50010, storageID=DS-196671195-10.10.120.67-50010-1334328338972, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.10.120.67:50010 remote=/10.10.120.67:59282] at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:267) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:163) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor
[ https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399036#comment-13399036 ] Kanna Karanam commented on HADOOP-8486: --- @Sanjay - Trunk has a fix for it already with lot of other code changes so we may not require a patch for it. Please let me know if you see I am missing something. Thanks Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor -- Key: HADOOP-8486 URL: https://issues.apache.org/jira/browse/HADOOP-8486 Project: Hadoop Common Issue Type: Bug Components: fs, io Affects Versions: 1.0.2, 1-win Reporter: Kanna Karanam Assignee: Kanna Karanam Fix For: 1-win Attachments: HADOOP-8486-branch-1-win-(2).patch, HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, HADOOP-8486-branch-1-win-(5).patch, HADOOP-8486-branch-1-win.patch I noticed this problem while I am working on porting HIVE to work on windows. Hive is attempting to create this class object to validate the file format and end up with resource leak. Because of this leak, we can’t move, rename or delete the files on windows when there is an open file handle whereas in UNIX we can perform all these operation with no issues even with open file handles. Please suggest me if you similar issues in any other places. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8519) idle client socket triggers DN ERROR log (should be INFO or DEBUG)
[ https://issues.apache.org/jira/browse/HADOOP-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399063#comment-13399063 ] Andy Isaacson commented on HADOOP-8519: --- The error is a little different on 2.0: {code} 2012-06-21 18:28:36,251 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: BlockSender.sendChunks() exception: java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/192.168.122.87:50010 remote= /192.168.122.3:51436] at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:164) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:203) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:482) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:634) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:252) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:88) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:63) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189) at java.lang.Thread.run(Thread.java:679) 2012-06-21 18:28:36,252 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.122.87:50010, dest: /192.168.122.3:51436, bytes: 53697024, op: HDFS_READ, cliID: D FSClient_NONMAPREDUCE_-1988072026_1, offset: 0, srvID: DS-706541979-127.0.1.1-50010-1339724203679, blockid: BP-882164591-127.0.1.1-133972395:blk_-1935427635464392086_1010, duration: 482450603444 2012-06-21 18:28:36,252 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.122.87, storageID=DS-706541979-127.0.1.1-50010-1339724203679, infoPort=50075, i pcPort=50020, storageInfo=lv=-40;cid=CID-02666c8e-a05e-480f-94df-f5226414f260;nsid=1569472409;c=0):Got exception while serving BP-882164591-127.0.1.1-133972395:blk_-19354276354643920 86_1010 to /192.168.122.3:51436 java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/192.168.122.87:50010 remote= /192.168.122.3:51436] at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:164) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:203) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:482) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:634) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:252) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:88) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:63) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189) at java.lang.Thread.run(Thread.java:679) 2012-06-21 18:28:36,253 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ubu-cdh-3:50010:DataXceiver error processing READ_BLOCK operation src: /192.168.122.3:51436 dest: /192.168.122.87:50010 java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/192.168.122.87:50010 remote=/192.168.122.3:51436] at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:164) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:203) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:482) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:634) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:252) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:88) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:63) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189) at java.lang.Thread.run(Thread.java:679) {code} idle client socket triggers DN ERROR log (should be INFO or DEBUG) -- Key: HADOOP-8519 URL:
[jira] [Commented] (HADOOP-8519) idle client socket triggers DN ERROR log (should be INFO or DEBUG)
[ https://issues.apache.org/jira/browse/HADOOP-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399070#comment-13399070 ] Andy Isaacson commented on HADOOP-8519: --- Super not obvious, the ERROR is coming from hdfs/server/datanode/BlockSender.java following horrifyingness: {code} 489 } catch (IOException e) { 490 /* Exception while writing to the client. Connection closure from 491* the other end is mostly the case and we do not care much about 492* it. But other things can go wrong, especially in transferTo(), 493* which we do not want to ignore. 494* 495* The message parsing below should not be considered as a good 496* coding example. NEVER do it to drive a program logic. NEVER. 497* It was done here because the NIO throws an IOException for EPIPE. 498*/ 499 String ioem = e.getMessage(); 500 if (!ioem.startsWith(Broken pipe) !ioem.startsWith(Connection reset)) { 501 LOG.error(BlockSender.sendChunks() exception: , e); 502 } 503 throw ioeToSocketException(e); 504 } {code} idle client socket triggers DN ERROR log (should be INFO or DEBUG) -- Key: HADOOP-8519 URL: https://issues.apache.org/jira/browse/HADOOP-8519 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.20.2 Environment: Red Hat Enterprise Linux Server release 6.2 (Santiago) Reporter: Jeff Lord Assignee: Andy Isaacson Datanode service is logging java.net.SocketTimeoutException at ERROR level. This message indicates that the datanode is not able to send data to the client because the client has stopped reading. This message is not really a cause for alarm and should be INFO level. 2012-06-18 17:47:13 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode DatanodeRegistration(x.x.x.x:50010, storageID=DS-196671195-10.10.120.67-50010-1334328338972, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.10.120.67:50010 remote=/10.10.120.67:59282] at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:267) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:163) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira