[jira] [Created] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API

2012-06-21 Thread madhukara phatak (JIRA)
madhukara phatak created HADOOP-8521:


 Summary: Port StreamInputFormat to new Map Reduce API
 Key: HADOOP-8521
 URL: https://issues.apache.org/jira/browse/HADOOP-8521
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: madhukara phatak
Assignee: madhukara phatak


As of now , hadoop streaming still used old Hadoop API. This JIRA ports it to 
the new M/R API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API

2012-06-21 Thread madhukara phatak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

madhukara phatak updated HADOOP-8521:
-

Description: As of now , hadoop streaming uses old Hadoop M/R API. This 
JIRA ports it to the new M/R API.  (was: As of now , hadoop streaming still 
used old Hadoop API. This JIRA ports it to the new M/R API.)

 Port StreamInputFormat to new Map Reduce API
 

 Key: HADOOP-8521
 URL: https://issues.apache.org/jira/browse/HADOOP-8521
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: madhukara phatak
Assignee: madhukara phatak

 As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to 
 the new M/R API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API

2012-06-21 Thread madhukara phatak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

madhukara phatak updated HADOOP-8521:
-

Attachment: HADOOP-8521.patch

 Port StreamInputFormat to new Map Reduce API
 

 Key: HADOOP-8521
 URL: https://issues.apache.org/jira/browse/HADOOP-8521
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: madhukara phatak
Assignee: madhukara phatak
 Attachments: HADOOP-8521.patch


 As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to 
 the new M/R API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API

2012-06-21 Thread madhukara phatak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

madhukara phatak updated HADOOP-8521:
-

Status: Patch Available  (was: Open)

 Port StreamInputFormat to new Map Reduce API
 

 Key: HADOOP-8521
 URL: https://issues.apache.org/jira/browse/HADOOP-8521
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: madhukara phatak
Assignee: madhukara phatak
 Attachments: HADOOP-8521.patch


 As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to 
 the new M/R API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8468) Umbrella of enhancements to support different failure and locality topologies

2012-06-21 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398269#comment-13398269
 ] 

Konstantin Shvachko commented on HADOOP-8468:
-

 3rd on local node of 2nd

How so?

Junping try to rewrite the policy I stated earlier using your terms for 4-level 
topology with node-groups as the third level, and you will see many words 
change. If you put it in terms when virtual nodes are added as the fourth 
level, then you don't need to change a word in the old policy. I thought it's a 
good thing to keep old policies consistent with new use cases. Confirms (1) 
that it's a good policy, and (2) that it's a good design.

 Agree. That's what I try to do previously also.

What changed your mind? Sounds like the right direction to me.

 Umbrella of enhancements to support different failure and locality topologies
 -

 Key: HADOOP-8468
 URL: https://issues.apache.org/jira/browse/HADOOP-8468
 Project: Hadoop Common
  Issue Type: Bug
  Components: ha, io
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Junping Du
Assignee: Junping Du
Priority: Critical
 Attachments: HADOOP-8468-total-v3.patch, HADOOP-8468-total.patch, 
 Proposal for enchanced failure and locality topologies (revised-1.0).pdf, 
 Proposal for enchanced failure and locality topologies.pdf


 The current hadoop network topology (described in some previous issues like: 
 Hadoop-692) works well in classic three-tiers network when it comes out. 
 However, it does not take into account other failure models or changes in the 
 infrastructure that can affect network bandwidth efficiency like: 
 virtualization. 
 Virtualized platform has following genes that shouldn't been ignored by 
 hadoop topology in scheduling tasks, placing replica, do balancing or 
 fetching block for reading: 
 1. VMs on the same physical host are affected by the same hardware failure. 
 In order to match the reliability of a physical deployment, replication of 
 data across two virtual machines on the same host should be avoided.
 2. The network between VMs on the same physical host has higher throughput 
 and lower latency and does not consume any physical switch bandwidth.
 Thus, we propose to make hadoop network topology extend-able and introduce a 
 new level in the hierarchical topology, a node group level, which maps well 
 onto an infrastructure that is based on a virtualized environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

2012-06-21 Thread Mike Percy (JIRA)
Mike Percy created HADOOP-8522:
--

 Summary: ResetableGzipOutputStream creates invalid gzip files when 
finish() and resetState() are used
 Key: HADOOP-8522
 URL: https://issues.apache.org/jira/browse/HADOOP-8522
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.0.0-alpha, 1.0.3
Reporter: Mike Percy


ResetableGzipOutputStream creates invalid gzip files when finish() and 
resetState() are used. The issue is that finish() flushes the compressor buffer 
and writes the gzip CRC32 + data length trailer. After that, resetState() does 
not repeat the gzip header, but simply starts writing more deflate-compressed 
data. The resultant files are not readable by the Linux gunzip tool. 
ResetableGzipOutputStream should write valid multi-member gzip files.

The gzip format is specified in [RFC 1952|https://tools.ietf.org/html/rfc1952].


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

2012-06-21 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398288#comment-13398288
 ] 

Mike Percy commented on HADOOP-8522:


Some additional info. Currently, resetState() works when using the native Zlib 
gzip implementation; the output appears to comply with the spec and works with 
gunzip because it writes the full header and trailer (basically concatenated 
gzip files). That may be one reason this bug has lain dormant for so long with 
the non-native implementation (serious users tend to use the native libs).

So, the problem is with the non-native gzip implementation.

 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used
 

 Key: HADOOP-8522
 URL: https://issues.apache.org/jira/browse/HADOOP-8522
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Mike Percy

 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used. The issue is that finish() flushes the compressor 
 buffer and writes the gzip CRC32 + data length trailer. After that, 
 resetState() does not repeat the gzip header, but simply starts writing more 
 deflate-compressed data. The resultant files are not readable by the Linux 
 gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip 
 files.
 The gzip format is specified in [RFC 
 1952|https://tools.ietf.org/html/rfc1952].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

2012-06-21 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated HADOOP-8522:
---

Attachment: HADOOP-8522-2.patch

I am attaching a patch to make the behavior of non-native resetState() 
consistent with native resetState(), which will make them both compliant with 
RFC1952 and gunzip.

Implementation totally lifted from HBase:
https://svn.apache.org/viewvc/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/ReusableStreamGzipCodec.java?revision=1342856view=markup

I added one unit test which simply tests that the output is readable with 
GZipInputStream, and one in which I had to comment-out the assert() because JDK 
GZipInputStream cannot handle multi-member gzip files. I'm open to suggestions 
for improving the unit test... it looks like HBase actually stores the expected 
bytes and requires an exact match in their test.

Testing done: manual inspection that the data generated via the 2nd unit test 
creates headers, trailers, crc32 checksums, and lengths corresponding to the 
two members included. Also verified that the output of unit test 2 is readable 
with gunzip and that the output matches the provided input.

 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used
 

 Key: HADOOP-8522
 URL: https://issues.apache.org/jira/browse/HADOOP-8522
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Mike Percy
 Attachments: HADOOP-8522-2.patch


 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used. The issue is that finish() flushes the compressor 
 buffer and writes the gzip CRC32 + data length trailer. After that, 
 resetState() does not repeat the gzip header, but simply starts writing more 
 deflate-compressed data. The resultant files are not readable by the Linux 
 gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip 
 files.
 The gzip format is specified in [RFC 
 1952|https://tools.ietf.org/html/rfc1952].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

2012-06-21 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398292#comment-13398292
 ] 

Mike Percy commented on HADOOP-8522:


(note: trunk patch)

 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used
 

 Key: HADOOP-8522
 URL: https://issues.apache.org/jira/browse/HADOOP-8522
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Mike Percy
 Attachments: HADOOP-8522-2.patch


 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used. The issue is that finish() flushes the compressor 
 buffer and writes the gzip CRC32 + data length trailer. After that, 
 resetState() does not repeat the gzip header, but simply starts writing more 
 deflate-compressed data. The resultant files are not readable by the Linux 
 gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip 
 files.
 The gzip format is specified in [RFC 
 1952|https://tools.ietf.org/html/rfc1952].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8468) Umbrella of enhancements to support different failure and locality topologies

2012-06-21 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398294#comment-13398294
 ] 

Junping Du commented on HADOOP-8468:


Hi Konstantin,
   Thanks for your comments. Please see my reply:

 If you put it in terms when virtual nodes are added as the fourth level, then 
 you don't need to change a word in the old policy.
Still need some slightly change as first replica should be placed on local 
virtual node but not node local. Let me show a two different way of translation 
the original rules you list above (in rule 2, I omit on two different nodes 
there as it is duplicated with rule 0).
Original:
0. No more than one replica is placed at any one node
1. First replica on the local node
2. Second and third replicas are in the same rack
3. Other replicas on random nodes with restriction that no more than two 
replicas are placed in the same rack, if there is enough racks.

two ways: 1) node, rack - node, *nodegroup*; 2) node, rack - *virtual node*, 
node, rack. The black word represent additional layer.
way 1:
0. No more than one replica is placed at any one *nodegroup*
1. First replica on the local node
2. Second and third replicas are in the same rack
3. Other replicas on random nodes with restriction that no more than two 
replicas are placed in the same rack, if there is enough racks
way 2:
0. No more than one replica is placed at any one node
1. First replica on the local *virtual node*
2. Second and third replicas are in the same rack
3. Other replicas on random nodes with restriction that no more than two 
replicas are placed in the same rack, if there is enough racks

So you can see it is equivalent in words. 

 Umbrella of enhancements to support different failure and locality topologies
 -

 Key: HADOOP-8468
 URL: https://issues.apache.org/jira/browse/HADOOP-8468
 Project: Hadoop Common
  Issue Type: Bug
  Components: ha, io
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Junping Du
Assignee: Junping Du
Priority: Critical
 Attachments: HADOOP-8468-total-v3.patch, HADOOP-8468-total.patch, 
 Proposal for enchanced failure and locality topologies (revised-1.0).pdf, 
 Proposal for enchanced failure and locality topologies.pdf


 The current hadoop network topology (described in some previous issues like: 
 Hadoop-692) works well in classic three-tiers network when it comes out. 
 However, it does not take into account other failure models or changes in the 
 infrastructure that can affect network bandwidth efficiency like: 
 virtualization. 
 Virtualized platform has following genes that shouldn't been ignored by 
 hadoop topology in scheduling tasks, placing replica, do balancing or 
 fetching block for reading: 
 1. VMs on the same physical host are affected by the same hardware failure. 
 In order to match the reliability of a physical deployment, replication of 
 data across two virtual machines on the same host should be avoided.
 2. The network between VMs on the same physical host has higher throughput 
 and lower latency and does not consume any physical switch bandwidth.
 Thus, we propose to make hadoop network topology extend-able and introduce a 
 new level in the hierarchical topology, a node group level, which maps well 
 onto an infrastructure that is based on a virtualized environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8468) Umbrella of enhancements to support different failure and locality topologies

2012-06-21 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398316#comment-13398316
 ] 

Junping Du commented on HADOOP-8468:


 What changed your mind? Sounds like the right direction to me.
From above comments, you can see way-1 inherit original policy almost as much 
as way-2. But way-1 will take more simplicity in implementation for some 
reasons like: DatanodeDescriptor don't have to remap to additional *virtual 
node* layer, NetworkTopology structure is easier to extend in InnerNode rather 
than leaf node, etc. Thoughts?

 Umbrella of enhancements to support different failure and locality topologies
 -

 Key: HADOOP-8468
 URL: https://issues.apache.org/jira/browse/HADOOP-8468
 Project: Hadoop Common
  Issue Type: Bug
  Components: ha, io
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Junping Du
Assignee: Junping Du
Priority: Critical
 Attachments: HADOOP-8468-total-v3.patch, HADOOP-8468-total.patch, 
 Proposal for enchanced failure and locality topologies (revised-1.0).pdf, 
 Proposal for enchanced failure and locality topologies.pdf


 The current hadoop network topology (described in some previous issues like: 
 Hadoop-692) works well in classic three-tiers network when it comes out. 
 However, it does not take into account other failure models or changes in the 
 infrastructure that can affect network bandwidth efficiency like: 
 virtualization. 
 Virtualized platform has following genes that shouldn't been ignored by 
 hadoop topology in scheduling tasks, placing replica, do balancing or 
 fetching block for reading: 
 1. VMs on the same physical host are affected by the same hardware failure. 
 In order to match the reliability of a physical deployment, replication of 
 data across two virtual machines on the same host should be avoided.
 2. The network between VMs on the same physical host has higher throughput 
 and lower latency and does not consume any physical switch bandwidth.
 Thus, we propose to make hadoop network topology extend-able and introduce a 
 new level in the hierarchical topology, a node group level, which maps well 
 onto an infrastructure that is based on a virtualized environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

2012-06-21 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated HADOOP-8522:
---

Attachment: (was: HADOOP-8522-2.patch)

 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used
 

 Key: HADOOP-8522
 URL: https://issues.apache.org/jira/browse/HADOOP-8522
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Mike Percy
 Attachments: HADOOP-8522-2a.patch


 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used. The issue is that finish() flushes the compressor 
 buffer and writes the gzip CRC32 + data length trailer. After that, 
 resetState() does not repeat the gzip header, but simply starts writing more 
 deflate-compressed data. The resultant files are not readable by the Linux 
 gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip 
 files.
 The gzip format is specified in [RFC 
 1952|https://tools.ietf.org/html/rfc1952].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

2012-06-21 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated HADOOP-8522:
---

Attachment: HADOOP-8522-2a.patch

 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used
 

 Key: HADOOP-8522
 URL: https://issues.apache.org/jira/browse/HADOOP-8522
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Mike Percy
 Attachments: HADOOP-8522-2a.patch


 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used. The issue is that finish() flushes the compressor 
 buffer and writes the gzip CRC32 + data length trailer. After that, 
 resetState() does not repeat the gzip header, but simply starts writing more 
 deflate-compressed data. The resultant files are not readable by the Linux 
 gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip 
 files.
 The gzip format is specified in [RFC 
 1952|https://tools.ietf.org/html/rfc1952].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API

2012-06-21 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398453#comment-13398453
 ] 

Robert Joseph Evans commented on HADOOP-8521:
-

I am a bit confused here.  I see that you added in a new mapreduce 
StreamInputFormat, with the corresponding StreamXmlRecordReader and 
StreamBaseRecordReader, but how does this enable us to use the new MapReduce 
API?  Can you update the documentation to provide some examples of how you can 
use these new classes you have added?  Also the test you have added in is not 
actually testing the new code at all.  It is still testing the old input format 
code.  I can delete the new code entirely and the test still passes.  It looks 
like a great start, but I think there is some more wiring that needs to be done 
to make this work.

 Port StreamInputFormat to new Map Reduce API
 

 Key: HADOOP-8521
 URL: https://issues.apache.org/jira/browse/HADOOP-8521
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: madhukara phatak
Assignee: madhukara phatak
 Attachments: HADOOP-8521.patch


 As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to 
 the new M/R API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API

2012-06-21 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HADOOP-8521:


Target Version/s: 3.0.0
  Status: Open  (was: Patch Available)

 Port StreamInputFormat to new Map Reduce API
 

 Key: HADOOP-8521
 URL: https://issues.apache.org/jira/browse/HADOOP-8521
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: madhukara phatak
Assignee: madhukara phatak
 Attachments: HADOOP-8521.patch


 As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to 
 the new M/R API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8517) --config option does not work with Hadoop installation on Windows

2012-06-21 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398461#comment-13398461
 ] 

Daryn Sharp commented on HADOOP-8517:
-

The correct switch is {{-conf}} and ordering of the args is wrong.  The correct 
cmdline is: {{hadoop fs -conf XXX -ls /}}

 --config option does not work with Hadoop installation on Windows
 -

 Key: HADOOP-8517
 URL: https://issues.apache.org/jira/browse/HADOOP-8517
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Trupti Dhavle

 I ran following command
 hadoop --config c:\directory\hadoop\conf fs -ls /
 I get following error for --config option
 Unrecognized option: --config
 Could not create the Java virtual machine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8517) --config option does not work with Hadoop installation on Windows

2012-06-21 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp resolved HADOOP-8517.
-

Resolution: Not A Problem

 --config option does not work with Hadoop installation on Windows
 -

 Key: HADOOP-8517
 URL: https://issues.apache.org/jira/browse/HADOOP-8517
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Trupti Dhavle

 I ran following command
 hadoop --config c:\directory\hadoop\conf fs -ls /
 I get following error for --config option
 Unrecognized option: --config
 Could not create the Java virtual machine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8497) Shell needs a way to list amount of physical consumed space in a directory

2012-06-21 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398469#comment-13398469
 ] 

Daryn Sharp commented on HADOOP-8497:
-

Would a {{du}} option that reports sizes after multiplying by the replication 
factor address this issue?

 Shell needs a way to list amount of physical consumed space in a directory
 --

 Key: HADOOP-8497
 URL: https://issues.apache.org/jira/browse/HADOOP-8497
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 1.0.3, 2.0.0-alpha, 3.0.0
Reporter: Todd Lipcon
Assignee: Andy Isaacson

 Currently, there is no way to see the physical consumed space for a 
 directory. du lists the logical (pre-replication) space, and fs -count only 
 displays the consumed space when a quota is set. This makes it hard for 
 administrators to set a quota on a directory, since they have no way to 
 determine a reasonable value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8496) FsShell is broken with s3 filesystems

2012-06-21 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398470#comment-13398470
 ] 

Daryn Sharp commented on HADOOP-8496:
-

Does the directory contain a -0s entry?  What is the exact cmdline and the 
exact contents of the directory?

 FsShell is broken with s3 filesystems
 -

 Key: HADOOP-8496
 URL: https://issues.apache.org/jira/browse/HADOOP-8496
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Affects Versions: 2.0.1-alpha
Reporter: Alejandro Abdelnur
Priority: Critical

 After setting up a S3 account, configuring the site.xml with the 
 accesskey/password, when doing an ls on a non-empty bucket I get:
 {code}
 Found 4 items
 -ls: -0s
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}
 Note that it correctly shows the number of items in the root of the bucket, 
 it does not show the contents of the root.
 I've tried -get and -put and it works fine, accessing a folder in the bucket 
 seems to be fully broken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8521) Port StreamInputFormat to new Map Reduce API

2012-06-21 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398472#comment-13398472
 ] 

Tom White commented on HADOOP-8521:
---

Also, MAPREDUCE-1122 would be a useful pre-requisite for this.

 Port StreamInputFormat to new Map Reduce API
 

 Key: HADOOP-8521
 URL: https://issues.apache.org/jira/browse/HADOOP-8521
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: madhukara phatak
Assignee: madhukara phatak
 Attachments: HADOOP-8521.patch


 As of now , hadoop streaming uses old Hadoop M/R API. This JIRA ports it to 
 the new M/R API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8523) test-patch.sh is doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)
Jack Dintruff created HADOOP-8523:
-

 Summary: test-patch.sh is doesn't validate patches before building
 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.1
Reporter: Jack Dintruff
Priority: Trivial


When running test-patch.sh with an invalid patch (not formatted properly) or 
one that doesn't compile, the script spends a lot of time building Hadoop 
before checking to see if the patch is invalid.  It would help devs if it 
checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

Status: Open  (was: Patch Available)

 test-patch.sh is doesn't validate patches before building
 -

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.3
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.3


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

Fix Version/s: 0.23.3
 Target Version/s: 0.23.3  (was: 0.23.1)
Affects Version/s: (was: 0.23.1)
   0.23.3
   Status: Patch Available  (was: Open)

 test-patch.sh is doesn't validate patches before building
 -

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.3
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.3


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

Attachment: Hadoop-8523.patch

 test-patch.sh is doesn't validate patches before building
 -

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.3
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.3

 Attachments: Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

Status: Patch Available  (was: Open)

 test-patch.sh is doesn't validate patches before building
 -

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.3
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.3

 Attachments: Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

Status: Open  (was: Patch Available)

 test-patch.sh is doesn't validate patches before building
 -

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.3
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.3

 Attachments: Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh is doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

Status: Patch Available  (was: Open)

 test-patch.sh is doesn't validate patches before building
 -

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.3
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.3

 Attachments: Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh is doesn't validate patches before building

2012-06-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398509#comment-13398509
 ] 

Hadoop QA commented on HADOOP-8523:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532885/Hadoop-8523.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 13 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1130//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1130//console

This message is automatically generated.

 test-patch.sh is doesn't validate patches before building
 -

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.3
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.3

 Attachments: Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8496) FsShell is broken with s3 filesystems

2012-06-21 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398513#comment-13398513
 ] 

Alejandro Abdelnur commented on HADOOP-8496:


Daryn, no '-s0' entry.

Using Hadoop 1.0.1 things works:

{code}
$ bin/hadoop fs -conf ~/aws-s3.xml -ls s3n://tucu/
Found 4 items
-rwxrwxrwx   1  5 2012-06-08 14:00 /foo.txt
drwxrwxrwx   -  0 1969-12-31 16:00 /test
-rwxrwxrwx   1  5 2012-06-08 13:53 /test.txt
-rwxrwxrwx   1  5 2012-06-08 13:56 /test1.txt
$ 
{code}

Using Hadoop 2.0.0/trunk things fail:

{code}
$ bin/hadoop fs -conf ~/aws-s3.xml -ls s3n://tucu/
Found 4 items
-ls: -0s
Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
$ 
{code}


 FsShell is broken with s3 filesystems
 -

 Key: HADOOP-8496
 URL: https://issues.apache.org/jira/browse/HADOOP-8496
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Affects Versions: 2.0.1-alpha
Reporter: Alejandro Abdelnur
Priority: Critical

 After setting up a S3 account, configuring the site.xml with the 
 accesskey/password, when doing an ls on a non-empty bucket I get:
 {code}
 Found 4 items
 -ls: -0s
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}
 Note that it correctly shows the number of items in the root of the bucket, 
 it does not show the contents of the root.
 I've tried -get and -put and it works fine, accessing a folder in the bucket 
 seems to be fully broken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

Summary: test-patch.sh doesn't validate patches before building  (was: 
test-patch.sh is doesn't validate patches before building)

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.3
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.3

 Attachments: Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398550#comment-13398550
 ] 

Jonathan Eagles commented on HADOOP-8523:
-

Thanks for the patch, Jack. I like that this will save me some time as I know I 
have hit this case before.

Couple of comments. There are two mode this script runs in, from Jenkins to run 
the commit builds and by developers for testing purposes. Jenkins mode relies 
on setup to download the patch file from the JIRA, so the check will need to 
happen after the patch is downloaded. Additionally, setup needs an unpatched 
build to check if trunk is stable and determine pre-patched javac warnings. 
Perhaps by adding a --dry-run option to smart-apply-patch this can be achieved.

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.3
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.3

 Attachments: Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398561#comment-13398561
 ] 

Jack Dintruff commented on HADOOP-8523:
---

The smart-apply-patch.sh executes a --dry-run of patch, so I'll add a patch 
that just moves my check until after setup is called.  Thanks for pointing that 
out.

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.3
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.3

 Attachments: Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

Attachment: Hadoop-8523.patch

Just moved the smart-apply-patch.sh call to run after setup as per Jon's 
suggestion..

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.3
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.3

 Attachments: Hadoop-8523.patch, Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Harsh J (JIRA)
Harsh J created HADOOP-8524:
---

 Summary: Allow users to get source of a Configuration parameter
 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Reporter: Harsh J
Priority: Trivial


When we load the various XMLs via the Configuration class, the source of the 
XML file (filename) is usually kept in the Configuration class but not exposed 
programmatically. It is presently exposed as comments such as Loaded from 
mapred-site.xml in the XML dump/serialization but can't be accessed otherwise 
(Via the Configuration API).

For debugging/etc. purposes, it may be useful to expose this safely (such as an 
API for where did this property come from? queries for a specific property, 
via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

  Component/s: (was: scripts)
   build
 Target Version/s: 0.23.3, 2.0.1-alpha  (was: 0.23.3)
Affects Version/s: 2.0.1-alpha
Fix Version/s: (was: 0.23.3)

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Attachments: Hadoop-8523.patch, Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398628#comment-13398628
 ] 

Hadoop QA commented on HADOOP-8523:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532902/Hadoop-8523.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 13 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1131//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1131//console

This message is automatically generated.

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Attachments: Hadoop-8523.patch, Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8497) Shell needs a way to list amount of physical consumed space in a directory

2012-06-21 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398630#comment-13398630
 ] 

Todd Lipcon commented on HADOOP-8497:
-

Yes, IMO it would.

 Shell needs a way to list amount of physical consumed space in a directory
 --

 Key: HADOOP-8497
 URL: https://issues.apache.org/jira/browse/HADOOP-8497
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 1.0.3, 2.0.0-alpha, 3.0.0
Reporter: Todd Lipcon
Assignee: Andy Isaacson

 Currently, there is no way to see the physical consumed space for a 
 directory. du lists the logical (pre-replication) space, and fs -count only 
 displays the consumed space when a quota is set. This makes it hard for 
 administrators to set a quota on a directory, since they have no way to 
 determine a reasonable value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8517) --config option does not work with Hadoop installation on Windows

2012-06-21 Thread Trupti Dhavle (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398639#comment-13398639
 ] 

Trupti Dhavle commented on HADOOP-8517:
---

--config is a switch given to export the HADOOP_CONF_DIR path in the command 
line.
When I run this on non-windows environment, the command works-

/Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/bin/hadoop  --config 
/Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/conf dfs -ls /
Found 3 items
drwxr-xr-x   - trupti supergroup  0 2012-06-12 14:09 /hdfsRegressionData
drwxr-xr-x   - trupti supergroup  0 2012-06-12 11:31 /tmp
drwxr-xr-x   - trupti supergroup  0 2012-06-20 10:12 /user

/Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/bin/hadoop
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
  namenode -format format the DFS filesystem
 ...
 ...

But the same switch is unavailable for Windows Hadoop.

 --config option does not work with Hadoop installation on Windows
 -

 Key: HADOOP-8517
 URL: https://issues.apache.org/jira/browse/HADOOP-8517
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Trupti Dhavle

 I ran following command
 hadoop --config c:\directory\hadoop\conf fs -ls /
 I get following error for --config option
 Unrecognized option: --config
 Could not create the Java virtual machine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HADOOP-8517) --config option does not work with Hadoop installation on Windows

2012-06-21 Thread Trupti Dhavle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trupti Dhavle reopened HADOOP-8517:
---


--config is a switch given to export the HADOOP_CONF_DIR path in the command 
line.
When I run this on non-windows environment, the command works-
/Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/bin/hadoop --config 
/Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/conf dfs -ls /
Found 3 items
drwxr-xr-x - trupti supergroup 0 2012-06-12 14:09 /hdfsRegressionData
drwxr-xr-x - trupti supergroup 0 2012-06-12 11:31 /tmp
drwxr-xr-x - trupti supergroup 0 2012-06-20 10:12 /user
/Users/trupti/WORK/Software/Hadoop/hadoop-1.0.3/bin/hadoop
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
namenode -format format the DFS filesystem
...
...
But the same switch is unavailable for Windows Hadoop.

 --config option does not work with Hadoop installation on Windows
 -

 Key: HADOOP-8517
 URL: https://issues.apache.org/jira/browse/HADOOP-8517
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Trupti Dhavle

 I ran following command
 hadoop --config c:\directory\hadoop\conf fs -ls /
 I get following error for --config option
 Unrecognized option: --config
 Could not create the Java virtual machine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-8524:


Attachment: HADOOP-8524.patch

 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-8524:


 Target Version/s: 2.0.1-alpha, 3.0.0
Affects Version/s: 2.0.0-alpha

 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-8524:


Status: Patch Available  (was: Open)

 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398697#comment-13398697
 ] 

Hadoop QA commented on HADOOP-8524:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532913/HADOOP-8524.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 13 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.fs.viewfs.TestViewFsTrash

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1132//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1132//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1132//console

This message is automatically generated.

 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398703#comment-13398703
 ] 

Jonathan Eagles commented on HADOOP-8523:
-

This is looking better. If we can move it right after patch download and before 
the javac warnings build in setup, then we can see even more speedup!

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Attachments: Hadoop-8523.patch, Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated HADOOP-8523:


Status: Open  (was: Patch Available)

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Attachments: HADOOP-8523.patch, Hadoop-8523.patch, Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

Attachment: HADOOP-8523.patch

Here's a more elegant fix up for review that doesn't require the initial build 
and allows the Jenkins machines to take advantage of this fix as well.

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Attachments: HADOOP-8523.patch, Hadoop-8523.patch, Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

Status: Patch Available  (was: Open)

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Trivial
  Labels: newbie
 Attachments: HADOOP-8523.patch, Hadoop-8523.patch, Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor

2012-06-21 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha reopened HADOOP-8486:



This has caused build failure because Path.getUriPath is no longer available.
[javac] 
c:\Users\bikas\Code\hdp\src\test\org\apache\hadoop\io\TestSequenceFile.java:59: 
cannot find symbol
[javac] symbol  : method getUriPath()
[javac] location: class org.apache.hadoop.fs.Path
[javac] File f = new File(nonSeqFile.getUriPath());

 Resource leak - Close the open resource handles (File handles) before 
 throwing the exception from the SequenceFile constructor
 --

 Key: HADOOP-8486
 URL: https://issues.apache.org/jira/browse/HADOOP-8486
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, io
Affects Versions: 1.0.2, 1-win
Reporter: Kanna Karanam
Assignee: Kanna Karanam
 Fix For: 1-win

 Attachments: HADOOP-8486-branch-1-win-(2).patch, 
 HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, 
 HADOOP-8486-branch-1-win.patch


 I noticed this problem while I am working on porting HIVE to work on windows. 
 Hive is attempting to create this class object to validate the file format 
 and end up with resource leak. Because of this leak, we can’t move, rename or 
 delete the files on windows when there is an open file handle whereas in UNIX 
 we can perform all these operation with no issues even with open file handles.
 Please suggest me if you similar issues in any other places.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Jack Dintruff (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Dintruff updated HADOOP-8523:
--

  Priority: Minor  (was: Trivial)
Issue Type: Improvement  (was: Bug)

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Minor
  Labels: newbie
 Attachments: HADOOP-8523.patch, Hadoop-8523.patch, Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-06-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398731#comment-13398731
 ] 

Hadoop QA commented on HADOOP-8523:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532919/HADOOP-8523.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 13 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1133//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1133//console

This message is automatically generated.

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Minor
  Labels: newbie
 Attachments: HADOOP-8523.patch, Hadoop-8523.patch, Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7967) Need generalized multi-token filesystem support

2012-06-21 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398740#comment-13398740
 ] 

Sanjay Radia commented on HADOOP-7967:
--

Daryn, rather than do this in two or three steps, let us do a single patch to 
address this problem completely. I have summarized our discussion in  
MAPREDUCE-3825  in this 
[comment.|https://issues.apache.org/jira/browse/MAPREDUCE-3825?focusedCommentId=13398726page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13398726]

 Need generalized multi-token filesystem support
 ---

 Key: HADOOP-7967
 URL: https://issues.apache.org/jira/browse/HADOOP-7967
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, security
Affects Versions: 0.23.1, 0.24.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HADOOP-7967-2.patch, HADOOP-7967-3.patch, 
 HADOOP-7967-4.patch, HADOOP-7967-compat.patch, HADOOP-7967.patch


 Multi-token filesystem support and its interactions with the MR 
 {{TokenCache}} is problematic.  The {{TokenCache}} tries to assume it has the 
 knowledge to know if the tokens for a filesystem are available, which it 
 can't possibly know for multi-token filesystems.  Filtered filesystems are 
 also problematic, such as har on viewfs.  When mergeFs is implemented, it too 
 will become a problem with the current implementation.  Currently 
 {{FileSystem}} will leak tokens even when some tokens are already present.
 The decision for token acquisition, and which tokens, should be pushed all 
 the way down into the {{FileSystem}} level.  The {{TokenCache}} should be 
 ignorant and simply request tokens from each {{FileSystem}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-8524:


Attachment: HADOOP-8524.patch

New patch that should fix the findbugs sync issue (wasn't sync'd over the 
internal DSes).

The javadoc and the failing test warnings are unrelated to this one.

 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: HADOOP-8524.patch, HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398773#comment-13398773
 ] 

Hadoop QA commented on HADOOP-8524:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532927/HADOOP-8524.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 13 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1134//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1134//console

This message is automatically generated.

 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: HADOOP-8524.patch, HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor

2012-06-21 Thread Kanna Karanam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kanna Karanam updated HADOOP-8486:
--

Attachment: HADOOP-8486-branch-1-win-(5).patch

 Resource leak - Close the open resource handles (File handles) before 
 throwing the exception from the SequenceFile constructor
 --

 Key: HADOOP-8486
 URL: https://issues.apache.org/jira/browse/HADOOP-8486
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, io
Affects Versions: 1.0.2, 1-win
Reporter: Kanna Karanam
Assignee: Kanna Karanam
 Fix For: 1-win

 Attachments: HADOOP-8486-branch-1-win-(2).patch, 
 HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, 
 HADOOP-8486-branch-1-win-(5).patch, HADOOP-8486-branch-1-win.patch


 I noticed this problem while I am working on porting HIVE to work on windows. 
 Hive is attempting to create this class object to validate the file format 
 and end up with resource leak. Because of this leak, we can’t move, rename or 
 delete the files on windows when there is an open file handle whereas in UNIX 
 we can perform all these operation with no issues even with open file handles.
 Please suggest me if you similar issues in any other places.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8517) --config option does not work with Hadoop installation on Windows

2012-06-21 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398794#comment-13398794
 ] 

Daryn Sharp commented on HADOOP-8517:
-

My apologies...  Too many variations of specifying a config.

 --config option does not work with Hadoop installation on Windows
 -

 Key: HADOOP-8517
 URL: https://issues.apache.org/jira/browse/HADOOP-8517
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Trupti Dhavle

 I ran following command
 hadoop --config c:\directory\hadoop\conf fs -ls /
 I get following error for --config option
 Unrecognized option: --config
 Could not create the Java virtual machine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor

2012-06-21 Thread Kanna Karanam (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398793#comment-13398793
 ] 

Kanna Karanam commented on HADOOP-8486:
---

Attached the fix for build failure.

 Resource leak - Close the open resource handles (File handles) before 
 throwing the exception from the SequenceFile constructor
 --

 Key: HADOOP-8486
 URL: https://issues.apache.org/jira/browse/HADOOP-8486
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, io
Affects Versions: 1.0.2, 1-win
Reporter: Kanna Karanam
Assignee: Kanna Karanam
 Fix For: 1-win

 Attachments: HADOOP-8486-branch-1-win-(2).patch, 
 HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, 
 HADOOP-8486-branch-1-win-(5).patch, HADOOP-8486-branch-1-win.patch


 I noticed this problem while I am working on porting HIVE to work on windows. 
 Hive is attempting to create this class object to validate the file format 
 and end up with resource leak. Because of this leak, we can’t move, rename or 
 delete the files on windows when there is an open file handle whereas in UNIX 
 we can perform all these operation with no issues even with open file handles.
 Please suggest me if you similar issues in any other places.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8496) FsShell is broken with s3 filesystems

2012-06-21 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398817#comment-13398817
 ] 

Daryn Sharp commented on HADOOP-8496:
-

Oh, I know what this is.  The format string is angry about %-0s for the 
user/group fields.  I'm positive there was a jira and patch to fix this.  It 
must not have been committed.

 FsShell is broken with s3 filesystems
 -

 Key: HADOOP-8496
 URL: https://issues.apache.org/jira/browse/HADOOP-8496
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Affects Versions: 2.0.1-alpha
Reporter: Alejandro Abdelnur
Priority: Critical

 After setting up a S3 account, configuring the site.xml with the 
 accesskey/password, when doing an ls on a non-empty bucket I get:
 {code}
 Found 4 items
 -ls: -0s
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}
 Note that it correctly shows the number of items in the root of the bucket, 
 it does not show the contents of the root.
 I've tried -get and -put and it works fine, accessing a folder in the bucket 
 seems to be fully broken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398822#comment-13398822
 ] 

Robert Joseph Evans commented on HADOOP-8524:
-

The code itself looks good to me.  The only issue I have has nothing to do with 
your code.  In the past I have talked with others about improving the 
debuggability/traceability of these values.  For Map/Reduce jobs the config is 
produced and then written out to job.xml.  After that the config will only look 
like all of the settings came from job.xml. We lose traceability to the 
original configurations.  This is even worse for the job.xml that gets sent to 
the job history server, because it is written out yet again after being read in 
by the AM, so even the comments in there only say job.xml.

I don't really expect you to fix this in your patch, unless you really want to. 
 It would just be nice to have the API marked as @Unstable so that if I ever do 
get around to putting in better traceability we don't have to change this API.  
If you are OK with the idea and marking it unstable I will file a separate JIRA 
for actually adding in the traceability.

 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: HADOOP-8524.patch, HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules

2012-06-21 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398823#comment-13398823
 ] 

Rohini Palaniswamy commented on HADOOP-8518:


Tucu,
   The client should support Server principal canonicalization through DNS. It 
is one of the standard practices and many clients like curl, Firefox do it. 

http://books.google.com/books?id=dGMd-uay-lkCpg=PT232lpg=PT232
http://docs.oracle.com/cd/E19253-01/816-4557/planning-25/index.html

Having to configure hadoop.security.auth_to_local for something that is a very 
common Kerberos practice/standard is not ideal. 

 SPNEGO client side should use KerberosName rules
 

 Key: HADOOP-8518
 URL: https://issues.apache.org/jira/browse/HADOOP-8518
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 1.1.0, 2.0.1-alpha


 currently KerberosName is used only on the server side to resolve the client 
 name, we should use it on the client side as well to resolve the server name 
 before getting the kerberos ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules

2012-06-21 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398826#comment-13398826
 ] 

Rohini Palaniswamy commented on HADOOP-8518:


hadoop.security.auth_to_local can be used as for override if required. 

 SPNEGO client side should use KerberosName rules
 

 Key: HADOOP-8518
 URL: https://issues.apache.org/jira/browse/HADOOP-8518
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 1.1.0, 2.0.1-alpha


 currently KerberosName is used only on the server side to resolve the client 
 name, we should use it on the client side as well to resolve the server name 
 before getting the kerberos ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-8524:


Attachment: HADOOP-8524.patch

Hey,

That makes complete sense and I am also perfectly fine with the idea of marking 
this as unstable. I do love the idea of proper traceability and can work on it 
if no one's begun yet (or if you're not gonna be driving it yet). Do file one 
and chalk it out.

This request arose out of Oozie configs however at my end, not purely MR but 
yes the traceability will help there too.

I've attached a patch to mark the API unstable.

Thanks for the comment and for taking a look quickly!

 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8525) Provide Improved Traceability for Configuration

2012-06-21 Thread Robert Joseph Evans (JIRA)
Robert Joseph Evans created HADOOP-8525:
---

 Summary: Provide Improved Traceability for Configuration
 Key: HADOOP-8525
 URL: https://issues.apache.org/jira/browse/HADOOP-8525
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Robert Joseph Evans
Priority: Trivial


Configuration provides basic traceability to see where a config setting came 
from, but once the configuration is written out that information is written to 
a comment in the XML and then lost the next time the configuration is read back 
in.  It would really be great to be able to store a complete history of where 
the config came from in the XML, so that it can then be retrieved later for 
debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules

2012-06-21 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398834#comment-13398834
 ] 

Alejandro Abdelnur commented on HADOOP-8518:


@Rohini, so does this mean that we should deconstruct the URL, canonize the 
hostname using InetAddress and recreate the URL before making the connection? 
And as fallback provide auth_to_local mapping for cases the canonization does 
not work as expected? thx!!

 SPNEGO client side should use KerberosName rules
 

 Key: HADOOP-8518
 URL: https://issues.apache.org/jira/browse/HADOOP-8518
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 1.1.0, 2.0.1-alpha


 currently KerberosName is used only on the server side to resolve the client 
 name, we should use it on the client side as well to resolve the server name 
 before getting the kerberos ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HADOOP-8525) Provide Improved Traceability for Configuration

2012-06-21 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans reassigned HADOOP-8525:
---

Assignee: Robert Joseph Evans

 Provide Improved Traceability for Configuration
 ---

 Key: HADOOP-8525
 URL: https://issues.apache.org/jira/browse/HADOOP-8525
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Priority: Trivial

 Configuration provides basic traceability to see where a config setting came 
 from, but once the configuration is written out that information is written 
 to a comment in the XML and then lost the next time the configuration is read 
 back in.  It would really be great to be able to store a complete history of 
 where the config came from in the XML, so that it can then be retrieved later 
 for debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398835#comment-13398835
 ] 

Robert Joseph Evans commented on HADOOP-8524:
-

Great I just filed HADOOP-8525 for this.  If you want to work on it go right 
ahead, if not I will try to throw together a quick patch sometime this week.

and +1 for this patch.  Looks good to me.

 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8525) Provide Improved Traceability for Configuration

2012-06-21 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398837#comment-13398837
 ] 

Robert Joseph Evans commented on HADOOP-8525:
-

I think the only big change here is a small addition to the XML format, so that 
we can add in a list of previous files it was read from and remove the comment 
that more or less gives the same information.  Internally the Map storing this 
into will have to now point to a list instead of a single value.  It looks 
fairly straight forward.  

 Provide Improved Traceability for Configuration
 ---

 Key: HADOOP-8525
 URL: https://issues.apache.org/jira/browse/HADOOP-8525
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Priority: Trivial

 Configuration provides basic traceability to see where a config setting came 
 from, but once the configuration is written out that information is written 
 to a comment in the XML and then lost the next time the configuration is read 
 back in.  It would really be great to be able to store a complete history of 
 where the config came from in the XML, so that it can then be retrieved later 
 for debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor

2012-06-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398849#comment-13398849
 ] 

Bikas Saha commented on HADOOP-8486:


lgtm. thanks for the quick fix.

 Resource leak - Close the open resource handles (File handles) before 
 throwing the exception from the SequenceFile constructor
 --

 Key: HADOOP-8486
 URL: https://issues.apache.org/jira/browse/HADOOP-8486
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, io
Affects Versions: 1.0.2, 1-win
Reporter: Kanna Karanam
Assignee: Kanna Karanam
 Fix For: 1-win

 Attachments: HADOOP-8486-branch-1-win-(2).patch, 
 HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, 
 HADOOP-8486-branch-1-win-(5).patch, HADOOP-8486-branch-1-win.patch


 I noticed this problem while I am working on porting HIVE to work on windows. 
 Hive is attempting to create this class object to validate the file format 
 and end up with resource leak. Because of this leak, we can’t move, rename or 
 delete the files on windows when there is an open file handle whereas in UNIX 
 we can perform all these operation with no issues even with open file handles.
 Please suggest me if you similar issues in any other places.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

2012-06-21 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398854#comment-13398854
 ] 

Tom White commented on HADOOP-8522:
---

The patch looks good to me. A small suggestion regarding the test: why not use 
GzipCodec to decompress too? Then you can have an assert in the test, and it 
checks roundtripping using Hadoop APIs. 

 That may be one reason this bug has lain dormant for so long with the 
 non-native implementation (serious users tend to use the native libs).

Also, Hadoop has only supported concatenated gzip since HADOOP-6835, so files 
that had corrupt later members would have been ignored by versions of Hadoop 
prior to this.


 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used
 

 Key: HADOOP-8522
 URL: https://issues.apache.org/jira/browse/HADOOP-8522
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Mike Percy
 Attachments: HADOOP-8522-2a.patch


 ResetableGzipOutputStream creates invalid gzip files when finish() and 
 resetState() are used. The issue is that finish() flushes the compressor 
 buffer and writes the gzip CRC32 + data length trailer. After that, 
 resetState() does not repeat the gzip header, but simply starts writing more 
 deflate-compressed data. The resultant files are not readable by the Linux 
 gunzip tool. ResetableGzipOutputStream should write valid multi-member gzip 
 files.
 The gzip format is specified in [RFC 
 1952|https://tools.ietf.org/html/rfc1952].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-8524:


   Resolution: Fixed
Fix Version/s: 2.0.1-alpha
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Robert. Committed to branch-2 and trunk.

 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-8524:


Target Version/s:   (was: 2.0.1-alpha, 3.0.0)

 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398877#comment-13398877
 ] 

Hudson commented on HADOOP-8524:


Integrated in Hadoop-Common-trunk-Commit #2378 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2378/])
HADOOP-8524. Allow users to get source of a Configuration parameter. 
(harsh) (Revision 1352689)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1352689
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java


 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8409) Fix TestCommandLineJobSubmission and TestGenericOptionsParser to work for windows

2012-06-21 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha resolved HADOOP-8409.


Resolution: Fixed

 Fix TestCommandLineJobSubmission and TestGenericOptionsParser to work for 
 windows
 -

 Key: HADOOP-8409
 URL: https://issues.apache.org/jira/browse/HADOOP-8409
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, test, util
Affects Versions: 1.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: HADOOP-8409-branch-1-win.2.patch, 
 HADOOP-8409-branch-1-win.3.patch, HADOOP-8409-branch-1-win.4.patch, 
 HADOOP-8409-branch-1-win.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 There are multiple places in prod and test code where Windows paths are not 
 handled properly. From a high level this could be summarized with:
 1. Windows paths are not necessarily valid DFS paths (while Unix paths are)
 2. Windows paths are not necessarily valid URIs (while Unix paths are)
 #1 causes a number of tests to fail because they implicitly assume that local 
 paths are valid DFS paths (by extracting the DFS test path from for example 
 test.build.data property)
 #2 causes issues when URIs are directly created on path strings passed in by 
 the user

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398881#comment-13398881
 ] 

Hudson commented on HADOOP-8524:


Integrated in Hadoop-Hdfs-trunk-Commit #2448 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2448/])
HADOOP-8524. Allow users to get source of a Configuration parameter. 
(harsh) (Revision 1352689)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1352689
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java


 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules

2012-06-21 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1339#comment-1339
 ] 

Daryn Sharp commented on HADOOP-8518:
-

I'm not up to speed on spnego, so could you educate me as to why the client 
needs to canonicalize the remote host?

 SPNEGO client side should use KerberosName rules
 

 Key: HADOOP-8518
 URL: https://issues.apache.org/jira/browse/HADOOP-8518
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 1.1.0, 2.0.1-alpha


 currently KerberosName is used only on the server side to resolve the client 
 name, we should use it on the client side as well to resolve the server name 
 before getting the kerberos ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor

2012-06-21 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398895#comment-13398895
 ] 

Sanjay Radia commented on HADOOP-8486:
--

This patch also directly applies to trunk - can you please provide a trunk 
patch.

 Resource leak - Close the open resource handles (File handles) before 
 throwing the exception from the SequenceFile constructor
 --

 Key: HADOOP-8486
 URL: https://issues.apache.org/jira/browse/HADOOP-8486
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, io
Affects Versions: 1.0.2, 1-win
Reporter: Kanna Karanam
Assignee: Kanna Karanam
 Fix For: 1-win

 Attachments: HADOOP-8486-branch-1-win-(2).patch, 
 HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, 
 HADOOP-8486-branch-1-win-(5).patch, HADOOP-8486-branch-1-win.patch


 I noticed this problem while I am working on porting HIVE to work on windows. 
 Hive is attempting to create this class object to validate the file format 
 and end up with resource leak. Because of this leak, we can’t move, rename or 
 delete the files on windows when there is an open file handle whereas in UNIX 
 we can perform all these operation with no issues even with open file handles.
 Please suggest me if you similar issues in any other places.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules

2012-06-21 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398901#comment-13398901
 ] 

Kihwal Lee commented on HADOOP-8518:


bq. I'm not up to speed on spnego, so could you educate me as to why the client 
needs to canonicalize the remote host? HADOOP-8043 might offer you some 
background/hint.

 SPNEGO client side should use KerberosName rules
 

 Key: HADOOP-8518
 URL: https://issues.apache.org/jira/browse/HADOOP-8518
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 1.1.0, 2.0.1-alpha


 currently KerberosName is used only on the server side to resolve the client 
 name, we should use it on the client side as well to resolve the server name 
 before getting the kerberos ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules

2012-06-21 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398918#comment-13398918
 ] 

Alejandro Abdelnur commented on HADOOP-8518:


@Daryn, the hadoop-auth SPNEGO client creates a token with HTTP/HOST as 
server principal where HOST is the host specifid in the URL. If you are using 
a hostname alias, then the resolved server principal will be HTTPHOST-alias. 
Then problem is that the KDC will not recognize this principal because it does 
not exist. This means that the hadoop-auth SPNEGO client should find out what 
is the real hostname to use as HOST. Hope this clarifies.

 SPNEGO client side should use KerberosName rules
 

 Key: HADOOP-8518
 URL: https://issues.apache.org/jira/browse/HADOOP-8518
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 1.1.0, 2.0.1-alpha


 currently KerberosName is used only on the server side to resolve the client 
 name, we should use it on the client side as well to resolve the server name 
 before getting the kerberos ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules

2012-06-21 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398929#comment-13398929
 ] 

Rohini Palaniswamy commented on HADOOP-8518:


Not as a fallback, but as a override. What I had done was to get the canonical 
name of the host from the URL to connect to and use it to construct the service 
principal's host part (HTTP/canonicalhostname). If a specific Configuration 
property was set as to what the FQDN of the service principal should be, then 
used that instead of constructing the service principal from the url. The 
override would help if the service prinicipal was in a different realm than the 
default realm too. You can have a separate specif config parameter to specify 
the service principal override and use the rule mapping configuration.

 SPNEGO client side should use KerberosName rules
 

 Key: HADOOP-8518
 URL: https://issues.apache.org/jira/browse/HADOOP-8518
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 1.1.0, 2.0.1-alpha


 currently KerberosName is used only on the server side to resolve the client 
 name, we should use it on the client side as well to resolve the server name 
 before getting the kerberos ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8524) Allow users to get source of a Configuration parameter

2012-06-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398936#comment-13398936
 ] 

Hudson commented on HADOOP-8524:


Integrated in Hadoop-Mapreduce-trunk-Commit #2397 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2397/])
HADOOP-8524. Allow users to get source of a Configuration parameter. 
(harsh) (Revision 1352689)

 Result = FAILURE
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1352689
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java


 Allow users to get source of a Configuration parameter
 --

 Key: HADOOP-8524
 URL: https://issues.apache.org/jira/browse/HADOOP-8524
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8524.patch, HADOOP-8524.patch, HADOOP-8524.patch


 When we load the various XMLs via the Configuration class, the source of the 
 XML file (filename) is usually kept in the Configuration class but not 
 exposed programmatically. It is presently exposed as comments such as Loaded 
 from mapred-site.xml in the XML dump/serialization but can't be accessed 
 otherwise (Via the Configuration API).
 For debugging/etc. purposes, it may be useful to expose this safely (such as 
 an API for where did this property come from? queries for a specific 
 property, via an API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HADOOP-8519) ERROR level log message should probably be changed to INFO

2012-06-21 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson reassigned HADOOP-8519:
-

Assignee: Andy Isaacson

 ERROR level log message should probably be changed to INFO
 --

 Key: HADOOP-8519
 URL: https://issues.apache.org/jira/browse/HADOOP-8519
 Project: Hadoop Common
  Issue Type: Task
Affects Versions: 0.20.2
 Environment: Red Hat Enterprise Linux Server release 6.2 (Santiago)
Reporter: Jeff Lord
Assignee: Andy Isaacson

 Datanode service is logging java.net.SocketTimeoutException at ERROR level.
 This message indicates that the datanode is not able to send data to the 
 client because the client has stopped reading. This message is not really a 
 cause for alarm and should be INFO level.
 2012-06-18 17:47:13 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode 
 DatanodeRegistration(x.x.x.x:50010, 
 storageID=DS-196671195-10.10.120.67-50010-1334328338972, infoPort=50075, 
 ipcPort=50020):DataXceiver
 java.net.SocketTimeoutException: 48 millis timeout while waiting for 
 channel to be ready for write. ch : java.nio.channels.SocketChannel[connected 
 local=/10.10.120.67:50010 remote=/10.10.120.67:59282]
 at 
 org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
 at 
 org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
 at 
 org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:267)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:163)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8519) idle client socket triggers DN ERROR log (should be INFO or DEBUG)

2012-06-21 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8519:
--

Target Version/s: 2.0.1-alpha
  Issue Type: Bug  (was: Task)
 Summary: idle client socket triggers DN ERROR log (should be INFO 
or DEBUG)  (was: ERROR level log message should probably be changed to INFO)

Simple to reproduce, just do
{code}
FileSystem fs = FileSystem.get(new Configuration());
DataInputStream f = fs.open(new Path(args[0]));
f.read(new byte[1024]);
Thread.sleep(500 * 1000);
{code}

The DN eventually gives up on the client socket and ERRORs the DataXceiver 
SocketTimeoutException.  This is definitely not ERROR worthy, I would say DEBUG 
or INFO at most.

 idle client socket triggers DN ERROR log (should be INFO or DEBUG)
 --

 Key: HADOOP-8519
 URL: https://issues.apache.org/jira/browse/HADOOP-8519
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.20.2
 Environment: Red Hat Enterprise Linux Server release 6.2 (Santiago)
Reporter: Jeff Lord
Assignee: Andy Isaacson

 Datanode service is logging java.net.SocketTimeoutException at ERROR level.
 This message indicates that the datanode is not able to send data to the 
 client because the client has stopped reading. This message is not really a 
 cause for alarm and should be INFO level.
 2012-06-18 17:47:13 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode 
 DatanodeRegistration(x.x.x.x:50010, 
 storageID=DS-196671195-10.10.120.67-50010-1334328338972, infoPort=50075, 
 ipcPort=50020):DataXceiver
 java.net.SocketTimeoutException: 48 millis timeout while waiting for 
 channel to be ready for write. ch : java.nio.channels.SocketChannel[connected 
 local=/10.10.120.67:50010 remote=/10.10.120.67:59282]
 at 
 org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
 at 
 org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
 at 
 org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:267)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:163)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8519) idle client socket triggers DN ERROR log (should be INFO or DEBUG)

2012-06-21 Thread Jeff Lord (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399028#comment-13399028
 ] 

Jeff Lord commented on HADOOP-8519:
---

Does this look right?

---
 .../hadoop/hdfs/server/datanode/DataXceiver.java   |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
index 6c280d8..2fb5878 100644
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
@@ -241,7 +241,7 @@ class DataXceiver extends Receiver implements Runnable {
   } catch(IOException e) {
 String msg = opReadBlock  + block +  received exception  + e; 
 LOG.info(msg);
-sendResponse(s, ERROR, msg, dnConf.socketWriteTimeout);
+sendResponse(s, INFO, msg, dnConf.socketWriteTimeout);
 throw e;
   }
   
-- 

 idle client socket triggers DN ERROR log (should be INFO or DEBUG)
 --

 Key: HADOOP-8519
 URL: https://issues.apache.org/jira/browse/HADOOP-8519
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.20.2
 Environment: Red Hat Enterprise Linux Server release 6.2 (Santiago)
Reporter: Jeff Lord
Assignee: Andy Isaacson

 Datanode service is logging java.net.SocketTimeoutException at ERROR level.
 This message indicates that the datanode is not able to send data to the 
 client because the client has stopped reading. This message is not really a 
 cause for alarm and should be INFO level.
 2012-06-18 17:47:13 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode 
 DatanodeRegistration(x.x.x.x:50010, 
 storageID=DS-196671195-10.10.120.67-50010-1334328338972, infoPort=50075, 
 ipcPort=50020):DataXceiver
 java.net.SocketTimeoutException: 48 millis timeout while waiting for 
 channel to be ready for write. ch : java.nio.channels.SocketChannel[connected 
 local=/10.10.120.67:50010 remote=/10.10.120.67:59282]
 at 
 org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
 at 
 org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
 at 
 org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:267)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:163)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8486) Resource leak - Close the open resource handles (File handles) before throwing the exception from the SequenceFile constructor

2012-06-21 Thread Kanna Karanam (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399036#comment-13399036
 ] 

Kanna Karanam commented on HADOOP-8486:
---

@Sanjay - Trunk has a fix for it already with lot of other code changes so we 
may not require a patch for it. Please let me know if you see I am missing 
something. Thanks

 Resource leak - Close the open resource handles (File handles) before 
 throwing the exception from the SequenceFile constructor
 --

 Key: HADOOP-8486
 URL: https://issues.apache.org/jira/browse/HADOOP-8486
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, io
Affects Versions: 1.0.2, 1-win
Reporter: Kanna Karanam
Assignee: Kanna Karanam
 Fix For: 1-win

 Attachments: HADOOP-8486-branch-1-win-(2).patch, 
 HADOOP-8486-branch-1-win-(3).patch, HADOOP-8486-branch-1-win-(4).patch, 
 HADOOP-8486-branch-1-win-(5).patch, HADOOP-8486-branch-1-win.patch


 I noticed this problem while I am working on porting HIVE to work on windows. 
 Hive is attempting to create this class object to validate the file format 
 and end up with resource leak. Because of this leak, we can’t move, rename or 
 delete the files on windows when there is an open file handle whereas in UNIX 
 we can perform all these operation with no issues even with open file handles.
 Please suggest me if you similar issues in any other places.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8519) idle client socket triggers DN ERROR log (should be INFO or DEBUG)

2012-06-21 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399063#comment-13399063
 ] 

Andy Isaacson commented on HADOOP-8519:
---

The error is a little different on 2.0:
{code}
2012-06-21 18:28:36,251 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
BlockSender.sendChunks() exception: 
java.net.SocketTimeoutException: 48 millis timeout while waiting for 
channel to be ready for write. ch : java.nio.channels.SocketChannel[connected 
local=/192.168.122.87:50010 remote=
/192.168.122.3:51436]
at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:164)
at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:203)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:482)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:634)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:252)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:88)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:63)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
at java.lang.Thread.run(Thread.java:679)
2012-06-21 18:28:36,252 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/192.168.122.87:50010, dest: /192.168.122.3:51436, bytes: 53697024, op: 
HDFS_READ, cliID: D
FSClient_NONMAPREDUCE_-1988072026_1, offset: 0, srvID: 
DS-706541979-127.0.1.1-50010-1339724203679, blockid: 
BP-882164591-127.0.1.1-133972395:blk_-1935427635464392086_1010, duration: 
482450603444
2012-06-21 18:28:36,252 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(192.168.122.87, 
storageID=DS-706541979-127.0.1.1-50010-1339724203679, infoPort=50075, i
pcPort=50020, 
storageInfo=lv=-40;cid=CID-02666c8e-a05e-480f-94df-f5226414f260;nsid=1569472409;c=0):Got
 exception while serving 
BP-882164591-127.0.1.1-133972395:blk_-19354276354643920
86_1010 to /192.168.122.3:51436
java.net.SocketTimeoutException: 48 millis timeout while waiting for 
channel to be ready for write. ch : java.nio.channels.SocketChannel[connected 
local=/192.168.122.87:50010 remote=
/192.168.122.3:51436]
at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:164)
at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:203)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:482)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:634)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:252)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:88)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:63)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
at java.lang.Thread.run(Thread.java:679)
2012-06-21 18:28:36,253 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
ubu-cdh-3:50010:DataXceiver error processing READ_BLOCK operation  src: 
/192.168.122.3:51436 dest: /192.168.122.87:50010
java.net.SocketTimeoutException: 48 millis timeout while waiting for 
channel to be ready for write. ch : java.nio.channels.SocketChannel[connected 
local=/192.168.122.87:50010 remote=/192.168.122.3:51436]
at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:164)
at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:203)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:482)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:634)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:252)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:88)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:63)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
at java.lang.Thread.run(Thread.java:679)
{code}

 idle client socket triggers DN ERROR log (should be INFO or DEBUG)
 --

 Key: HADOOP-8519
 URL: 

[jira] [Commented] (HADOOP-8519) idle client socket triggers DN ERROR log (should be INFO or DEBUG)

2012-06-21 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399070#comment-13399070
 ] 

Andy Isaacson commented on HADOOP-8519:
---

Super not obvious, the ERROR is coming from 
hdfs/server/datanode/BlockSender.java following horrifyingness:
{code}
489 } catch (IOException e) {
490   /* Exception while writing to the client. Connection closure from
491* the other end is mostly the case and we do not care much about
492* it. But other things can go wrong, especially in transferTo(),
493* which we do not want to ignore.
494*
495* The message parsing below should not be considered as a good
496* coding example. NEVER do it to drive a program logic. NEVER.
497* It was done here because the NIO throws an IOException for EPIPE.
498*/
499   String ioem = e.getMessage();
500   if (!ioem.startsWith(Broken pipe)  !ioem.startsWith(Connection 
reset)) {
501 LOG.error(BlockSender.sendChunks() exception: , e);
502   }
503   throw ioeToSocketException(e);
504 }
{code}

 idle client socket triggers DN ERROR log (should be INFO or DEBUG)
 --

 Key: HADOOP-8519
 URL: https://issues.apache.org/jira/browse/HADOOP-8519
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.20.2
 Environment: Red Hat Enterprise Linux Server release 6.2 (Santiago)
Reporter: Jeff Lord
Assignee: Andy Isaacson

 Datanode service is logging java.net.SocketTimeoutException at ERROR level.
 This message indicates that the datanode is not able to send data to the 
 client because the client has stopped reading. This message is not really a 
 cause for alarm and should be INFO level.
 2012-06-18 17:47:13 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode 
 DatanodeRegistration(x.x.x.x:50010, 
 storageID=DS-196671195-10.10.120.67-50010-1334328338972, infoPort=50075, 
 ipcPort=50020):DataXceiver
 java.net.SocketTimeoutException: 48 millis timeout while waiting for 
 channel to be ready for write. ch : java.nio.channels.SocketChannel[connected 
 local=/10.10.120.67:50010 remote=/10.10.120.67:59282]
 at 
 org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
 at 
 org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
 at 
 org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:267)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:163)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira