from:"Kihwal Lee \(Jira\)"

[jira] [Commented] (HADOOP-8372) normalizeHostName() in NetUtils is not working properly in resolving a hostname start with numeric character

2012-05-09 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271482#comment-13271482
 ] 

Kihwal Lee commented on HADOOP-8372:


{noformat}
A host name (label) MUST NOT consist of all numeric values.
{noformat}

They are invalid host names, so the tests need to be fixed. I suggest you keep 
the patch as is and fix the broken tests.

If a user puts an invalid host name in config, it will fail eventually. If this 
method were to return the string as is in such cases, it would get translated 
the same way somewhere down the road. We could add a full validation and make 
it blow up in this method, but I don't think the gain is worth the complexity 
of the check.


 normalizeHostName() in NetUtils is not working properly in resolving a 
 hostname start with numeric character
 

 Key: HADOOP-8372
 URL: https://issues.apache.org/jira/browse/HADOOP-8372
 Project: Hadoop Common
  Issue Type: Bug
  Components: io, util
Affects Versions: 1.0.0, 0.23.0
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HADOOP-8372.patch


 A valid host name can start with numeric value (You can refer RFC952, RFC1123 
 or http://www.zytrax.com/books/dns/apa/names.html), so it is possible in a 
 production environment, user name their hadoop nodes as: 1hosta, 2hostb, etc. 
 But normalizeHostName() will recognise this hostname as IP address and return 
 directly rather than resolving the real IP address. These nodes will be 
 failed to get correct network topology if topology script/TableMapping only 
 contains their IPs (without hostname).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-05-15 Thread Kihwal Lee (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13276083#comment-13276083
]

Kihwal Lee commented on HADOOP-8240:

We need this feature to make data copying and verification work across clusters
with different configurations. I would appreciate any feedback.

h4. Design Choices

# *Add a new create method to FileSystem for allowing checksum type to be
specified.* FileSystem#create() already allows specifying bytesPerChecksum.
The new create method may accept a DataChecksum object. Users can use the
existing DataChecksum.newDataChecksum( int type, int bytesPerChecksum) to
create one. Users who wants to specify non-default type likely want to control
bytesPerChecksum as well.
# *Add checksum types to CreateFlags.* This approach minimizes interface
changes, but may not be the most intuitive/consistent way.
# *Add a method to FSDataOutputStream and DFSOutputStream to allow users to
override default checksum parameters.* This method should fail if data is
already written. This is sort of like ioctl. If there are other tunables we
want to support, we could generalize the api. But changing internal parameters
(not encapsulated data) of an object during run-time doesn't go well with
typical java semantics and may cause confusion. So we need to be careful about
this.

h4. Other previously discussed approaches

# *Setting dfs.checksum.type.* FileSystem cache cause it to be stay the same
after the creation of DFSClient. Also, conf is shared, so it can have
unforeseen side-effects.
# *Disable FileSystem cache.* Create a new Configuration and set
dfs.checksum.type. Without cache, memory bloat is too much.
# *Use conf as a part of key in FileSystem cache, in addition to UGI and scheme
+ authority.* Something along this line may work. Doing shallow comparison may
not be enough. Do we create a special hashCode/equals to make it safer? There
will be memory bloat, but how much? It is still up to users to manage
different configurations and may be more prone to mistakes because of that.

Allow users to specify a checksum type on create()
--

Key: HADOOP-8240
URL: https://issues.apache.org/jira/browse/HADOOP-8240
Project: Hadoop Common
Issue Type: Improvement
Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Fix For: 0.23.3, 2.0.0, 3.0.0

Attachments: hadoop-8240.patch

Per discussion in HADOOP-8060, a way for users to specify a checksum type on
create() is needed. The way FileSystem cache works makes it impossible to use
dfs.checksum.type to achieve this. Also checksum-related API is at
Filesystem-level, so we prefer something at that level, not hdfs-specific
one. Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-05-15 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13276270#comment-13276270
 ] 

Kihwal Lee commented on HADOOP-8240:


Thanks, Nicholas. I think what you suggested makes sense. I haven't thought 
about FileContext side of changes though. 

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 0.23.3, 2.0.0, 3.0.0

 Attachments: hadoop-8240.patch


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8518) SPNEGO client side should use KerberosName rules

2012-06-21 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398901#comment-13398901
 ] 

Kihwal Lee commented on HADOOP-8518:


bq. I'm not up to speed on spnego, so could you educate me as to why the client 
needs to canonicalize the remote host? HADOOP-8043 might offer you some 
background/hint.

 SPNEGO client side should use KerberosName rules
 

 Key: HADOOP-8518
 URL: https://issues.apache.org/jira/browse/HADOOP-8518
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 1.1.0, 2.0.1-alpha


 currently KerberosName is used only on the server side to resolve the client 
 name, we should use it on the client side as well to resolve the server name 
 before getting the kerberos ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8548) test-patch.sh shows an incorrect link in Jekins builds

2012-07-02 Thread Kihwal Lee (JIRA)

Kihwal Lee created HADOOP-8548:
--

 Summary: test-patch.sh shows an incorrect link in Jekins builds
 Key: HADOOP-8548
 URL: https://issues.apache.org/jira/browse/HADOOP-8548
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Kihwal Lee


Precommit builds show an incorrect link for javac warnings.

{noformat}
Javac warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2539//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
{noformat}

Note that 'trunk' appears twice. Do we need $(basename $BASEDIR) in the 
following? Other places don't have it.

{code:title=test-patch.sh}
JIRA_COMMENT_FOOTER=Javac warnings: $BUILD_URL/artifact/trunk/$(basename 
$BASEDIR)/patchprocess/diffJavacWarnings.txt
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8548) test-patch.sh shows an incorrect link in Jekins builds

2012-07-02 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8548:
---

Description: 
Precommit builds show an incorrect link for javac warnings.

{noformat}
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/
2539//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
{noformat}

Note that 'trunk' appears twice. Do we need $(basename $BASEDIR) in the 
following? Other places don't have it.

{code:title=test-patch.sh}
JIRA_COMMENT_FOOTER=Javac warnings: $BUILD_URL/artifact/trunk/
$(basename $BASEDIR)/patchprocess/diffJavacWarnings.txt
{code}

  was:
Precommit builds show an incorrect link for javac warnings.

{noformat}
Javac warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2539//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
{noformat}

Note that 'trunk' appears twice. Do we need $(basename $BASEDIR) in the 
following? Other places don't have it.

{code:title=test-patch.sh}
JIRA_COMMENT_FOOTER=Javac warnings: $BUILD_URL/artifact/trunk/$(basename 
$BASEDIR)/patchprocess/diffJavacWarnings.txt
{code}


 test-patch.sh shows an incorrect link in Jekins builds
 --

 Key: HADOOP-8548
 URL: https://issues.apache.org/jira/browse/HADOOP-8548
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Kihwal Lee

 Precommit builds show an incorrect link for javac warnings.
 {noformat}
 Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/
 2539//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
 {noformat}
 Note that 'trunk' appears twice. Do we need $(basename $BASEDIR) in the 
 following? Other places don't have it.
 {code:title=test-patch.sh}
 JIRA_COMMENT_FOOTER=Javac warnings: $BUILD_URL/artifact/trunk/
 $(basename $BASEDIR)/patchprocess/diffJavacWarnings.txt
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8548) test-patch.sh shows an incorrect link in Jekins builds

2012-07-02 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8548:
---

Assignee: Kihwal Lee
  Status: Patch Available  (was: Open)

 test-patch.sh shows an incorrect link in Jekins builds
 --

 Key: HADOOP-8548
 URL: https://issues.apache.org/jira/browse/HADOOP-8548
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: hadoop-8548.patch


 Precommit builds show an incorrect link for javac warnings.
 {noformat}
 Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/
 2539//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
 {noformat}
 Note that 'trunk' appears twice. Do we need $(basename $BASEDIR) in the 
 following? Other places don't have it.
 {code:title=test-patch.sh}
 JIRA_COMMENT_FOOTER=Javac warnings: $BUILD_URL/artifact/trunk/
 $(basename $BASEDIR)/patchprocess/diffJavacWarnings.txt
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8548) test-patch.sh shows an incorrect link in Jekins builds

2012-07-02 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8548:
---

Attachment: hadoop-8548.patch

 test-patch.sh shows an incorrect link in Jekins builds
 --

 Key: HADOOP-8548
 URL: https://issues.apache.org/jira/browse/HADOOP-8548
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Kihwal Lee
 Attachments: hadoop-8548.patch


 Precommit builds show an incorrect link for javac warnings.
 {noformat}
 Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/
 2539//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
 {noformat}
 Note that 'trunk' appears twice. Do we need $(basename $BASEDIR) in the 
 following? Other places don't have it.
 {code:title=test-patch.sh}
 JIRA_COMMENT_FOOTER=Javac warnings: $BUILD_URL/artifact/trunk/
 $(basename $BASEDIR)/patchprocess/diffJavacWarnings.txt
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7753) Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class

2012-07-11 Thread Kihwal Lee (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412008#comment-13412008
]

Kihwal Lee commented on HADOOP-7753:

bq. btw, which jar pulled by webhdfs into Hadoop 1.x caused problems? we should
try to fix that too.

We've seen users having trouble with jersey, which was brought in by webhdfs.
For 1.x it is already broken, like Eli said. We tend to refrain from
introducing new dependencies and be conservative on upping versions. Since
users have adapted to the current broken status, changing (add/remove/upgrade)
them causes a lot of pain. :(

Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class

Key: HADOOP-7753
URL: https://issues.apache.org/jira/browse/HADOOP-7753
Project: Hadoop Common
Issue Type: Sub-task
Components: io, native, performance
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Fix For: 0.23.0

Attachments: HADOOP-7753.branch-1.patch, hadoop-7753.txt,
hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt, hadoop-7753.txt

This JIRA adds JNI wrappers for sync_data_range and posix_fadvise. It also
implements a ReadaheadPool class for future use from HDFS and MapReduce.

[jira] [Created] (HADOOP-8611) Allow fall-back to the shell-based implementation when JNI-based users-group mapping fails

2012-07-20 Thread Kihwal Lee (JIRA)

Kihwal Lee created HADOOP-8611:
--

 Summary: Allow fall-back to the shell-based implementation when 
JNI-based users-group mapping fails
 Key: HADOOP-8611
 URL: https://issues.apache.org/jira/browse/HADOOP-8611
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 2.0.0-alpha, 0.23.0, 1.0.3
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 1.1.1, 0.23.3, 3.0.0, 2.2.0-alpha


When the JNI-based users-group mapping is enabled, the process/command will 
fail if the native library, libhadoop.so, cannot be found. This mostly happens 
at client-side where users may use hadoop programatically. Instead of failing, 
falling back to the shell-based implementation will be desirable. Depending on 
how cluster is configured, use of the native netgroup mapping cannot be 
subsituted by the shell-based default. For this reason, this behavior must be 
configurable with the default being disabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8599) Non empty response from FileSystem.getFileBlockLocations when asking for data beyond the end of file

2012-07-23 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420983#comment-13420983
 ] 

Kihwal Lee commented on HADOOP-8599:


I think TestCombineFileInputFormat.testForEmptyFile started failing after this. 
The split size on an empty input file used to be 1, but it's now 0.

 Non empty response from FileSystem.getFileBlockLocations when asking for data 
 beyond the end of file 
 -

 Key: HADOOP-8599
 URL: https://issues.apache.org/jira/browse/HADOOP-8599
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 1.0.3, 0.23.1, 2.0.0-alpha
Reporter: Andrey Klochkov
Assignee: Andrey Klochkov
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha

 Attachments: HADOOP-8859-branch-0.23.patch


 When FileSystem.getFileBlockLocations(file,start,len) is called with start 
 argument equal to the file size, the response is not empty. There is a test 
 TestGetFileBlockLocations.testGetFileBlockLocations2 which uses randomly 
 generated start and len arguments when calling 
 FileSystem.getFileBlockLocations and the test fails randomly (when the 
 generated start value equals to the file size).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8599) Non empty response from FileSystem.getFileBlockLocations when asking for data beyond the end of file

2012-07-23 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420985#comment-13420985
 ] 

Kihwal Lee commented on HADOOP-8599:


MAPREDUCE-4470 has been filed.

 Non empty response from FileSystem.getFileBlockLocations when asking for data 
 beyond the end of file 
 -

 Key: HADOOP-8599
 URL: https://issues.apache.org/jira/browse/HADOOP-8599
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 1.0.3, 0.23.1, 2.0.0-alpha
Reporter: Andrey Klochkov
Assignee: Andrey Klochkov
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha

 Attachments: HADOOP-8859-branch-0.23.patch


 When FileSystem.getFileBlockLocations(file,start,len) is called with start 
 argument equal to the file size, the response is not empty. There is a test 
 TestGetFileBlockLocations.testGetFileBlockLocations2 which uses randomly 
 generated start and len arguments when calling 
 FileSystem.getFileBlockLocations and the test fails randomly (when the 
 generated start value equals to the file size).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8633) Interrupted FsShell copies may leave tmp files

2012-07-31 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426044#comment-13426044
 ] 

Kihwal Lee commented on HADOOP-8633:


Looks good to me. I like the use of {{TargetFileSystem}}.

 Interrupted FsShell copies may leave tmp files
 --

 Key: HADOOP-8633
 URL: https://issues.apache.org/jira/browse/HADOOP-8633
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Attachments: HADOOP-8633.patch


 Interrupting a copy, ex. via SIGINT, may cause tmp files to not be removed.  
 If the user is copying large files then the remnants will eat into the user's 
 quota.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-08-13 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433606#comment-13433606
 ] 

Kihwal Lee commented on HADOOP-8240:


The new patch implements ChecksumOpt and updates API in both FileSystem and 
FileContext. This patch also includes:
- related changes in HDFS.
- a new common test for ChecksumOpt and another test that is DFS-specific.
- an updated MR test case due to API change

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8240.patch, hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-08-13 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433615#comment-13433615
 ] 

Kihwal Lee commented on HADOOP-8240:


The branch-0.23 patch will be uploaded once the trunk/branch-2 patch is 
reviewed. The patch will be slightly different due to the differences in the 
use of ProtoBuff and encryption support.

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8240.patch, hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-08-15 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435745#comment-13435745
 ] 

Kihwal Lee commented on HADOOP-8240:


bq. How about combining them?
+1 for sure. I will update my patch to make use of HADOOP-8700. 

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8240.patch, hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-08-16 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8240:
---

Status: Patch Available  (was: Open)

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8240.patch, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-08-16 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8240:
---

Attachment: hadoop-8240-trunk-branch2.patch.txt

Oops. The new test file was dropped while redoing the patch. Here is a new 
patch.

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8240.patch, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt, hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-08-16 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435852#comment-13435852
 ] 

Kihwal Lee commented on HADOOP-8240:


Also dropped HADOOP-8239 from dependency. HDFS-3177 still is dependent on 
HADOOP-8239 and this jira.

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8240.patch, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt, hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-08-16 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8240:
---

Attachment: hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt

The javac and javadoc warnings are fixed in the new patch.

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8240.patch, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt, hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-16 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8239:
---

Attachment: hadoop-8239-after-hadoop-8240.patch.txt

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-16 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8239:
---

Attachment: hadoop-8239-before-hadoop-8240.patch.txt

The attached patch updates MD5MD5CRC32 to handle the checksum type. The change 
is backward compatible. E.g. getFileChecksum() calls between 1.x and 2.x 
custers work over hftp or webhdfs. The actual implementation of the updated 
call is in HDFS-3177.

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-16 Thread Kihwal Lee (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436001#comment-13436001
]

Kihwal Lee commented on HADOOP-8239:

bq. -1 tests included. The patch doesn't appear to include any new or modified
tests.

No new test is added in this patch. A part of the compatibility is checked by
existing tests excercising getFileChecksum(). More tests are coming with
HDFS:3177.

{quote}
-1 core tests. The patch failed these unit tests in
hadoop-common-project/hadoop-common:
org.apache.hadoop.ha.TestZKFailoverController
{quote}
This is a known issue. See HADOOP-8591.

Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
--

Key: HADOOP-8239
URL: https://issues.apache.org/jira/browse/HADOOP-8239
Project: Hadoop Common
Issue Type: Improvement
Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Fix For: 2.1.0-alpha

Attachments: hadoop-8239-after-hadoop-8240.patch.txt,
hadoop-8239-before-hadoop-8240.patch.txt

In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended
to carry the information on the actual checksum type being used. The
interoperability between the extended version and branch-1 should be
guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or
httpfs.

[jira] [Updated] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-08-18 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8240:
---

Attachment: hadoop-8240-trunk-branch2.patch.txt

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8240.patch, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt, hadoop-8240-trunk-branch2.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-08-18 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437263#comment-13437263
 ] 

Kihwal Lee commented on HADOOP-8240:


The new patch addresses the review comments, except 
{{CHECKSUM_UNINIT}}/{{Type.UNINIT}}.

If an unknown checksum type is read from a conf, we fall back to the {{NULL}} 
type. But when a user only specifies {{bytesPerChecksum}} through an old API, 
the configured type must be used. For this reason an unspecified checksum type 
cannot be treated as the {{NULL}} type, because this means disabling checksum.

{{UNINIT}} is confusing, since it really means use default. So I changed it 
to {{CHECKSUM_DEFAULT}}/{{Type.DEFAULT}}. Users may explicitly set checksum 
type to this to let the system pick up the configured type, in addition to the 
old API scenario.  {{bytesPerChecksum}} works similarly when set to -1.


 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8240.patch, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt, hadoop-8240-trunk-branch2.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-18 Thread Kihwal Lee (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437271#comment-13437271
]

Kihwal Lee commented on HADOOP-8239:

I think XML is fine. XML parsing is done at the document level, so we can
safely find out or ignore the existence of the extra parameter and not worry
about the size of data. I tried calling getFileChecksum() over Hftp between a
patched 0.23 cluster and a 1.0.x cluster, and it worked fine both ways.

The change you suggested does not solve the whole problem. The magic number is
like a simple binary length field. Presence/absence of it tells you how much
data you need to read. So the read-side of patched version works even when
reading from an unpatched version. But it's not true for the other way around.
The unpatched version will always leave something unread in the stream. XML is
nice in that it inherently has begin and end marker and not sensitive to size
changes.

Since JsonUtil depends on this serialization/deserialization methods I don't
think it cannot obtain the bidirectional compatibility by modifying only one
side. If it had used XML and did not do the length check, it would have no such
problem. Fully Json-ized approach could have worked as well.

One approach I can think of is to leave the current readFields()/write()
methods unchanged. I think only WebHdfs is using it and if that is true, we can
make WebHdfs actually send and receive everything in JSON format and keep the
current bytes Json field as is. When it does not find the new fields from
an old data source, it can do the old deserialization on bytes. Similarly, it
should send everything in individual JSON field as well as the old serialzed
bytes.

It may be better to move the JSON util methods to MD5MD5CRC32FileChecksum.java,
since they will have to know the internals of MD5MD5CRC32FileChecksum.

Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
--

Key: HADOOP-8239
URL: https://issues.apache.org/jira/browse/HADOOP-8239
Project: Hadoop Common
Issue Type: Improvement
Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Fix For: 2.1.0-alpha

Attachments: hadoop-8239-after-hadoop-8240.patch.txt,
hadoop-8239-before-hadoop-8240.patch.txt

[jira] [Commented] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-18 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437369#comment-13437369
 ] 

Kihwal Lee commented on HADOOP-8239:


bq. It may be better to move the JSON util methods to 
MD5MD5CRC32FileChecksum.java, since they will have to know the internals of 
MD5MD5CRC32FileChecksum.

On a second thought, the JsonUtil is in the o.h.a.hdfs.web package, so I will 
leave it in there.

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-18 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8239:
---

Attachment: hadoop-8239-before-hadoop-8240.patch.txt

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-18 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8239:
---

Attachment: hadoop-8239-after-hadoop-8240.patch.txt

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-18 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437413#comment-13437413
 ] 

Kihwal Lee commented on HADOOP-8239:


Just posted a new patch that makes getFileChecksum() interoperable.

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-20 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437980#comment-13437980
 ] 

Kihwal Lee commented on HADOOP-8239:


I think adding a new class is a good idea. Since DFS.getFileChcksum is expected 
to return MD5MD5CRC32FileChecksum in a lot of places, subclassing 
MD5MD5CRC32FileChecksum for each variant could work.

We can regard CRC32 in MD5MD5CRC32FileChecksum as a generic term for any 32 
bit CRC algorithms. At least that is the case in current 2.0/trunk. If we go 
with this, subclassing MD5MD5CRC32FileChecksum for each variant makes sense.

The following is what I am thinking:

*In MD5MD5CRC32FileChecksum*

The constructor sets crcType to DataChecksum.Type.CRC32

{code}
/** 
 * getAlgorithmName() will use it to construct the name
 */ 
private DataChecksum.Type getCrcType() {
  return crcType;
}

public ChecksumOpt getChecksumOpt() {
  rethrn new ChecksumOpt(getCrcType(), bytesPerCrc);
}
{code}

*Subclass MD5MD5CRC32GzipFileChecksum*
 The constructor sets crcType to DataChecksum.Type.CRC32
 
*Subclass MD5MD5CRC32CastagnoliFileChecksum*
 The constructor sets crcType to DataChecksum.Type.CRC32C

*Interoperability  compatibility*
- Any existing user/hadoop code that expects MD5MD5CRC32FileChecksum from 
DFS.getFileChecksum() will continue to work.
- Any new code that makes use of the new getChecksumOpt() will work as long as 
DFSClient#getFileChecksum() creates and returns the right object. This will be 
done in HDFS-3177, and without it, every thing will default to CRC32, which is 
the current behavior of branch-2/trunk.
- A newer client calling getFileChecksum() to an old cluster over hftp or 
webhdfs will work. (always CRC32)
- An older client calling getFileChecksum() to newer cluster - If the remote 
file on the newer cluster is in CRC32, both hftp and webhdfs work.  If CRC32C 
or anything else, hftp will have a cheksum mismatch. In webhdfs, it will get an 
algorithm field that won't match anything the old MD5MD5CRC32FileChecksum can 
create. In WebHdfsFileSystem, it will generate an IOException, Algorithm not 
matched:.

I think this is reasonable. What do you think?

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-20 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437988#comment-13437988
 ] 

Kihwal Lee commented on HADOOP-8239:


Correction: MD5MD5CRC32FileChecksum#getCrcType() is not needed.

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-20 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8239:
---

Attachment: hadoop-8239-trunk-branch2.patch.txt

The new patch adds a separate class for each checksum type used in 
MD5MD5CRC32FileChecksum.

MD5MD5CRC32FileChecksum has the new getCrcType() and the subclasses overrides 
it.

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, hadoop-8239-trunk-branch2.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-20 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438192#comment-13438192
 ] 

Kihwal Lee commented on HADOOP-8239:


BAD patch. I will fix it and reupload in a bit.

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, hadoop-8239-trunk-branch2.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-20 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8239:
---

Attachment: hadoop-8239-trunk-branch2.patch.txt

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-trunk-branch2.patch.txt, hadoop-8239-trunk-branch2.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-20 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8239:
---

Status: Patch Available  (was: Open)

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-trunk-branch2.patch.txt, hadoop-8239-trunk-branch2.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8700) Move the checksum type constants to an enum

2012-08-20 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8700:
---

Attachment: hadoop-8700-branch-0.23.patch.txt

Attaching the patch for branch 0.23. Existing patch has conflicts mainly due to 
context differences.

 Move the checksum type constants to an enum
 ---

 Key: HADOOP-8700
 URL: https://issues.apache.org/jira/browse/HADOOP-8700
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Fix For: 0.23.3, 2.2.0-alpha

 Attachments: c8700_20120815b.patch, c8700_20120815.patch, 
 hadoop-8700-branch-0.23.patch.txt


 In DataChecksum, there are constants for crc types, crc names and crc sizes.  
 We should move them to an enum for better coding style.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8700) Move the checksum type constants to an enum

2012-08-20 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8700:
---

Fix Version/s: 0.23.3

 Move the checksum type constants to an enum
 ---

 Key: HADOOP-8700
 URL: https://issues.apache.org/jira/browse/HADOOP-8700
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Fix For: 0.23.3, 2.2.0-alpha

 Attachments: c8700_20120815b.patch, c8700_20120815.patch, 
 hadoop-8700-branch-0.23.patch.txt


 In DataChecksum, there are constants for crc types, crc names and crc sizes.  
 We should move them to an enum for better coding style.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-08-20 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8240:
---

Attachment: hadoop-8240-branch-0.23-alone.patch.txt

An equivalent patch for branch-0.23 is attached. It has dependency on 
HADOOP-8700.

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 0.23.3, 2.2.0-alpha

 Attachments: hadoop-8240-branch-0.23-alone.patch.txt, 
 hadoop-8240.patch, hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt, hadoop-8240-trunk-branch2.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-08-20 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8240:
---

Fix Version/s: 0.23.3

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 0.23.3, 2.2.0-alpha

 Attachments: hadoop-8240-branch-0.23-alone.patch.txt, 
 hadoop-8240.patch, hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt, hadoop-8240-trunk-branch2.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-20 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438371#comment-13438371
 ] 

Kihwal Lee commented on HADOOP-8239:


bq. -1 tests included. The patch doesn't appear to include any new or modified 
tests.

Additional test cases will be in HDFS-3177.

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-trunk-branch2.patch.txt, hadoop-8239-trunk-branch2.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-20 Thread Kihwal Lee (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438456#comment-13438456
]

Kihwal Lee commented on HADOOP-8239:

bq. MD5MD5CRC32GzipFileChecksum and MD5MD5CRC32CastagnoliFileChecksum should
not have the following fields.
The last patch is supposed to fix this, but the files were not added. Sorry
about that.

bq. DataChecksum.MIXED is not used. What do we need it? Could we add it later?
Any file system implementation that's using MD5MD5CRC32FileChecksum will need
it, a file can contain blocks with different checksum types. This is not
desired, but at least we should be able to detect it. So I think it belongs
here and will be used by HDFS-3177.

I will post the corrected patch.

Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
--

Key: HADOOP-8239
URL: https://issues.apache.org/jira/browse/HADOOP-8239
Project: Hadoop Common
Issue Type: Improvement
Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Fix For: 2.1.0-alpha

Attachments: hadoop-8239-after-hadoop-8240.patch.txt,
hadoop-8239-after-hadoop-8240.patch.txt,
hadoop-8239-before-hadoop-8240.patch.txt,
hadoop-8239-before-hadoop-8240.patch.txt,
hadoop-8239-trunk-branch2.patch.txt, hadoop-8239-trunk-branch2.patch.txt

[jira] [Updated] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-20 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8239:
---

Attachment: hadoop-8239-trunk-branch2.patch.txt

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.1.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-trunk-branch2.patch.txt, hadoop-8239-trunk-branch2.patch.txt, 
 hadoop-8239-trunk-branch2.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8239) Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used

2012-08-21 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8239:
---

Attachment: hadoop-8239-branch-0.23.patch.txt

Attaching the patch for branch-0.23.

 Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
 --

 Key: HADOOP-8239
 URL: https://issues.apache.org/jira/browse/HADOOP-8239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 2.2.0-alpha

 Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-after-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, 
 hadoop-8239-before-hadoop-8240.patch.txt, hadoop-8239-branch-0.23.patch.txt, 
 hadoop-8239-trunk-branch2.patch.txt, hadoop-8239-trunk-branch2.patch.txt, 
 hadoop-8239-trunk-branch2.patch.txt


 In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
 to carry the information on the actual checksum type being used. The 
 interoperability between the extended version and branch-1 should be 
 guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
 httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8060) Add a capability to use of consistent checksums for append and copy

2012-08-27 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13442514#comment-13442514
 ] 

Kihwal Lee commented on HADOOP-8060:


In HDFS-3177, Sanjay suggested that the one checksum type per file be enforced 
architecturally, rather than DFSClient doing it using existing facility. The 
changes in HDFS-3177 still allows DistCp, etc. to discover and set checksum 
parameters so that the results of getFileChecksum() on copies can match. I will 
resolve this jira with a modified summary.  I expect Sanjay to file a new Jira 
when he has a proposal.

 Add a capability to use of consistent checksums for append and copy
 ---

 Key: HADOOP-8060
 URL: https://issues.apache.org/jira/browse/HADOOP-8060
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, util
Affects Versions: 0.23.0, 0.23.1, 0.24.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee

 After the improved CRC32C checksum feature became default, some of use cases 
 involving data movement are no longer supported.  For example, when running 
 DistCp to copy from a file stored with the CRC32 checksum to a new cluster 
 with the CRC32C set to default checksum, the final data integrity check fails 
 because of mismatch in checksums.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8060) Add a capability to discover and set checksum types per file.

2012-08-27 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8060:
---

Summary: Add a capability to discover and set checksum types per file.  
(was: Add a capability to use of consistent checksums for append and copy)

 Add a capability to discover and set checksum types per file.
 -

 Key: HADOOP-8060
 URL: https://issues.apache.org/jira/browse/HADOOP-8060
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, util
Affects Versions: 0.23.0, 0.23.1, 0.24.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee

 After the improved CRC32C checksum feature became default, some of use cases 
 involving data movement are no longer supported.  For example, when running 
 DistCp to copy from a file stored with the CRC32 checksum to a new cluster 
 with the CRC32C set to default checksum, the final data integrity check fails 
 because of mismatch in checksums.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8060) Add a capability to discover and set checksum types per file.

2012-08-27 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8060:
---

Fix Version/s: 2.2.0-alpha
   3.0.0
   0.23.3

 Add a capability to discover and set checksum types per file.
 -

 Key: HADOOP-8060
 URL: https://issues.apache.org/jira/browse/HADOOP-8060
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, util
Affects Versions: 0.23.0, 0.23.1, 0.24.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha


 After the improved CRC32C checksum feature became default, some of use cases 
 involving data movement are no longer supported.  For example, when running 
 DistCp to copy from a file stored with the CRC32 checksum to a new cluster 
 with the CRC32C set to default checksum, the final data integrity check fails 
 because of mismatch in checksums.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-8060) Add a capability to discover and set checksum types per file.

2012-08-27 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HADOOP-8060.


Resolution: Fixed

 Add a capability to discover and set checksum types per file.
 -

 Key: HADOOP-8060
 URL: https://issues.apache.org/jira/browse/HADOOP-8060
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, util
Affects Versions: 0.23.0, 0.23.1, 0.24.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha


 After the improved CRC32C checksum feature became default, some of use cases 
 involving data movement are no longer supported.  For example, when running 
 DistCp to copy from a file stored with the CRC32 checksum to a new cluster 
 with the CRC32C set to default checksum, the final data integrity check fails 
 because of mismatch in checksums.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8689) Make trash a server side configuration option

2012-08-27 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13442648#comment-13442648
 ] 

Kihwal Lee commented on HADOOP-8689:


In HDFS-3856, namenode does System.exit(1), because getServerDefaults() is not 
allowed on backup nodes.

 Make trash a server side configuration option
 -

 Key: HADOOP-8689
 URL: https://issues.apache.org/jira/browse/HADOOP-8689
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 2.2.0-alpha

 Attachments: hadoop-8689.txt, hadoop-8689.txt


 Per ATM's suggestion in HADOOP-8598 for v2 let's make {{fs.trash.interval}} 
 configured server side. If it is not configured server side then the client 
 side configuration is used. The {{fs.trash.checkpoint.interval}} option is 
 already server side as the emptier runs in the NameNode. Clients may manually 
 run an emptier via hadoop org.apache.hadoop.fs.Trash but it's OK if it uses a 
 separate interval. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8783) Improve RPC.Server's digest auth

2012-09-11 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453093#comment-13453093
 ] 

Kihwal Lee commented on HADOOP-8783:


+1 (non-binding) Looks good to me. I hope better testing will be added with the 
client-side changes.

 Improve RPC.Server's digest auth
 

 Key: HADOOP-8783
 URL: https://issues.apache.org/jira/browse/HADOOP-8783
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: ipc, security
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HADOOP-8783.patch, HADOOP-8783.patch


 RPC.Server should always allow digest auth (tokens) if a secret manager if 
 present.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8806) libhadoop.so: search java.library.path when calling dlopen

2012-09-14 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455813#comment-13455813
 ] 

Kihwal Lee commented on HADOOP-8806:


bq. These libraries can be bundled in the $HADOOP_ROOT/lib/native directory. 
For example, the -Dbundle.snappy build option copies libsnappy.so to this 
directory. However, snappy can't be loaded from this directory unless 
LD_LIBRARY_PATH is set to include this directory.

If this is only about MR jobs, isn't setting {{LD_LIBRARY_PATH}} in 
{{mapreduce.admin.user.env}} enough?

 libhadoop.so: search java.library.path when calling dlopen
 --

 Key: HADOOP-8806
 URL: https://issues.apache.org/jira/browse/HADOOP-8806
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Colin Patrick McCabe
Priority: Minor

 libhadoop calls {{dlopen}} to load {{libsnappy.so}} and {{libz.so}}.  These 
 libraries can be bundled in the {{$HADOOP_ROOT/lib/native}} directory.  For 
 example, the {{-Dbundle.snappy}} build option copies {{libsnappy.so}} to this 
 directory.  However, snappy can't be loaded from this directory unless 
 {{LD_LIBRARY_PATH}} is set to include this directory.
 Should we also search {{java.library.path}} when loading these libraries?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Moved] (HADOOP-8932) JNI-based user-group mapping modules can be too chatty on lookup failures

2012-10-16 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee moved HDFS-4064 to HADOOP-8932:
--

  Component/s: (was: security)
   security
 Target Version/s:   (was: 3.0.0, 2.0.3-alpha, 0.23.5)
Affects Version/s: (was: 0.23.5)
   (was: 2.0.3-alpha)
   (was: 3.0.0)
   0.23.5
   2.0.3-alpha
   3.0.0
  Key: HADOOP-8932  (was: HDFS-4064)
  Project: Hadoop Common  (was: Hadoop HDFS)

 JNI-based user-group mapping modules can be too chatty on lookup failures
 -

 Key: HADOOP-8932
 URL: https://issues.apache.org/jira/browse/HADOOP-8932
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.5
Reporter: Kihwal Lee
Assignee: Kihwal Lee

 On a user/group lookup failure, JniBasedUnixGroupsMapping and 
 JniBasedUnixGroupsNetgroupMapping are logging the
 full stack trace at WARN level.  Since the caller of these methods is already 
 logging errors, this is not needed.  In branch-1, just one line is logged, so 
 we don't need this change there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8932) JNI-based user-group mapping modules can be too chatty on lookup failures

2012-10-16 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8932:
---

Attachment: hadoop-8932.patch.txt

 JNI-based user-group mapping modules can be too chatty on lookup failures
 -

 Key: HADOOP-8932
 URL: https://issues.apache.org/jira/browse/HADOOP-8932
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.5
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: hadoop-8932.patch.txt


 On a user/group lookup failure, JniBasedUnixGroupsMapping and 
 JniBasedUnixGroupsNetgroupMapping are logging the
 full stack trace at WARN level.  Since the caller of these methods is already 
 logging errors, this is not needed.  In branch-1, just one line is logged, so 
 we don't need this change there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8932) JNI-based user-group mapping modules can be too chatty on lookup failures

2012-10-16 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-8932:
---

Status: Patch Available  (was: Open)

 JNI-based user-group mapping modules can be too chatty on lookup failures
 -

 Key: HADOOP-8932
 URL: https://issues.apache.org/jira/browse/HADOOP-8932
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.5
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: hadoop-8932.patch.txt


 On a user/group lookup failure, JniBasedUnixGroupsMapping and 
 JniBasedUnixGroupsNetgroupMapping are logging the
 full stack trace at WARN level.  Since the caller of these methods is already 
 logging errors, this is not needed.  In branch-1, just one line is logged, so 
 we don't need this change there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8932) JNI-based user-group mapping modules can be too chatty on lookup failures

2012-10-16 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477094#comment-13477094
 ] 

Kihwal Lee commented on HADOOP-8932:


No test is included since the change is about log message.

 JNI-based user-group mapping modules can be too chatty on lookup failures
 -

 Key: HADOOP-8932
 URL: https://issues.apache.org/jira/browse/HADOOP-8932
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.5
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: hadoop-8932.patch.txt


 On a user/group lookup failure, JniBasedUnixGroupsMapping and 
 JniBasedUnixGroupsNetgroupMapping are logging the
 full stack trace at WARN level.  Since the caller of these methods is already 
 logging errors, this is not needed.  In branch-1, just one line is logged, so 
 we don't need this change there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7977) Allow Hadoop clients and services to run in an OSGi container

2012-10-31 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-7977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488027#comment-13488027
 ] 

Kihwal Lee commented on HADOOP-7977:


Do you have any jira/patch that adds the hadoop-karaf module?

 Allow Hadoop clients and services to run in an OSGi container
 -

 Key: HADOOP-7977
 URL: https://issues.apache.org/jira/browse/HADOOP-7977
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 0.24.0
 Environment: OSGi client runtime (Spring c), possibly service 
 runtime (e.g. Apache Karaf)
Reporter: Steve Loughran
Priority: Minor

 There's been past discussion on running Hadoop client and service code in 
 OSGi. This JIRA issue exists to wrap up the needs and issues. 
 # client-side use of public Hadoop APIs would seem most important.
 # service-side deployments could offer benefits. The non-standard Hadoop Java 
 security configuration may interfere with this goal.
 # testing would all be functional with dependencies on external services, to 
 make things harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9013) UGI should not hardcode loginUser's authenticationType

2012-11-06 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491818#comment-13491818
 ] 

Kihwal Lee commented on HADOOP-9013:


+1 This is a straightforward change. Looks good to me.

 UGI should not hardcode loginUser's authenticationType
 --

 Key: HADOOP-9013
 URL: https://issues.apache.org/jira/browse/HADOOP-9013
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs, security
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HADOOP-9013.patch


 {{UGI.loginUser}} assumes that the user's auth type for security on = 
 kerberos, security off = simple.  It should instead use the configured auth 
 type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8240) Allow users to specify a checksum type on create()

2012-11-14 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497130#comment-13497130
 ] 

Kihwal Lee commented on HADOOP-8240:


Thanks, Uma for pointing this out.  I will take a look and file a jira if 
necessary. BlockScan on those blocks become less useful as you said, but it may 
still cover certain rare failure modes. Anyways, we may be able to optimize it.

 Allow users to specify a checksum type on create()
 --

 Key: HADOOP-8240
 URL: https://issues.apache.org/jira/browse/HADOOP-8240
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 0.23.3, 2.0.2-alpha

 Attachments: hadoop-8240-branch-0.23-alone.patch.txt, 
 hadoop-8240.patch, hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt, hadoop-8240-trunk-branch2.patch.txt, 
 hadoop-8240-trunk-branch2.patch.txt


 Per discussion in HADOOP-8060, a way for users to specify a checksum type on 
 create() is needed. The way FileSystem cache works makes it impossible to use 
 dfs.checksum.type to achieve this. Also checksum-related API is at 
 Filesystem-level, so we prefer something at that level, not hdfs-specific 
 one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9037) Bug in test-patch.sh and precommit build process

2012-11-14 Thread Kihwal Lee (JIRA)

Kihwal Lee created HADOOP-9037:
--

 Summary: Bug in test-patch.sh and precommit build process
 Key: HADOOP-9037
 URL: https://issues.apache.org/jira/browse/HADOOP-9037
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0
Reporter: Kihwal Lee
Priority: Critical


In HDFS-4171, the precommit build failed, but it still posted +1 for every 
category.  

{noformat}
==
==
Running tests.
==
==


/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/
dev-support/test-patch.sh: line 713: [: hadoop-hdfs-project/hadoop-hdfs:
binary operator expected
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9037) Bug in test-patch.sh and precommit build process

2012-11-14 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497210#comment-13497210
 ] 

Kihwal Lee commented on HADOOP-9037:


At line 713,
{noformat}
  if [ -n $hdfs_modules ]; then
{noformat}

{{$hdfs_modules}} was hadoop-hdfs-project/hadoop-hdfs 
hadoop-hdfs-project/hadoop-hdfs-httpfs. Since -n is a unary operator, the 
{{test}} complained.

 Bug in test-patch.sh and precommit build process
 

 Key: HADOOP-9037
 URL: https://issues.apache.org/jira/browse/HADOOP-9037
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0
Reporter: Kihwal Lee
Priority: Critical

 In HDFS-4171, the precommit build failed, but it still posted +1 for every 
 category.  
 {noformat}
 ==
 ==
 Running tests.
 ==
 ==
 /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/
 dev-support/test-patch.sh: line 713: [: hadoop-hdfs-project/hadoop-hdfs:
 binary operator expected
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HADOOP-9037) Bug in test-patch.sh and precommit build process

2012-11-14 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee reassigned HADOOP-9037:
--

Assignee: Kihwal Lee

 Bug in test-patch.sh and precommit build process
 

 Key: HADOOP-9037
 URL: https://issues.apache.org/jira/browse/HADOOP-9037
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical

 In HDFS-4171, the precommit build failed, but it still posted +1 for every 
 category.  
 {noformat}
 ==
 ==
 Running tests.
 ==
 ==
 /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/
 dev-support/test-patch.sh: line 713: [: hadoop-hdfs-project/hadoop-hdfs:
 binary operator expected
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9037) Bug in test-patch.sh and precommit build process

2012-11-14 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-9037:
---

Attachment: hadoop-9037.patch

The patch adds quotes around the argument to the unary operator. It may have a 
leading space, but that only happens when there actually is a module, so the 
-n test is still valid.

 Bug in test-patch.sh and precommit build process
 

 Key: HADOOP-9037
 URL: https://issues.apache.org/jira/browse/HADOOP-9037
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Attachments: hadoop-9037.patch


 In HDFS-4171, the precommit build failed, but it still posted +1 for every 
 category.  
 {noformat}
 ==
 ==
 Running tests.
 ==
 ==
 /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/
 dev-support/test-patch.sh: line 713: [: hadoop-hdfs-project/hadoop-hdfs:
 binary operator expected
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9037) Bug in test-patch.sh and precommit build process

2012-11-14 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-9037:
---

Status: Patch Available  (was: Open)

 Bug in test-patch.sh and precommit build process
 

 Key: HADOOP-9037
 URL: https://issues.apache.org/jira/browse/HADOOP-9037
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Attachments: hadoop-9037.patch


 In HDFS-4171, the precommit build failed, but it still posted +1 for every 
 category.  
 {noformat}
 ==
 ==
 Running tests.
 ==
 ==
 /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/
 dev-support/test-patch.sh: line 713: [: hadoop-hdfs-project/hadoop-hdfs:
 binary operator expected
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9039) Lower the logging level of the initialization messages coming from users-group mapping modules.

2012-11-14 Thread Kihwal Lee (JIRA)

Kihwal Lee created HADOOP-9039:
--

 Summary: Lower the logging level of the initialization messages 
coming from users-group mapping modules.
 Key: HADOOP-9039
 URL: https://issues.apache.org/jira/browse/HADOOP-9039
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Kihwal Lee




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-9039) Lower the logging level of the initialization messages coming from users-group mapping modules.

2012-11-14 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HADOOP-9039.


Resolution: Duplicate

 Lower the logging level of the initialization messages coming from 
 users-group mapping modules.
 ---

 Key: HADOOP-9039
 URL: https://issues.apache.org/jira/browse/HADOOP-9039
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 2.0.2-alpha, 0.23.4
Reporter: Kihwal Lee

 There are INFO messages coming out whenever some of the users-group mapping 
 modules are used.  This is annoying to users. We should change it to DEBUG.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HADOOP-9042) Add a test for umask in FileSystemContractBaseTest

2012-11-14 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee reopened HADOOP-9042:



This HDFS test case is failing.  The precommit build did not run any hdfs test, 
so it didn't catch this.

{panel}
Running org.apache.hadoop.hdfs.TestHDFSFileSystemContract
Tests run: 30, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 37.774 sec  
FAILURE!
testMkdirsWithUmask(org.apache.hadoop.hdfs.TestHDFSFileSystemContract)  Time 
elapsed: 1046 sec   FAILURE!
junit.framework.AssertionFailedError: expected:461 but was:493
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.failNotEquals(Assert.java:283)
at junit.framework.Assert.assertEquals(Assert.java:64)
at junit.framework.Assert.assertEquals(Assert.java:182)
at junit.framework.Assert.assertEquals(Assert.java:188)
at 
org.apache.hadoop.fs.FileSystemContractBaseTest.testMkdirsWithUmask(FileSystemContractBaseTest.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at 
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at 
org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
{panel}

 Add a test for umask in FileSystemContractBaseTest
 --

 Key: HADOOP-9042
 URL: https://issues.apache.org/jira/browse/HADOOP-9042
 Project: Hadoop Common
  Issue Type: Test
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.0.3-alpha

 Attachments: HDFS-3975.001.patch, HDFS-3975.002.patch, 
 HDFS-3975.003.patch, HDFS-3975.004.patch, HDFS-3975.005.patch, 
 HDFS-3975.006.patch


 Add a unit test to make sure {{umask}} is working correctly in 
 FileSystemContractBaseTest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9108) Add a method to clear terminateCalled to ExitUtil for test cases

2012-11-29 Thread Kihwal Lee (JIRA)

Kihwal Lee created HADOOP-9108:
--

 Summary: Add a method to clear terminateCalled to ExitUtil for 
test cases
 Key: HADOOP-9108
 URL: https://issues.apache.org/jira/browse/HADOOP-9108
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Affects Versions: 0.23.5, 2.0.2-alpha
Reporter: Kihwal Lee


Currently once terminateCalled is set, it will stay set since it's a class 
static variable. This can break test cases where multiple test cases run in one 
jvm. In MiniDfsCluster, it should be cleared during shutdown for the next test 
case to run properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-9108) Add a method to clear terminateCalled to ExitUtil for test cases

2012-11-29 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HADOOP-9108.


Resolution: Invalid

 Add a method to clear terminateCalled to ExitUtil for test cases
 

 Key: HADOOP-9108
 URL: https://issues.apache.org/jira/browse/HADOOP-9108
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Kihwal Lee

 Currently once terminateCalled is set, it will stay set since it's a class 
 static variable. This can break test cases where multiple test cases run in 
 one jvm. In MiniDfsCluster, it should be cleared during shutdown for the next 
 test case to run properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HADOOP-9108) Add a method to clear terminateCalled to ExitUtil for test cases

2012-11-29 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee reopened HADOOP-9108:


  Assignee: Kihwal Lee

I found out the necessary changes have already been made in trunk and branch-2 
by HDFS-3663 and HDFS-3765.  But we cannot simply pull these patches to 
branch-0.23 because HDFS-3765 contains more than just ExitUtil change.

I will use this jira to implement something equivalent for branch-0.23. Since 
this is for tests, a slight divergence should be of no concern.

 Add a method to clear terminateCalled to ExitUtil for test cases
 

 Key: HADOOP-9108
 URL: https://issues.apache.org/jira/browse/HADOOP-9108
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Kihwal Lee
Assignee: Kihwal Lee

 Currently once terminateCalled is set, it will stay set since it's a class 
 static variable. This can break test cases where multiple test cases run in 
 one jvm. In MiniDfsCluster, it should be cleared during shutdown for the next 
 test case to run properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9108) Add a method to clear terminateCalled to ExitUtil for test cases

2012-11-29 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-9108:
---

 Target Version/s: 0.23.6  (was: 3.0.0, 2.0.3-alpha, 0.23.6)
Affects Version/s: (was: 2.0.2-alpha)

 Add a method to clear terminateCalled to ExitUtil for test cases
 

 Key: HADOOP-9108
 URL: https://issues.apache.org/jira/browse/HADOOP-9108
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Affects Versions: 0.23.5
Reporter: Kihwal Lee
Assignee: Kihwal Lee

 Currently once terminateCalled is set, it will stay set since it's a class 
 static variable. This can break test cases where multiple test cases run in 
 one jvm. In MiniDfsCluster, it should be cleared during shutdown for the next 
 test case to run properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-10158) SPNEGO should work with multiple interfaces/SPNs.

2014-01-24 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-10158:


Priority: Critical  (was: Major)

 SPNEGO should work with multiple interfaces/SPNs.
 -

 Key: HADOOP-10158
 URL: https://issues.apache.org/jira/browse/HADOOP-10158
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Kihwal Lee
Assignee: Daryn Sharp
Priority: Critical
 Attachments: HADOOP-10158.patch, HADOOP-10158_multiplerealms.patch, 
 HADOOP-10158_multiplerealms.patch, HADOOP-10158_multiplerealms.patch


 This is the list of internal servlets added by namenode.
 | Name | Auth | Need to be accessible by end users |
 | StartupProgressServlet | none | no |
 | GetDelegationTokenServlet | internal SPNEGO | yes |
 | RenewDelegationTokenServlet | internal SPNEGO | yes |
 |  CancelDelegationTokenServlet | internal SPNEGO | yes |
 |  FsckServlet | internal SPNEGO | yes |
 |  GetImageServlet | internal SPNEGO | no |
 |  ListPathsServlet | token in query | yes |
 |  FileDataServlet | token in query | yes |
 |  FileChecksumServlets | token in query | yes |
 | ContentSummaryServlet | token in query | yes |
 GetDelegationTokenServlet, RenewDelegationTokenServlet, 
 CancelDelegationTokenServlet and FsckServlet are accessed by end users, but 
 hard-coded to use the internal SPNEGO filter.
 If a name node HTTP server binds to multiple external IP addresses, the 
 internal SPNEGO service principal name may not work with an address to which 
 end users are connecting.  The current SPNEGO implementation in Hadoop is 
 limited to use a single service principal per filter.
 If the underlying hadoop kerberos authentication handler cannot easily be 
 modified, we can at least create a separate auth filter for the end-user 
 facing servlets so that their service principals can be independently 
 configured. If not defined, it should fall back to the current behavior.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HADOOP-9640) RPC Congestion Control with FairCallQueue

2014-01-24 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881428#comment-13881428
 ] 

Kihwal Lee commented on HADOOP-9640:


Can low priority requests starve higher priority requests? If a low priority 
call queue is full and all reader threads are blocked on put() for adding calls 
belonging to that queue, newly arriving higher priority requests won't get 
processed even if their corresponding queue is not full.  If the request rate 
stays greater than service rate for some time in this state, the listen queue 
will likely overflow and all types of requests will suffer regardless of 
priority. 

 RPC Congestion Control with FairCallQueue
 -

 Key: HADOOP-9640
 URL: https://issues.apache.org/jira/browse/HADOOP-9640
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.2.0
Reporter: Xiaobo Peng
  Labels: hdfs, qos, rpc
 Attachments: MinorityMajorityPerformance.pdf, 
 NN-denial-of-service-updated-plan.pdf, faircallqueue.patch, 
 faircallqueue2.patch, faircallqueue3.patch, faircallqueue4.patch, 
 faircallqueue5.patch, faircallqueue6.patch, 
 faircallqueue7_with_runtime_swapping.patch, 
 rpc-congestion-control-draft-plan.pdf


 Several production Hadoop cluster incidents occurred where the Namenode was 
 overloaded and failed to respond. 
 We can improve quality of service for users during namenode peak loads by 
 replacing the FIFO call queue with a [Fair Call 
 Queue|https://issues.apache.org/jira/secure/attachment/12616864/NN-denial-of-service-updated-plan.pdf].
  (this plan supersedes rpc-congestion-control-draft-plan).
 Excerpted from the communication of one incident, “The map task of a user was 
 creating huge number of small files in the user directory. Due to the heavy 
 load on NN, the JT also was unable to communicate with NN...The cluster 
 became responsive only once the job was killed.”
 Excerpted from the communication of another incident, “Namenode was 
 overloaded by GetBlockLocation requests (Correction: should be getFileInfo 
 requests. the job had a bug that called getFileInfo for a nonexistent file in 
 an endless loop). All other requests to namenode were also affected by this 
 and hence all jobs slowed down. Cluster almost came to a grinding 
 halt…Eventually killed jobtracker to kill all jobs that are running.”
 Excerpted from HDFS-945, “We've seen defective applications cause havoc on 
 the NameNode, for e.g. by doing 100k+ 'listStatus' on very large directories 
 (60k files) etc.”



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HADOOP-10295) Allow distcp to automatically identify the checksum type of source files and use it for the target

2014-01-28 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884212#comment-13884212
 ] 

Kihwal Lee commented on HADOOP-10295:
-

Thanks for working on this, Jing.  One thing to note is that the block size 
needs to be identical in addition to the checksum parameters in order for the 
checksums to match. So it might make more sense to introduce an option to 
preserve the two together.

 Allow distcp to automatically identify the checksum type of source files and 
 use it for the target
 --

 Key: HADOOP-10295
 URL: https://issues.apache.org/jira/browse/HADOOP-10295
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HADOOP-10295.000.patch, hadoop-10295.patch


 Currently while doing distcp, users can use -Ddfs.checksum.type to specify 
 the checksum type in the target FS. This works fine if all the source files 
 are using the same checksum type. If files in the source cluster have mixed 
 types of checksum, users have to either use -skipcrccheck or have checksum 
 mismatching exception. Thus we may need to consider adding a new option to 
 distcp so that it can automatically identify the original checksum type of 
 each source file and use the same checksum type in the target FS. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HADOOP-10314) The ls command help still shows outdated 0.16 format.

2014-01-30 Thread Kihwal Lee (JIRA)

Kihwal Lee created HADOOP-10314:
---

 Summary: The ls command help still shows outdated 0.16 format.
 Key: HADOOP-10314
 URL: https://issues.apache.org/jira/browse/HADOOP-10314
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Kihwal Lee


The description of output format is vastly outdated. It was changed after 
version 0.16.

{noformat}
$ hadoop fs -help ls
-ls [-d] [-h] [-R] [path ...]:List the contents that match the 
specified file pattern. If
path is not specified, the contents of /user/currentUser
will be listed. Directory entries are of the form 
dirName (full path) dir 
and file entries are of the form 
fileName(full path) r n size 
where n is the number of replicas specified for the file 
and size is the size of the file, in bytes.
  -d  Directories are listed as plain files.
  -h  Formats the sizes of files in a human-readable fashion
  rather than a number of bytes.
  -R  Recursively list the contents of directories.
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HADOOP-10315) Log the original exception when getGroups() fail in UGI.

2014-01-30 Thread Kihwal Lee (JIRA)

Kihwal Lee created HADOOP-10315:
---

 Summary: Log the original exception when getGroups() fail in UGI.
 Key: HADOOP-10315
 URL: https://issues.apache.org/jira/browse/HADOOP-10315
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.2.0, 0.23.10
Reporter: Kihwal Lee


In UserGroupInformation, getGroupNames() swallows the original exception. There 
have been many occasions that more information on the original exception could 
have helped.

{code}
  public synchronized String[] getGroupNames() {
ensureInitialized();
try {
  ListString result = groups.getGroups(getShortUserName());
  return result.toArray(new String[result.size()]);
} catch (IOException ie) {
  LOG.warn(No groups available for user  + getShortUserName());
  return new String[0];
}
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (HADOOP-9880) SASL changes from HADOOP-9421 breaks Secure HA NN

2014-02-19 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741810#comment-13741810
 ] 

Kihwal Lee edited comment on HADOOP-9880 at 2/19/14 10:55 PM:
--

This is slightly more appealing hack than HDFS-3083.

I've moved the call to the NN-specific {{checkAvailableForRead}} from the RPC 
layer into the NN's secret manager so it's only called when token auth is being 
performed.

However, the current method signatures only allow {{InvalidToken}} to be 
thrown.  So rather than change a bunch of signatures that may impact other 
projects, I've tunneled the {{StandyException}} in the cause of an 
{{InvalidToken}}.  The RPC server will unwrap the nested exception..


was (Author: daryn):
This is slightly more appealing hack than HDFS-3083.

I've moved the call to the NN-specific {{checkAvailableForRead}} from the RPC 
layer into the NN's secret manager so it's only called when token auth is being 
performed.

However, the current method signatures only allow {{InvalidToken}} to be 
thrown.  So rather than change a bunch of signatures that may impact other 
projects, I've tunneled the {{StandyException}} in the cause of an 
{{InvalidToken}}.  The RPC server will unwrap the nested exception.

 SASL changes from HADOOP-9421 breaks Secure HA NN 
 --

 Key: HADOOP-9880
 URL: https://issues.apache.org/jira/browse/HADOOP-9880
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Kihwal Lee
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: HADOOP-9880.patch


 buildSaslNegotiateResponse() will create a SaslRpcServer with TOKEN auth. 
 When create() is called against it, secretManager.checkAvailableForRead() is 
 called, which fails in HA standby. Thus HA standby nodes cannot be 
 transitioned to active.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HADOOP-10314) The ls command help still shows outdated 0.16 format.

2014-03-03 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918232#comment-13918232
 ] 

Kihwal Lee commented on HADOOP-10314:
-

+1 the patch looks good.

 The ls command help still shows outdated 0.16 format.
 -

 Key: HADOOP-10314
 URL: https://issues.apache.org/jira/browse/HADOOP-10314
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Kihwal Lee
Assignee: Rushabh S Shah
  Labels: newbie
 Attachments: patch-10314-v2.patch, patch-10314.patch


 The description of output format is vastly outdated. It was changed after 
 version 0.16.
 {noformat}
 $ hadoop fs -help ls
 -ls [-d] [-h] [-R] [path ...]:  List the contents that match the 
 specified file pattern. If
   path is not specified, the contents of /user/currentUser
   will be listed. Directory entries are of the form 
   dirName (full path) dir 
   and file entries are of the form 
   fileName(full path) r n size 
   where n is the number of replicas specified for the file 
   and size is the size of the file, in bytes.
 -d  Directories are listed as plain files.
 -h  Formats the sizes of files in a human-readable fashion
 rather than a number of bytes.
 -R  Recursively list the contents of directories.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HADOOP-10314) The ls command help still shows outdated 0.16 format.

2014-03-03 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-10314:


   Resolution: Fixed
Fix Version/s: 2.4.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed this to trunk, branch-2 and branch-2.4. Thanks for working on 
the fix, Rushabh.

 The ls command help still shows outdated 0.16 format.
 -

 Key: HADOOP-10314
 URL: https://issues.apache.org/jira/browse/HADOOP-10314
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Kihwal Lee
Assignee: Rushabh S Shah
  Labels: newbie
 Fix For: 3.0.0, 2.4.0

 Attachments: patch-10314-v2.patch, patch-10314.patch


 The description of output format is vastly outdated. It was changed after 
 version 0.16.
 {noformat}
 $ hadoop fs -help ls
 -ls [-d] [-h] [-R] [path ...]:  List the contents that match the 
 specified file pattern. If
   path is not specified, the contents of /user/currentUser
   will be listed. Directory entries are of the form 
   dirName (full path) dir 
   and file entries are of the form 
   fileName(full path) r n size 
   where n is the number of replicas specified for the file 
   and size is the size of the file, in bytes.
 -d  Directories are listed as plain files.
 -h  Formats the sizes of files in a human-readable fashion
 rather than a number of bytes.
 -R  Recursively list the contents of directories.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HADOOP-9986) HDFS Compatible ViewFileSystem

2014-03-25 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HADOOP-9986.


Resolution: Invalid

 HDFS Compatible ViewFileSystem
 --

 Key: HADOOP-9986
 URL: https://issues.apache.org/jira/browse/HADOOP-9986
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Lohit Vijayarenu
 Fix For: 2.0.6-alpha


 There are multiple scripts and projects like pig, hive, elephantbird refer to 
 HDFS URI as hdfs://namenodehostport/ or hdfs:/// . In federated namespace 
 this causes problem because supported scheme for federation is viewfs:// . We 
 will have to force all users to change their scripts/programs to be able to 
 access federated cluster. 
 It would be great if thee was a way to map viewfs scheme to hdfs scheme 
 without exposing it to users. Opening this JIRA to get inputs from people who 
 have thought about this in their clusters.
 In our clusters we ended up created another class 
 HDFSCompatibleViewFileSystem which hijacks both hdfs.fs.impl and 
 viewfs.fs.impl and passes down filesystem calls to ViewFileSystem. Is there 
 any suggested approach other than this?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HADOOP-10442) Group look-up can cause segmentation fault when certain JNI-based mapping module is used.

2014-03-26 Thread Kihwal Lee (JIRA)

Kihwal Lee created HADOOP-10442:
---

 Summary: Group look-up can cause segmentation fault when certain 
JNI-based mapping module is used.
 Key: HADOOP-10442
 URL: https://issues.apache.org/jira/browse/HADOOP-10442
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.3.0, 2.4.0
Reporter: Kihwal Lee
Priority: Blocker


When JniBasedUnixGroupsNetgroupMapping or JniBasedUnixGroupsMapping is used, we 
get segmentation fault very often. The same system ran 2.2 for months without 
any problem, but as soon as upgrading to 2.3, it started crashing.  This 
resulted in multiple name node crashes per day.

The server was running nslcd (nss-pam-ldapd-0.7.5-15.el6_3.2). We did not see 
this problem on the servers running sssd. 

There was one change in the C code and it modified the return code handling 
after getgrouplist() call. If the function returns 0 or a negative value less 
than -1, it will do realloc() instead of returning failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10442) Group look-up can cause segmentation fault when certain JNI-based mapping module is used.

2014-03-26 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948350#comment-13948350
 ] 

Kihwal Lee commented on HADOOP-10442:
-

The return code handling was modified in HADOOP-10087. This is the only change 
in the JNI user-group mapping modules between 2.2 and 2.3.

 Group look-up can cause segmentation fault when certain JNI-based mapping 
 module is used.
 -

 Key: HADOOP-10442
 URL: https://issues.apache.org/jira/browse/HADOOP-10442
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.3.0, 2.4.0
Reporter: Kihwal Lee
Priority: Blocker

 When JniBasedUnixGroupsNetgroupMapping or JniBasedUnixGroupsMapping is used, 
 we get segmentation fault very often. The same system ran 2.2 for months 
 without any problem, but as soon as upgrading to 2.3, it started crashing.  
 This resulted in multiple name node crashes per day.
 The server was running nslcd (nss-pam-ldapd-0.7.5-15.el6_3.2). We did not see 
 this problem on the servers running sssd. 
 There was one change in the C code and it modified the return code handling 
 after getgrouplist() call. If the function returns 0 or a negative value less 
 than -1, it will do realloc() instead of returning failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HADOOP-10442) Group look-up can cause segmentation fault when certain JNI-based mapping module is used.

2014-03-26 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-10442:


Attachment: HADOOP-10442.patch

 Group look-up can cause segmentation fault when certain JNI-based mapping 
 module is used.
 -

 Key: HADOOP-10442
 URL: https://issues.apache.org/jira/browse/HADOOP-10442
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.3.0, 2.4.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: HADOOP-10442.patch


 When JniBasedUnixGroupsNetgroupMapping or JniBasedUnixGroupsMapping is used, 
 we get segmentation fault very often. The same system ran 2.2 for months 
 without any problem, but as soon as upgrading to 2.3, it started crashing.  
 This resulted in multiple name node crashes per day.
 The server was running nslcd (nss-pam-ldapd-0.7.5-15.el6_3.2). We did not see 
 this problem on the servers running sssd. 
 There was one change in the C code and it modified the return code handling 
 after getgrouplist() call. If the function returns 0 or a negative value less 
 than -1, it will do realloc() instead of returning failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HADOOP-10442) Group look-up can cause segmentation fault when certain JNI-based mapping module is used.

2014-03-26 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-10442:


Status: Patch Available  (was: Open)

 Group look-up can cause segmentation fault when certain JNI-based mapping 
 module is used.
 -

 Key: HADOOP-10442
 URL: https://issues.apache.org/jira/browse/HADOOP-10442
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.3.0, 2.4.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: HADOOP-10442.patch


 When JniBasedUnixGroupsNetgroupMapping or JniBasedUnixGroupsMapping is used, 
 we get segmentation fault very often. The same system ran 2.2 for months 
 without any problem, but as soon as upgrading to 2.3, it started crashing.  
 This resulted in multiple name node crashes per day.
 The server was running nslcd (nss-pam-ldapd-0.7.5-15.el6_3.2). We did not see 
 this problem on the servers running sssd. 
 There was one change in the C code and it modified the return code handling 
 after getgrouplist() call. If the function returns 0 or a negative value less 
 than -1, it will do realloc() instead of returning failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HADOOP-10442) Group look-up can cause segmentation fault when certain JNI-based mapping module is used.

2014-03-26 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee reassigned HADOOP-10442:
---

Assignee: Kihwal Lee

 Group look-up can cause segmentation fault when certain JNI-based mapping 
 module is used.
 -

 Key: HADOOP-10442
 URL: https://issues.apache.org/jira/browse/HADOOP-10442
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.3.0, 2.4.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HADOOP-10442.patch


 When JniBasedUnixGroupsNetgroupMapping or JniBasedUnixGroupsMapping is used, 
 we get segmentation fault very often. The same system ran 2.2 for months 
 without any problem, but as soon as upgrading to 2.3, it started crashing.  
 This resulted in multiple name node crashes per day.
 The server was running nslcd (nss-pam-ldapd-0.7.5-15.el6_3.2). We did not see 
 this problem on the servers running sssd. 
 There was one change in the C code and it modified the return code handling 
 after getgrouplist() call. If the function returns 0 or a negative value less 
 than -1, it will do realloc() instead of returning failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10442) Group look-up can cause segmentation fault when certain JNI-based mapping module is used.

2014-03-26 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948832#comment-13948832
 ] 

Kihwal Lee commented on HADOOP-10442:
-

A 2.3 NN has been running with this fix for some time.  The NN crashed every 
3-5 hours before this. 

 Group look-up can cause segmentation fault when certain JNI-based mapping 
 module is used.
 -

 Key: HADOOP-10442
 URL: https://issues.apache.org/jira/browse/HADOOP-10442
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.3.0, 2.4.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HADOOP-10442.patch


 When JniBasedUnixGroupsNetgroupMapping or JniBasedUnixGroupsMapping is used, 
 we get segmentation fault very often. The same system ran 2.2 for months 
 without any problem, but as soon as upgrading to 2.3, it started crashing.  
 This resulted in multiple name node crashes per day.
 The server was running nslcd (nss-pam-ldapd-0.7.5-15.el6_3.2). We did not see 
 this problem on the servers running sssd. 
 There was one change in the C code and it modified the return code handling 
 after getgrouplist() call. If the function returns 0 or a negative value less 
 than -1, it will do realloc() instead of returning failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10442) Group look-up can cause segmentation fault when certain JNI-based mapping module is used.

2014-03-31 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955576#comment-13955576
 ] 

Kihwal Lee commented on HADOOP-10442:
-

[~cmccabe]:  I also think the version of nslcd we used is buggy.  The return 
code handling before your change was just masking it, but it likely had other 
side effects.  I observed many lookup timeouts in NN prior to crashes, while my 
own program calling the same libc functions running on the same box at the same 
time had no issue.  The nslcd lookup timeout was configured to be 20 seconds in 
/etc/nslcd.conf.

{panel}
12:15:21,106  WARN security.Groups: Potential performance problem:
getGroups(user=) took 20020 milliseconds.
 12:15:21,107  WARN security.UserGroupInformation: No groups available for user 

{panel}

bq. Also, looking at this more closely, I believe we mishandle the case where 
the user is a member of no groups. This would be a pretty odd configuration (I 
wonder if it's possible?).

Getting no groups after a successful getpwnam() can probably only happen when 
the user was removed in between the two calls. All other cases might be 
considered as errors.  I saw cases of an admin user getting permission refused 
for certain operations. It was fixed after the refresh command was issued.  It 
must have hit the no-group error when building the acl and the result was 
negatively cached. If it didn't do negative caching, user-level retries would 
have worked.

So, the solution might be letting the native code return 0 even on error 
conditions as you suggested, but making netgroup modules not do negative 
caching.  That's when a valid user name has no netgroups.

 Group look-up can cause segmentation fault when certain JNI-based mapping 
 module is used.
 -

 Key: HADOOP-10442
 URL: https://issues.apache.org/jira/browse/HADOOP-10442
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.3.0, 2.4.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Fix For: 3.0.0, 2.4.0, 2.5.0

 Attachments: HADOOP-10442.patch


 When JniBasedUnixGroupsNetgroupMapping or JniBasedUnixGroupsMapping is used, 
 we get segmentation fault very often. The same system ran 2.2 for months 
 without any problem, but as soon as upgrading to 2.3, it started crashing.  
 This resulted in multiple name node crashes per day.
 The server was running nslcd (nss-pam-ldapd-0.7.5-15.el6_3.2). We did not see 
 this problem on the servers running sssd. 
 There was one change in the C code and it modified the return code handling 
 after getgrouplist() call. If the function returns 0 or a negative value less 
 than -1, it will do realloc() instead of returning failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HADOOP-10454) Provide FileContext version of har file system

2014-04-01 Thread Kihwal Lee (JIRA)

Kihwal Lee created HADOOP-10454:
---

 Summary: Provide FileContext version of har file system
 Key: HADOOP-10454
 URL: https://issues.apache.org/jira/browse/HADOOP-10454
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kihwal Lee


Add support for HarFs, the FileContext version of HarFileSystem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HADOOP-10454) Provide FileContext version of har file system

2014-04-01 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-10454:


Attachment: HADOOP-10454.patch

[~knoguchi] and [~jlowe] did the actual work.

 Provide FileContext version of har file system
 --

 Key: HADOOP-10454
 URL: https://issues.apache.org/jira/browse/HADOOP-10454
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kihwal Lee
 Attachments: HADOOP-10454.patch


 Add support for HarFs, the FileContext version of HarFileSystem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HADOOP-10454) Provide FileContext version of har file system

2014-04-01 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-10454:


Assignee: Kihwal Lee
  Status: Patch Available  (was: Open)

 Provide FileContext version of har file system
 --

 Key: HADOOP-10454
 URL: https://issues.apache.org/jira/browse/HADOOP-10454
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HADOOP-10454.patch


 Add support for HarFs, the FileContext version of HarFileSystem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HADOOP-10454) Provide FileContext version of har file system

2014-04-01 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956530#comment-13956530
 ] 

Kihwal Lee edited comment on HADOOP-10454 at 4/1/14 2:11 PM:
-

Attaching the patch. [~knoguchi] and [~jlowe] did the actual work.


was (Author: kihwal):
[~knoguchi] and [~jlowe] did the actual work.

 Provide FileContext version of har file system
 --

 Key: HADOOP-10454
 URL: https://issues.apache.org/jira/browse/HADOOP-10454
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HADOOP-10454.patch


 Add support for HarFs, the FileContext version of HarFileSystem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HADOOP-10454) Provide FileContext version of har file system

2014-04-03 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-10454:


Target Version/s: 0.23.11, 2.5.0  (was: 2.5.0)

 Provide FileContext version of har file system
 --

 Key: HADOOP-10454
 URL: https://issues.apache.org/jira/browse/HADOOP-10454
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HADOOP-10454.patch


 Add support for HarFs, the FileContext version of HarFileSystem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10454) Provide FileContext version of har file system

2014-04-03 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13959358#comment-13959358
 ] 

Kihwal Lee commented on HADOOP-10454:
-

Committed to branch-0.23 as well.

 Provide FileContext version of har file system
 --

 Key: HADOOP-10454
 URL: https://issues.apache.org/jira/browse/HADOOP-10454
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HADOOP-10454.patch


 Add support for HarFs, the FileContext version of HarFileSystem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HADOOP-10454) Provide FileContext version of har file system

2014-04-03 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-10454:


   Resolution: Fixed
Fix Version/s: 2.5.0
   0.23.11
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2 and branch-0.23.

 Provide FileContext version of har file system
 --

 Key: HADOOP-10454
 URL: https://issues.apache.org/jira/browse/HADOOP-10454
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 3.0.0, 0.23.11, 2.5.0

 Attachments: HADOOP-10454.patch


 Add support for HarFs, the FileContext version of HarFileSystem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10498) Add support for proxy server

2014-04-15 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969569#comment-13969569
 ] 

Kihwal Lee commented on HADOOP-10498:
-

According to the IETF draft, 
http://tools.ietf.org/html/draft-ietf-appsawg-http-forwarded-10 , the header 
format may change to include information about multiple proxies involved in 
forwarding.  Even if this becomes the standard, proxy servers will most likely 
still put the defacto X-Forwarded-For header for compatibility reasons.  For 
this reason, I think it is reasonable to assume X-Forwarded-For for now.  If 
the proposal becomes a standard, it will be more than just header name change, 
so making it configurable doesn't seem to add much value.

+1 for the patch.

 Add support for proxy server
 

 Key: HADOOP-10498
 URL: https://issues.apache.org/jira/browse/HADOOP-10498
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HADOOP-10498.patch


 HDFS-6218  HDFS-6219 require support for configurable proxy servers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10501) Server#getHandlers() accesses handlers without synchronization

2014-04-15 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969624#comment-13969624
 ] 

Kihwal Lee commented on HADOOP-10501:
-

It's currently only meant for unit tests.  Do you have any other use case in 
mind? 

 Server#getHandlers() accesses handlers without synchronization
 --

 Key: HADOOP-10501
 URL: https://issues.apache.org/jira/browse/HADOOP-10501
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor

 {code}
   Iterable? extends Thread getHandlers() {
 return Arrays.asList(handlers);
   }
 {code}
 All the other methods accessing handlers are synchronized methods.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10501) Server#getHandlers() accesses handlers without synchronization

2014-04-15 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969712#comment-13969712
 ] 

Kihwal Lee commented on HADOOP-10501:
-

It should be okay as long as it is called after Server#start(). Adding 
{{synchronized}} won't do much good, but won't hurt either.

 Server#getHandlers() accesses handlers without synchronization
 --

 Key: HADOOP-10501
 URL: https://issues.apache.org/jira/browse/HADOOP-10501
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor

 {code}
   Iterable? extends Thread getHandlers() {
 return Arrays.asList(handlers);
   }
 {code}
 All the other methods accessing handlers are synchronized methods.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HADOOP-7688) When a servlet filter throws an exception in init(..), the Jetty server failed silently.

2014-04-15 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-7688:
---

Fix Version/s: 0.23.11

 When a servlet filter throws an exception in init(..), the Jetty server 
 failed silently. 
 -

 Key: HADOOP-7688
 URL: https://issues.apache.org/jira/browse/HADOOP-7688
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.23.0, 0.24.0
Reporter: Tsz Wo Nicholas Sze
Assignee: Uma Maheswara Rao G
 Fix For: 1.2.0, 3.0.0, 2.0.3-alpha, 0.23.11

 Attachments: HADOOP-7688-branch-1.patch, HADOOP-7688-branch-2.patch, 
 HADOOP-7688.patch, filter-init-exception-test.patch, 
 org.apache.hadoop.http.TestServletFilter-output.txt


 When a servlet filter throws a ServletException in init(..), the exception is 
 logged by Jetty but not re-throws to the caller.  As a result, the Jetty 
 server failed silently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10504) Document proxy server support

2014-04-15 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969789#comment-13969789
 ] 

Kihwal Lee commented on HADOOP-10504:
-

{{hadoop-common-project/hadoop-common/src/site/apt/SecureMode.apt.vm}} needs to 
be updated. Since the change is already in, I am making it a blocker for 2.5.0.

 Document proxy server support
 -

 Key: HADOOP-10504
 URL: https://issues.apache.org/jira/browse/HADOOP-10504
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.0.0, 2.5.0
Reporter: Daryn Sharp

 Document http proxy support introduced by HADOOP-10498.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HADOOP-10504) Document proxy server support

2014-04-15 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-10504:


 Component/s: (was: util)
  documentation
Priority: Blocker  (was: Major)
Target Version/s: 2.5.0  (was: 3.0.0, 2.5.0)

 Document proxy server support
 -

 Key: HADOOP-10504
 URL: https://issues.apache.org/jira/browse/HADOOP-10504
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.0.0, 2.5.0
Reporter: Daryn Sharp
Priority: Blocker

 Document http proxy support introduced by HADOOP-10498.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HADOOP-10522) JniBasedUnixGroupMapping mishandles errors

2014-04-18 Thread Kihwal Lee (JIRA)

Kihwal Lee created HADOOP-10522:
---

 Summary: JniBasedUnixGroupMapping mishandles errors
 Key: HADOOP-10522
 URL: https://issues.apache.org/jira/browse/HADOOP-10522
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Kihwal Lee
Priority: Critical


The mishandling of errors in the jni user-to-groups mapping modules can cause 
segmentation faults in subsequent calls.  Here are the bugs:

1) If {{hadoop_user_info_fetch()}} returns an error code that is not ENOENT, 
the error may not be handled at all.  This bug was found by [~cnauroth].

2)  In {{hadoop_user_info_fetch()}} and {{hadoop_group_info_fetch()}}, the 
global {{errno}} is directly used. This is not thread-safe and could be the 
cause of some failures that disappeared after enabling the big lookup lock.

3) In the above methods, there is no limit on retries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

1 2 3 4 5 6 7 8 9 >

1 - 100 of 812 matches

Mail list logo