[jira] Commented: (HADOOP-894) dfs client protocol should allow asking for parts of the block map

2007-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496236 ] Hadoop QA commented on HADOOP-894: -- +1

[jira] Created: (HADOOP-1376) RandomWriter should be tweaked to generate input data for terasort

2007-05-16 Thread Devaraj Das (JIRA)
RandomWriter should be tweaked to generate input data for terasort -- Key: HADOOP-1376 URL: https://issues.apache.org/jira/browse/HADOOP-1376 Project: Hadoop Issue Type:

[jira] Commented: (HADOOP-1205) The open method of FSNamesystem should be synchronized

2007-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496259 ] Hadoop QA commented on HADOOP-1205: --- Integrated in Hadoop-Nightly #90 (See

[jira] Commented: (HADOOP-1358) seek call ignores result of skipBytes(int)

2007-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496260 ] Hadoop QA commented on HADOOP-1358: --- Integrated in Hadoop-Nightly #90 (See

[jira] Updated: (HADOOP-1340) md5 file in filecache should inherit replication factor from the file it belongs to.

2007-05-16 Thread Tom White (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HADOOP-1340: -- Resolution: Fixed Status: Resolved (was: Patch Available) I've just committed this. Thanks

[jira] Updated: (HADOOP-1355) Possible null pointer dereference in TaskLogAppender.append(LoggingEvent)

2007-05-16 Thread Tom White (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HADOOP-1355: -- Resolution: Fixed Status: Resolved (was: Patch Available) I've just committed this. Thanks

[jira] Updated: (HADOOP-1306) DFS Scalability: Reduce the number of getAdditionalBlock RPCs on the namenode

2007-05-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HADOOP-1306: - Attachment: fineGrainLocks3.patch Merged patch with latest trunk. DFS Scalability:

[jira] Updated: (HADOOP-1306) DFS Scalability: Reduce the number of getAdditionalBlock RPCs on the namenode

2007-05-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HADOOP-1306: - Attachment: (was: fineGrainLocks2.patch) DFS Scalability: Reduce the number of

[jira] Created: (HADOOP-1377) Creation time and modification time for hadoop files and directories

2007-05-16 Thread dhruba borthakur (JIRA)
Creation time and modification time for hadoop files and directories Key: HADOOP-1377 URL: https://issues.apache.org/jira/browse/HADOOP-1377 Project: Hadoop Issue Type:

Re: [jira] Commented: (HADOOP-1374) TaskTracker falls into an infinite loop.

2007-05-16 Thread Nigel Daley
This could be related to HADOOP-1332. On May 15, 2007, at 11:35 PM, Owen O'Malley (JIRA) wrote: [ https://issues.apache.org/jira/browse/HADOOP-1374? page=com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanel#action_12496206 ] Owen O'Malley commented on HADOOP-1374:

[jira] Updated: (HADOOP-894) dfs client protocol should allow asking for parts of the block map

2007-05-16 Thread Nigel Daley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nigel Daley updated HADOOP-894: --- Fix Version/s: (was: 0.13.0) 0.14.0 Moving to 0.14 as this is not a bug.

[jira] Assigned: (HADOOP-1375) a simple parser for hbase.

2007-05-16 Thread Jim Kellerman (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Kellerman reassigned HADOOP-1375: - Assignee: Jim Kellerman a simple parser for hbase. --

Re: Many Checksum Errors

2007-05-16 Thread Doug Cutting
[ Moving discussion to hadoop-dev. -drc ] Raghu Angadi wrote: This is good validation how important ECC memory is. Currently HDFS client deletes a block when it notices a checksum error. After moving to Block level CRCs soon, we should make Datanode re-validate the block before deciding to

[jira] Resolved: (HADOOP-1106) Number of DataNodes less than target replication causes NameNode WARN message every millisecond

2007-05-16 Thread Nigel Daley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nigel Daley resolved HADOOP-1106. - Resolution: Cannot Reproduce I can't reproduce this anymore either. Number of DataNodes less

Re: Many Checksum Errors

2007-05-16 Thread Raghu Angadi
Doug Cutting wrote: [ Moving discussion to hadoop-dev. -drc ] Raghu Angadi wrote: This is good validation how important ECC memory is. Currently HDFS client deletes a block when it notices a checksum error. After moving to Block level CRCs soon, we should make Datanode re-validate the block

[jira] Updated: (HADOOP-1357) Call to equals() comparing different types in CopyFiles.cleanup(Configuration, JobConf, String, String)

2007-05-16 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Cutting updated HADOOP-1357: - Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this.

[jira] Created: (HADOOP-1378) DataNode log message includes toString of an array

2007-05-16 Thread Nigel Daley (JIRA)
DataNode log message includes toString of an array -- Key: HADOOP-1378 URL: https://issues.apache.org/jira/browse/HADOOP-1378 Project: Hadoop Issue Type: Bug Components: dfs

[jira] Updated: (HADOOP-1356) ValueHistogram.addNextValue(Object) ignores return value of String.substring(int, int)

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-1356: -- Assignee: Runping Qi ValueHistogram.addNextValue(Object) ignores return value of

[jira] Created: (HADOOP-1379) Integrate Findbugs into nightly build process

2007-05-16 Thread Nigel Daley (JIRA)
Integrate Findbugs into nightly build process - Key: HADOOP-1379 URL: https://issues.apache.org/jira/browse/HADOOP-1379 Project: Hadoop Issue Type: New Feature Components: test

[jira] Commented: (HADOOP-1242) dfs upgrade/downgrade problems

2007-05-16 Thread Konstantin Shvachko (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496371 ] Konstantin Shvachko commented on HADOOP-1242: - I think we should target a more general task here (if

[jira] Commented: (HADOOP-1379) Integrate Findbugs into nightly build process

2007-05-16 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496372 ] Doug Cutting commented on HADOOP-1379: -- Can you please attach some sample findbugs output? I have no problem

RE: Many Checksum Errors

2007-05-16 Thread Hairong Kuang
What Doug suggested makes sense. We should make the initial buffer size to be bytesPerChecksum and the user defined buffer size to be the size of the second buffer. This will also solve most of the problems that I described in HADOOP-1124. Hairong -Original Message- From: Raghu Angadi

Re: Many Checksum Errors

2007-05-16 Thread Doug Cutting
Raghu Angadi wrote: In my implementation of block-level CRCs (does not affect ChecksumFileSystem in HADOOP-928), we don't buffer checksum data at all. That sounds like a good approach. I look forward to seeing the patch. We could remove buffering all together in FileSystem level and let

Re: Many Checksum Errors

2007-05-16 Thread Raghu Angadi
Hairong Kuang wrote: What Doug suggested makes sense. We should make the initial buffer size to be bytesPerChecksum and the user defined buffer size to be the size of the second buffer. This will also solve most of the problems that I described in HADOOP-1124. But this will not fix the same

Re: Many Checksum Errors

2007-05-16 Thread Doug Cutting
Raghu Angadi wrote: But this will not fix the same problem with block-level checksums. Pretty soon, HDFS will not use ChecksumFileSystem at all. I'd hope that block-level checksums do not replicate logic from ChecksumFileSystem. Rather they should probably share substantial portions of

Re: Many Checksum Errors

2007-05-16 Thread Raghu Angadi
Doug Cutting wrote: Raghu Angadi wrote: In my implementation of block-level CRCs (does not affect ChecksumFileSystem in HADOOP-928), we don't buffer checksum data at all. That sounds like a good approach. I look forward to seeing the patch. I will prepare a temporary patch with the

[jira] Updated: (HADOOP-234) Hadoop Pipes for writing map/reduce jobs in C++ and python

2007-05-16 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Cutting updated HADOOP-234: Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. Thanks,

Re: Many Checksum Errors

2007-05-16 Thread Raghu Angadi
Doug Cutting wrote: Raghu Angadi wrote: But this will not fix the same problem with block-level checksums. Pretty soon, HDFS will not use ChecksumFileSystem at all. I'd hope that block-level checksums do not replicate logic from ChecksumFileSystem. Rather they should probably share

[jira] Updated: (HADOOP-1379) Integrate Findbugs into nightly build process

2007-05-16 Thread Nigel Daley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nigel Daley updated HADOOP-1379: Attachment: hadoop-findbugs-report.html Findbugs is tunable, both in the detectors it uses and

[jira] Commented: (HADOOP-1374) TaskTracker falls into an infinite loop.

2007-05-16 Thread Konstantin Shvachko (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496397 ] Konstantin Shvachko commented on HADOOP-1374: - Not really. The first time I have seen it on a pure

[jira] Updated: (HADOOP-1356) ValueHistogram.addNextValue(Object) ignores return value of String.substring(int, int)

2007-05-16 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-1356: --- Attachment: patch-1356.txt A fix of two line changes is attached.

[jira] Updated: (HADOOP-1356) ValueHistogram.addNextValue(Object) ignores return value of String.substring(int, int)

2007-05-16 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-1356: --- Status: Patch Available (was: Open) ValueHistogram.addNextValue(Object) ignores return value of

[jira] Commented: (HADOOP-1356) ValueHistogram.addNextValue(Object) ignores return value of String.substring(int, int)

2007-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496402 ] Hadoop QA commented on HADOOP-1356: --- +1 http://issues.apache.org/jira/secure/attachment/12357495/patch-1356.txt

[jira] Assigned: (HADOOP-1363) waitForCompletion() calls Thread.sleep() with a lock held

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HADOOP-1363: - Assignee: Owen O'Malley waitForCompletion() calls Thread.sleep() with a lock held

[jira] Assigned: (HADOOP-1368) Inconsistent synchronization of 3 fields in JobInProgress.java

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HADOOP-1368: - Assignee: Owen O'Malley Inconsistent synchronization of 3 fields in JobInProgress.java

[jira] Assigned: (HADOOP-1369) Inconsistent synchronization of TaskTracker fields

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HADOOP-1369: - Assignee: Owen O'Malley Inconsistent synchronization of TaskTracker fields

[jira] Assigned: (HADOOP-1364) Inconsistent synchronization of SequenceFile$Reader.noBufferedValues; locked 66% of time

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HADOOP-1364: - Assignee: Owen O'Malley Inconsistent synchronization of

[jira] Updated: (HADOOP-1363) waitForCompletion() calls Thread.sleep() with a lock held

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-1363: -- Attachment: 1363.patch waitForCompletion() calls Thread.sleep() with a lock held

[jira] Updated: (HADOOP-1363) waitForCompletion() calls Thread.sleep() with a lock held

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-1363: -- Status: Patch Available (was: Open) waitForCompletion() calls Thread.sleep() with a lock

[jira] Commented: (HADOOP-1363) waitForCompletion() calls Thread.sleep() with a lock held

2007-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496423 ] Hadoop QA commented on HADOOP-1363: --- +1 http://issues.apache.org/jira/secure/attachment/12357499/1363.patch

[jira] Commented: (HADOOP-1079) DFS Scalability: optimize processing time of block reports

2007-05-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496425 ] dhruba borthakur commented on HADOOP-1079: -- The BlockReport exists because of the following reasons. 1.

[jira] Created: (HADOOP-1380) We should have a util.Subprocess class with utilities for starting subprocesses

2007-05-16 Thread Owen O'Malley (JIRA)
We should have a util.Subprocess class with utilities for starting subprocesses --- Key: HADOOP-1380 URL: https://issues.apache.org/jira/browse/HADOOP-1380 Project: Hadoop

[jira] Created: (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2007-05-16 Thread Owen O'Malley (JIRA)
The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL:

[jira] Updated: (HADOOP-1368) Inconsistent synchronization of 3 fields in JobInProgress.java

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-1368: -- Status: Patch Available (was: Open) Inconsistent synchronization of 3 fields in

[jira] Assigned: (HADOOP-1079) DFS Scalability: optimize processing time of block reports

2007-05-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur reassigned HADOOP-1079: Assignee: dhruba borthakur DFS Scalability: optimize processing time of block

[jira] Updated: (HADOOP-1368) Inconsistent synchronization of 3 fields in JobInProgress.java

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-1368: -- Attachment: 1368.patch Inconsistent synchronization of 3 fields in JobInProgress.java

[jira] Commented: (HADOOP-1079) DFS Scalability: optimize processing time of block reports

2007-05-16 Thread Raghu Angadi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496433 ] Raghu Angadi commented on HADOOP-1079: -- Would n't this result in a possitive feedback loop for load on

[jira] Commented: (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2007-05-16 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496435 ] Doug Cutting commented on HADOOP-1381: -- reduce the overhead by a factor of 500 But if the overhead is

[jira] Commented: (HADOOP-1226) makeQualified should return an instance of a DfsPath when passed a DfsPath

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496439 ] Owen O'Malley commented on HADOOP-1226: --- +1 makeQualified should return an instance of a DfsPath when

[jira] Commented: (HADOOP-1298) adding user info to file

2007-05-16 Thread Kurtis Heimerl (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496441 ] Kurtis Heimerl commented on HADOOP-1298: Could I get this patch reviewed some time soon? adding user info

[jira] Commented: (HADOOP-1368) Inconsistent synchronization of 3 fields in JobInProgress.java

2007-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496437 ] Hadoop QA commented on HADOOP-1368: --- +1 http://issues.apache.org/jira/secure/attachment/12357503/1368.patch

[jira] Commented: (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2007-05-16 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496429 ] Doug Cutting commented on HADOOP-1381: -- Why would this be better? The current design is to add them as

[jira] Updated: (HADOOP-1361) seek calls in 3 io classes ignore result of skipBytes(int)

2007-05-16 Thread Hairong Kuang (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HADOOP-1361: -- Attachment: io_skip.patch seek calls in 3 io classes ignore result of skipBytes(int)

[jira] Assigned: (HADOOP-1361) seek calls in 3 io classes ignore result of skipBytes(int)

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HADOOP-1361: - Assignee: Hairong Kuang seek calls in 3 io classes ignore result of skipBytes(int)

[jira] Commented: (HADOOP-1079) DFS Scalability: optimize processing time of block reports

2007-05-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496443 ] dhruba borthakur commented on HADOOP-1079: -- A short discussion with Sameer revealed that the case where

[jira] Updated: (HADOOP-1079) DFS Scalability: optimize processing time of block reports

2007-05-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HADOOP-1079: - Attachment: blockReportPeriod.patch Here is a sample patch that increases the

[jira] Commented: (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2007-05-16 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496432 ] Owen O'Malley commented on HADOOP-1381: --- If your input splits are roughly 128MB or so, putting in a sync