[jira] Updated: (HADOOP-2367) Get representative hprof information from tasks

2008-01-18 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2367: -- Status: Open (was: Patch Available) > Get representative hprof information fr

[jira] Updated: (HADOOP-2367) Get representative hprof information from tasks

2008-01-18 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2367: -- Status: Patch Available (was: Open) > Get representative hprof information fr

[jira] Updated: (HADOOP-2367) Get representative hprof information from tasks

2008-01-18 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2367: -- Attachment: profile-4.patch I addressed Chris' concerns, including create a unit t

[jira] Commented: (HADOOP-2404) HADOOP-2185 breaks compatibility with hadoop-0.15.0

2008-01-18 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560527#action_12560527 ] Owen O'Malley commented on HADOOP-2404: --- Konst just pointed out that w

[jira] Commented: (HADOOP-2404) HADOOP-2185 breaks compatibility with hadoop-0.15.0

2008-01-18 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560524#action_12560524 ] Owen O'Malley commented on HADOOP-2404: --- We've dealt with this case

[jira] Commented: (HADOOP-2284) BasicTypeSorterBase.compare calls progress on each compare

2008-01-17 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560236#action_12560236 ] Owen O'Malley commented on HADOOP-2284: --- +1, as long as the unit tes

[jira] Updated: (HADOOP-2572) TaskLogServlet returns 410 when trying to access log early in task life

2008-01-17 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2572: -- Status: Open (was: Patch Available) This should be re-written to test if the file e

[jira] Updated: (HADOOP-2469) WritableUtils.clone should take Configuration rather than JobConf

2008-01-17 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2469: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed

[jira] Updated: (HADOOP-2603) SequenceFileAsBinaryInputFormat

2008-01-17 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2603: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed

[jira] Updated: (HADOOP-2188) RPC should send a ping rather than use client timeouts

2008-01-17 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2188: -- Fix Version/s: (was: 0.16.0) > RPC should send a ping rather than use client

[jira] Resolved: (HADOOP-2477) Unit test fails on Windows: TestCopyFiles.testCopyFromLocalToDfs

2008-01-17 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HADOOP-2477. --- > Unit test fails on Windows: TestCopyFiles.testCopyFromLo

[jira] Reopened: (HADOOP-2477) Unit test fails on Windows: TestCopyFiles.testCopyFromLocalToDfs

2008-01-17 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reopened HADOOP-2477: --- > Unit test fails on Windows: TestCopyFiles.testCopyFromLo

[jira] Resolved: (HADOOP-2477) Unit test fails on Windows: TestCopyFiles.testCopyFromLocalToDfs

2008-01-17 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HADOOP-2477. --- Resolution: Fixed This HADOOP-2476 fixed this issue. > Unit test fails on

Re: [VOTE] Release Hadoop 0.15.3 (candidate 0)

2008-01-17 Thread Owen O'Malley
On Jan 15, 2008, at 5:10 PM, Nigel Daley wrote: I've created a candidate build for Hadoop 0.15.3. http://people.apache.org/~nigel/hadoop-0.15.3-candidate-0/ Should we release this? +1 I've run the release on mac in single node configuration and it worked fine.

[jira] Created: (HADOOP-2644) Remove the warning for attempting to override a final parameter

2008-01-17 Thread Owen O'Malley (JIRA)
Reporter: Owen O'Malley We should remove the warning that final configuration attributes are being overridden. In particular, in the task's logs, I get: {code} 2008-01-17 10:40:43,659 WARN org.apache.hadoop.conf.Configuration: /Users/oom/var/hadoop-trial/mapred/local/taskTracke

[jira] Created: (HADOOP-2628) the combiner output counter is really counting number of input keys to the combiner

2008-01-16 Thread Owen O'Malley (JIRA)
: Hadoop Issue Type: Bug Reporter: Owen O'Malley Fix For: 0.16.0 Currently, each key that is read by the combiner bumps the number of combiner outputs. We should have both combiner input keys and combiner output record counters. -- This message is automati

[jira] Created: (HADOOP-2627) the map task output servlet doesn't protect against ".." attacks

2008-01-16 Thread Owen O'Malley (JIRA)
Issue Type: Bug Components: mapred Reporter: Owen O'Malley Fix For: 0.17.0 The servlet we use to export the map outputs doesn't protect itself against ".." attacks. However, because the code adds a /file.out.index and /file.out to it, it can on

[jira] Commented: (HADOOP-2614) dfs web interfaces should run as a configurable user account

2008-01-15 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559286#action_12559286 ] Owen O'Malley commented on HADOOP-2614: --- My security hackles get

[jira] Commented: (HADOOP-2431) Test HDFS File Permissions

2008-01-15 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559145#action_12559145 ] Owen O'Malley commented on HADOOP-2431: --- I'm with Doug on this one

[jira] Updated: (HADOOP-2367) Get representative hprof information from tasks

2008-01-15 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2367: -- Status: Patch Available (was: Open) This patch is fixed with regard to trunk.

[jira] Updated: (HADOOP-2610) [HQL] Clarify the return policy

2008-01-15 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2610: -- Description: Current ReturnMsg class and the return policy are problematic. Let'

[jira] Updated: (HADOOP-2367) Get representative hprof information from tasks

2008-01-14 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2367: -- Attachment: profile-3.patch Updated to trunk. > Get representative hprof informat

[jira] Updated: (HADOOP-2367) Get representative hprof information from tasks

2008-01-14 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2367: -- Status: Open (was: Patch Available) Updating to current trunk. > Get representati

[jira] Updated: (HADOOP-2603) SequenceFileAsBinaryInputFormat

2008-01-14 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2603: -- Status: Open (was: Patch Available) This looks good, except you need to add java doc fo

[jira] Issue Comment Edited: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

2008-01-14 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558834#action_12558834 ] owen.omalley edited comment on HADOOP-1985 at 1/14/08 3:09 PM: -

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

2008-01-14 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558834#action_12558834 ] Owen O'Malley commented on HADOOP-1985: --- I'm worried about the t

[jira] Resolved: (HADOOP-2516) HADOOP-1819 removed a public api JobTracker.getTracker in 0.15.0

2008-01-14 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HADOOP-2516. --- Resolution: Won't Fix Assignee: Owen O'Malley (was: Arun C Murthy) I

[jira] Commented: (HADOOP-2581) Counters and other useful stats should be logged into Job History log

2008-01-11 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558154#action_12558154 ] Owen O'Malley commented on HADOOP-2581: --- It is already in trunk and there

[jira] Commented: (HADOOP-2581) Counters and other useful stats should be logged into Job History log

2008-01-11 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558101#action_12558101 ] Owen O'Malley commented on HADOOP-2581: --- The counters *are* logged in jo

[jira] Commented: (HADOOP-2560) Combining multiple input blocks into one mapper

2008-01-11 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558042#action_12558042 ] Owen O'Malley commented on HADOOP-2560: --- I like that approach, Doug. We sh

[jira] Issue Comment Edited: (HADOOP-2116) Job.local.dir to be exposed to tasks

2008-01-10 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557918#action_12557918 ] owen.omalley edited comment on HADOOP-2116 at 1/10/08 10:42 PM:

[jira] Commented: (HADOOP-2116) Job.local.dir to be exposed to tasks

2008-01-10 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557918#action_12557918 ] Owen O'Malley commented on HADOOP-2116: --- *Ugh* is right. I'd propo

[jira] Commented: (HADOOP-2570) streaming jobs fail after HADOOP-2227

2008-01-10 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557858#action_12557858 ] Owen O'Malley commented on HADOOP-2570: --- I agree with Milind that the best

[jira] Commented: (HADOOP-2298) ant target without source and docs

2008-01-10 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557855#action_12557855 ] Owen O'Malley commented on HADOOP-2298: --- +1 on "binary" target

[jira] Updated: (HADOOP-2399) Input key and value to combiner and reducer should be reused

2008-01-10 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2399: -- Attachment: reuse-obj.patch This is a rough patch that isn't quite working yet. &

[jira] Updated: (HADOOP-2562) globPaths does not support {ab,cd} as it claims to

2008-01-09 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2562: -- Fix Version/s: (was: 0.16.0) 0.15.3 > globPaths does not sup

Re: Slip Hadoop 0.16.0 Feature Freeze?

2008-01-09 Thread Owen O'Malley
On Jan 9, 2008, at 11:08 AM, Nigel Daley wrote: So any objections to 0.16 feature freeze on Tuesday Jan 15 EOD? I think we should stick with the friday feature freeze. How about Jan 18? -- Owen

[jira] Updated: (HADOOP-2092) Pipes C++ task does not die even if the Java tasks die

2008-01-08 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2092: -- Status: Open (was: Patch Available) A couple of things bother me: 1. There is no lo

Re: [VOTE] Release Hadoop 0.15.2 (candidate 0)

2008-01-07 Thread Owen O'Malley
On Dec 29, 2007, at 11:09 PM, Nigel Daley wrote: I've created a candidate build for Hadoop 0.15.2. http://people.apache.org/~nigel/hadoop-0.15.2-candidate-0/ Should we release this? +1 I installed and ran a distributed word count on my mac laptop. -- Owen

Re: [VOTE] Release Hadoop 0.15.2 (candidate 0)

2008-01-03 Thread Owen O'Malley
On Jan 3, 2008, at 3:41 AM, Arun C Murthy wrote: Nigel Daley wrote: I've created a candidate build for Hadoop 0.15.2. http://people.apache.org/~nigel/hadoop-0.15.2-candidate-0/ Should we release this? What is the general opinion on http://issues.apache.org/jira/browse/ HADOOP-2516? I re

[jira] Resolved: (HADOOP-308) Task Tracker does not handle the case of read only local dir case correctly

2008-01-03 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HADOOP-308. -- Resolution: Duplicate Jiras that are fixed as part of other jiras should be marked dupli

[jira] Reopened: (HADOOP-308) Task Tracker does not handle the case of read only local dir case correctly

2008-01-03 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reopened HADOOP-308: -- > Task Tracker does not handle the case of read only local dir case c

[jira] Commented: (HADOOP-2516) HADOOP-1819 removed a public api JobTracker.getTracker in 0.15.0

2008-01-03 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12555805#action_12555805 ] Owen O'Malley commented on HADOOP-2516: --- There isn't a way

[jira] Commented: (HADOOP-2516) HADOOP-1819 removed a public api JobTracker.getTracker in 0.15.0

2008-01-03 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1285#action_1285 ] Owen O'Malley commented on HADOOP-2516: --- getTracker was removed, because t

[jira] Commented: (HADOOP-2501) Implement utility-tools for working with SequenceFiles

2007-12-29 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554923 ] Owen O'Malley commented on HADOOP-2501: --- The data can be extracted via: {code} bin/hadoop fs -text bla

[jira] Resolved: (HADOOP-1661) TextOutputFormat should ignore NullWritables like null values.

2007-12-26 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HADOOP-1661. --- Resolution: Duplicate Fix Version/s: 0.16.0 This was fixed by HADOOP

[jira] Updated: (HADOOP-2425) TextOutputFormat should special case Text

2007-12-26 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2425: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed

[jira] Commented: (HADOOP-2469) WritableUtils.clone should take Configuration rather than JobConf

2007-12-26 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554537 ] Owen O'Malley commented on HADOOP-2469: --- I think so. It is an unlikely method for users to

[jira] Commented: (HADOOP-2284) BasicTypeSorterBase.compare calls progress on each compare

2007-12-26 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554528 ] Owen O'Malley commented on HADOOP-2284: --- I agree with Devaraj. The cost of gettimeofday is huge when put

[jira] Updated: (HADOOP-2285) TextInputFormat is slow compared to reading files.

2007-12-26 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2285: -- Status: Patch Available (was: Open) > TextInputFormat is slow compared to readi

[jira] Updated: (HADOOP-2285) TextInputFormat is slow compared to reading files.

2007-12-26 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2285: -- Attachment: fast-line.patch Ok, here is a patch that does: 1. Avoids encoding the data

[jira] Commented: (HADOOP-2208) Reduce frequency of Counter updates in the task tracker status

2007-12-24 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554316 ] Owen O'Malley commented on HADOOP-2208: --- I'm pretty worried about the approach of this patch. I

[jira] Assigned: (HADOOP-2285) TextInputFormat is slow compared to reading files.

2007-12-23 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HADOOP-2285: - Assignee: Owen O'Malley > TextInputFormat is slow compared to

[jira] Commented: (HADOOP-2425) TextOutputFormat should special case Text

2007-12-21 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554092 ] Owen O'Malley commented on HADOOP-2425: --- Chris ran a benchmark that writes 2g of data through befor

[jira] Updated: (HADOOP-2425) TextOutputFormat should special case Text

2007-12-15 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2425: -- Status: Patch Available (was: Open) > TextOutputFormat should special c

[jira] Updated: (HADOOP-2425) TextOutputFormat should special case Text

2007-12-15 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2425: -- Attachment: text-out-format-2.patch I forgot NullWritable last time, which I

[jira] Commented: (HADOOP-2228) Jobs fail because job.xml exists

2007-12-15 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552095 ] Owen O'Malley commented on HADOOP-2228: --- +1 this makes sense. If the submit failed on the client, but no

[jira] Resolved: (HADOOP-2213) Job submission gets Job tracker still initializing message while Namenode is in safemode

2007-12-14 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HADOOP-2213. --- Resolution: Won't Fix > Job submission gets Job tracker still initializing

[jira] Commented: (HADOOP-2433) Streaming: org.apache.hadoop.mapred.lib.IdentityMapper should not inserted unnecessary keys

2007-12-14 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552032 ] Owen O'Malley commented on HADOOP-2433: --- The problem of course is that TextInputFormat returns the offs

[jira] Updated: (HADOOP-2425) TextOutputFormat should special case Text

2007-12-14 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2425: -- Attachment: text-out-format.patch This patch changes: 1. TextOutputFormat detects Tex

[jira] Updated: (HADOOP-2425) TextOutputFormat should special case Text

2007-12-14 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2425: -- Status: Patch Available (was: Open) > TextOutputFormat should special c

[jira] Created: (HADOOP-2428) we should add a wait for non-safe mode and call dfsadmin -report in start-dfs

2007-12-14 Thread Owen O'Malley (JIRA)
Issue Type: Improvement Components: scripts Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.16.0 I think we should add a call to wait for safe mode exit and print the dfs report to show upgrades that are in progress. -- Th

Re: New to Hadoop

2007-12-14 Thread Owen O'Malley
On Dec 14, 2007, at 3:33 AM, Goel, Ankur wrote: As a part of my medium and long term plans I intent to continuously use hadoop for executing my Map/Reduce jobs as well as make active contributions to Hadoop in the form of patches for existing bugs/features/enhancements that exist is Hadoop JIRA

[jira] Resolved: (HADOOP-1827) Reducer.reduce method's OutputCollector is too strict, it shoudn't need the key to be WritableComparable

2007-12-13 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HADOOP-1827. --- Resolution: Won't Fix > Reducer.reduce method's OutputCollector is too s

[jira] Created: (HADOOP-2425) TextOutputFormat should special case Text

2007-12-13 Thread Owen O'Malley (JIRA)
Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.16.0 TextOutputFormat is spending a noticeable amount of time encoding and then decoding the bytes in Text objects before they are sent to the output. We should handle this as a special case. -- This

[jira] Updated: (HADOOP-2248) Word count example is spending 24% of the time in incrCounter

2007-12-13 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2248: -- Attachment: counter-speedup-3.patch I found and fixed the problem. The last problem

[jira] Updated: (HADOOP-2248) Word count example is spending 24% of the time in incrCounter

2007-12-13 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2248: -- Status: Open (was: Patch Available) > Word count example is spending 24% of the

[jira] Updated: (HADOOP-2248) Word count example is spending 24% of the time in incrCounter

2007-12-13 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2248: -- Status: Patch Available (was: Open) Resubmitting for QA. > Word count example is

Re: Hadoop TLP?

2007-12-13 Thread Owen O'Malley
On Dec 13, 2007, at 3:21 PM, Doug Cutting wrote: I think it is time that we make Hadoop a top level project (TLP) at Apache +1

[jira] Created: (HADOOP-2399) Input key and value to combiner and reducer should be reused

2007-12-10 Thread Owen O'Malley (JIRA)
Components: mapred Affects Versions: 0.15.1 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.16.0 Currently, the input key and value are recreated on every iteration for input to the combiner and reducer. It would speed up the system substa

[jira] Updated: (HADOOP-2248) Word count example is spending 24% of the time in incrCounter

2007-12-10 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2248: -- Status: Patch Available (was: Open) > Word count example is spending 24% of the

[jira] Updated: (HADOOP-2248) Word count example is spending 24% of the time in incrCounter

2007-12-10 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2248: -- Attachment: counter-speedup-2.patch Fixes the parts that went out of date, while waitin

[jira] Commented: (HADOOP-2385) Validate configuration parameters

2007-12-09 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549926 ] Owen O'Malley commented on HADOOP-2385: --- I can see the motivation/frustration for this, but it doesn&#

[jira] Created: (HADOOP-2383) The combiner in pipes is closed before the last values are passed in.

2007-12-07 Thread Owen O'Malley (JIRA)
: Bug Components: pipes Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.16.0 Currently the last spill is sent to the combiner after the close method is called. -- This message is automatically generated by JIRA. - You can rep

[jira] Updated: (HADOOP-2158) hdfsListDirectory in libhdfs does not scale

2007-12-07 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2158: -- Status: Patch Available (was: Open) Promoting this for Christian. > hdfsListDire

[jira] Updated: (HADOOP-2359) PendingReplicationMonitor thread received exception. java.lang.InterruptedException

2007-12-07 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2359: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed

[jira] Updated: (HADOOP-2313) build does not fail when libhdfs build fails

2007-12-07 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2313: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed

[jira] Updated: (HADOOP-2271) chmod in ant package target fails

2007-12-07 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2271: -- Resolution: Fixed Fix Version/s: 0.16.0 Status: Resolved (was:

[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server

2007-12-07 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549578 ] Owen O'Malley commented on HADOOP-2232: --- Makund, Can you please run a 500 node sort and look for

[jira] Updated: (HADOOP-2085) Map-side joins on sorted, equally-partitioned datasets

2007-12-07 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2085: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed

[jira] Updated: (HADOOP-2376) The sort example shouldn't override the number of maps

2007-12-06 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2376: -- Status: Patch Available (was: Open) > The sort example shouldn't override the n

[jira] Updated: (HADOOP-2367) Get representative hprof information from tasks

2007-12-06 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2367: -- Status: Patch Available (was: Open) > Get representative hprof information fr

[jira] Updated: (HADOOP-2376) The sort example shouldn't override the number of maps

2007-12-06 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2376: -- Attachment: sort-no-maps.patch This just pulls out the code from the example that is t

[jira] Created: (HADOOP-2376) The sort example shouldn't override the number of maps

2007-12-06 Thread Owen O'Malley (JIRA)
nents: examples Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.16.0 The sort example currently overrides the number of maps. It should just take the default, because the current behavior can end up with a bad number of maps by default. In part

[jira] Updated: (HADOOP-2367) Get representative hprof information from tasks

2007-12-06 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2367: -- Attachment: profile-2.patch This patch is the previous one where the client gets the pr

[jira] Commented: (HADOOP-2367) Get representative hprof information from tasks

2007-12-06 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549326 ] Owen O'Malley commented on HADOOP-2367: --- The race condition between the subprocess finishing and the

[jira] Created: (HADOOP-2375) Task tracker should wait for the process to exit before declaring the task successful or failed.

2007-12-06 Thread Owen O'Malley (JIRA)
-2375 Project: Hadoop Issue Type: Bug Components: mapred Reporter: Owen O'Malley Currently when a task declares it is done, the status in the task tracker is changed immediately. Instead it should wait for the subprocess to actually be done befo

Re: JDiff for Hadoop?

2007-12-06 Thread Owen O'Malley
On Dec 6, 2007, at 1:05 PM, Doug Cutting wrote: Someone has suggested adding JDiff output to Lucene's documentation. Should we add this to Hadoop's documentation too? +1

[jira] Updated: (HADOOP-2367) Get representative hprof information from tasks

2007-12-06 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2367: -- Attachment: profile.patch This patch adds a config variable to enable profiling of the

[jira] Resolved: (HADOOP-1327) Doc on Streaming

2007-12-06 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HADOOP-1327. --- Resolution: Fixed Fix Version/s: 0.16.0 I just committed this. Thanks, Rob!

[jira] Created: (HADOOP-2367) Get representative hprof information from tasks

2007-12-06 Thread Owen O'Malley (JIRA)
Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.16.0 It would be great to get a representative (2 or 3) sample of builtin java profiler for a sample of maps and reduces. I'd store the information in the userlog directory and make it a

[jira] Updated: (HADOOP-2342) create a micro-benchmark for measure local-file versus hdfs read

2007-12-05 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2342: -- Attachment: throughput.patch I forgot to restart the time on the local read and

[jira] Updated: (HADOOP-2342) create a micro-benchmark for measure local-file versus hdfs read

2007-12-05 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2342: -- Status: Open (was: Patch Available) > create a micro-benchmark for measure local-fil

[jira] Updated: (HADOOP-2342) create a micro-benchmark for measure local-file versus hdfs read

2007-12-05 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2342: -- Status: Patch Available (was: Open) Need to be re-reviewed by QA. > create

[jira] Updated: (HADOOP-2342) create a micro-benchmark for measure local-file versus hdfs read

2007-12-05 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2342: -- Attachment: (was: throughput.patch) > create a micro-benchmark for measure lo

[jira] Assigned: (HADOOP-1327) Doc on Streaming

2007-12-05 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HADOOP-1327: - Assignee: Rob Weltman > Doc on Streaming > > >

[jira] Updated: (HADOOP-2342) create a micro-benchmark for measure local-file versus hdfs read

2007-12-05 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2342: -- Attachment: throughput.patch This benchmark reads and writes files using ja

[jira] Updated: (HADOOP-2342) create a micro-benchmark for measure local-file versus hdfs read

2007-12-05 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-2342: -- Status: Patch Available (was: Open) > create a micro-benchmark for measure local-fil

[jira] Created: (HADOOP-2359) PendingReplicationMonitor thread received exception. java.lang.InterruptedException

2007-12-05 Thread Owen O'Malley (JIRA)
: Hadoop Issue Type: Bug Components: dfs Affects Versions: 0.16.0 Reporter: Owen O'Malley Assignee: dhruba borthakur Fix For: 0.16.0 I sometimes get the message: 07/12/05 19:01:36 WARN fs.FSNamesystem: PendingReplicationMonitor t

[jira] Commented: (HADOOP-2342) create a micro-benchmark for measure local-file versus hdfs read

2007-12-04 Thread Owen O'Malley (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548351 ] Owen O'Malley commented on HADOOP-2342: --- The numbers I'm seeing for reads and writes on 10GB are:

  1   2   3   4   5   6   7   8   9   10   >