Re: Issue in reduce phase with SortedMapWritable and custom Writables as values
Okay, I think I'm getting closer but now I'm running into another problem. First off, I created my own CustomMapWritable that extends MapWritable and invokes AbstractMapWritable.addToMap() to add my custom classes. Now the map/reduce phases actually complete and the job as a whole completes. However, when I try to use the SequenceFile API to later read the output data, I'm getting a strange exception. First the code: FileSystem fileSys = FileSystem.get(conf); SequenceFile.Reader reader = new SequenceFile.Reader(fileSys, inFile, conf); Text key = new Text(); CustomWritable stats = new CustomWritable(); reader.next(key, stats); reader.close(); And now the exception that's thrown: java.io.IOException: can't find class: com.test.CustomStatsWritable because com.test.CustomStatsWritable at org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:210) at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:145) at com.test.CustomStatsWritable.readFields(UserStatsWritable.java:49) at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879) ... Any ideas? Thanks, Ryan On Tue, Sep 9, 2008 at 12:36 AM, Ryan LeCompte [EMAIL PROTECTED] wrote: Hello, I'm attempting to use a SortedMapWritable with a LongWritable as the key and a custom implementation of org.apache.hadoop.io.Writable as the value. I notice that my program works fine when I use another primitive wrapper (e.g. Text) as the value, but fails with the following exception when I use my custom Writable instance: 2008-09-08 23:25:02,072 INFO org.apache.hadoop.mapred.ReduceTask: Initiating in-memory merge with 1 segments... 2008-09-08 23:25:02,077 INFO org.apache.hadoop.mapred.Merger: Merging 1 sorted segments 2008-09-08 23:25:02,077 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 5492 bytes 2008-09-08 23:25:02,099 WARN org.apache.hadoop.mapred.ReduceTask: attempt_200809082247_0005_r_00_0 Merge of the inmemory files threw a n exception: java.io.IOException: Intermedate merge failed at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2133) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2064) Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:80) at org.apache.hadoop.io.SortedMapWritable.readFields(SortedMapWritable.java:179) ... I noticed that the AbstractMapWritable class has a protected addToMap(Class clazz) method. Do I somehow need to let my SortedMapWritable instance know about my custom Writable value? I've properly implemented the custom Writable object (it just contains a few primitives, like longs and ints). Any insight is appreciated. Thanks, Ryan
Re: public IP for datanode on EC2
I think most people try to avoid allowing remote access for security reasons. If you can add a file, I can mount your filesystem too, maybe even delete things. Whereas with EC2-only filesystems, your files are *only* exposed to everyone else that knows or can scan for your IPAddr and ports. I imagine that the access to the ports used by HDFS could be restricted to specific IPs using the EC2 group (ec2-authorize) or any other firewall mechanism if necessary. Could anyone confirm that there is no conf parameter I could use to force the address of my DataNodes? Thanks Julien -- DigitalPebble Ltd http://www.digitalpebble.com
Re: Issue in reduce phase with SortedMapWritable and custom Writables as values
Based on some similar problems that I found others were having in the mailing lists, it looks like the solution was to list my Map/Reduce job JAR In the conf/hadoop-env.sh file under HADOOP_CLASSPATH. After doing that and re-submitting the job, it all worked fine! I guess the MapWritable class somehow doesn't share the same classpath as the program that actually submits the job conf. Is this expected? Thanks, Ryan On Tue, Sep 9, 2008 at 9:44 AM, Ryan LeCompte [EMAIL PROTECTED] wrote: Okay, I think I'm getting closer but now I'm running into another problem. First off, I created my own CustomMapWritable that extends MapWritable and invokes AbstractMapWritable.addToMap() to add my custom classes. Now the map/reduce phases actually complete and the job as a whole completes. However, when I try to use the SequenceFile API to later read the output data, I'm getting a strange exception. First the code: FileSystem fileSys = FileSystem.get(conf); SequenceFile.Reader reader = new SequenceFile.Reader(fileSys, inFile, conf); Text key = new Text(); CustomWritable stats = new CustomWritable(); reader.next(key, stats); reader.close(); And now the exception that's thrown: java.io.IOException: can't find class: com.test.CustomStatsWritable because com.test.CustomStatsWritable at org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:210) at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:145) at com.test.CustomStatsWritable.readFields(UserStatsWritable.java:49) at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879) ... Any ideas? Thanks, Ryan On Tue, Sep 9, 2008 at 12:36 AM, Ryan LeCompte [EMAIL PROTECTED] wrote: Hello, I'm attempting to use a SortedMapWritable with a LongWritable as the key and a custom implementation of org.apache.hadoop.io.Writable as the value. I notice that my program works fine when I use another primitive wrapper (e.g. Text) as the value, but fails with the following exception when I use my custom Writable instance: 2008-09-08 23:25:02,072 INFO org.apache.hadoop.mapred.ReduceTask: Initiating in-memory merge with 1 segments... 2008-09-08 23:25:02,077 INFO org.apache.hadoop.mapred.Merger: Merging 1 sorted segments 2008-09-08 23:25:02,077 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 5492 bytes 2008-09-08 23:25:02,099 WARN org.apache.hadoop.mapred.ReduceTask: attempt_200809082247_0005_r_00_0 Merge of the inmemory files threw a n exception: java.io.IOException: Intermedate merge failed at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2133) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2064) Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:80) at org.apache.hadoop.io.SortedMapWritable.readFields(SortedMapWritable.java:179) ... I noticed that the AbstractMapWritable class has a protected addToMap(Class clazz) method. Do I somehow need to let my SortedMapWritable instance know about my custom Writable value? I've properly implemented the custom Writable object (it just contains a few primitives, like longs and ints). Any insight is appreciated. Thanks, Ryan
Re: Thinking about retriving DFS metadata from datanodes!!!
This isn't a very stable direction. You really don't want multiple distinct methods for storing the metadata, because discrepancies are very bad. High Availability (HA) is a very important medium term goal for HDFS, but it will likely be done using multiple NameNodes and ZooKeeper. -- Owen
Re: Failing MR jobs!
On Sep 7, 2008, at 12:26 PM, Erik Holstad wrote: Hi! I'm trying to run a MR job, but it keeps on failing and I can't understand why. Sometimes it shows output at 66% and sometimes 98% or so. I had a couple of exception before that I didn't catch that made the job to fail. The log file from the task can be found at: http://pastebin.com/m4414d369 From the logs it looks like the TaskTracker killed your reduce task because it didn't report any progress for 10 mins, which is the default timeout. FWIW it's probably because _one_ of the calls to your 'reduce' function got stuck trying to communicate with one of the external resources you are using... Arun
Re: distcp failing
i'm not sure that's the issue, i basically tarred up the hadoop directory from the cluster and copied it over to the non-data node but i do agree i've likely got a setting wrong, since i can run distcp from the namenode and it works fine. the question is which one On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball [EMAIL PROTECTED]wrote: It is likely that you mapred.system.dir and/or fs.default.name settings are incorrect on the non-datanode machine that you are launching the task from. These two settings (in your conf/hadoop-site.xml file) must match the settings on the cluster itself. - Aaron On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico [EMAIL PROTECTED]wrote: I'm attempting to load data into hadoop (version 0.17.1), from a non-datanode machine in the cluster. I can run jobs and copyFromLocal works fine, but when i try to use distcp i get the below. I'm don't understand what the error, can anyone help? Thanks blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite file:///Users/mdidomenico/hadoop/1gTestfile /user/mdidomenico/1gTestfile 08/09/07 23:56:06 INFO util.CopyFiles: srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile] 08/09/07 23:56:06 INFO util.CopyFiles: destPath=/user/mdidomenico/1gTestfile1 08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1 With failures, global counters are inaccurate; consider running with -i Copy failed: org.apache.hadoop.ipc.RemoteException: java.io.IOException: /tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such file or directory at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149) at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155) at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136) at org.apache.hadoop.mapred.JobInProgress.init(JobInProgress.java:175) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1755) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896) at org.apache.hadoop.ipc.Client.call(Client.java:557) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212) at $Proxy1.submitJob(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy1.submitJob(Unknown Source) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973) at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604) at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)
Re: distcp failing
a little more digging and it appears i cannot run distcp as someone other then hadoop on the namenode /tmp/hadoop-hadoop/mapred/system/job_200809091231_0005/job.xml looking at this directory from the error file the system directory does not exist on the namenode, i only have a local directory On Tue, Sep 9, 2008 at 12:41 PM, Michael Di Domenico [EMAIL PROTECTED] wrote: i'm not sure that's the issue, i basically tarred up the hadoop directory from the cluster and copied it over to the non-data node but i do agree i've likely got a setting wrong, since i can run distcp from the namenode and it works fine. the question is which one On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball [EMAIL PROTECTED]wrote: It is likely that you mapred.system.dir and/or fs.default.name settings are incorrect on the non-datanode machine that you are launching the task from. These two settings (in your conf/hadoop-site.xml file) must match the settings on the cluster itself. - Aaron On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico [EMAIL PROTECTED]wrote: I'm attempting to load data into hadoop (version 0.17.1), from a non-datanode machine in the cluster. I can run jobs and copyFromLocal works fine, but when i try to use distcp i get the below. I'm don't understand what the error, can anyone help? Thanks blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite file:///Users/mdidomenico/hadoop/1gTestfile /user/mdidomenico/1gTestfile 08/09/07 23:56:06 INFO util.CopyFiles: srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile] 08/09/07 23:56:06 INFO util.CopyFiles: destPath=/user/mdidomenico/1gTestfile1 08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1 With failures, global counters are inaccurate; consider running with -i Copy failed: org.apache.hadoop.ipc.RemoteException: java.io.IOException: /tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such file or directory at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149) at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155) at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136) at org.apache.hadoop.mapred.JobInProgress.init(JobInProgress.java:175) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1755) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896) at org.apache.hadoop.ipc.Client.call(Client.java:557) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212) at $Proxy1.submitJob(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy1.submitJob(Unknown Source) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973) at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604) at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)
Re: specifying number of nodes for job
On Mon, Sep 8, 2008 at 4:26 PM, Sandy [EMAIL PROTECTED] wrote: In all seriousness though, why is this not possible? Is there something about the MapReduce model of parallel computation that I am not understanding? Or this more of an arbitrary implementation choice made by the Hadoop framework? If so, I am curious why this is the case. What are the benefits? It is possible to do with changes to Hadoop. There was a jira filed for it, but I don't think anyone has worked on it. (HADOOP-2573) For Map/Reduce it is a design goal that number of tasks not nodes are the important metric. You want a job to be able to run with any given cluster size. For scalability testing, you could just remove task trackers... -- Owen
Re: distcp failing
manually creating the system directory gets me past the first error, but now i get this. i'm not necessarily sure its a step forward though, because the map task never shows up in the jobtracker [EMAIL PROTECTED] hadoop-0.17.1]$ bin/hadoop distcp file:///home/mdidomenico/1gTestfile 1gTestfile 08/09/09 13:12:06 INFO util.CopyFiles: srcPaths=[file:/home/mdidomenico/1gTestfile] 08/09/09 13:12:06 INFO util.CopyFiles: destPath=1gTestfile 08/09/09 13:12:07 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:07 INFO dfs.DFSClient: Abandoning block blk_5758513071638050362 08/09/09 13:12:13 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:13 INFO dfs.DFSClient: Abandoning block blk_1691495306775808049 08/09/09 13:12:17 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:17 INFO dfs.DFSClient: Abandoning block blk_1027634596973755899 08/09/09 13:12:19 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:19 INFO dfs.DFSClient: Abandoning block blk_4535302510016050282 08/09/09 13:12:23 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:23 INFO dfs.DFSClient: Abandoning block blk_7022658012001626339 08/09/09 13:12:25 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:25 INFO dfs.DFSClient: Abandoning block blk_-4509681241839967328 08/09/09 13:12:29 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:29 INFO dfs.DFSClient: Abandoning block blk_8318033979013580420 08/09/09 13:12:31 WARN dfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block. 08/09/09 13:12:31 WARN dfs.DFSClient: Error Recovery for block blk_-4509681241839967328 bad datanode[0] 08/09/09 13:12:35 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:35 INFO dfs.DFSClient: Abandoning block blk_2848354798649979411 08/09/09 13:12:41 WARN dfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block. 08/09/09 13:12:41 WARN dfs.DFSClient: Error Recovery for block blk_2848354798649979411 bad datanode[0] Exception in thread Thread-0 java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(Unknown Source) at java.util.TreeMap$KeyIterator.next(Unknown Source) at org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217) at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214) at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324) at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224) at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209) 08/09/09 13:12:41 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:41 INFO dfs.DFSClient: Abandoning block blk_9189111926428577428 On Tue, Sep 9, 2008 at 1:03 PM, Michael Di Domenico [EMAIL PROTECTED]wrote: a little more digging and it appears i cannot run distcp as someone other then hadoop on the namenode /tmp/hadoop-hadoop/mapred/system/job_200809091231_0005/job.xml looking at this directory from the error file the system directory does not exist on the namenode, i only have a local directory On Tue, Sep 9, 2008 at 12:41 PM, Michael Di Domenico [EMAIL PROTECTED] wrote: i'm not sure that's the issue, i basically tarred up the hadoop directory from the cluster and copied it over to the non-data node but i do agree i've likely got a setting wrong, since i can run distcp from the namenode and it works fine. the question is which one On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball [EMAIL PROTECTED]wrote: It is likely that you mapred.system.dir and/or fs.default.name settings are incorrect on the non-datanode machine that you are launching the task from. These two settings (in your conf/hadoop-site.xml file) must match the settings on the cluster itself. - Aaron On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico [EMAIL PROTECTED]wrote: I'm attempting to load data into hadoop (version 0.17.1), from a non-datanode machine in the cluster. I can run jobs and copyFromLocal works fine, but when i try to use distcp i get the below. I'm don't understand what the error, can anyone help? Thanks blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite file:///Users/mdidomenico/hadoop/1gTestfile /user/mdidomenico/1gTestfile 08/09/07 23:56:06 INFO util.CopyFiles: srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile] 08/09/07 23:56:06 INFO
Re: Monthly Hadoop User Group Meeting (Bay Area)
Chris K Wensel wrote: doh, conveniently collides with the GridGain and GridDynamics presentations: http://web.meetup.com/66/calendar/8561664/ Bay Area Hadoop User Group meetings are held on the third Wednesday every month. This has been on the calendar for quite a while. Doug
Re: distcp failing
Apparently, the fix to my original error is because hadoop is setup for a single local machine out of the box and i had to change these directories property namemapred.local.dir/name value/hadoop/mapred/local/value /property property namemapred.system.dir/name value/hadoop/mapred/system/value /property property namemapred.temp.dir/name value/hadoop/mapred/temp/value /property to be hdfs instead of hadoop.tmp.dir So now distcp works as a non-hadoop user and mapred works as a non-hadoop user from the name node, however, from a workstation i get this now blue:hadoop-0.17.1 mdidomenico$ bin/hadoop distcp file:///Users/mdidomenico/hadoop/1gTestfile 1gTestfile-1 08/09/09 13:44:19 INFO util.CopyFiles: srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile] 08/09/09 13:44:19 INFO util.CopyFiles: destPath=1gTestfile-1 08/09/09 13:44:20 INFO util.CopyFiles: srcCount=1 08/09/09 13:44:22 INFO mapred.JobClient: Running job: job_200809091332_0004 08/09/09 13:44:23 INFO mapred.JobClient: map 0% reduce 0% 08/09/09 13:44:31 INFO mapred.JobClient: Task Id : task_200809091332_0004_m_00_0, Status : FAILED java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124) 08/09/09 13:44:50 INFO mapred.JobClient: Task Id : task_200809091332_0004_m_00_1, Status : FAILED java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124) 08/09/09 13:45:07 INFO mapred.JobClient: Task Id : task_200809091332_0004_m_00_2, Status : FAILED java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124) 08/09/09 13:45:26 INFO mapred.JobClient: map 100% reduce 100% With failures, global counters are inaccurate; consider running with -i Copy failed: java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1062) at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604) at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763) On Tue, Sep 9, 2008 at 1:14 PM, Michael Di Domenico [EMAIL PROTECTED]wrote: manually creating the system directory gets me past the first error, but now i get this. i'm not necessarily sure its a step forward though, because the map task never shows up in the jobtracker [EMAIL PROTECTED] hadoop-0.17.1]$ bin/hadoop distcp file:///home/mdidomenico/1gTestfile 1gTestfile 08/09/09 13:12:06 INFO util.CopyFiles: srcPaths=[file:/home/mdidomenico/1gTestfile] 08/09/09 13:12:06 INFO util.CopyFiles: destPath=1gTestfile 08/09/09 13:12:07 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:07 INFO dfs.DFSClient: Abandoning block blk_5758513071638050362 08/09/09 13:12:13 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:13 INFO dfs.DFSClient: Abandoning block blk_1691495306775808049 08/09/09 13:12:17 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:17 INFO dfs.DFSClient: Abandoning block blk_1027634596973755899 08/09/09 13:12:19 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:19 INFO dfs.DFSClient: Abandoning block blk_4535302510016050282 08/09/09 13:12:23 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:23 INFO dfs.DFSClient: Abandoning block blk_7022658012001626339 08/09/09 13:12:25 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:25 INFO dfs.DFSClient: Abandoning block blk_-4509681241839967328 08/09/09 13:12:29 INFO dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream 08/09/09 13:12:29 INFO dfs.DFSClient: Abandoning block blk_8318033979013580420 08/09/09 13:12:31 WARN dfs.DFSClient:
Re: Monthly Hadoop User Group Meeting (Bay Area)
Chris K Wensel wrote: doh, conveniently collides with the GridGain and GridDynamics presentations: http://web.meetup.com/66/calendar/8561664/ Bay Area Hadoop User Group meetings are held on the third Wednesday every month. This has been on the calendar for quite a while. Doug maybe I should have said, coincidentally. -- Chris K Wensel [EMAIL PROTECTED] http://chris.wensel.net/ http://www.cascading.org/
Re: Thinking about retriving DFS metadata from datanodes!!!
+1 - from the perspective of the data nodes, dfs is just a block-level store and is thus much more robust and scalable. On 9/9/08 9:14 AM, Owen O'Malley [EMAIL PROTECTED] wrote: This isn't a very stable direction. You really don't want multiple distinct methods for storing the metadata, because discrepancies are very bad. High Availability (HA) is a very important medium term goal for HDFS, but it will likely be done using multiple NameNodes and ZooKeeper. -- Owen
Re: distcp failing
Looking in the task tracker log, i see this This file does exist on my local workstation, but it does not exist on the namenode/datanodes in my cluster. So it begs the question of if i misunderstood the use of distcp or is there still something wrong? I'm looking for something that will read a file from my workstation and load it into the dfs, but instead of going through the namenode like copyFromLocal seems to do, i'd like it to load the data via the datanodes directly, if distcp doesn't do it this way, is there anything that will? 2008-09-09 14:00:54,418 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId= 2008-09-09 14:00:54,662 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0 2008-09-09 14:00:54,894 INFO org.apache.hadoop.util.CopyFiles: FAIL 1gTestfile : java.io.FileNotFoundException: File file:/Users/mdidomenico/hadoop/1gTestfile does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:402) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:242) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:116) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:274) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:380) at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.copy(CopyFiles.java:366) at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.map(CopyFiles.java:493) at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.map(CopyFiles.java:268) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124) 2008-09-09 14:01:03,950 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124) On Tue, Sep 9, 2008 at 1:47 PM, Michael Di Domenico [EMAIL PROTECTED]wrote: Apparently, the fix to my original error is because hadoop is setup for a single local machine out of the box and i had to change these directories property namemapred.local.dir/name value/hadoop/mapred/local/value /property property namemapred.system.dir/name value/hadoop/mapred/system/value /property property namemapred.temp.dir/name value/hadoop/mapred/temp/value /property to be hdfs instead of hadoop.tmp.dir So now distcp works as a non-hadoop user and mapred works as a non-hadoop user from the name node, however, from a workstation i get this now blue:hadoop-0.17.1 mdidomenico$ bin/hadoop distcp file:///Users/mdidomenico/hadoop/1gTestfile 1gTestfile-1 08/09/09 13:44:19 INFO util.CopyFiles: srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile] 08/09/09 13:44:19 INFO util.CopyFiles: destPath=1gTestfile-1 08/09/09 13:44:20 INFO util.CopyFiles: srcCount=1 08/09/09 13:44:22 INFO mapred.JobClient: Running job: job_200809091332_0004 08/09/09 13:44:23 INFO mapred.JobClient: map 0% reduce 0% 08/09/09 13:44:31 INFO mapred.JobClient: Task Id : task_200809091332_0004_m_00_0, Status : FAILED java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124) 08/09/09 13:44:50 INFO mapred.JobClient: Task Id : task_200809091332_0004_m_00_1, Status : FAILED java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124) 08/09/09 13:45:07 INFO mapred.JobClient: Task Id : task_200809091332_0004_m_00_2, Status : FAILED java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124) 08/09/09 13:45:26 INFO mapred.JobClient: map 100% reduce 100% With failures, global counters are inaccurate; consider running with -i Copy failed: java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1062) at
Re: Hadoop Streaming and Multiline Input
If I understand your question correctly, you need to write your own FileInputFormat. Please see http://hadoop.apache.org/core/docs/r0.18.0/api/index.html for details. Regards, Tim On Sat, Sep 6, 2008 at 9:20 PM, Dennis Kubes [EMAIL PROTECTED] wrote: Is is possible to set a multiline text input in streaming to be used as a single record? For example say I wanted to scan a webpage for a specific regex that is multiline, is this possible in streaming? Dennis
Re: Hadoop Streaming and Multiline Input
If your webpage is xml tagged and you are looking into using streaming. This might help http://hadoop.apache.org/core/docs/r0.18.0/streaming.html#How+do+I+parse+XML+documents+using+streaming%3F -Lohit - Original Message From: Jim Twensky [EMAIL PROTECTED] To: core-user@hadoop.apache.org Sent: Tuesday, September 9, 2008 11:23:37 AM Subject: Re: Hadoop Streaming and Multiline Input If I understand your question correctly, you need to write your own FileInputFormat. Please see http://hadoop.apache.org/core/docs/r0.18.0/api/index.html for details. Regards, Tim On Sat, Sep 6, 2008 at 9:20 PM, Dennis Kubes [EMAIL PROTECTED] wrote: Is is possible to set a multiline text input in streaming to be used as a single record? For example say I wanted to scan a webpage for a specific regex that is multiline, is this possible in streaming? Dennis
Question on Streaming
Hello, I need to use Hadoop Streaming to run several instances of a single program on different files. Before doing it, I wrote a simple test application as the mapper, which basically outputs the standard input without doing anything useful. So it looks like the following: ---echo.sh-- echo Running mapper, input is $1 ---echo.sh-- For the input, I created a single text file input.txt that has number from 1 to 10 on each line, so it goes like: ---input.txt--- 1 2 .. 10 ---input.txt--- I uploaded input.txt on hdfs://stream/ directory and then ran Hadoop Streaming utility as follows: bin/hadoop jar hadoop-0.18.0-streaming.jar \ -input /stream \ -output /trash \ -mapper echo.sh \ -file echo.sh \ -jobconf mapred.reduce.tasks=0 and from what I understood in the streaming tutorial, I expected that each mapper would run an instance of echo.sh with one of the lines in input.txt so I expected to get an output in the form of Running mapper, input is 2 Running mapper, input is 5 ... and so on but I got only two output files, part-0 and part-1 that contain the string Running mapper, input is . As far as I see, the mappers ran the mapper script echo.sh without the standard input. I basicly followed the tutorial and I'm confused now so could you please tell me what I'm missing here? Thanks in advance, Jim
output multiple values?
I have a simple reducer that computes the average by doing a sum/ count. But I want to output both the average and the count for a given key, not just the average. Is it possible to output both values from the same invocation of the reducer? Or do I need two reducer invocations? If I try to call output.collect() twice from the reducer and label the key with type=avg or type=count, I get a bunch of garbage out. Please let me know if you have any suggestions. Thanks, Shirley
Re: output multiple values?
On Sep 9, 2008, at 12:20 PM, Shirley Cohen wrote: I have a simple reducer that computes the average by doing a sum/ count. But I want to output both the average and the count for a given key, not just the average. Is it possible to output both values from the same invocation of the reducer? Or do I need two reducer invocations? If I try to call output.collect() twice from the reducer and label the key with type=avg or type=count, I get a bunch of garbage out. Please let me know if you have any suggestions. I'd be tempted to define a type like: class AverageAndCount implements Writable { private long sum; private long count; ... public String toString() { return avg = + (sum / (double) count) + , count = + count); } } Then you could use your reducer as both a combiner and reducer and you would get both values out if you use TextOutputFormat. That said, it should absolutely work to do collect twice. -- Owen
Re: Simple Survey
Quick reminder to take the survey. We know more than a dozen companies are using Hadoop. heh http://www.scaleunlimited.com/survey.html thanks! chris On Sep 8, 2008, at 10:43 AM, Chris K Wensel wrote: Hey all Scale Unlimited is putting together some case studies for an upcoming class and wants to get a snapshot of what the Hadoop user community looks like. If you have 2 minutes, please feel free to take the short anonymous survey below: http://www.scaleunlimited.com/survey.html All results will be public. cheers, chris -- Chris K Wensel [EMAIL PROTECTED] http://chris.wensel.net/ http://www.cascading.org/ -- Chris K Wensel [EMAIL PROTECTED] http://chris.wensel.net/ http://www.cascading.org/
Re: Could not obtain block: blk_-2634319951074439134_1129 file=/user/root/crawl_debug/segments/20080825053518/content/part-00002/data
I'm not sure whether this is the same issue or not, but on my 4 slave cluster, setting the below parameter doesn't seem to fix the issue. What I'm seeing is that occasionally data nodes stop responding for up to 10 minutes at a time. In this case, the TaskTrackers will mark the nodes as dead, and occasionally the namenode will mark them as dead as well (you can see the Last Contact time steadily increase for a random node at a time every half hour or so. This seems to be happening during times of high disk utilization. -- Stefan From: Espen Amble Kolstad [EMAIL PROTECTED] Reply-To: core-user@hadoop.apache.org Date: Mon, 8 Sep 2008 12:40:01 +0200 To: core-user@hadoop.apache.org Subject: Re: Could not obtain block: blk_-2634319951074439134_1129 file=/user/root/crawl_debug/segments/20080825053518/content/part-2/data There's a JIRA on this already: https://issues.apache.org/jira/browse/HADOOP-3831 Setting dfs.datanode.socket.write.timeout=0 in hadoop-site.xml seems to do the trick for now. Espen On Mon, Sep 8, 2008 at 11:24 AM, Espen Amble Kolstad [EMAIL PROTECTED] wrote: Hi, Thanks for the tip! I tried revision 692572 of the 0.18 branch, but I still get the same errors. On Sunday 07 September 2008 09:42:43 Dhruba Borthakur wrote: The DFS errors might have been caused by http://issues.apache.org/jira/browse/HADOOP-4040 thanks, dhruba On Sat, Sep 6, 2008 at 6:59 AM, Devaraj Das [EMAIL PROTECTED] wrote: These exceptions are apparently coming from the dfs side of things. Could someone from the dfs side please look at these? On 9/5/08 3:04 PM, Espen Amble Kolstad [EMAIL PROTECTED] wrote: Hi, Thanks! The patch applies without change to hadoop-0.18.0, and should be included in a 0.18.1. However, I'm still seeing: in hadoop.log: 2008-09-05 11:13:54,805 WARN dfs.DFSClient - Exception while reading from blk_3428404120239503595_2664 of /user/trank/segments/20080905102650/crawl_generate/part-00010 from somehost:50010: java.io.IOException: Premeture EOF from in putStream in datanode.log: 2008-09-05 11:15:09,554 WARN dfs.DataNode - DatanodeRegistration(somehost:50010, storageID=DS-751763840-somehost-50010-1219931304453, infoPort=50075, ipcPort=50020):Got exception while serving blk_-4682098638573619471_2662 to /somehost: java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/somehost:50010 remote=/somehost:45244] These entries in datanode.log happens a few minutes apart repeatedly. I've reduced # map-tasks so load on this node is below 1.0 with 5GB of free memory (so it's not resource starvation). Espen On Thu, Sep 4, 2008 at 3:33 PM, Devaraj Das [EMAIL PROTECTED] wrote: I started a profile of the reduce-task. I've attached the profiling output. It seems from the samples that ramManager.waitForDataToMerge() doesn't actually wait. Has anybody seen this behavior. This has been fixed in HADOOP-3940 On 9/4/08 6:36 PM, Espen Amble Kolstad [EMAIL PROTECTED] wrote: I have the same problem on our cluster. It seems the reducer-tasks are using all cpu, long before there's anything to shuffle. I started a profile of the reduce-task. I've attached the profiling output. It seems from the samples that ramManager.waitForDataToMerge() doesn't actually wait. Has anybody seen this behavior. Espen On Thursday 28 August 2008 06:11:42 wangxu wrote: Hi,all I am using hadoop-0.18.0-core.jar and nutch-2008-08-18_04-01-55.jar, and running hadoop on one namenode and 4 slaves. attached is my hadoop-site.xml, and I didn't change the file hadoop-default.xml when data in segments are large,this kind of errors occure: java.io.IOException: Could not obtain block: blk_-2634319951074439134_1129 file=/user/root/crawl_debug/segments/20080825053518/content/part- 2/data at org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClie nt.jav a:1462) at org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient. java:1 312) at org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:14 17) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.j ava:64 ) at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:102 ) at org.apache.hadoop.io.SequenceFile$Reader.readBuffer(SequenceFile.java :1646) at org.apache.hadoop.io.SequenceFile$Reader.seekToCurrentValue(SequenceF ile.ja va:1712) at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile .java: 1787) at org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(Seq uenceF ileRecordReader.java:104) at org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRe cordRe ader.java:79) at org.apache.hadoop.mapred.join.WrappedRecordReader.next(WrappedRecordR eader. java:112) at
Re: Simple Survey
Unfortunately there is a problem with the survey. I was unable to answer correctly question '9. How much data is stored on your Hadoop cluster (in GB)?' It would not let me enter more than 10TB (we currently have 45TB of data in our cluster; actual data, not a sum of disk used (with all of its replicas) but unique data). Other than that, I tried :-) On Tue, Sep 9, 2008 at 4:01 PM, Chris K Wensel [EMAIL PROTECTED] wrote: Quick reminder to take the survey. We know more than a dozen companies are using Hadoop. heh http://www.scaleunlimited.com/survey.html thanks! chris On Sep 8, 2008, at 10:43 AM, Chris K Wensel wrote: Hey all Scale Unlimited is putting together some case studies for an upcoming class and wants to get a snapshot of what the Hadoop user community looks like. If you have 2 minutes, please feel free to take the short anonymous survey below: http://www.scaleunlimited.com/survey.html All results will be public. cheers, chris -- Chris K Wensel [EMAIL PROTECTED] http://chris.wensel.net/ http://www.cascading.org/ -- Chris K Wensel [EMAIL PROTECTED] http://chris.wensel.net/ http://www.cascading.org/
Hadoop 0.18 stable?
Hi , When is the hadoop 0.18 version expected to be stable? I was looking into upgrading to it. Are there any known critical issues that we've run into in this version? Thanks, Deepika
Re: Simple Survey
how weird. i'll forward this on and see if they can fix it. thanks for letting me know. chris On Sep 9, 2008, at 4:22 PM, John Kane wrote: Unfortunately there is a problem with the survey. I was unable to answer correctly question '9. How much data is stored on your Hadoop cluster (in GB)?' It would not let me enter more than 10TB (we currently have 45TB of data in our cluster; actual data, not a sum of disk used (with all of its replicas) but unique data). Other than that, I tried :-) On Tue, Sep 9, 2008 at 4:01 PM, Chris K Wensel [EMAIL PROTECTED] wrote: Quick reminder to take the survey. We know more than a dozen companies are using Hadoop. heh http://www.scaleunlimited.com/survey.html thanks! chris On Sep 8, 2008, at 10:43 AM, Chris K Wensel wrote: Hey all Scale Unlimited is putting together some case studies for an upcoming class and wants to get a snapshot of what the Hadoop user community looks like. If you have 2 minutes, please feel free to take the short anonymous survey below: http://www.scaleunlimited.com/survey.html All results will be public. cheers, chris -- Chris K Wensel [EMAIL PROTECTED] http://chris.wensel.net/ http://www.cascading.org/ -- Chris K Wensel [EMAIL PROTECTED] http://chris.wensel.net/ http://www.cascading.org/ -- Chris K Wensel [EMAIL PROTECTED] http://chris.wensel.net/ http://www.cascading.org/
Re: Hadoop 0.18 stable?
I would suggest 0.18.1 (released or not). Yahoo! is still hasn't deployed 0.18 widely yet but it got a fair amount of smaller scale testing across the user base. Raghu. Deepika Khera wrote: Hi , When is the hadoop 0.18 version expected to be stable? I was looking into upgrading to it. Are there any known critical issues that we've run into in this version? Thanks, Deepika
Re: Thinking about retriving DFS metadata from datanodes!!!
Thanks for paying attention to my tentative idea! What I thought isn't how to store the meradata, but the final (or last) way to recover valuable data in the cluster when something worst (which destroy the metadata in all multiple NameNode) happen. i.e. terrorist attack or natural disasters destroy half of cluster nodes within all NameNode, we can recover as much data as possible by this mechanism, and hava big chance to recover entire data of cluster because fo original replication. Any suggestion is appreciate! 2008/9/10 Pete Wyckoff [EMAIL PROTECTED] +1 - from the perspective of the data nodes, dfs is just a block-level store and is thus much more robust and scalable. On 9/9/08 9:14 AM, Owen O'Malley [EMAIL PROTECTED] wrote: This isn't a very stable direction. You really don't want multiple distinct methods for storing the metadata, because discrepancies are very bad. High Availability (HA) is a very important medium term goal for HDFS, but it will likely be done using multiple NameNodes and ZooKeeper. -- Owen -- Sorry for my english!! 明
number of tasks on a node.
Hi. How can node find out how many task are being run on it at a given time? I want tasktracer nodes (which are assigned from amazon EC) to shutdown if nothing is being run for some period of time, but don't yet see right way of implementing this.
Re: number of tasks on a node.
You can monitor the task or node status through web pages provided. On Wed, Sep 10, 2008 at 11:24 AM, Edward J. Yoon [EMAIL PROTECTED]wrote: TaskTrackers are communicates with JobTracker so I guess you can handle it via JobTracker. -Edward On Wed, Sep 10, 2008 at 12:03 PM, Dmitry Pushkarev [EMAIL PROTECTED] wrote: Hi. How can node find out how many task are being run on it at a given time? I want tasktracer nodes (which are assigned from amazon EC) to shutdown if nothing is being run for some period of time, but don't yet see right way of implementing this. -- Best regards, Edward J. Yoon [EMAIL PROTECTED] http://blog.udanax.org -- 朱盛凯 Jash Zhu 复旦大学软件学院 Software School, Fudan University