Re: Issue in reduce phase with SortedMapWritable and custom Writables as values

2008-09-09 Thread Ryan LeCompte
Okay, I think I'm getting closer but now I'm running into another problem.

First off, I created my own CustomMapWritable that extends MapWritable
and invokes AbstractMapWritable.addToMap() to add my custom classes.
Now the map/reduce phases actually complete and the job as a whole
completes. However, when I try to use the SequenceFile API to later
read the output data, I'm getting a strange exception. First the code:

FileSystem fileSys = FileSystem.get(conf);
SequenceFile.Reader reader = new SequenceFile.Reader(fileSys, inFile,
conf);
Text key = new Text();
CustomWritable stats = new CustomWritable();
reader.next(key, stats);
reader.close();

And now the exception that's thrown:

java.io.IOException: can't find class: com.test.CustomStatsWritable
because com.test.CustomStatsWritable
at 
org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:210)
at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:145)
at com.test.CustomStatsWritable.readFields(UserStatsWritable.java:49)
at 
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
...

Any ideas?

Thanks,
Ryan


On Tue, Sep 9, 2008 at 12:36 AM, Ryan LeCompte [EMAIL PROTECTED] wrote:
 Hello,

 I'm attempting to use a SortedMapWritable with a LongWritable as the
 key and a custom implementation of org.apache.hadoop.io.Writable as
 the value. I notice that my program works fine when I use another
 primitive wrapper (e.g. Text) as the value, but fails with the
 following exception when I use my custom Writable instance:

 2008-09-08 23:25:02,072 INFO org.apache.hadoop.mapred.ReduceTask:
 Initiating in-memory merge with 1 segments...
 2008-09-08 23:25:02,077 INFO org.apache.hadoop.mapred.Merger: Merging
 1 sorted segments
 2008-09-08 23:25:02,077 INFO org.apache.hadoop.mapred.Merger: Down to
 the last merge-pass, with 1 segments left of total size: 5492 bytes
 2008-09-08 23:25:02,099 WARN org.apache.hadoop.mapred.ReduceTask:
 attempt_200809082247_0005_r_00_0 Merge of the inmemory files threw
 a
 n exception: java.io.IOException: Intermedate merge failed
at 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2133)
at 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2064)
 Caused by: java.lang.RuntimeException: java.lang.NullPointerException
at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:80)
at 
 org.apache.hadoop.io.SortedMapWritable.readFields(SortedMapWritable.java:179)
...

 I noticed that the AbstractMapWritable class has a protected
 addToMap(Class clazz) method. Do I somehow need to let my
 SortedMapWritable instance know about my custom Writable value? I've
 properly implemented the custom Writable object (it just contains a
 few primitives, like longs and ints).

 Any insight is appreciated.

 Thanks,
 Ryan



Re: public IP for datanode on EC2

2008-09-09 Thread Julien Nioche
 I think most people try to avoid allowing remote access for security
 reasons. If you can add a file, I can mount your filesystem too, maybe even
 delete things. Whereas with EC2-only filesystems, your files are *only*
 exposed to everyone else that knows or can scan for your IPAddr and ports.


I imagine that the access to the ports used by HDFS could be restricted to
specific IPs using the EC2 group (ec2-authorize) or any other firewall
mechanism if necessary.

Could anyone confirm that there is no conf parameter I could use to force
the address of my DataNodes?

Thanks

Julien

-- 
DigitalPebble Ltd
http://www.digitalpebble.com


Re: Issue in reduce phase with SortedMapWritable and custom Writables as values

2008-09-09 Thread Ryan LeCompte
Based on some similar problems that I found others were having in the
mailing lists, it looks like the solution was to list my Map/Reduce
job JAR In the conf/hadoop-env.sh file under HADOOP_CLASSPATH. After
doing that and re-submitting the job, it all worked fine! I guess the
MapWritable class somehow doesn't share the same classpath as the
program that actually submits the job conf. Is this expected?

Thanks,
Ryan


On Tue, Sep 9, 2008 at 9:44 AM, Ryan LeCompte [EMAIL PROTECTED] wrote:
 Okay, I think I'm getting closer but now I'm running into another problem.

 First off, I created my own CustomMapWritable that extends MapWritable
 and invokes AbstractMapWritable.addToMap() to add my custom classes.
 Now the map/reduce phases actually complete and the job as a whole
 completes. However, when I try to use the SequenceFile API to later
 read the output data, I'm getting a strange exception. First the code:

 FileSystem fileSys = FileSystem.get(conf);
 SequenceFile.Reader reader = new SequenceFile.Reader(fileSys, inFile,
 conf);
 Text key = new Text();
 CustomWritable stats = new CustomWritable();
 reader.next(key, stats);
 reader.close();

 And now the exception that's thrown:

 java.io.IOException: can't find class: com.test.CustomStatsWritable
 because com.test.CustomStatsWritable
at 
 org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:210)
at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:145)
at com.test.CustomStatsWritable.readFields(UserStatsWritable.java:49)
at 
 org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
at 
 org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
 ...

 Any ideas?

 Thanks,
 Ryan


 On Tue, Sep 9, 2008 at 12:36 AM, Ryan LeCompte [EMAIL PROTECTED] wrote:
 Hello,

 I'm attempting to use a SortedMapWritable with a LongWritable as the
 key and a custom implementation of org.apache.hadoop.io.Writable as
 the value. I notice that my program works fine when I use another
 primitive wrapper (e.g. Text) as the value, but fails with the
 following exception when I use my custom Writable instance:

 2008-09-08 23:25:02,072 INFO org.apache.hadoop.mapred.ReduceTask:
 Initiating in-memory merge with 1 segments...
 2008-09-08 23:25:02,077 INFO org.apache.hadoop.mapred.Merger: Merging
 1 sorted segments
 2008-09-08 23:25:02,077 INFO org.apache.hadoop.mapred.Merger: Down to
 the last merge-pass, with 1 segments left of total size: 5492 bytes
 2008-09-08 23:25:02,099 WARN org.apache.hadoop.mapred.ReduceTask:
 attempt_200809082247_0005_r_00_0 Merge of the inmemory files threw
 a
 n exception: java.io.IOException: Intermedate merge failed
at 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2133)
at 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2064)
 Caused by: java.lang.RuntimeException: java.lang.NullPointerException
at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:80)
at 
 org.apache.hadoop.io.SortedMapWritable.readFields(SortedMapWritable.java:179)
...

 I noticed that the AbstractMapWritable class has a protected
 addToMap(Class clazz) method. Do I somehow need to let my
 SortedMapWritable instance know about my custom Writable value? I've
 properly implemented the custom Writable object (it just contains a
 few primitives, like longs and ints).

 Any insight is appreciated.

 Thanks,
 Ryan




Re: Thinking about retriving DFS metadata from datanodes!!!

2008-09-09 Thread Owen O'Malley
This isn't a very stable direction. You really don't want multiple distinct
methods for storing the metadata, because discrepancies are very bad. High
Availability (HA) is a very important medium term goal for HDFS, but it will
likely be done using multiple NameNodes and ZooKeeper.

-- Owen


Re: Failing MR jobs!

2008-09-09 Thread Arun C Murthy


On Sep 7, 2008, at 12:26 PM, Erik Holstad wrote:


Hi!
I'm trying to run a MR job, but it keeps on failing and I can't  
understand

why.
Sometimes it shows output at 66% and sometimes 98% or so.
I had a couple of exception before that I didn't catch that made the  
job to

fail.


The log file from the task can be found at:
http://pastebin.com/m4414d369



From the logs it looks like the TaskTracker killed your reduce task  
because it didn't report any progress for 10 mins, which is the  
default timeout.


FWIW it's probably because _one_ of the calls to your 'reduce'  
function got stuck trying to communicate with one of the external  
resources you are using...


Arun


Re: distcp failing

2008-09-09 Thread Michael Di Domenico
i'm not sure that's the issue, i basically tarred up the hadoop directory
from the cluster and copied it over to the non-data node
but i do agree i've likely got a setting wrong, since i can run distcp from
the namenode and it works fine.  the question is which one

On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball [EMAIL PROTECTED]wrote:

 It is likely that you mapred.system.dir and/or fs.default.name settings
 are
 incorrect on the non-datanode machine that you are launching the task from.
 These two settings (in your conf/hadoop-site.xml file) must match the
 settings on the cluster itself.

 - Aaron

 On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico
 [EMAIL PROTECTED]wrote:

  I'm attempting to load data into hadoop (version 0.17.1), from a
  non-datanode machine in the cluster.  I can run jobs and copyFromLocal
  works
  fine, but when i try to use distcp i get the below.  I'm don't understand
  what the error, can anyone help?
  Thanks
 
  blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite
  file:///Users/mdidomenico/hadoop/1gTestfile /user/mdidomenico/1gTestfile
  08/09/07 23:56:06 INFO util.CopyFiles:
  srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
  08/09/07 23:56:06 INFO util.CopyFiles:
  destPath=/user/mdidomenico/1gTestfile1
  08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1
  With failures, global counters are inaccurate; consider running with -i
  Copy failed: org.apache.hadoop.ipc.RemoteException: java.io.IOException:
  /tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such
  file
  or directory
 at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215)
 at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149)
 at
  org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155)
 at
  org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136)
 at
  org.apache.hadoop.mapred.JobInProgress.init(JobInProgress.java:175)
 at
  org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1755)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
 at java.lang.reflect.Method.invoke(Unknown Source)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
 
 at org.apache.hadoop.ipc.Client.call(Client.java:557)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
 at $Proxy1.submitJob(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:585)
 at
 
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 at
 
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 at $Proxy1.submitJob(Unknown Source)
 at
 org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758)
 at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
 at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
 at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
 at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)
 



Re: distcp failing

2008-09-09 Thread Michael Di Domenico
a little more digging and it appears i cannot run distcp as someone other
then hadoop on the namenode
 /tmp/hadoop-hadoop/mapred/system/job_200809091231_0005/job.xml

looking at this directory from the error file the system directory does
not exist on the namenode, i only have a local directory

On Tue, Sep 9, 2008 at 12:41 PM, Michael Di Domenico [EMAIL PROTECTED]
 wrote:

 i'm not sure that's the issue, i basically tarred up the hadoop directory
 from the cluster and copied it over to the non-data node
 but i do agree i've likely got a setting wrong, since i can run distcp from
 the namenode and it works fine.  the question is which one

 On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball [EMAIL PROTECTED]wrote:

 It is likely that you mapred.system.dir and/or fs.default.name settings
 are
 incorrect on the non-datanode machine that you are launching the task
 from.
 These two settings (in your conf/hadoop-site.xml file) must match the
 settings on the cluster itself.

 - Aaron

 On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico
 [EMAIL PROTECTED]wrote:

  I'm attempting to load data into hadoop (version 0.17.1), from a
  non-datanode machine in the cluster.  I can run jobs and copyFromLocal
  works
  fine, but when i try to use distcp i get the below.  I'm don't
 understand
  what the error, can anyone help?
  Thanks
 
  blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite
  file:///Users/mdidomenico/hadoop/1gTestfile /user/mdidomenico/1gTestfile
  08/09/07 23:56:06 INFO util.CopyFiles:
  srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
  08/09/07 23:56:06 INFO util.CopyFiles:
  destPath=/user/mdidomenico/1gTestfile1
  08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1
  With failures, global counters are inaccurate; consider running with -i
  Copy failed: org.apache.hadoop.ipc.RemoteException: java.io.IOException:
  /tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such
  file
  or directory
 at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215)
 at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149)
 at
  org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155)
 at
  org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136)
 at
  org.apache.hadoop.mapred.JobInProgress.init(JobInProgress.java:175)
 at
  org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1755)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
 Source)
 at java.lang.reflect.Method.invoke(Unknown Source)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
 
 at org.apache.hadoop.ipc.Client.call(Client.java:557)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
 at $Proxy1.submitJob(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:585)
 at
 
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 at
 
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 at $Proxy1.submitJob(Unknown Source)
 at
 org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758)
 at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
 at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
 at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
 at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)
 





Re: specifying number of nodes for job

2008-09-09 Thread Owen O'Malley
On Mon, Sep 8, 2008 at 4:26 PM, Sandy [EMAIL PROTECTED] wrote:

 In all seriousness though, why is this not possible? Is there something
 about the MapReduce model of parallel computation that I am not
 understanding? Or this more of an arbitrary implementation choice made by
 the Hadoop framework? If so, I am curious why this is the case. What are
 the
 benefits?


It is possible to do with changes to Hadoop. There was a jira filed for it,
but I don't think anyone has worked on it. (HADOOP-2573) For Map/Reduce it
is a design goal that number of tasks not nodes are the important metric.
You want a job to be able to run with any given cluster size. For
scalability testing, you could just remove task trackers...

-- Owen


Re: distcp failing

2008-09-09 Thread Michael Di Domenico
manually creating the system directory gets me past the first error, but
now i get this.  i'm not necessarily sure its a step forward though, because
the map task never shows up in the jobtracker
[EMAIL PROTECTED] hadoop-0.17.1]$ bin/hadoop distcp
file:///home/mdidomenico/1gTestfile 1gTestfile
08/09/09 13:12:06 INFO util.CopyFiles:
srcPaths=[file:/home/mdidomenico/1gTestfile]
08/09/09 13:12:06 INFO util.CopyFiles: destPath=1gTestfile
08/09/09 13:12:07 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:07 INFO dfs.DFSClient: Abandoning block
blk_5758513071638050362
08/09/09 13:12:13 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:13 INFO dfs.DFSClient: Abandoning block
blk_1691495306775808049
08/09/09 13:12:17 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:17 INFO dfs.DFSClient: Abandoning block
blk_1027634596973755899
08/09/09 13:12:19 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:19 INFO dfs.DFSClient: Abandoning block
blk_4535302510016050282
08/09/09 13:12:23 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:23 INFO dfs.DFSClient: Abandoning block
blk_7022658012001626339
08/09/09 13:12:25 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:25 INFO dfs.DFSClient: Abandoning block
blk_-4509681241839967328
08/09/09 13:12:29 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:29 INFO dfs.DFSClient: Abandoning block
blk_8318033979013580420
08/09/09 13:12:31 WARN dfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
08/09/09 13:12:31 WARN dfs.DFSClient: Error Recovery for block
blk_-4509681241839967328 bad datanode[0]
08/09/09 13:12:35 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:35 INFO dfs.DFSClient: Abandoning block
blk_2848354798649979411
08/09/09 13:12:41 WARN dfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
08/09/09 13:12:41 WARN dfs.DFSClient: Error Recovery for block
blk_2848354798649979411 bad datanode[0]
Exception in thread Thread-0 java.util.ConcurrentModificationException
at java.util.TreeMap$PrivateEntryIterator.nextEntry(Unknown Source)
at java.util.TreeMap$KeyIterator.next(Unknown Source)
at org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
at
org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214)
at
org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324)
at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224)
at
org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209)
08/09/09 13:12:41 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:41 INFO dfs.DFSClient: Abandoning block
blk_9189111926428577428

On Tue, Sep 9, 2008 at 1:03 PM, Michael Di Domenico
[EMAIL PROTECTED]wrote:

 a little more digging and it appears i cannot run distcp as someone other
 then hadoop on the namenode
  /tmp/hadoop-hadoop/mapred/system/job_200809091231_0005/job.xml

 looking at this directory from the error file the system directory does
 not exist on the namenode, i only have a local directory


 On Tue, Sep 9, 2008 at 12:41 PM, Michael Di Domenico 
 [EMAIL PROTECTED] wrote:

 i'm not sure that's the issue, i basically tarred up the hadoop directory
 from the cluster and copied it over to the non-data node
 but i do agree i've likely got a setting wrong, since i can run distcp
 from the namenode and it works fine.  the question is which one

 On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball [EMAIL PROTECTED]wrote:

 It is likely that you mapred.system.dir and/or fs.default.name settings
 are
 incorrect on the non-datanode machine that you are launching the task
 from.
 These two settings (in your conf/hadoop-site.xml file) must match the
 settings on the cluster itself.

 - Aaron

 On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico
 [EMAIL PROTECTED]wrote:

  I'm attempting to load data into hadoop (version 0.17.1), from a
  non-datanode machine in the cluster.  I can run jobs and copyFromLocal
  works
  fine, but when i try to use distcp i get the below.  I'm don't
 understand
  what the error, can anyone help?
  Thanks
 
  blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite
  file:///Users/mdidomenico/hadoop/1gTestfile
 /user/mdidomenico/1gTestfile
  08/09/07 23:56:06 INFO util.CopyFiles:
  srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
  08/09/07 23:56:06 INFO 

Re: Monthly Hadoop User Group Meeting (Bay Area)

2008-09-09 Thread Doug Cutting

Chris K Wensel wrote:
doh, conveniently collides with the GridGain and GridDynamics 
presentations:


http://web.meetup.com/66/calendar/8561664/


Bay Area Hadoop User Group meetings are held on the third Wednesday 
every month.  This has been on the calendar for quite a while.


Doug


Re: distcp failing

2008-09-09 Thread Michael Di Domenico
Apparently, the fix to my original error is because hadoop is setup for a
single local machine out of the box and i had to change these directories
property
  namemapred.local.dir/name
  value/hadoop/mapred/local/value
/property
property
  namemapred.system.dir/name
  value/hadoop/mapred/system/value
/property
property
  namemapred.temp.dir/name
  value/hadoop/mapred/temp/value
/property

to be hdfs instead of hadoop.tmp.dir

So now distcp works as a non-hadoop user and mapred works as a non-hadoop
user from the name node, however, from a workstation i get this now

blue:hadoop-0.17.1 mdidomenico$ bin/hadoop distcp
file:///Users/mdidomenico/hadoop/1gTestfile 1gTestfile-1
08/09/09 13:44:19 INFO util.CopyFiles:
srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
08/09/09 13:44:19 INFO util.CopyFiles: destPath=1gTestfile-1
08/09/09 13:44:20 INFO util.CopyFiles: srcCount=1
08/09/09 13:44:22 INFO mapred.JobClient: Running job: job_200809091332_0004
08/09/09 13:44:23 INFO mapred.JobClient:  map 0% reduce 0%
08/09/09 13:44:31 INFO mapred.JobClient: Task Id :
task_200809091332_0004_m_00_0, Status : FAILED
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
at
org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

08/09/09 13:44:50 INFO mapred.JobClient: Task Id :
task_200809091332_0004_m_00_1, Status : FAILED
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
at
org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

08/09/09 13:45:07 INFO mapred.JobClient: Task Id :
task_200809091332_0004_m_00_2, Status : FAILED
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
at
org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

08/09/09 13:45:26 INFO mapred.JobClient:  map 100% reduce 100%
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1062)
at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)


On Tue, Sep 9, 2008 at 1:14 PM, Michael Di Domenico
[EMAIL PROTECTED]wrote:

 manually creating the system directory gets me past the first error, but
 now i get this.  i'm not necessarily sure its a step forward though, because
 the map task never shows up in the jobtracker
 [EMAIL PROTECTED] hadoop-0.17.1]$ bin/hadoop distcp
 file:///home/mdidomenico/1gTestfile 1gTestfile
 08/09/09 13:12:06 INFO util.CopyFiles:
 srcPaths=[file:/home/mdidomenico/1gTestfile]
 08/09/09 13:12:06 INFO util.CopyFiles: destPath=1gTestfile
 08/09/09 13:12:07 INFO dfs.DFSClient: Exception in createBlockOutputStream
 java.io.IOException: Could not read from stream
 08/09/09 13:12:07 INFO dfs.DFSClient: Abandoning block
 blk_5758513071638050362
 08/09/09 13:12:13 INFO dfs.DFSClient: Exception in createBlockOutputStream
 java.io.IOException: Could not read from stream
 08/09/09 13:12:13 INFO dfs.DFSClient: Abandoning block
 blk_1691495306775808049
 08/09/09 13:12:17 INFO dfs.DFSClient: Exception in createBlockOutputStream
 java.io.IOException: Could not read from stream
 08/09/09 13:12:17 INFO dfs.DFSClient: Abandoning block
 blk_1027634596973755899
 08/09/09 13:12:19 INFO dfs.DFSClient: Exception in createBlockOutputStream
 java.io.IOException: Could not read from stream
 08/09/09 13:12:19 INFO dfs.DFSClient: Abandoning block
 blk_4535302510016050282
 08/09/09 13:12:23 INFO dfs.DFSClient: Exception in createBlockOutputStream
 java.io.IOException: Could not read from stream
 08/09/09 13:12:23 INFO dfs.DFSClient: Abandoning block
 blk_7022658012001626339
 08/09/09 13:12:25 INFO dfs.DFSClient: Exception in createBlockOutputStream
 java.io.IOException: Could not read from stream
 08/09/09 13:12:25 INFO dfs.DFSClient: Abandoning block
 blk_-4509681241839967328
 08/09/09 13:12:29 INFO dfs.DFSClient: Exception in createBlockOutputStream
 java.io.IOException: Could not read from stream
 08/09/09 13:12:29 INFO dfs.DFSClient: Abandoning block
 blk_8318033979013580420
 08/09/09 13:12:31 WARN dfs.DFSClient: 

Re: Monthly Hadoop User Group Meeting (Bay Area)

2008-09-09 Thread Chris K Wensel

Chris K Wensel wrote:
doh, conveniently collides with the GridGain and GridDynamics  
presentations:

http://web.meetup.com/66/calendar/8561664/


Bay Area Hadoop User Group meetings are held on the third Wednesday  
every month.  This has been on the calendar for quite a while.


Doug



maybe I should have said, coincidentally.

--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/



Re: Thinking about retriving DFS metadata from datanodes!!!

2008-09-09 Thread Pete Wyckoff
+1 - 

from the perspective of the data nodes, dfs is just a block-level store and
is thus much more robust and scalable.



On 9/9/08 9:14 AM, Owen O'Malley [EMAIL PROTECTED] wrote:

 This isn't a very stable direction. You really don't want multiple distinct
 methods for storing the metadata, because discrepancies are very bad. High
 Availability (HA) is a very important medium term goal for HDFS, but it will
 likely be done using multiple NameNodes and ZooKeeper.
 
 -- Owen



Re: distcp failing

2008-09-09 Thread Michael Di Domenico
Looking in the task tracker log, i see this
This file does exist on my local workstation, but it does not exist on the
namenode/datanodes in my cluster.  So it begs the question of if i
misunderstood the use of distcp or is there still something wrong?

I'm looking for something that will read a file from my workstation and load
it into the dfs, but instead of going through the namenode like
copyFromLocal seems to do, i'd like it to load the data via the datanodes
directly, if distcp doesn't do it this way, is there anything that will?

2008-09-09 14:00:54,418 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=MAP, sessionId=
2008-09-09 14:00:54,662 INFO org.apache.hadoop.mapred.MapTask:
numReduceTasks: 0
2008-09-09 14:00:54,894 INFO org.apache.hadoop.util.CopyFiles: FAIL
1gTestfile : java.io.FileNotFoundException: File
file:/Users/mdidomenico/hadoop/1gTestfile does not exist.
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:402)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:242)
at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:116)
at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:274)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:380)
at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.copy(CopyFiles.java:366)
at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.map(CopyFiles.java:493)
at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.map(CopyFiles.java:268)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

2008-09-09 14:01:03,950 WARN org.apache.hadoop.mapred.TaskTracker: Error
running child
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
at
org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)


On Tue, Sep 9, 2008 at 1:47 PM, Michael Di Domenico
[EMAIL PROTECTED]wrote:

 Apparently, the fix to my original error is because hadoop is setup for a
 single local machine out of the box and i had to change these directories
 property
   namemapred.local.dir/name
   value/hadoop/mapred/local/value
 /property
 property
   namemapred.system.dir/name
   value/hadoop/mapred/system/value
 /property
 property
   namemapred.temp.dir/name
   value/hadoop/mapred/temp/value
 /property

 to be hdfs instead of hadoop.tmp.dir

 So now distcp works as a non-hadoop user and mapred works as a non-hadoop
 user from the name node, however, from a workstation i get this now

 blue:hadoop-0.17.1 mdidomenico$ bin/hadoop distcp
 file:///Users/mdidomenico/hadoop/1gTestfile 1gTestfile-1
 08/09/09 13:44:19 INFO util.CopyFiles:
 srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
 08/09/09 13:44:19 INFO util.CopyFiles: destPath=1gTestfile-1
 08/09/09 13:44:20 INFO util.CopyFiles: srcCount=1
 08/09/09 13:44:22 INFO mapred.JobClient: Running job: job_200809091332_0004
 08/09/09 13:44:23 INFO mapred.JobClient:  map 0% reduce 0%
 08/09/09 13:44:31 INFO mapred.JobClient: Task Id :
 task_200809091332_0004_m_00_0, Status : FAILED
 java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
 at
 org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
 at
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

 08/09/09 13:44:50 INFO mapred.JobClient: Task Id :
 task_200809091332_0004_m_00_1, Status : FAILED
 java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
 at
 org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
 at
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

 08/09/09 13:45:07 INFO mapred.JobClient: Task Id :
 task_200809091332_0004_m_00_2, Status : FAILED
 java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
 at
 org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
 at
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

 08/09/09 13:45:26 INFO mapred.JobClient:  map 100% reduce 100%
 With failures, global counters are inaccurate; consider running with -i
 Copy failed: java.io.IOException: Job failed!
 at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1062)
 at 

Re: Hadoop Streaming and Multiline Input

2008-09-09 Thread Jim Twensky
If I understand your question correctly, you need to write your own
FileInputFormat. Please see
http://hadoop.apache.org/core/docs/r0.18.0/api/index.html for details.

Regards,
Tim

On Sat, Sep 6, 2008 at 9:20 PM, Dennis Kubes [EMAIL PROTECTED] wrote:

 Is is possible to set a multiline text input in streaming to be used as a
 single record?  For example say I wanted to scan a webpage for a specific
 regex that is multiline, is this possible in streaming?

 Dennis



Re: Hadoop Streaming and Multiline Input

2008-09-09 Thread lohit
If your webpage is xml tagged and you are looking into using streaming.
This might help 
http://hadoop.apache.org/core/docs/r0.18.0/streaming.html#How+do+I+parse+XML+documents+using+streaming%3F
-Lohit



- Original Message 
From: Jim Twensky [EMAIL PROTECTED]
To: core-user@hadoop.apache.org
Sent: Tuesday, September 9, 2008 11:23:37 AM
Subject: Re: Hadoop Streaming and Multiline Input

If I understand your question correctly, you need to write your own
FileInputFormat. Please see
http://hadoop.apache.org/core/docs/r0.18.0/api/index.html for details.

Regards,
Tim

On Sat, Sep 6, 2008 at 9:20 PM, Dennis Kubes [EMAIL PROTECTED] wrote:

 Is is possible to set a multiline text input in streaming to be used as a
 single record?  For example say I wanted to scan a webpage for a specific
 regex that is multiline, is this possible in streaming?

 Dennis




Question on Streaming

2008-09-09 Thread Jim Twensky
Hello, I need to use Hadoop Streaming to run several instances of a single
program on different files. Before doing it, I wrote a simple test
application as the mapper, which basically outputs the standard input
without doing anything useful. So it looks like the following:

---echo.sh--
echo Running mapper, input is $1
---echo.sh--

For the input, I created a single text file input.txt that has number from 1
to 10 on each line, so it goes like:

---input.txt---
1
2
..
10
---input.txt---

I uploaded input.txt on hdfs://stream/ directory and then ran Hadoop
Streaming utility as follows:

bin/hadoop jar hadoop-0.18.0-streaming.jar  \
-input /stream \
-output /trash \
-mapper echo.sh \
-file echo.sh \
-jobconf mapred.reduce.tasks=0

and from what I understood in the streaming tutorial, I expected that each
mapper would run an instance of echo.sh with one of the lines in input.txt
so I expected to get an output in the form of

Running mapper, input is 2
Running mapper, input is 5
...
and so on but I got only two output files, part-0 and part-1 that
contain the string Running mapper, input is  . As far as I see, the
mappers ran the mapper script echo.sh without the standard input. I basicly
followed the tutorial and I'm confused now so could you please tell me what
I'm missing here?

Thanks in advance,
Jim


output multiple values?

2008-09-09 Thread Shirley Cohen
I have a simple reducer that computes the average by doing a sum/ 
count. But I want to output both the average and the count for a  
given key, not just the average. Is it possible to output both values  
from the same invocation of the reducer? Or do I need two reducer  
invocations? If I try to call output.collect() twice from the reducer  
and label the key with type=avg or type=count, I get a bunch of  
garbage out. Please let me know if you have any suggestions.


Thanks,

Shirley


Re: output multiple values?

2008-09-09 Thread Owen O'Malley


On Sep 9, 2008, at 12:20 PM, Shirley Cohen wrote:

I have a simple reducer that computes the average by doing a sum/ 
count. But I want to output both the average and the count for a  
given key, not just the average. Is it possible to output both  
values from the same invocation of the reducer? Or do I need two  
reducer invocations? If I try to call output.collect() twice from  
the reducer and label the key with type=avg or type=count, I get  
a bunch of garbage out. Please let me know if you have any  
suggestions.


I'd be tempted to define a type like:

class AverageAndCount implements Writable {
  private long sum;
  private long count;
  ...
  public String toString() {
 return avg =  + (sum / (double) count) + , count =  + count);
  }
}

Then you could use your reducer as both a combiner and reducer and you  
would get both values out if you use TextOutputFormat. That said, it  
should absolutely work to do collect twice.


-- Owen


Re: Simple Survey

2008-09-09 Thread Chris K Wensel


Quick reminder to take the survey. We know more than a dozen companies  
are using Hadoop. heh


http://www.scaleunlimited.com/survey.html   

thanks!
chris

On Sep 8, 2008, at 10:43 AM, Chris K Wensel wrote:


Hey all

Scale Unlimited is putting together some case studies for an  
upcoming class and wants to get a snapshot of what the Hadoop user  
community looks like.


If you have 2 minutes, please feel free to take the short anonymous  
survey below:


http://www.scaleunlimited.com/survey.html

All results will be public.

cheers,
chris

--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/



--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/



Re: Could not obtain block: blk_-2634319951074439134_1129 file=/user/root/crawl_debug/segments/20080825053518/content/part-00002/data

2008-09-09 Thread Stefan Will
I'm not sure whether this is the same issue or not, but on my 4 slave
cluster, setting the below parameter doesn't seem to fix the issue.

What I'm seeing is that occasionally data nodes stop responding for up to 10
minutes at a time. In this case, the TaskTrackers will mark the nodes as
dead, and occasionally the namenode will mark them as dead as well (you can
see the Last Contact time steadily increase for a random node at a time
every half hour or so.

This seems to be happening during times of high disk utilization.

-- Stefan



 From: Espen Amble Kolstad [EMAIL PROTECTED]
 Reply-To: core-user@hadoop.apache.org
 Date: Mon, 8 Sep 2008 12:40:01 +0200
 To: core-user@hadoop.apache.org
 Subject: Re: Could not obtain block: blk_-2634319951074439134_1129
 file=/user/root/crawl_debug/segments/20080825053518/content/part-2/data
 
 There's a JIRA on this already:
 https://issues.apache.org/jira/browse/HADOOP-3831
 Setting dfs.datanode.socket.write.timeout=0 in hadoop-site.xml seems
 to do the trick for now.
 
 Espen
 
 On Mon, Sep 8, 2008 at 11:24 AM, Espen Amble Kolstad [EMAIL PROTECTED] 
 wrote:
 Hi,
 
 Thanks for the tip!
 
 I tried revision 692572 of the 0.18 branch, but I still get the same errors.
 
 On Sunday 07 September 2008 09:42:43 Dhruba Borthakur wrote:
 The DFS errors might have been caused by
 
 http://issues.apache.org/jira/browse/HADOOP-4040
 
 thanks,
 dhruba
 
 On Sat, Sep 6, 2008 at 6:59 AM, Devaraj Das [EMAIL PROTECTED] wrote:
 These exceptions are apparently coming from the dfs side of things. Could
 someone from the dfs side please look at these?
 
 On 9/5/08 3:04 PM, Espen Amble Kolstad [EMAIL PROTECTED] wrote:
 Hi,
 
 Thanks!
 The patch applies without change to hadoop-0.18.0, and should be
 included in a 0.18.1.
 
 However, I'm still seeing:
 in hadoop.log:
 2008-09-05 11:13:54,805 WARN  dfs.DFSClient - Exception while reading
 from blk_3428404120239503595_2664 of
 /user/trank/segments/20080905102650/crawl_generate/part-00010 from
 somehost:50010: java.io.IOException: Premeture EOF from in
 putStream
 
 in datanode.log:
 2008-09-05 11:15:09,554 WARN  dfs.DataNode -
 DatanodeRegistration(somehost:50010,
 storageID=DS-751763840-somehost-50010-1219931304453, infoPort=50075,
 ipcPort=50020):Got exception while serving
 blk_-4682098638573619471_2662 to
 /somehost:
 java.net.SocketTimeoutException: 48 millis timeout while waiting
 for channel to be ready for write. ch :
 java.nio.channels.SocketChannel[connected local=/somehost:50010
 remote=/somehost:45244]
 
 These entries in datanode.log happens a few minutes apart repeatedly.
 I've reduced # map-tasks so load on this node is below 1.0 with 5GB of
 free memory (so it's not resource starvation).
 
 Espen
 
 On Thu, Sep 4, 2008 at 3:33 PM, Devaraj Das [EMAIL PROTECTED] wrote:
 I started a profile of the reduce-task. I've attached the profiling
 output. It seems from the samples that ramManager.waitForDataToMerge()
 doesn't actually wait.
 Has anybody seen this behavior.
 
 This has been fixed in HADOOP-3940
 
 On 9/4/08 6:36 PM, Espen Amble Kolstad [EMAIL PROTECTED] wrote:
 I have the same problem on our cluster.
 
 It seems the reducer-tasks are using all cpu, long before there's
 anything to
 shuffle.
 
 I started a profile of the reduce-task. I've attached the profiling
 output. It seems from the samples that ramManager.waitForDataToMerge()
 doesn't actually wait.
 Has anybody seen this behavior.
 
 Espen
 
 On Thursday 28 August 2008 06:11:42 wangxu wrote:
 Hi,all
 I am using hadoop-0.18.0-core.jar and nutch-2008-08-18_04-01-55.jar,
 and running hadoop on one namenode and 4 slaves.
 attached is my hadoop-site.xml, and I didn't change the file
 hadoop-default.xml
 
 when data in segments are large,this kind of errors occure:
 
 java.io.IOException: Could not obtain block:
 blk_-2634319951074439134_1129
 file=/user/root/crawl_debug/segments/20080825053518/content/part-
 2/data at
 org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClie
 nt.jav a:1462) at
 org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.
 java:1 312) at
 org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:14
 17) at java.io.DataInputStream.readFully(DataInputStream.java:178)
 at
 org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.j
 ava:64 ) at
 org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:102
 ) at
 org.apache.hadoop.io.SequenceFile$Reader.readBuffer(SequenceFile.java
 :1646) at
 org.apache.hadoop.io.SequenceFile$Reader.seekToCurrentValue(SequenceF
 ile.ja va:1712) at
 org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile
 .java: 1787) at
 org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(Seq
 uenceF ileRecordReader.java:104) at
 org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRe
 cordRe ader.java:79) at
 org.apache.hadoop.mapred.join.WrappedRecordReader.next(WrappedRecordR
 eader. java:112) at
 

Re: Simple Survey

2008-09-09 Thread John Kane
Unfortunately there is a problem with the survey. I was unable to answer
correctly question '9. How much data is stored on your Hadoop cluster (in
GB)?' It would not let me enter more than 10TB (we currently have 45TB of
data in our cluster; actual data, not a sum of disk used (with all of its
replicas) but unique data).

Other than that, I tried :-)

On Tue, Sep 9, 2008 at 4:01 PM, Chris K Wensel [EMAIL PROTECTED] wrote:


 Quick reminder to take the survey. We know more than a dozen companies are
 using Hadoop. heh

 http://www.scaleunlimited.com/survey.html

 thanks!
 chris


 On Sep 8, 2008, at 10:43 AM, Chris K Wensel wrote:

  Hey all

 Scale Unlimited is putting together some case studies for an upcoming
 class and wants to get a snapshot of what the Hadoop user community looks
 like.

 If you have 2 minutes, please feel free to take the short anonymous survey
 below:

 http://www.scaleunlimited.com/survey.html

 All results will be public.

 cheers,
 chris

 --
 Chris K Wensel
 [EMAIL PROTECTED]
 http://chris.wensel.net/
 http://www.cascading.org/


 --
 Chris K Wensel
 [EMAIL PROTECTED]
 http://chris.wensel.net/
 http://www.cascading.org/




Hadoop 0.18 stable?

2008-09-09 Thread Deepika Khera
Hi ,

 

When is the hadoop 0.18 version expected to be stable? I was looking
into upgrading to it.

 

Are there any known critical issues that we've run into in this version?

 

Thanks,

Deepika



Re: Simple Survey

2008-09-09 Thread Chris K Wensel

how weird. i'll forward this on and see if they can fix it.

thanks for letting me know.

chris

On Sep 9, 2008, at 4:22 PM, John Kane wrote:

Unfortunately there is a problem with the survey. I was unable to  
answer
correctly question '9. How much data is stored on your Hadoop  
cluster (in
GB)?' It would not let me enter more than 10TB (we currently have  
45TB of
data in our cluster; actual data, not a sum of disk used (with all  
of its

replicas) but unique data).

Other than that, I tried :-)

On Tue, Sep 9, 2008 at 4:01 PM, Chris K Wensel [EMAIL PROTECTED]  
wrote:




Quick reminder to take the survey. We know more than a dozen  
companies are

using Hadoop. heh

http://www.scaleunlimited.com/survey.html

thanks!
chris


On Sep 8, 2008, at 10:43 AM, Chris K Wensel wrote:

Hey all


Scale Unlimited is putting together some case studies for an  
upcoming
class and wants to get a snapshot of what the Hadoop user  
community looks

like.

If you have 2 minutes, please feel free to take the short  
anonymous survey

below:

http://www.scaleunlimited.com/survey.html

All results will be public.

cheers,
chris

--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/



--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/




--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/



Re: Hadoop 0.18 stable?

2008-09-09 Thread Raghu Angadi


I would suggest 0.18.1 (released or not).  Yahoo! is still hasn't 
deployed 0.18 widely yet but it got a fair amount of smaller scale 
testing across the user base.


Raghu.

Deepika Khera wrote:

Hi ,

 


When is the hadoop 0.18 version expected to be stable? I was looking
into upgrading to it.

 


Are there any known critical issues that we've run into in this version?

 


Thanks,

Deepika






Re: Thinking about retriving DFS metadata from datanodes!!!

2008-09-09 Thread 叶双明
Thanks for paying attention  to my tentative idea!

What I thought isn't how to store the meradata, but the final (or last) way
to recover valuable data in the cluster when something worst (which destroy
the metadata in all multiple NameNode) happen. i.e. terrorist attack  or
natural disasters destroy half of cluster nodes within all NameNode, we can
recover as much data as possible by this mechanism, and hava big chance to
recover entire data of cluster because fo original replication.

Any suggestion is appreciate!

2008/9/10 Pete Wyckoff [EMAIL PROTECTED]

 +1 -

 from the perspective of the data nodes, dfs is just a block-level store and
 is thus much more robust and scalable.



 On 9/9/08 9:14 AM, Owen O'Malley [EMAIL PROTECTED] wrote:

  This isn't a very stable direction. You really don't want multiple
 distinct
  methods for storing the metadata, because discrepancies are very bad.
 High
  Availability (HA) is a very important medium term goal for HDFS, but it
 will
  likely be done using multiple NameNodes and ZooKeeper.
 
  -- Owen




-- 
Sorry for my english!! 明


number of tasks on a node.

2008-09-09 Thread Dmitry Pushkarev
Hi.

 

How can node find out how many task are being run on it at a given time?

I want tasktracer nodes (which are assigned from amazon EC) to shutdown if
nothing is being run for some period of time, but don't yet see right way of
implementing this.



Re: number of tasks on a node.

2008-09-09 Thread Shengkai Zhu
You can monitor the task or node status through web pages provided.

On Wed, Sep 10, 2008 at 11:24 AM, Edward J. Yoon [EMAIL PROTECTED]wrote:

 TaskTrackers are communicates with JobTracker so I guess you can
 handle it via JobTracker.

 -Edward

 On Wed, Sep 10, 2008 at 12:03 PM, Dmitry Pushkarev [EMAIL PROTECTED]
 wrote:
  Hi.
 
 
 
  How can node find out how many task are being run on it at a given time?
 
  I want tasktracer nodes (which are assigned from amazon EC) to shutdown
 if
  nothing is being run for some period of time, but don't yet see right way
 of
  implementing this.
 
 



 --
 Best regards, Edward J. Yoon
 [EMAIL PROTECTED]
 http://blog.udanax.org




-- 

朱盛凯

Jash Zhu

复旦大学软件学院

Software School, Fudan University