Hadoop streaming - Subprocess failed

2012-08-29 Thread Periya.Data
Hi, I am running a map-reduce job in Python and I get this error message. I do not understand what it means. Output is not written to HDFS. I am using CDH3u3. Any suggestion is appreciated. MapAttempt TASK_TYPE=MAP TASKID=task_201208232245_2812_m_00

RE: Metrics ..

2012-08-29 Thread Wong, David (DMITS)
Here's a snippet of tasktracker metrics using Metrics2. (I think there were (more) gaps in the pre-metrics2 versions.) Note that you'll need to have hadoop-env.sh and hadoop-metrics2.properties setup on all the nodes you want reports from. 1345570905436 ugi.ugi: context=ugi,

Re: Metrics ..

2012-08-29 Thread Mark Olimpiati
Hi David, I enabled the jvm.class of the hadoop-metrics.properties, you're output seems to be from something else (dfs.class or mapred.class) which reports hadoop deamons performace. For example your output shows processName=TaskTracker which I'm not looking for. How can I report jvm

no output written to HDFS

2012-08-29 Thread Periya.Data
Hi All, My Hadoop streaming job (in Python) runs to completion (both map and reduce says 100% complete). But, when I look at the output directory in HDFS, the part files are empty. I do not know what might be causing this behavior. I understand that the percentages represent the records that

Re: no output written to HDFS

2012-08-29 Thread Bertrand Dechoux
Do you observe the same thing when running without Hadoop? (cat, map, sort and then reduce) Could you provide the counters of your job? You should be able to get them using the job tracker interface. The most probable answer without more information would be that your reducer do not output any

Minimun Input Spit Size for a map job

2012-08-29 Thread cat mys
Hello,I 'm currently developing a MapReduce application, and i want my Map job to takemore data than a single line as the default configuration does (eg 64kb input split size).How can i change the input split size for the Map job? Are there any configuration files that i have to edit or a

Re: example usage of s3 file system

2012-08-29 Thread Håvard Wahl Kongsgård
see also http://wiki.apache.org/hadoop/AmazonS3 On Tue, Aug 28, 2012 at 9:14 AM, Chris Collins chris_j_coll...@yahoo.com wrote: Hi I am trying to use the Hadoop filesystem abstraction with S3 but in my tinkering I am not having a great deal of success. I am particularly interested in the

Custom InputFormat errer

2012-08-29 Thread Chen He
Hi guys I met a interesting problem when I implement my own custom InputFormat which extends the FileInputFormat.(I rewrite the RecordReader class but not the InputSplit class) My recordreader will take following format as a basic record: (my recordreader extends the LineRecordReader. It returns

Re: example usage of s3 file system

2012-08-29 Thread Chris Collins
Thanks Haavard, I am aware of that page but I am not sure why you are pointing me to it. This really looks like a bug where Jets3tNativeFileSystemStore is parsing a response from jets3t. its looking for ResponseCode=404 but actually getting ResponseCode: 404. I dont see how it ever worked

Re: HBase and MapReduce data locality

2012-08-29 Thread N Keywal
Inline. Just a set of you're right :-). It's documented here: http://hbase.apache.org/book.html#regions.arch.locality On Wed, Aug 29, 2012 at 8:06 AM, Robert Dyer rd...@iastate.edu wrote: Ok but does that imply that only 1 of your compute nodes is promised to have all of the data for any given

Re: Custom InputFormat errer

2012-08-29 Thread Harsh J
Hi Chen, Does your record reader and mapper handle the case where one map split may not exactly get the whole record? Your case is not very different from the newlines logic presented here: http://wiki.apache.org/hadoop/HadoopMapReduce On Wed, Aug 29, 2012 at 11:13 AM, Chen He airb...@gmail.com

Re: MRBench Maps strange behaviour

2012-08-29 Thread Bejoy KS
Hi Gaurav You can get the information on the num of map tasks in the job from the JT web UI itself. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Gaurav Dasgupta gdsay...@gmail.com Date: Wed, 29 Aug 2012 13:14:11 To: user@hadoop.apache.org

Re: Job does not run with EOFException

2012-08-29 Thread Caetano Sauer
I am able to browse the web UI and telnet/netcat the tasktracker host and port, so the connection is being established. Is there any way I can confirm whether it is really some kind of version conflict? The EOF when doing readInt() seems like a protocol incompatibility. By the way, the tastracker

Re: MRBench Maps strange behaviour

2012-08-29 Thread praveenesh kumar
Then the question arises how MRBench is using the parameters : According to the mail he send... he is running MRBench with the following parameter: * hadoop jar /usr/lib/hadoop-0.20/hadoop-test.jar mrbench -maps 10 -reduces 10 * I guess he is assuming the MRbench to launch 10 mappers and 10

RE: hadoop 1.0.3 equivalent of MultipleTextOutputFormat

2012-08-29 Thread Tony Burton
Or, is it possible to request that the functionality provided by MultipleTextOutputFormat be supported by the new Hadoop API? Thanks, Tony -Original Message- From: Tony Burton [mailto:tbur...@sportingindex.com] Sent: 28 August 2012 14:37 To: user@hadoop.apache.org Subject: RE: hadoop

Re: Controlling on which node a reducer will be executed

2012-08-29 Thread Eduard Skaley
Thank you for your reply Harsh J. I really need this feature, because this will boost up execution of my use case a lot. Could you give me a hint where to look in the sources to get a good starting point for implementation? Which classes are involved? Hi Eduard, This isn't impossible, just

Re: Controlling on which node a reducer will be executed

2012-08-29 Thread Eduard Skaley
Thank you for your reply Harsh J. I really need this feature, because this will boost up execution of my use case a lot. Could you give me a hint where to look in the sources to get a good starting point for implementation? Which classes are involved? Hi Eduard, This isn't impossible, just

Metrics ..

2012-08-29 Thread Mark Olimpiati
Hi, I enabled the metrics.properties to use FileContext, in which jvm metrics values are written to a file as follows: jvm.metrics: hostName= localhost, processName=MAP, sessionId=, gcCount=10, gcTimeMillis=130, logError=0, logFatal=0, logInfo=21, logWarn=0, memHeapCommittedM=180.1211,

unsubscribe

2012-08-29 Thread Jay
From: Dan Yi d...@mediosystems.com To: user@hadoop.apache.org user@hadoop.apache.org Sent: Wednesday, August 29, 2012 12:57 PM Subject: unsubscribe

unsubscribe

2012-08-29 Thread Fahd Albinali

unsubscribe

2012-08-29 Thread Ahmed Nagy
unsubscribe

Re: Custom InputFormat errer

2012-08-29 Thread Chen He
Hi Harsh Thank you for your reply. Do you mean I need to change the FileSplit to avoid those errors I mentioned happen? Regards! Chen On Wed, Aug 29, 2012 at 2:46 AM, Harsh J ha...@cloudera.com wrote: Hi Chen, Does your record reader and mapper handle the case where one map split may not

Re: Delays in worker node jobs

2012-08-29 Thread Vinod Kumar Vavilapalli
Do you know if you have enough job-load on the system? One way to look at this is to look for running map/reduce tasks on the JT UI at the same time you are looking at the node's cpu usage. Collecting hadoop metrics via a metrics collection system say ganglia will let you match up the

Re: Delays in worker node jobs

2012-08-29 Thread Terry Healy
Thanks guys. Unfortunately I had started the datanode by local command rather than from start-all.sh, so the related parts of the logs were lost. I was watching the cpu loads on all 8 cores via gkrellm at the time and they were definitely quiet. After a few minutes the jobs seemed to get in sync

Re: Custom InputFormat errer

2012-08-29 Thread Harsh J
No, what I mean is that your RecordReader should be able to handle a case where it may start from middle of a record and hence not be able to read any record (i.e. return false or whatever right up front). On Wed, Aug 29, 2012 at 1:27 PM, Chen He airb...@gmail.com wrote: Hi Harsh Thank you for

RE: How to unsubscribe (was Re: unsubscribe)

2012-08-29 Thread sathyavageeswaran
I have tried every trick to get self unsubscribed. Yesterday I got a mail saying you can't unsubscribe once subscribed. -Original Message- From: Andy Isaacson [mailto:a...@cloudera.com] Sent: 30 August 2012 03:25 To: user@hadoop.apache.org Cc: Dan Yi; Jay Subject: How to unsubscribe (was

Re: How to unsubscribe (was Re: unsubscribe)

2012-08-29 Thread Ted Dunning
That was a stupid joke. It wasn't real advice. Have you sent email to the specific email address listed? On Thu, Aug 30, 2012 at 12:35 AM, sathyavageeswaran sat...@morisonmenon.com wrote: I have tried every trick to get self unsubscribed. Yesterday I got a mail saying you can't unsubscribe

RE: How to unsubscribe (was Re: unsubscribe)

2012-08-29 Thread sathyavageeswaran
Of course have sent emails to all permutations and combinations of emails listed with appropriate subject matter. From: Ted Dunning [mailto:tdunn...@maprtech.com] Sent: 30 August 2012 10:12 To: user@hadoop.apache.org Cc: Dan Yi; Jay Subject: Re: How to unsubscribe (was Re: unsubscribe)

Re: How to unsubscribe (was Re: unsubscribe)

2012-08-29 Thread Ted Dunning
Can you say which addresses you sent emails so? The merging of mailing lists may have left you subscribed to a different group than you expected. Thus, your assumptions may not match what is required. If you provide a specific list, somebody might be able to help you. Also, I was asking about

RE: How to unsubscribe (was Re: unsubscribe)

2012-08-29 Thread sathyavageeswaran
I sent to user-unsubscr...@hadoop.apache.org in the beginning. Followed by that I mailed to the links that the automatic mail sends in vain. Later I thought of all possible permutations that are possible using the 15 alphabets and once character that is in the email address of

Re: How to unsubscribe (was Re: unsubscribe)

2012-08-29 Thread Ted Dunning
Nicely done. This seems to indicate that there might be a bug in the mailing list management configuration. Not sure how that would happen with exmlm, but it seems more plausible. One last question, though, is the email address that you sent the unsubscribe requests from the same as the mailing