hi,hadoopman
you can put the large data into your hdfs using hadoop fs -put src dest
and then you can use alter table xxx add partition(x) location 'desc'
2011/5/11 amit jaiswal amit_...@yahoo.com
Hi,
What is the meaning of 'union' over here. Is there any hadoop job with 1
(or few)
Hi,
I'm running some experiments using hadoop streaming.
I always get a output_dir/part-0 file at the end, but I wonder:
when exactly will this filename show up? when it's completely written,
or will it already show up while the hapreduce software is still
writing to it? Is the write atomic?
Hi all,
The String[] that is output by the InputSplit.getLocations() gives the list
of nodes where the input split resides.
But the node detail is either represented as the ip-address or the hostname
(for eg - an entry in the list could be either 10.72.147.109 or mattHDFS1
(hostname). Is it
Hello,
I am writing a MR job where the distribution of the Keys emitted by the Map
phase is not known beforehand and so I can't create the partitions for the
TotalOrderPartitioner. I would like to sample those keys to create the
partitions and then run the job that will process the whole
Is it possible to get a Host-address to Host-name mapping in the JIP ?
Someone please help me with this!
Thanks,
Matthew
On Thu, May 12, 2011 at 5:36 PM, Matthew John tmatthewjohn1...@gmail.comwrote:
Hi all,
The String[] that is output by the InputSplit.getLocations() gives the list
of
Hello,
I have a four node hadoop cluster running hadoop v.0.20.2 on CentOS 5.6.
Here is my layout:
Name01.hadoop.stage (namenode)
Name02.hadoop.stage (sec namenode / jobtracker)
Data01.hadoop.stage (data node)
Data02.hadoop.stage (data node)
When trying to run a benchmark test for
The creation of files part-n is atomic. When you run a MR job, these
files are created in directory output_dir/_temporary and moved to
output_dir after the files is closed for writing. This move is atomic
hence as long as you don't try to read these files from temporary directory
(which I see
Hi there,
I'm experiencing some unusual behaviour on our 0.20.2 hadoop cluster.
Randomly (periodically), we're getting Call to namenode failures on
tasktrackers causing tasks to fail:
2011-05-12 14:36:37,462 WARN org.apache.hadoop.mapred.TaskRunner:
attempt_201105090819_059_m_0038_0Child Error
Hi there,
I'm experiencing some unusual behaviour on our 0.20.2 hadoop cluster.
Randomly (periodically), we're getting Call to namenode failures on
tasktrackers causing tasks to fail:
2011-05-12 14:36:37,462 WARN org.apache.hadoop.mapred.TaskRunner:
attempt_201105090819_059_m_0038_0Child Error
For one long running job we are noticing that the mapper jvms do not exit
even after the mapper is done. Any suggestions on why this could be
happening.
The java processes get cleaned up if I do a hadoop job -kill job_id. The
java processes get cleaned up of I run in it in a smaller batch and the
Hi there,
Apologies if this comes through twice but i sent the mail a few hours
ago and haven't seen it on the mailing list.
I'm experiencing some unusual behaviour on our 0.20.2 hadoop cluster.
Randomly (periodically), we're getting Call to namenode failures on
tasktrackers causing tasks to
Which version of hadoop are you running?
Are you running on linux?
-Joey
On Thu, May 12, 2011 at 1:39 PM, Adi adi.pan...@gmail.com wrote:
For one long running job we are noticing that the mapper jvms do not exit
even after the mapper is done. Any suggestions on why this could be
happening.
Which version of hadoop are you running?
Hadoop 0.21.0 with some patches.
Are you running on linux?
Yes
Linux 2.6.18-238.9.1.el5 #1 SMP x86_64 x86_64 x86_64 GNU/Linux
java version 1.6.0_21
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) 64-Bit Server VM (build
Hello,
I am trying to set up Hadoop HDFS in a cluster for the first time. So far I was
using pseudo-distributed mode on my PC at home and everything was working
perfectly.
Tha NameNode starts but the DataNode doesn't start and the log contains the
following:
2011-05-13 04:01:13,663 INFO
Is that all the messages in the datanode log? Do you see any SHUTDOWN message
also?
-Bharath
From: Panayotis Antonopoulos antonopoulos...@hotmail.com
To: common-user@hadoop.apache.org
Sent: Thursday, May 12, 2011 6:07 PM
Subject: Datanode doesn't start but
yes. that is a general solution to control counts of output files.
however, if you need to control counts of outputs dynamically, how could
you do?
if an output file name is 'A', counts of this output files are needed to
be 5.
if an output file name is 'B', counts of this output files are
Hadoop 0.21.0 with some patches.
Hadoop 0.21.0 doesn't get much use, so I'm not sure how much help I can be.
2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process
You can control the number of reducers by calling
job.setNumReduceTasks() before you launch it.
-Joey
On Thu, May 12, 2011 at 6:33 PM, Jun Young Kim juneng...@gmail.com wrote:
yes. that is a general solution to control counts of output files.
however, if you need to control counts of outputs
2011-05-12 13:52:04,147 WARN
org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such
process
Your logs showed that Hadoop tried to kill processes but the kill
command claimed they didn't exist. The
Is there a reason for using OpenJDK and not Sun's JDK?
Also... I believe there were noted issues with the .17 JDK. I will look for a
link and post if I can find.
Otherwise, the behaviour I have seen before. Hadoop is detaching from the JVM
and stops seeing it.
I think your problem lies in
Hi
I'm using FileInputFormat which will split files logically according to
their sizes into splits. Can the mapper get a pointer to these splits? and
know which split it is assigned ?
I tried looking up the Reporter class and see how is it printing the
logical splits on the UI for each
On Thu, May 12, 2011 at 8:59 PM, Mark question markq2...@gmail.com wrote:
Hi
I'm using FileInputFormat which will split files logically according to
their sizes into splits. Can the mapper get a pointer to these splits? and
know which split it is assigned ?
Look at
Thanks for the reply Owen, I only knew about map.input.file.
So there is no way I can see the other possible splits (start+length)? like
some function that returns strings of map.input.file and map.input.offset of
the other mappers ?
Thanks,
Mark
On Thu, May 12, 2011 at 9:08 PM, Owen O'Malley
you mean by user-specified is when you write your job name via
JobConf.setJobName(myTask) ?
Then using the same object you can recall your name as follows:
JobConf conf ;
conf.getJobName() ;
~Cheers
Mark
On Tue, May 10, 2011 at 10:16 AM, Mark Zand mz...@basistech.com wrote:
While I can get
On Thu, May 12, 2011 at 9:23 PM, Mark question markq2...@gmail.com wrote:
So there is no way I can see the other possible splits (start+length)?
like
some function that returns strings of map.input.file and map.input.offset
of
the other mappers ?
No, there isn't any way to do it using the
Then which class is filling the
Thanks again Owen, hopefully last but:
Who's filling the map.input.file and map.input.offset (ie. which class)
so I can extend it to have a function to return these strings.
Thanks,
Mark
On Thu, May 12, 2011 at 10:07 PM, Owen O'Malley omal...@apache.org wrote:
One of the reasons I can think of could be a version mismatch. You may
want to ensure that the job in question was not carrying a separate
version of Hadoop along with it inside, perhaps?
On Fri, May 13, 2011 at 12:42 AM, Sidney Simmons
ssimm...@nmitconsulting.co.uk wrote:
Hi there,
I'm
Have you defined the IP
of the DN in the slaves file?
Sent from my iPhone
On May 12, 2011, at 7:27 PM, Bharath Mundlapudi bharathw...@yahoo.com wrote:
Is that all the messages in the datanode log? Do you see any SHUTDOWN message
also?
-Bharath
29 matches
Mail list logo