Tom,
What I meant to say was that doing this is well supported with
existing API/libraries itself:
- The class MultipleOutputs supports providing a filename for an
output. See MultipleOutputs.addNamedOutput usage [1].
- The type 'NullWritable' is a special writable that doesn't do
anything. So
Hi,
I am working on a open source project
Nectarhttps://github.com/zinnia-phatak-dev/Nectar where
i am trying to create the hadoop jobs depending upon the user input. I was
using Java Process API to run the bin/hadoop shell script to submit the
jobs. But it seems not good way because the process
A simple job.submit(…) OR JobClient.runJob(jobConf), submits your job
right from the Java API. Does this not work for you? If not, what
error do you face?
Forking out and launching from a system process is a bad idea unless
there's absolutely no way.
On Tue, Jul 26, 2011 at 3:28 PM, madhu phatak
Hi Madhu,
You can submit the jobs using the Job API's programmatically from any
system. The job submission code can be written this way.
// Create a new Job
Job job = new Job(new Configuration());
job.setJarByClass(MyJob.class);
// Specify various job-specific
Hi
I am using the same APIs but i am not able to run the jobs by just adding
the configuration files and jars . It never create a job in Hadoop , it just
shows cleaning up staging area and fails.
On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote:
Hi Madhu,
You can
Madhu,
Do you get a specific error message / stack trace? Could you also
paste your JT logs?
On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com wrote:
Hi
I am using the same APIs but i am not able to run the jobs by just adding
the configuration files and jars . It never
I am using JobControl.add() to add a job and running job control in
a separate thread and using JobControl.allFinished() to see all jobs
completed or not . Is this work same as Job.submit()??
On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote:
Madhu,
Do you get a specific error
Yes. Internally, it calls regular submit APIs.
On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak phatak@gmail.com wrote:
I am using JobControl.add() to add a job and running job control in
a separate thread and using JobControl.allFinished() to see all jobs
completed or not . Is this work same
Madhu,
Can you check the client logs, whether any error/exception is coming while
submitting the job?
Devaraj K
-Original Message-
From: Harsh J [mailto:ha...@cloudera.com]
Sent: Tuesday, July 26, 2011 5:01 PM
To: common-user@hadoop.apache.org
Subject: Re: Submitting and running
Hi Harsh,
Cool, thanks for the details. For anyone interested, with your tip
and description I was able to find an example inside the Hadoop in
Action (Chapter 7, p168) book.
Another question, though, it doesn't look like MultipleOutputs will
let me control the filename in a per-key (per map)
Hi all,
I am attempting to implement MultipleOutputFormat to write data to multiple
files dependent on the output keys and values. Can somebody provide a
working example with how to implement this in Hadoop 0.20.2?
Thanks!
--
Roger Chen
UC Davis Genome Center
Good afternoon Bobby,
Thanks so much, now its working excellent. And the speed is also reasonable.
Once again thanks u.
Regards,
Daniel T. Yehdego
Computational Science Program
University of Texas at El Paso, UTEP
dtyehd...@miners.utep.edu
From: ev...@yahoo-inc.com
To:
package com.shopkick.util;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.lib.MultipleTextOutputFormat;
public class MultiFileOutput extends MultipleTextOutputFormatText, Text {
@Override
protected String generateFileNameForKeyValue(Text key, Text value,
Dear All:
I am trying to run Hadoop on Windows 7 so as to test programs before moving to
Unix/Linux. I have downloaded the Hadoop 0.20.2 and Eclipse 3.6 because I want
to use the plugin. I am also using cygwin. However, I set the environment
variable for JAVA_HOME and added the
Roger,
Beyond Ayon's example answer, I'd like you to note that the newer API
will *not* carry a supported MultipleOutputFormat as it has been
obsoleted away in favor of MultipleOutputs, whose use is much easier,
is threadsafe, and also carries an example to look at, at [1].
[1] -
Tom,
You can theoretically add N amounts of named outputs from a single
task itself, even from within the map() calls (addNamedOutputs or
addMultiNamedOutputs checks within itself for dupes, so you don't have
to). So yes, you can keep adding outputs and using them per-key, and
given your earlier
Try using virtual box/vmware and downloading either an image that has hadoop on
it or a linux image and installing it there.
Good luck
James.
On 2011-07-26, at 12:33 PM, A Df wrote:
Dear All:
I am trying to run Hadoop on Windows 7 so as to test programs before moving
to Unix/Linux. I
A Df,
(Inlines)
On Wed, Jul 27, 2011 at 12:03 AM, A Df abbey_dragonfor...@yahoo.com wrote:
Dear All:
I am trying to run Hadoop on Windows 7 so as to test programs before moving
to Unix/Linux. I have downloaded the Hadoop 0.20.2 and Eclipse 3.6 because I
want to use the plugin. I am also
Hi A Df,
I haven't set up Hadoop under cygwin, but I use cygwin a lot.
One thing I would suggest is to use the bash shell in cygwin and use the
following format for the $PATH additions:
PATH=$PATH:/cygdrive/c/cygwin/bin:/cygdrive/c/cygwin/usr/bin
My understanding is that the stable version of
A Df,
Try reinstalling java to a friendlier location (without spaces) - c:\java
rather than c:\Program Files - it's parsing on the space is what it appears
from the error message ~ I've encountered this very same problem.
JAVA_HOME to be the root of your Java installation which I changed to
Harsh:
See (inline at the **) I hope its easy to follow and for the other responses, I
was not sure how to respond to get everything into one. Sorry for top posting!
Eric where would I put the line below and explain in newbie terms, thanks:
A Df,
On Wed, Jul 27, 2011 at 1:42 AM, A Df abbey_dragonfor...@yahoo.com wrote:
Harsh:
See (inline at the **) I hope its easy to follow and for the other responses,
I was not sure how to respond to get everything into one. Sorry for top
posting!
Np! I don't strongly enforce a style of
The problem I'm facing right now is with the configuration needed for
MultipleOutputs, because JobConf is deprecated now and I am unable to do its
equivalent with Configuration. I set the configuration of the job by:
Job job = new Job(getConf());
but when I'm trying to use this line in my
Gotcha, my bad then. The hadoop distribution I use provides a
backported MO, so I overlooked this particular issue while replying.
Still, the warning holds as the versions would roll ahead. But I
believe the refactor would not be that much of a pain, so perhaps its
a no-worry.
On Wed, Jul 27,
Hi
I am submitting the job as follows
java -cp
Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/*
com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob input/book.csv
kkk11fffrrw 1
I get the log in CLI
Hi,
I want to build Hadoop 0.20.2 from source using the Eclipse IDE. Can anyone
help me with this?
Regards,
Vighnesh
Hi Vighnesh,
Step 1) Download the code base from apache svn repository.
Step 2) In root folder you can find build.xml file. In that folder just execute
a)ant and b)ant eclipse
this will generate the eclipse project setings files.
After this directly you can import this project in you
27 matches
Mail list logo