Re: Hbase and Hadoop Config to run in Standalone mode

Jean-Daniel Cryans Thu, 23 Jul 2009 09:57:41 -0700

Well you should always set an output directory, but in your case I see
that the job still ran.


J-D

On Thu, Jul 23, 2009 at 12:52 PM, bharath
vissapragada<bharathvissapragada1...@gmail.com> wrote:
> I have set "c.setOutputFormat(NullOutputFormat.class);"  otherwise its
> showing the error
> "Output directory not set in JobConf."
>
> I think this is causing troubles ... any idea?
>
>
>
> On Thu, Jul 23, 2009 at 10:12 PM, bharath vissapragada <
> bharathvissapragada1...@gmail.com> wrote:
>
>> I have tried apache -commons logging ...
>>
>> instead of printing the row ... i have written log.error(row) ...
>> even then i got the same output as follows ...
>>
>> 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=JobTracker, sessionId=
>> 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User classes
>> may not be found. See JobConf(Class) or JobConf#setJar(String).
>> 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> 0->localhost.localdomain:,
>> 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
>> 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> 0->localhost.localdomain:,
>> 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
>> 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
>> 09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
>> 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
>> 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
>> 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
>> 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_m_000000_0 is done. And is in the process of
>> commiting
>> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> 'attempt_local_0001_m_000000_0' done.
>> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
>> 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass, with 1
>> segments left of total size: 333 bytes
>> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_r_000000_0 is done. And is in the process of
>> commiting
>> 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
>> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> 'attempt_local_0001_r_000000_0' done.
>> 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
>> 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
>> 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
>> 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
>>
>>
>>
>> On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans 
>> <jdcry...@apache.org>wrote:
>>
>>> And you don't need any more config to run local MR jobs on HBase. But
>>> you do need Hadoop when running MR jobs on HBase on a cluster.
>>>
>>> Also your code is running fine as you could see, the real question is
>>> where is the stdout going when in local mode. When you ran your other
>>> MR jobs, it was on a working Hadoop setup right? So you were looking
>>> at the logs in the web UI? One simple thing to do is to do your
>>> debugging with a logger so you are sure to see your output as I
>>> already proposed. Another simple thing is to get a pseudo-distributed
>>> setup and run you HBase MR jobs with Hadoop and get your logs like I'm
>>> sure you did before.
>>>
>>> J-D
>>>
>>> On Thu, Jul 23, 2009 at 11:54 AM, bharath
>>> vissapragada<bharathvissapragada1...@gmail.com> wrote:
>>> > I am really thankful to you J-D for replying me inspite of ur busy
>>> schedule.
>>> > I am still in a learning stage and there are no good guides on HBase
>>> other
>>> > than Its own one .. So please spare me and I really appreciate ur help .
>>> >
>>> > Now i got ur point that there is no need of hadoop while running Hbase
>>> MR
>>> > programs .... But iam confused abt the config . I have only set the
>>> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do anything
>>> ..
>>> > so i wonder if my conf was wrong or some error in that simple code ...
>>> > because stdout worked for me while writing mapreduce programs ...
>>> >
>>> > Thanks once again!
>>> >
>>> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <
>>> jdcry...@apache.org>wrote:
>>> >
>>> >> The code itself is very simple, I was referring to your own
>>> >> description of your situation. You say you use standalone HBase yet
>>> >> you talk about Hadoop configuration. You also talk about the
>>> >> JobTracker web UI which is in no use since you run local jobs directly
>>> >> on HBase.
>>> >>
>>> >> J-D
>>> >>
>>> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
>>> >> vissapragada<bharathvissapragada1...@gmail.com> wrote:
>>> >> > I used stdout for debugging while writing codes in hadoop MR programs
>>> and
>>> >> it
>>> >> > worked fine ...
>>> >> > Can you please tell me wch part of the code u found confusing so that
>>> i
>>> >> can
>>> >> > explain it a bit clearly ...
>>> >> >
>>> >> >
>>> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
>>> jdcry...@apache.org
>>> >> >wrote:
>>> >> >
>>> >> >> What you wrote is a bit confusing to me, sorry.
>>> >> >>
>>> >> >> The usual way to debug MR jobs is to define a logger and post with
>>> >> >> either info or debug level, not sysout like you did. I'm not even
>>> sure
>>> >> >> where the standard output is logged when using a local job. Also
>>> since
>>> >> >> this is local you won't see anything in your host:50030 web UI. So
>>> use
>>> >> >> apache common logging and you should see your output.
>>> >> >>
>>> >> >> J-D
>>> >> >>
>>> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>>> >> >> vissapragada<bharathvissapragada1...@gmail.com> wrote:
>>> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code ...
>>> >> >> >
>>> >> >> > Im doing it frm the command line .. Iam pasting some part of the
>>> code
>>> >> >> here
>>> >> >> > ....
>>> >> >> >
>>> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
>>> >> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
>>> >> IOException
>>> >> >> {
>>> >> >> >                System.out.println(row);
>>> >> >> > }
>>> >> >> >
>>> >> >> > public JobConf createSubmittableJob(String[] args) throws
>>> IOException
>>> >> {
>>> >> >> >                JobConf c = new JobConf(getConf(),
>>> >> >> MR_DS_Scan_Case1.class);
>>> >> >> >                c.set("col.name", args[1]);
>>> >> >> >                c.set("operator.name",args[2]);
>>> >> >> >                c.set("val.name",args[3]);
>>> >> >> >                IdentityTableMap.initJob(args[0], args[1],
>>> >> >> this.getClass(),
>>> >> >> > c);
>>> >> >> >                c.setOutputFormat(NullOutputFormat.class);
>>> >> >> >                 return c
>>> >> >> > }
>>> >> >> >
>>> >> >> > As u can see ... im just printing the value of row in the map .. i
>>> >> can't
>>> >> >> see
>>> >> >> > in the terminal .....
>>> >> >> > I only wan't the map phase ... so i didn't write any reduce phase
>>> ..
>>> >> is
>>> >> >> my
>>> >> >> > jobConf correct??
>>> >> >> >
>>> >> >> > Also as i have already asked how to check the job logs and web
>>> >> interface
>>> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running in
>>> local
>>> >> mode
>>> >> >> ...
>>> >> >> >
>>> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
>>> >> jdcry...@apache.org
>>> >> >> >wrote:
>>> >> >> >
>>> >> >> >> What output do you need exactly? I see that you have 8 output
>>> records
>>> >> >> >> in your reduce task so if you take a look in your output folder
>>> or
>>> >> >> >> table (I don't know which sink you used) you should see them.
>>> >> >> >>
>>> >> >> >> Also did you run your MR inside Eclipse or in command line?
>>> >> >> >>
>>> >> >> >> Thx,
>>> >> >> >>
>>> >> >> >> J-D
>>> >> >> >>
>>> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>>> >> >> >> vissapragada<bhara...@students.iiit.ac.in> wrote:
>>> >> >> >> > This is the output i go t.. seems everything is fine ..but no
>>> >> output!!
>>> >> >> >> >
>>> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics
>>> >> with
>>> >> >> >> > processName=JobTracker, sessionId=
>>> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.
>>>  User
>>> >> >> >> classes
>>> >> >> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>>> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
>>> >> >> >> > 0->localhost.localdomain:,
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
>>> >> job_local_0001
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
>>> >> >> >> > 0->localhost.localdomain:,
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
>>> >> 79691776/99614720
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
>>> >> 262144/327680
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map
>>> output
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>>> >> >> >> Task:attempt_local_0001_m_000000_0
>>> >> >> >> > is done. And is in the process of commiting
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>>> >> >> >> > 'attempt_local_0001_m_000000_0' done.
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
>>> merge-pass,
>>> >> >> with 1
>>> >> >> >> > segments left of total size: 333 bytes
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>>> >> >> >> Task:attempt_local_0001_r_000000_0
>>> >> >> >> > is done. And is in the process of commiting
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>>> >> >> >> > 'attempt_local_0001_r_000000_0' done.
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
>>> >> job_local_0001
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>>> read=38949
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>>> >> written=78378
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>>> groups=8
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output
>>> >> records=0
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
>>> records=8
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
>>> >> records=8
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>>> bytes=315
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
>>> >> records=0
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>>> records=8
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>>> records=8
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>>> >> >> >> > bhara...@students.iiit.ac.in> wrote:
>>> >> >> >> >
>>> >> >> >> >> since i haven;t started the cluster .. i can even see the
>>> details
>>> >> in
>>> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
>>> anything
>>> >> to
>>> >> >> >> >> hadoop/conf/hadoop-site.xml
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>>> >> >> >> >> bhara...@students.iiit.ac.in> wrote:
>>> >> >> >> >>
>>> >> >> >> >>> Hi all ,
>>> >> >> >> >>>
>>> >> >> >> >>> I wanted to run HBase in standalone mode to check my Hbase MR
>>> >> >> programs
>>> >> >> >> ...
>>> >> >> >> >>> I have dl a built version of hbase-0.20. and i have hadoop
>>> 0.19.3
>>> >> >> >> >>>
>>> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started
>>> hbase
>>> >> and
>>> >> >> >> >>> inserted some tables using JAVA API .. Now i have written
>>> some MR
>>> >> >> >> programs
>>> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly
>>> without
>>> >> any
>>> >> >> >> errors
>>> >> >> >> >>> and all the Map -reduce statistics are displayed correctly
>>> but  i
>>> >> >> get
>>> >> >> >> no
>>> >> >> >> >>> output .
>>> >> >> >> >>>
>>> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in
>>> stand
>>> >> alone
>>> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
>>> >> >> statements
>>> >> >> >> donot
>>> >> >> >> >>> work .. no output is displayed on the screen ... I doubt my
>>> >> config
>>> >> >> ....
>>> >> >> >> >>>
>>> >> >> >> >>> Do i need to add some config to run them ... Please reply ...
>>> >> >> >> >>>
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >
>>> >> >> >>
>>> >> >> >
>>> >> >>
>>> >> >
>>> >>
>>> >
>>>
>>
>>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Reply via email to