Lets use HADOOP-2179 for figuring out whats going on Holger. Would you mind uploading your configuration files -- both hadoop and hbase -- and put a link there to the data you are using so I can duplicate your setup locally?

Thanks,
St.Ack


Holger Stenzhorn wrote:
Hi,

During my latest "experiments" I was always using the local mode for Hadoop/Hbase. In the following I just give you the respective Reducer classes that I use for testing.
Employing the...
- "TestFileReducer" the MapReduce job completes flawlessly.
- "TestBaseReducer" the job crashes (see log below and attached with DEBUG turned on)

Cheers,
Holger

Code:
------

 public static class TestFileReducer extends MapReduceBase
   implements Reducer<Text, Text, Text, Text> {
     public void reduce(Text key, Iterator<Text> values,
                      OutputCollector<Text, Text> output,
Reporter reporter) throws IOException { StringBuilder builder = new StringBuilder();
     while (values.hasNext()) {
       builder.append(values.next() + "\n");
     }
     output.collect(key, new Text(builder.toString()));
   }
 }

 public static class TestBaseReducer extends MapReduceBase
   implements Reducer<Text, Text, Text, MapWritable> {
     public void reduce(Text key, Iterator<Text> values,
                      OutputCollector<Text, MapWritable> output,
                      Reporter reporter) throws IOException {
     StringBuilder builder = new StringBuilder();
     while (values.hasNext()) {
       builder.append(values.next() + "\n");
     }
     MapWritable value = new MapWritable();
value.put(new Text("triples:" + string.hashCode()), new ImmutableBytesWritable(builder.toString().getBytes())); output.collect(key, value);
   }
 }



Test case log:
--------------

07/11/09 15:17:22 INFO mapred.JobClient:  map 100% reduce 83%
07/11/09 15:17:25 INFO mapred.LocalJobRunner: reduce > reduce
07/11/09 15:17:28 INFO mapred.LocalJobRunner: reduce > reduce
07/11/09 15:17:31 INFO mapred.LocalJobRunner: reduce > reduce
07/11/09 15:17:34 INFO mapred.LocalJobRunner: reduce > reduce
07/11/09 15:17:37 INFO mapred.LocalJobRunner: reduce > reduce
07/11/09 15:22:24 WARN mapred.LocalJobRunner: job_local_1
java.net.SocketTimeoutException: timed out waiting for rpc response
       at org.apache.hadoop.ipc.Client.call(Client.java:484)
       at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
       at $Proxy1.batchUpdate(Unknown Source)
       at org.apache.hadoop.hbase.HTable.commit(HTable.java:724)
       at org.apache.hadoop.hbase.HTable.commit(HTable.java:701)
at org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:89) at org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:63) at org.apache.hadoop.mapred.ReduceTask$2.collect(ReduceTask.java:308)
       at TriplesTest$TestBaseReducer.reduce(TriplesTest.java:78)
       at TriplesTest$TestBaseReducer.reduce(TriplesTest.java:52)
       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:326)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:164)
Exception in thread "main" java.io.IOException: Job failed!
       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:831)
       at TriplesTest.run(TriplesTest.java:181)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
       at TriplesTest.main(TriplesTest.java:187)


Master log:
-----------

Attached as GZIP...

Cheers,
Holger

stack wrote:
Holger Stenzhorn wrote:
Since I am just a beginner at Hadoop and Hbase I cannot really tell whether an exception is a "real" one or just a "hint" - but exceptions always look scary... :-)
Yeah.  You might add your POV to the issue.

Anyways, when I did cut the allowed heap for the server and the test class both down to 2GB. In this case I just get following (not too optimistic) results... Well, I post all the log I think that might be necessary since I cannot say which exception is important and which one not.
Looks like your poor old mapreduce job failed when it tried to write a
record to hbase.

Looking at the master log, whats odd is that the catalog .META. table
has no mention of the 'triples' table.  Its been created?  (It may not
be showing because you are not running with logging at DEBUG level.
As of yesterday or so you can set DEBUG level from the UI by browsing
to 'Log Level' servlet at http://MASTER_HOST:PORT -- usually
http://localhost:60000/ -- and set the package
'org.apache.hadoop.hbase' to DEBUG).

But then you OOME.

You might try outputting back to local filesystem rather than to HDFS
-- use something like TextOutputFormat instead of TableOutputFormat.
If that works, then there is an issue w/ writing output to hbase.
Please open a JIRA, paste your MR program and lets figure a way to get
the data file across.

Thanks for your patience H,
St.Ack


Reply via email to