Hi,
During my latest "experiments" I was always using the local mode for
Hadoop/Hbase.
In the following I just give you the respective Reducer classes that
I use for testing.
Employing the...
- "TestFileReducer" the MapReduce job completes flawlessly.
- "TestBaseReducer" the job crashes (see log below and attached with
DEBUG turned on)
Cheers,
Holger
Code:
------
public static class TestFileReducer extends MapReduceBase
implements Reducer<Text, Text, Text, Text> {
public void reduce(Text key, Iterator<Text> values,
OutputCollector<Text, Text> output,
Reporter reporter) throws IOException {
StringBuilder builder = new StringBuilder();
while (values.hasNext()) {
builder.append(values.next() + "\n");
}
output.collect(key, new Text(builder.toString()));
}
}
public static class TestBaseReducer extends MapReduceBase
implements Reducer<Text, Text, Text, MapWritable> {
public void reduce(Text key, Iterator<Text> values,
OutputCollector<Text, MapWritable> output,
Reporter reporter) throws IOException {
StringBuilder builder = new StringBuilder();
while (values.hasNext()) {
builder.append(values.next() + "\n");
}
MapWritable value = new MapWritable();
value.put(new Text("triples:" + string.hashCode()), new
ImmutableBytesWritable(builder.toString().getBytes()));
output.collect(key, value);
}
}
Test case log:
--------------
07/11/09 15:17:22 INFO mapred.JobClient: map 100% reduce 83%
07/11/09 15:17:25 INFO mapred.LocalJobRunner: reduce > reduce
07/11/09 15:17:28 INFO mapred.LocalJobRunner: reduce > reduce
07/11/09 15:17:31 INFO mapred.LocalJobRunner: reduce > reduce
07/11/09 15:17:34 INFO mapred.LocalJobRunner: reduce > reduce
07/11/09 15:17:37 INFO mapred.LocalJobRunner: reduce > reduce
07/11/09 15:22:24 WARN mapred.LocalJobRunner: job_local_1
java.net.SocketTimeoutException: timed out waiting for rpc response
at org.apache.hadoop.ipc.Client.call(Client.java:484)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
at $Proxy1.batchUpdate(Unknown Source)
at org.apache.hadoop.hbase.HTable.commit(HTable.java:724)
at org.apache.hadoop.hbase.HTable.commit(HTable.java:701)
at
org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:89)
at
org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:63)
at
org.apache.hadoop.mapred.ReduceTask$2.collect(ReduceTask.java:308)
at TriplesTest$TestBaseReducer.reduce(TriplesTest.java:78)
at TriplesTest$TestBaseReducer.reduce(TriplesTest.java:52)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:326)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:164)
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:831)
at TriplesTest.run(TriplesTest.java:181)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at TriplesTest.main(TriplesTest.java:187)
Master log:
-----------
Attached as GZIP...
Cheers,
Holger
stack wrote:
Holger Stenzhorn wrote:
Since I am just a beginner at Hadoop and Hbase I cannot really tell
whether an exception is a "real" one or just a "hint" - but
exceptions always look scary... :-)
Yeah. You might add your POV to the issue.
Anyways, when I did cut the allowed heap for the server and the
test class both down to 2GB. In this case I just get following (not
too optimistic) results...
Well, I post all the log I think that might be necessary since I
cannot say which exception is important and which one not.
Looks like your poor old mapreduce job failed when it tried to write a
record to hbase.
Looking at the master log, whats odd is that the catalog .META. table
has no mention of the 'triples' table. Its been created? (It may not
be showing because you are not running with logging at DEBUG level.
As of yesterday or so you can set DEBUG level from the UI by browsing
to 'Log Level' servlet at http://MASTER_HOST:PORT -- usually
http://localhost:60000/ -- and set the package
'org.apache.hadoop.hbase' to DEBUG).
But then you OOME.
You might try outputting back to local filesystem rather than to HDFS
-- use something like TextOutputFormat instead of TableOutputFormat.
If that works, then there is an issue w/ writing output to hbase.
Please open a JIRA, paste your MR program and lets figure a way to get
the data file across.
Thanks for your patience H,
St.Ack