[
https://issues.apache.org/jira/browse/HBASE-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724984#action_12724984
]
Lars George commented on HBASE-1385:
------------------------------------
Re: the error I get for the test, here is what I see:
{code}
java.lang.NullPointerException
at
org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:899)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at
org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.runTestOnTable(TestTableMapReduce.java:142)
at
org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.testMultiRegionTable(TestTableMapReduce.java:121)
{code}
I traced this down and found this in getSerializer():
{code}
if (serialization.accept(c)) {
return (Serialization<T>) serialization;
}
{code}
which does this in the default WritableSerialization class:
{code}
public boolean accept(Class<?> c) {
return Writable.class.isAssignableFrom(c);
}
{code}
So, this means that the class handed in must be serializable. Which makes sense
given the class name. Now looking into where it is called, I see this in
JobClient:
{code}
T[] array = (T[]) splits.toArray(new
org.apache.hadoop.mapreduce.InputSplit[splits.size()]);
...
SerializationFactory factory = new SerializationFactory(conf);
Serializer<T> serializer = factory.getSerializer((Class<T>)
array[0].getClass());
...
{code}
So InputSplit *must* be implementing Writable! Looking at the old (mind you!)
InputSplit:
{code}
@Deprecated
public interface InputSplit extends Writable {
...
{code}
which is fine, but the new one in mapreduce does this:
{code}
public abstract class InputSplit {
...
{code}
and that's that. Broken! So I can either add it myself on a higher level and
hope for the best, or... ? Suggestions?
> Revamp TableInputFormat, needs updating to match hadoop 0.20.x AND remove bit
> where we can make < maps than regions
> -------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-1385
> URL: https://issues.apache.org/jira/browse/HBASE-1385
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Fix For: 0.21.0
>
> Attachments: 1385-v2.patch, 1385-v3.patch, 1385-v4.patch, 1385.patch,
> mr.patch
>
>
> Update TIF to match new MR.
> Remove the bit of logic where we will use number of configured maps as splits
> count rather than regions.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.