Hello All
I have a map program which process ORC file. From the driver I set the
orcformat as input format.
job.setInputFormatClass(OrcNewInputFormat.class);
In the OrcNewInputFormat the value is OrcStruct. Map program I Writable
value passed as param is type casted to OrcStruct
OrcStruct record = (OrcStruct) value
I want to test this mapper using MRUnit. For this in the setup method of
unit test I create a ORC file
OrcFile.createWriter(testFilePath,
OrcFile.writerOptions(conf).inspector(inspector).stripeSize(100000).bufferSize(10000)
.version(OrcFile.Version.V_0_12));
Then in the test method I read it and using MRUnit invoke mapper. Below is
the code
// Read orc file
Reader reader = OrcFile.createReader(fs, testFilePath) ;
RecordReader recordRdr = reader.rows() ;
OrcStruct row = null ;
List<OrcStruct> mapData = new ArrayList<>()
while(recordRdr.hasNext()) {
row = (OrcStruct) recordRdr.next(row) ;
mapData.add(row) ;
}
// test mapper
initializeSerde(mapDriver.getConfiguration());
Writable writable = getWritable(mapData.get(0)) ; // test 1st record
mapper processing
mapDriver.withCacheFile(strCachePath)
.withInput(NullWritable.get(), writable );
mapDriver.runTest();
But while running the test case I get below error
java.lang.UnsupportedOperationException: can't write the bundle
at
org.apache.hadoop.hive.ql.io.orc.OrcSerde$OrcSerdeRow.write(OrcSerde.java:61)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:98)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:82)
at
org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:80)
at
org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:97)
at
org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:110)
at org.apache.hadoop.mrunit.TestDriver.copy(TestDriver.java:675)
at org.apache.hadoop.mrunit.TestDriver.copyPair(TestDriver.java:679)
at org.apache.hadoop.mrunit.MapDriverBase.addInput(MapDriverBase.java:120)
at org.apache.hadoop.mrunit.MapDriverBase.withInput(MapDriverBase.java:210)
Looking at orcserde write is not supported which MRUnit invokes. Hence test
case errors out.
How do we unit test case the mapper processing Orc file. Is there any
other way or what needs to be changed in what I am doing?
Thanks in advance for the help .
br
Sandeep