[ https://issues.apache.org/jira/browse/HBASE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947240#comment-14947240 ]
Jerry He commented on HBASE-14557: ---------------------------------- I was also looking into Export then Import into a bulk path. Interesting my quick test indicated it does not have the problem. I though It would have the same problem given that NoTagsKeyValue is going into client Result, and the MR OutputClass is be set to KeyValue class as well. After looking into the code a bit more. Here is why: Export gets from Results from the server containing Cells as NoTagsKeyValue into SequenceFile. Import gets Results from SequenceFile. Going into SequenceFile and getting out of it involves MR serialization. Our ResultSerialization seems to always emit KeyValue objects. https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ResultSerialization.java Evetually it will come down to the ProtobufUtil.toCel(Cell) {code} public static Cell toCell(final CellProtos.Cell cell) { // Doing this is going to kill us if we do it for all data passed. // St.Ack 20121205 return CellUtil.createCell(cell.getRow().toByteArray(), cell.getFamily().toByteArray(), cell.getQualifier().toByteArray(), cell.getTimestamp(), (byte)cell.getCellType().getNumber(), cell.getValue().toByteArray()); } {code} which emits a new KeyValue. In the same ResultSerialization class but for Result94Deserializer, we directly have: {code} kvs.add(new KeyValue(buf, offset, keyLength)) {code} This is similar to [~anoop.hbase]'s idea, which I like as well since we will still use the backing buffer. > MapReduce WALPlayer issue with NoTagsKeyValue > --------------------------------------------- > > Key: HBASE-14557 > URL: https://issues.apache.org/jira/browse/HBASE-14557 > Project: HBase > Issue Type: Bug > Affects Versions: 2.0.0 > Reporter: Jerry He > > Running MapReduce WALPlayer to convert WAL info HFiles: > {noformat} > 15/10/05 20:28:08 INFO mapred.JobClient: Task Id : > attempt_201508031611_0029_m_000000_0, Status : FAILED > java.io.IOException: Type mismatch in value from map: expected > org.apache.hadoop.hbase.KeyValue, recieved > org.apache.hadoop.hbase.NoTagsKeyValue > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:997) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:689) > at > org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) > at > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) > at > org.apache.hadoop.hbase.mapreduce.WALPlayer$WALKeyValueMapper.map(WALPlayer.java:111) > at > org.apache.hadoop.hbase.mapreduce.WALPlayer$WALKeyValueMapper.map(WALPlayer.java:96) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:751) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:368) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at > java.security.AccessController.doPrivileged(AccessController.java:369) > at javax.security.auth.Subject.doAs(Subject.java:572) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1502) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)