[ 
https://issues.apache.org/jira/browse/HBASE-12757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257934#comment-14257934
 ] 

pangxiaoxi commented on HBASE-12757:
------------------------------------

protected void setup(Context context){
               TableSplit split = (TableSplit)context.getInputSplit();
                if(split != null){
                        start_row = split.getStartRow();
                        end_row = split.getEndRow();
                       .....
                }
}
public void map(ImmutableBytesWritable row,Result columns ,Context context) {
        ...
        byte[] rowkey = row.get();
        if(Bytes.compareTo(start_row, rowkey) > 0 || Bytes.compareTo(end_row, 
rowkey) < 0){
            TableSplit split = (TableSplit)context.getInputSplit();
            if(split != null){
                System.err.println(String.format("location=%1$s,start_row=%2$s 
,rowkey=%3$s ,end_row=%4$s",
                                                                
split.getRegionLocation() ,HbaseUtil.printRowkey(split.getStartRow()),
                                                                
HbaseUtil.printRowkey(rowkey),HbaseUtil.printRowkey(split.getEndRow())));
                }
           }    
  .................    

> MR map's input rowkey out of range of current Region 
> -----------------------------------------------------
>
>                 Key: HBASE-12757
>                 URL: https://issues.apache.org/jira/browse/HBASE-12757
>             Project: HBase
>          Issue Type: Bug
>          Components: Client, hbase
>    Affects Versions: 0.94.7
>         Environment: hadoop 1.1.2, r1440782
> hbase 0.94.7
> linux  2.6.32-279.el6.x86_64
>            Reporter: pangxiaoxi
>            Priority: Critical
>
> I excute mapreduce scan all table, sometimes map input value of rowkey is out 
> of range on current Region (get from inputsplit ).
> this mabey  lost data or get unused data.
> eg. 
> location=datanode11,start_row=D9CB114FD09A82A3_0000000000000000_m_43DAAA689D4AFC86
>  ,rowkey=D323E1D0A51E5185_0000000000000000_m_75686B8924108044 
> ,end_row=DB0C4FC44E6D80C1_0000000000000000_m_E956CC65322BA3E5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to