I don't know if HBase shell scan command use ColumnCountGetFilter. The absence of compaction could explain the 2 same cell displayed. But when I filter on one colfam, I get only 1 cell ... from the wrong colfam (like if the cell is stored in the wrong HFile) ...
When I add clone of my KeyValues in my Put in reduce the data is well writen (I get my 2 colfam filled). It sound strange that client mapReduce can set such a mess in the storage... Regards, -- Damien 2012/11/11 Varun Sharma <[email protected]> > I have not look at this in detail but does this eventually use the > ColumnCountGetFilter - if yes, then this will actually also include upto > one older version since filters run before version tracking - see JIRA > https://issues.apache.org/jira/browse/HBASE-5257 which has a fix - > Remember > that versions are always kept in memstore and only cleaned up when memstore > is flushed out as an HFile. > > On Fri, Nov 9, 2012 at 8:52 AM, Damien Hardy <[email protected]> > wrote: > > > Ok I can reply to myself ... > > > > you have to add a clone of the KeyValue in the Put. So > > p.add(kv); > > becomes > > p.add(kv.clone()); > > > > If not, I suppose only the last one is added in HBase (but the result is > > quite weird and should be fixed IMO) > > > > Cheers, > > > > -- > > Damien > > > > > > 2012/11/9 Damien Hardy <[email protected]> > > > > > Hello, > > > > > > I am a bit confused here... > > > > > > I try to execute a M/R to import data in HBase table 'Consultation'. > > > > > > Running on CDH4.1.2 > > > > > > map function emits context.write(ImmutableBytesWritable, KeyValue) > > > > > > conf summary : > > > job.setOutputFormatClass(TableOutputFormat.class); > > > job.setInputFormatClass(DataDrivenDBInputFormat.class); > > > job.getConfiguration().set(TableOutputFormat.OUTPUT_TABLE, > > > "Consultation"); > > > job.setOutputKeyClass(ImmutableBytesWritable.class); > > > job.setOutputValueClass(KeyValue.class); > > > > > > > > > The reduce class is : > > > > > > static class ImportReducer > > > extends TableReducer<ImmutableBytesWritable, KeyValue, > > > ImmutableBytesWritable> { > > > @Override > > > public void reduce(ImmutableBytesWritable row, Iterable<KeyValue> > > kvs, > > > Reducer<ImmutableBytesWritable, KeyValue, ImmutableBytesWritable, > > > Writable>.Context context) > > > throws java.io.IOException, InterruptedException { > > > Put p = new Put(row.copyBytes()); > > > int i = 0; > > > byte[] rk = null; > > > for (KeyValue kv: kvs) { > > > p.add(kv); > > > if ( Bytes.compareTo(CF_VISITED, 0, CF_VISITED.length, > > > kv.getBuffer(), kv.getFamilyOffset(), kv.getFamilyLength() ) == 0 ) { > > > i++; > > > } > > > } > > > p.add(CF_COUNTER,QA_COUNTER,Bytes.toBytes(i)); > > > context.write(new ImmutableBytesWritable(row),p); > > > } > > > } > > > > > > > > > hbase(main):038:0> scan 'Consultation', {COLUMNS=> *'visiting_tl'*, > LIMIT > > > => 10 } > > > ROW > > > COLUMN+CELL > > > > > > 00070db1aa26d1906a078a1e03f788cb-\x00\x13\x80\x15 column=* > > > visited_tl:*\x7F\xFF\xFE\xD9\x00\xFC\xDB\xB7\x001\xC5\xA7, > > > timestamp=1266998781000, > > > value=\x00\x00\x00\x00 > > > > > > 001316263fc8b454bbd86dff1587a347-\x00>t\x05 column=* > > > visited_tl:*\x7F\xFF\xFE\xD7\x0F\xB8u_\x00\x08\xE1\xA0, > > > timestamp=1275341540000, > > > value=\x00\x00\x00\x00 > > > > > > 001497e68d7c71a3cd281860484fa6be-\x00/\x0E^ column=* > > > visited_tl:*\x7F\xFF\xFE\xD8\x06\x9B\xB0\xB7\x00(3S, > > > timestamp=1271199453000, > > > value=\x00\x00\x00\x00 > > > > > > 001845aac2462a1c24b36eb90ab698cf-\x00\x04\x1E\xF5 column=* > > > visited_tl:*\x7F\xFF\xFE\xD6\xA8\xB9-\xEF\x002Po, > > > timestamp=1277069546000, > > > value=\x00\x00\x00\x01 > > > > > > 0019cec2c1f38c42b1c540ef7708c6a9-\x00;\xE0\x97 column=* > > > visited_tl:*\x7F\xFF\xFE\xD8\xF9\xC7\x0C_\x00\x02?., > > > timestamp=1267119748000, > > > value=\x00\x00\x00\x00 > > > > > > 001de6b92754b0ef44ee10bf2bdfe3c3-\x00%\x1AV column=* > > > visited_tl:*\x7F\xFF\xFE\xD6\xE4H\x99\xC7\x00\x0F\x7F9, > > > timestamp=1276070291000, > > > value=\x00\x00\x00\x01 > > > > > > 00217f082f96eb12108c139b99a3ccb7-\x00\x02w\x08 column=* > > > visited_tl:*\x7F\xFF\xFE\xD8\xEB\x1B\x95\xEF\x00\x0A7\x19, > > > timestamp=1267365866000, > > > value=\x00\x00\x00\x00 > > > > > > 0021cbfd559f56dd298e4b4fee7626a9-\x00r\xBF\xFA column=* > > > visited_tl:*\x7F\xFF\xFE\xD6\xA1\x0B-\x0F\x00\x03\xBC\x8B, > > > timestamp=1277198390000, > > > value=\x00\x00\x00\x02 > > > > > > 00266c02a60f9a6efb5d24317e6032a0-\x00\x0E]+ column=* > > > visited_tl:*\x7F\xFF\xFE\xD6\xBC\x0D\xD1\x7F\x00/ q, > > > timestamp=1276745232000, > > > value=\x00\x00\x00\x01 > > > > > > 0026dbbd6562da5b79f1b09e94e3b973-\x00C[\x93 column=* > > > visited_tl:*\x7F\xFF\xFE\xD7\xB0\xFA\xB7/\x00\x02~\x09, > > > timestamp=1272636066000, > > > value=\x00\x00\x00\x01 > > > > > > 10 row(s) in 2.1130 seconds > > > > > > > > > hbase(main):036:0> get 'Consultation', > > > "00070db1aa26d1906a078a1e03f788cb-\x00\x13\x80\x15" > > > COLUMN > > > CELL > > > > > > *visited_tl:\x7F\xFF\xFE\xD9\x00\xFC\xDB\xB7\x001\xC5\xA7* > > > timestamp=1266998781000, > > > value=\x00\x00\x00\x00 > > > > > > *visited_tl:\x7F\xFF\xFE\xD9\x00\xFC\xDB\xB7\x001\xC5\xA7* > > > timestamp=1266998781000, > > > value=\x00\x00\x00\x00 > > > > > > visits_count:_counter > > > timestamp=1352475456545, > > > value=\x00\x00\x02\xA1 > > > > > > 3 row(s) in 0.3260 seconds > > > > > > hbase(main):037:0> get 'Consultation', > > > "00070db1aa26d1906a078a1e03f788cb-\x00\x13\x80\x15", *'visiting_tl:'* > > > COLUMN > > > CELL > > > > > > *visited_tl:*\x7F\xFF\xFE\xD9\x00\xFC\xDB\xB7\x001\xC5\xA7 > > > timestamp=1266998781000, > > > value=\x00\x00\x00\x00 > > > > > > 1 row(s) in 0.1650 seconds > > > > > > So I have 3 problems : > > > > > > * table is only 1 VERSION enable : who can I get the cell > > > visited_tl:\x7F\xFF\xFE\xD9\x00\xFC\xDB\xB7\x001\xC5\xA7 2 time for a > > > single row ? > > > * when I explicitly query for CF 'visiting_tl:' , I get a > 'visited_tl:' > > > cell ... WTF ? > > > * the Counter is (int)673 ... where are my 673 visited_tl cell ? (673 > is > > > the good value according to my source) > > > > > > Cheers, >
