I apologize if this has been brought up before, but the Scan class acts differently in regular client queries than in MapReduce jobs configured by TableMapReduceUtil. I'm using the 0.20.0 release in standalone mode at the moment for a proof of concept.
1. Startrow/Stoprow Scan scan = new Scan( startRow, stopRow ); The "startrow", "stoprow" arguments don't seem to be honored in a MapReduce jobs and it turns into a full tablescan. 2. Column selection If you use this instance of Scan... Scan scan = new Scan( startRow, stopRow ); ... in regular client activity this instance will allow selection of attributes in the Result. However, this same instance used in a MapReduce job will produce the following exception: Exception in thread "main" java.io.IOException: Expecting at least one column. at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:281) The remedy is to call either "addColumn" or "addFamily" on the Scan instance as appropriate, but it's a little odd that in one use case things will work and in another it will exception. Doug Meil Director of Engineering doug.m...@explorys.net