Please see the following two constants defined in TableInputFormat :

  /** Column Family to Scan */

  public static final String SCAN_COLUMN_FAMILY =
"hbase.mapreduce.scan.column.family";

  /** Space delimited list of columns and column families to scan. */

  public static final String SCAN_COLUMNS = "hbase.mapreduce.scan.columns";

CellCounter accepts these parameters. You can play with CellCounter to see
how they work.


FYI

On Mon, Jul 2, 2018 at 4:01 AM, revolutionisme <[email protected]>
wrote:

> Hi,
>
> I am using HBase with Spark and as I have wide columns (> 10000) I wanted
> to
> use the "setbatch(num)" option to not read all the columns for a row but in
> batches.
>
> I can create a scan and set the batch size I want with
> TableInputFormat.SCAN_BATCHSIZE, but I am a bit confused how this would
> work
> with more than 1 column family.
>
> Any help is appreciated.
>
> PS: Also any documentation or inputs on newAPIHadoopRDD would be really
> appreciated as well.
>
> Thanks & Regards,
> Biplob
>
>
>
> --
> Sent from: http://apache-hbase.679495.n3.nabble.com/HBase-User-
> f4020416.html
>

Reply via email to