Hi Seshadri, There's 2 things you need to configure 1) Running on Flink runner. For this you only need to set the runner on your PipelineOptions: options.setRunner(FlinkRunner.class) (plus whatever other options the Flink runner needs - see https://beam.apache.org/documentation/runners/flink/) 2) Reading from HBase. This is not Beam-specific: see documentation for the HBase Hadoop connector in general; this SO question http://stackoverflow.com/questions/25189527/how-to-process-a-range-of-hbase-rows-using-spark seems relevant - it shows how to create a Configuration for reading from HBase. You should create a Configuration in a similar way, and give it to HadoopInputFormatIO.
Hope this helps. On Wed, May 10, 2017 at 3:43 PM Seshadri Raghunathan <[email protected]> wrote: > Hi, > > I am looking for some sample code / reference on how to read data from > HBase using HadoopInputFormatIO for Flink runner. > > Something similar to > https://github.com/apache/beam/blob/master/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOCassandraIT.java > but for HBase on Flink runner. > > Appreciate any help in this regard ! > > Thanks, > Seshadri >
