Re: HadoopInputFormatIO - HBase - Flink runner

Eugene Kirpichov Wed, 10 May 2017 15:55:07 -0700

Hi Seshadri,

There's 2 things you need to configure
1) Running on Flink runner. For this you only need to set the runner on
your PipelineOptions: options.setRunner(FlinkRunner.class) (plus whatever
other options the Flink runner needs - see
https://beam.apache.org/documentation/runners/flink/)
2) Reading from HBase. This is not Beam-specific: see documentation for the
HBase Hadoop connector in general; this SO question
http://stackoverflow.com/questions/25189527/how-to-process-a-range-of-hbase-rows-using-spark
seems
relevant - it shows how to create a Configuration for reading from HBase.
You should create a Configuration in a similar way, and give it to
HadoopInputFormatIO.


Hope this helps.


On Wed, May 10, 2017 at 3:43 PM Seshadri Raghunathan <[email protected]>
wrote:

> Hi,
>
> I am looking for some sample code / reference on how to read data from
> HBase using  HadoopInputFormatIO for Flink runner.
>
> Something similar to
> https://github.com/apache/beam/blob/master/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOCassandraIT.java
> but for HBase on Flink runner.
>
> Appreciate any help in this regard !
>
> Thanks,
> Seshadri
>

Re: HadoopInputFormatIO - HBase - Flink runner

Reply via email to