Reading from ORC Files in HDFS

Allan Wilson Sun, 17 Dec 2017 19:18:07 -0800

Hi,

Is there anyway to read ORC files from HDFS directly using Apache Beam?


I’m looking at loading up Kafka with data stored in ORC files backing Hive 
tables.

After doing some research it doesn’t look possible, but I thought I ask to make 
sure.

It may be possible to use jdbc or hcatalog to query the data out, but I’d 
rather scale out by pulling the data straight from the datanodes.

The runner I’m using is Spark 1.6.3 on the HDP 2.6.2 distro.

Reading from ORC Files in HDFS

Reply via email to