Reading from ORC Files in HDFS

2017-12-17 Thread Allan Wilson
Hi, Is there anyway to read ORC files from HDFS directly using Apache Beam? I’m looking at loading up Kafka with data stored in ORC files backing Hive tables. After doing some research it doesn’t look possible, but I thought I ask to make sure. It may be possible to use jdbc or hcatalog to

Usecase scenario: Job definition from low frequently changing storage

2017-12-17 Thread Theo Diefenthal
Hi there, I'm currently evaluating Apache Beam as stream processing engine for my current project and hope you can help me out with some architectural questions. Let me fist describe what I'm willing to do: I have lots of IoT sensor data coming in from an Azure EventHub. For each message, I can