Re: Slack request

2018-01-19 Thread Allan Wilson
Can you add me as well?

Allan Wilson
Principal Software Engineer
Pandora Media Inc

Thanks

From: Lukasz Cwik <lc...@google.com<mailto:lc...@google.com>>
Reply-To: "user@beam.apache.org<mailto:user@beam.apache.org>" 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Date: Friday, January 19, 2018 at 12:15 PM
To: "user@beam.apache.org<mailto:user@beam.apache.org>" 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Subject: Re: Slack request

Invite sent, welcome.

On Fri, Jan 19, 2018 at 10:52 AM, Logan Hennessy 
<logan.henne...@zonarsystems.com<mailto:logan.henne...@zonarsystems.com>> wrote:
Request for invitation to join Slack channel.

Logan Hennessy | Software Dev Engineer
Zonar Systems



Re: Reading from ORC Files in HDFS

2017-12-19 Thread Allan Wilson
 Had a feeling that would be the answer, but being new to Beam I wanted to make 
sure I wasn’t missing something. :)


Thanks Ismael



On 12/18/17, 3:07 AM, "Ismaël Mejía" <ieme...@gmail.com> wrote:

>Hello,
>
>There is not support yet to read ORC files directly on Beam, You can
>track the progress of this issue here.
>https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_BEAM-2D1861=DwIFaQ=gFTBenQ7Vj71sUi1A4CkFnmPzqwDo07QsHw-JRepxyw=ZpzaEtcaU94NK3jHb3YffLFtq_DRaHEGobEO2J_3zIw=M0Hv4VMVlhVQOflTfehE_mOiOJXTz5Y-Mc7Hk-ybtF8=BVnOfRDnazZ6nFSJN0tyuBb-qNOUTvab47qT5Nykuws=
> 
>
>You better use HCatalogIO than JdbcIO (the split should be better).
>
>
>
>
>On Mon, Dec 18, 2017 at 4:17 AM, Allan Wilson <awils...@pandora.com> wrote:
>> Hi,
>>
>> Is there anyway to read ORC files from HDFS directly using Apache Beam?
>>
>> I’m looking at loading up Kafka with data stored in ORC files backing Hive
>> tables.
>>
>> After doing some research it doesn’t look possible, but I thought I ask to
>> make sure.
>>
>> It may be possible to use jdbc or hcatalog to query the data out, but I’d
>> rather scale out by pulling the data straight from the datanodes.
>>
>> The runner I’m using is Spark 1.6.3 on the HDP 2.6.2 distro.
>>
>>
>>
>>


Reading from ORC Files in HDFS

2017-12-17 Thread Allan Wilson
Hi,

Is there anyway to read ORC files from HDFS directly using Apache Beam?

I’m looking at loading up Kafka with data stored in ORC files backing Hive 
tables.

After doing some research it doesn’t look possible, but I thought I ask to make 
sure.

It may be possible to use jdbc or hcatalog to query the data out, but I’d 
rather scale out by pulling the data straight from the datanodes.

The runner I’m using is Spark 1.6.3 on the HDP 2.6.2 distro.