writing to maprfs?

2018-05-24 Thread Akanksha Sharma B
Hi All, I have a pipeline; it writes to hdfs, using "org.apache.beam.sdk.io.hdfs" package. I was hoping that it will work with maprfs as well. However I have been debugging for some days, with no success. I do not provide hdfsConfiguration from command line, and instead use the configuratio

Re: writing to maprfs?

2018-05-24 Thread Akanksha Sharma B
Hi, Answering my own question 😊 writing to maprfs worked after I added following property to core-site.xml:- fs.maprfs.impl com.mapr.fs.MapRFileSystem Regards, Akanksha From: Akanksha Sharma B Sent: Thursday, May 24, 2018 9:24

ParquetIO javadocs

2018-06-20 Thread Akanksha Sharma B
Hi, >From the built-in io-transforms list >(https://beam.apache.org/documentation/io/built-in/), I can find Parquet being >supported. However, I could not find its javadocs. Built-in I/O Transforms - Apache Beam beam.apache.org Apache Beam i

spark streaming

2018-06-29 Thread Akanksha Sharma B
Hi, I just started using apache beam pipeline for processing unbouned data, on spark. So it essentially uses spark-streaming. However, I came across following statement in Spark Runner Documentation. "Note: support for the Beam Model in streaming is currently experimental, follow-up in the m

Schema class in 2.5 ?

2018-07-11 Thread Akanksha Sharma B
Hi, Can you please share some documentation about ongoing changes related to Schema class. I am looking to understand why is it being introduced and how can I use it. I was looking for something like RDD in Beam, i.e. Beam understands schema of data internally and thus can handle some convers

Re: Schema class in 2.5 ?

2018-07-11 Thread Akanksha Sharma B
/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc On 11 Jul 2018, at 10:38, Akanksha Sharma B mailto:akanksha.b.sha...@ericsson.com>> wrote: Hi, Can you please share some documentation about ongoing changes related to Schema class. I am looking to understand why is it being introduced and how can I use it.

Re: Schema class in 2.5 ?

2018-07-12 Thread Akanksha Sharma B
n and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specific Languages (DSLs). Regards, Akanksha From: Akanksha Sharma B Sent: Wednesday, July 11, 2018 11:02:37 AM To: user@beam.apache.org Subject: Re: Schema class in 2.5 ? Thanks

Re: Schema class in 2.5 ?

2018-07-12 Thread Akanksha Sharma B
Schema a while ago and BeamSQL doc seems was not updated. Could you create a Jira issue for that? On 12 Jul 2018, at 11:10, Akanksha Sharma B mailto:akanksha.b.sha...@ericsson.com>> wrote: Hi, As I see, in 2.5 BeamSQL had been changed to work with Schema. The sample code provided in

Re: Schema class in 2.5 ?

2018-07-17 Thread Akanksha Sharma B
Hi, ParquetIO needs avro Schema(org.apache.avro.Schema) to read and write. Will it also be possible not to use any avro Schema at all or use Beams Schema (org.apache.beam.sdk.schemas.Schema) Regards, Akanksha From: Akanksha Sharma B Sent: Thursday, July 12

Re: Schema class in 2.5 ?

2018-07-17 Thread Akanksha Sharma B
beam/sdk/io/parquet/ParquetIO.java#L285 Best regards, Łukasz wt., 17 lip 2018 o 09:52 Akanksha Sharma B mailto:akanksha.b.sha...@ericsson.com>> napisał(a): Hi, ParquetIO needs avro Schema(org.apache.avro.Schema) to read and write. Will it also be possible no

pipeline with parquet and sql

2018-07-24 Thread Akanksha Sharma B
Hi, Please consider following pipeline:- Source is Parquet file, having hundreds of columns. Sink is Parquet. Multiple output parquet files are generated after applying some sql joins. Sql joins to be applied differ for each output parquet file. Lets assume we have a sql queries generator or

Re: pipeline with parquet and sql

2018-07-31 Thread Akanksha Sharma B
Hi, I am hoping to get some hints/pointers from the experts here. I hope the scenario described below was understandable. I hope it is a valid use-case. Please let me know if I need to explain the scenario better. Regards, Akanksha From: Akanksha Sharma B

Re: pipeline with parquet and sql

2018-08-01 Thread Akanksha Sharma B
the scenario better. Regards, Akanksha ________ From: Akanksha Sharma B Sent: Friday, July 27, 2018 9:44 AM To: d...@beam.apache.org<mailto:d...@beam.apache.org> Subject: Re: pipeline with parquet and sql Hi, Please consider following pipeline:- Source is Parquet fi

Schema Aware PCollections

2018-08-08 Thread Akanksha Sharma B
From: Chamikara Jayalath Sent: Wednesday, August 1, 2018 3:57 PM To: user@beam.apache.org Cc: d...@beam.apache.org Subject: Re: pipeline with parquet and sql On Wed, Aug 1, 2018 at 1:12 AM Akanksha Sharma B mailto:akanksha.b.sha...@ericsson.com>> wrote: Hi, Thanks. I understood the Parq

Re: Schema Aware PCollections

2018-08-09 Thread Akanksha Sharma B
end-to-end examples yet. Regards, Anton On Wed, Aug 8, 2018 at 5:45 AM Akanksha Sharma B mailto:akanksha.b.sha...@ericsson.com>> wrote: Hi, (changed the email-subject to make it generic) It is mentioned in Schema-Aware PCollections design doc (https://docs.goog