Re: writeAsCSV with partitionBy

2016-05-25 Thread Aljoscha Krettek
gt; > > > [1] > > > https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/connectors/fs/RollingSink.html > > > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/w

Re: writeAsCSV with partitionBy

2016-05-24 Thread KirstiLaurila
rg/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/connectors/fs/RollingSink.html -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/writeAsCSV-with-partitionBy-tp4893p7140.html Sent from the Apache Flink User Mailing Li

Re: writeAsCSV with partitionBy

2016-05-24 Thread Juho Autio
https://issues.apache.org/jira/browse/FLINK-3961 >> <https://issues.apache.org/jira/browse/FLINK-3961> >> >> >> >> >> -- >> View this message in context: >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/writeAsCSV-with

Re: writeAsCSV with partitionBy

2016-05-24 Thread Srikanth
t-archive.2336050.n4.nabble.com/writeAsCSV-with-partitionBy-tp4893p7118.html > Sent from the Apache Flink User Mailing List archive. mailing list archive > at Nabble.com. >

Re: writeAsCSV with partitionBy

2016-05-23 Thread KirstiLaurila
Yeah, created this one https://issues.apache.org/jira/browse/FLINK-3961 <https://issues.apache.org/jira/browse/FLINK-3961> -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/writeAsCSV-with-partitionBy-tp4893p7118.html Sent from the

Re: writeAsCSV with partitionBy

2016-05-23 Thread Fabian Hueske
he near future? > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/writeAsCSV-with-partitionBy-tp4893p7099.html > Sent from the Apache Flink User Mailing List archive. mailing list archive > at Nabble.com. >

Re: writeAsCSV with partitionBy

2016-05-23 Thread KirstiLaurila
Is there any plans to implement this kind of feature (possibility to write to data specified partitions) in the near future? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/writeAsCSV-with-partitionBy-tp4893p7099.html Sent from the Apache

Re: writeAsCSV with partitionBy

2016-02-16 Thread Fabian Hueske
Yes, you're right. I did not understand your question correctly. Right now, Flink does not feature an output format that writes records to output files depending on a key attribute. You would need to implement such an output format yourself and append it as follows: val data = ... data.partitionB

Re: writeAsCSV with partitionBy

2016-02-16 Thread Srikanth
Fabian, Not sure if we are on the same page. If I do something like below code, it will groupby field 0 and each task will write a separate part file in parallel. val sink = data1.join(data2) .where(1).equalTo(0) { ((l,r) => ( l._3, r._3) ) } .partitionByHash(0) .writeAsCsv(pathBa

Re: writeAsCSV with partitionBy

2016-02-15 Thread Fabian Hueske
Hi Srikanth, DataSet.partitionBy() will partition the data on the declared partition fields. If you append a DataSink with the same parallelism as the partition operator, the data will be written out with the defined partitioning. It should be possible to achieve the behavior you described using D

writeAsCSV with partitionBy

2016-02-12 Thread Srikanth
Hello, Is there a Hive(or Spark dataframe) partitionBy equivalent in Flink? I'm looking to save output as CSV files partitioned by two columns(date and hour). The partitionBy dataset API is more to partition the data based on a column for further processing. I'm thinking there is no direct