date:20160212

Flink packaging makes life hard for SBT fat jar's

2016-02-12 Thread shikhar

Repro at https://github.com/shikhar/flink-sbt-fatjar-troubles, run `sbt assembly` A fat jar seems like the best way to provide jobs for Flink to execute. I am declaring deps like: {noformat} "org.apache.flink" %% "flink-clients" % "1.0-SNAPSHOT" % "provided" "org.apache.flink" %% "flink-streaming

Re: streaming using DeserializationSchema

2016-02-12 Thread Nick Dimiduk

My input file contains newline-delimited JSON records, one per text line. The records on the Kafka topic are JSON blobs encoded to UTF8 and written as bytes. On Fri, Feb 12, 2016 at 1:41 PM, Martin Neumann wrote: > I'm trying the same thing now. > > I guess you need to read the file as byte arra

Re: streaming using DeserializationSchema

2016-02-12 Thread Martin Neumann

I'm trying the same thing now. I guess you need to read the file as byte arrays somehow to make it work. What read function did you use? The mapper is not hard to write but the byte array stuff gives me a headache. cheers Martin On Fri, Feb 12, 2016 at 9:12 PM, Nick Dimiduk wrote: > Hi Mart

Re: streaming using DeserializationSchema

2016-02-12 Thread Nick Dimiduk

Hi Martin, I have the same usecase. I wanted to be able to load from dumps of data in the same format as is on the kafak queue. I created a new application main, call it the "job" instead of the "flow". I refactored my code a bit for building the flow so all that can be reused via factory method.

writeAsCSV with partitionBy

2016-02-12 Thread Srikanth

Hello, Is there a Hive(or Spark dataframe) partitionBy equivalent in Flink? I'm looking to save output as CSV files partitioned by two columns(date and hour). The partitionBy dataset API is more to partition the data based on a column for further processing. I'm thinking there is no direct

Re: Master Thesis [Apache-flink paper references]

2016-02-12 Thread subash basnet

Hello Matthias, Thank you very much :) Best Regards, Subash Basnet On Fri, Feb 12, 2016 at 8:22 PM, Matthias J. Sax wrote: > You might want to check out the Stratosphere project web site: > http://stratosphere.eu/project/publications/ > > -Matthias > > On 02/12/2016 05:52 PM, subash basnet wro

Re: Master Thesis [Apache-flink paper references]

2016-02-12 Thread Matthias J. Sax

You might want to check out the Stratosphere project web site: http://stratosphere.eu/project/publications/ -Matthias On 02/12/2016 05:52 PM, subash basnet wrote: > Hello all, > > I am currently doing master's thesis on Apache-flink. It would be really > helpful to know about the reference paper

Master Thesis [Apache-flink paper references]

2016-02-12 Thread subash basnet

Hello all, I am currently doing master's thesis on Apache-flink. It would be really helpful to know about the reference papers followed for the development/background of flink. It would help me build a solid background knowledge to analyze flink. Currently I am reading all the related materials f

Re: How to convert List to flink DataSet

2016-02-12 Thread subash basnet

Hello Fabian, Thank you for the response, but I have been stuck on how to iterate over the DataSet, perform operations and return a new modified DataSet similar to that of list operation as shown below. Eg: Currently I am doing the following: for (Centroid centroid : centroids.collect()) { for

Re: Merge or minus Dataset API missing

2016-02-12 Thread Flavio Pompermaier

Ah ok, I didn't know about it! Thanks Till and Fabian! On Fri, Feb 12, 2016 at 5:11 PM, Fabian Hueske wrote: > Hi Flavio, > > If I got it right, you can use a FullOuterJoin. > It will give you both elements on a match and otherwise a left or a right > element and null. > > Best, Fabian > > 2016-

Re: Merge or minus Dataset API missing

2016-02-12 Thread Fabian Hueske

Hi Flavio, If I got it right, you can use a FullOuterJoin. It will give you both elements on a match and otherwise a left or a right element and null. Best, Fabian 2016-02-12 16:48 GMT+01:00 Flavio Pompermaier : > Hi to all, > > I have a use case where I have to merge 2 datasets but I can't fin

Re: Merge or minus Dataset API missing

2016-02-12 Thread Till Rohrmann

Why don’t you simply use a fullOuterJoin to do that? Cheers, Till On Fri, Feb 12, 2016 at 4:48 PM, Flavio Pompermaier wrote: > Hi to all, > > I have a use case where I have to merge 2 datasets but I can't find a > direct dataset API to do that. > I want to execute some function when there's a

Merge or minus Dataset API missing

2016-02-12 Thread Flavio Pompermaier

Hi to all, I have a use case where I have to merge 2 datasets but I can't find a direct dataset API to do that. I want to execute some function when there's a match, otherwise move on the not-null element. At the moment I can do this in a fairly complicated way (I want to avoid broadcasting becaus

Re: Compile issues with Flink 1.0-SNAPSHOT and Scala 2.11

2016-02-12 Thread Cory Monty

Thanks, Stephan. Everything is back to normal for us. Cheers, Cory On Fri, Feb 12, 2016 at 6:54 AM, Stephan Ewen wrote: > Hi Cory! > > We found the problem. There is a development fork of Flink for Stream SQL, > whose CI infrastructure accidentally also deployed snapshots and overwrote > some

Re: Compile issues with Flink 1.0-SNAPSHOT and Scala 2.11

2016-02-12 Thread Stephan Ewen

Hi Cory! We found the problem. There is a development fork of Flink for Stream SQL, whose CI infrastructure accidentally also deployed snapshots and overwrote some of the proper master branch snapshots. That's why the snapshots got inconsistent. We fixed that, and newer snapshots should be online

Re: consume kafka stream with flink

2016-02-12 Thread Robert Metzger

Hi Tanguy, I would recommend to refer to the documentation of the specific Flink version you are using. This is the documentation for 1.0-SNAPSHOT: https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/connectors/kafka.html and this is the doc for 0.10.x: https://ci.apache.org/pro

consume kafka stream with flink

2016-02-12 Thread Tanguy Racinet

Hello, I am currently trying to develop am algorithm mining frequent item sets over a data stream. I am using kafka to generate the stream, however I cannot manage to link Flink to Kafka. The code presented here is working but only using Flink version 0.9.1 https://github.com/dataArtisans/kafka

Re: streaming using DeserializationSchema

2016-02-12 Thread Martin Neumann

Its not only about testing, I will also need to run things against different datasets. I want to reuse as much of the code as possible to load the same data from a file instead of kafka. Is there a simple way of loading the data from a File using the same conversion classes that I would use to tra

Flink packaging makes life hard for SBT fat jar's

Re: streaming using DeserializationSchema

Re: streaming using DeserializationSchema

Re: streaming using DeserializationSchema

writeAsCSV with partitionBy

Re: Master Thesis [Apache-flink paper references]

Re: Master Thesis [Apache-flink paper references]

Master Thesis [Apache-flink paper references]

Re: How to convert List to flink DataSet

Re: Merge or minus Dataset API missing

Re: Merge or minus Dataset API missing

Re: Merge or minus Dataset API missing

Merge or minus Dataset API missing

Re: Compile issues with Flink 1.0-SNAPSHOT and Scala 2.11

Re: Compile issues with Flink 1.0-SNAPSHOT and Scala 2.11

Re: consume kafka stream with flink

consume kafka stream with flink

Re: streaming using DeserializationSchema

18 matches

Site Navigation

Mail list logo

Footer information