Oh yes, even better I’d say. Best,
> On Sep 19, 2016, at 9:48 AM, Jean-Baptiste Onofré <[email protected]> wrote: > > Hi Emanuele, > > +1 to support Unbounded sink, but also, a very convenient function would be a > Window to create a bounded collection as a subset of a unbounded collection. > > Regards > JB > > On 09/19/2016 05:59 PM, Emanuele Cesena wrote: >> Hi, >> >> This is a great insight. Is there any plan to support unbounded sink in Beam? >> >> On the temp kafka->kafka solution, this is exactly what we’re doing (and I >> wish to change). We have production stream pipelines that are kafka->kafka. >> Then we have 2 main use cases: kafka connect to dump into hive and go batch >> from there, and druid for real time reporting. >> >> However this makes prototyping really slow, and I wanted to introduce Beam >> to short cut from kafka to anywhere. >> >> Best, >> >> >>> On Sep 18, 2016, at 10:38 PM, Aljoscha Krettek <[email protected]> wrote: >>> >>> Hi, >>> right now, writing to a Beam "Sink" is only supported for bounded streams, >>> as you discovered. An unbounded stream cannot be transformed to a bounded >>> stream using a window, this will just "chunk" the stream differently but it >>> will still be unbounded. >>> >>> The options you have right now for writing are to write to your external >>> datastore using a DoFn, using KafkaIO to write to a Kafka topic or to use >>> UnboundedFlinkSink to wrap a Flink Sink for use in a Beam pipeline. The >>> latter would allow you to use, for example, BucketingSink or RollingSink >>> from Flink. I'm only mentioning UnboundedFlinkSink for completeness, I >>> would not recommend using it since your program will only work on the Flink >>> runner. The way to go, IMHO, would be to write to Kafka and then take the >>> data from there and ship it to some final location such as HDFS. >>> >>> Cheers, >>> Aljoscha >>> >>> On Sun, 18 Sep 2016 at 23:17 Emanuele Cesena <[email protected]> wrote: >>> Thanks I’ll look into it, even if it’s not really the feature I need >>> (exactly because it will stop execution). >>> >>> >>>> On Sep 18, 2016, at 2:11 PM, Chawla,Sumit <[email protected]> wrote: >>>> >>>> Hi Emanuele >>>> >>>> KafkaIO supports withMaxNumRecords(X) support which will create a bounded >>>> source from Kafka. However, the pipeline will finish once X number of >>>> records are read. >>>> >>>> Regards >>>> Sumit Chawla >>>> >>>> >>>> On Sun, Sep 18, 2016 at 2:00 PM, Emanuele Cesena <[email protected]> >>>> wrote: >>>> Hi, >>>> >>>> Thanks for the hint - I’ll debug better but I thought I did that: >>>> https://github.com/ecesena/beam-starter/blob/master/src/main/java/com/dataradiant/beam/examples/StreamWordCount.java#L140 >>>> >>>> Best, >>>> >>>> >>>>> On Sep 18, 2016, at 1:54 PM, Jean-Baptiste Onofré <[email protected]> >>>>> wrote: >>>>> >>>>> Hi Emanuele >>>>> >>>>> You have to use a window to create a bounded collection from an unbounded >>>>> source. >>>>> >>>>> Regards >>>>> JB >>>>> >>>>> On Sep 18, 2016, at 21:04, Emanuele Cesena <[email protected]> wrote: >>>>> Hi, >>>>> >>>>> I wrote a while ago about a simple example I was building to test KafkaIO: >>>>> https://github.com/ecesena/beam-starter/blob/master/src/main/java/com/dataradiant/beam/examples/StreamWordCount.java >>>>> >>>>> Issues with Flink should be fixed now, and I’m try to run the example on >>>>> master and Flink 1.1.2. >>>>> I’m currently getting: >>>>> Caused by: java.lang.IllegalArgumentException: Write can only be applied >>>>> to a Bounded PCollection >>>>> >>>>> What is the recommended way to go here? >>>>> - is there a way to create a bounded collection from an unbounded one? >>>>> - is there a plat to let TextIO write unbounded collections? >>>>> - is there another recommended “simple sink” to use? >>>>> >>>>> Thank you much! >>>>> >>>>> Best, >>>> >>>> -- >>>> Emanuele Cesena, Data Eng. >>>> http://www.shopkick.com >>>> >>>> Il corpo non ha ideali >>>> >>>> >>>> >>>> >>>> >>> >>> -- >>> Emanuele Cesena, Data Eng. >>> http://www.shopkick.com >>> >>> Il corpo non ha ideali >>> >>> >>> >>> >> > > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com -- Emanuele Cesena, Data Eng. http://www.shopkick.com Il corpo non ha ideali
