Re: Pico WordCount

2016-12-07 Thread Jean-Baptiste Onofré
Thanks, Jesse -- Jean-Baptiste Onofré jbono...@apache.org <mailto:jbono...@apache.org> http://blog.nanthrax.net Talend - http://www.talend.com -- Neelesh Srinivas Salian Customer Operations Engineer * * * * -- Jean-Baptiste Onofré jbono...@apache.org http://b

Re: Is there something like HiveIO planned?

2016-11-24 Thread Jean-Baptiste Onofré
about the Jira creation timing. Btw, I only started playing around with Beam, but if there is any way I could help, I would love to contribute to this. Regards, Tim On Fri, Nov 25, 2016 at 3:31 PM, Jean-Baptiste Onofré <j...@nanthrax.net> wrote: Hi Tim, We have a HBaseIO and HiveIO i

Re: Is there something like HiveIO planned?

2016-11-24 Thread Jean-Baptiste Onofré
the files directly from HDFS? Regards, Tim -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: 回复:how to use key-value storage like redis with PCollection?

2016-11-24 Thread Jean-Baptiste Onofré
-- 发件人:Jean-Baptiste Onofré <j...@nanthrax.net> 发送时间:2016年11月22日(星期二) 03:29 收件人:user <user@beam.incubator.apache.org> 主 题:Re: how to use key-value storage like redis with PCollection? Hi Amir, I'm working on MqttIO right now,

Re: Beam on Kubernetes

2016-11-22 Thread Jean-Baptiste Onofré
with Kubernetes (yet..) but as it becomes popular I'm wondering if/how it could converge with Beam. Amit. On Mon, Nov 21, 2016 at 9:29 PM Jean-Baptiste Onofré <j...@nanthrax.net <mailto:j...@nanthrax.net>> wrote: Hi Ben, I did examples with Mesos and Marathon, but not yet wit

ApacheCon: Apache Beam BoF and Beam diner

2016-11-14 Thread Jean-Baptiste Onofré
at 10:00 PM, Neelesh Salian <nsal...@cloudera.com> wrote: Anyone here today? On Nov 11, 2016 12:01 PM, "Stephan Ewen" <se...@apache.org> wrote: I'll also be in Sevilla Monday and Tuesday morning and happy to meet. Stephan On Fri, Nov 11, 2016 at 11:55 AM, Jean-Baptiste Ono

Re: TextIO.Read.named have removed?

2016-11-12 Thread Jean-Baptiste Onofré
document miss match for current Java API TextIO.Read.named seems have removed ? -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: PCollection to PCollection Conversion

2016-11-07 Thread Jean-Baptiste Onofré
V (or list types) and put a specific String delimiter. The new transform would take in any type in a PCollection and makes it a PCollection using a specific String delimiter. Thanks, Jesse -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: Strata+Hadoop World

2016-09-29 Thread Jean-Baptiste Onofré
Jesse -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: Issues with simple KafkaIO-read pipeline -- where to write?

2016-09-19 Thread Jean-Baptiste Onofré
ataradiant/beam/examples/StreamWordCount.java#L140 Best, On Sep 18, 2016, at 1:54 PM, Jean-Baptiste Onofré <j...@nanthrax.net> wrote: Hi Emanuele You have to use a window to create a bounded collection from an unbounded source. Regards JB On Sep 18, 2016, at 21:04, Emanuele Cesena

Re: [INFO] Apache Beam wikipedia page

2016-08-03 Thread Jean-Baptiste Onofré
fights abuse in an almost ‘maniac’ way, the article was marked as candidate for deletion so please feel free to help me improve it and discuss in the talk section the reasons to contest the deletion.​ -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: Introduction to Apache Beam at Shanghai Big Data Streaming 3rd Meetup

2016-07-02 Thread Jean-Baptiste Onofré
et companies are investigating in using Beam. Thanks, Manu Zhang -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: A quick demo of Apache Beam with Docker

2016-06-22 Thread Jean-Baptiste Onofré
k https://github.com/ecesena/beam-starter Any feedback is more than welcome! Best, E. -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: KafkaIO Producer Closing

2016-06-22 Thread Jean-Baptiste Onofré
/metrics.sample.window.ms> = 3 partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner send.buffer.bytes = 131072 linger.ms <http://linger.ms> = 0 2016-06-22 11:06:20,743 INFO AppInfoParser:82 - Kafka version : 0.9.0.1 2016-06-22 11:06:20,744 INFO AppInfoParser:83 - Kafka commitId : 23c69d62a0cabf06 2016-06-22 11:06:20,849 INFO KafkaProducer:613 - Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. Thanks, Jesse -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: KafkaIO Producer Closing

2016-06-22 Thread Jean-Baptiste Onofré
= null ssl.keymanager.algorithm = SunX509 metrics.sample.window.ms <http://metrics.sample.window.ms> = 3 partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner send.buffer.bytes = 131072 linger.ms <http://linger.ms> = 0 2016-06-22 11:06:20,743 INFO AppInfoParser:82 - Kafka version : 0.9.0.1 2016-06-22 11:06:20,744 INFO AppInfoParser:83 - Kafka commitId : 23c69d62a0cabf06 2016-06-22 11:06:20,849 INFO KafkaProducer:613 - Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. Thanks, Jesse -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: Multi-threading implementation equivalence in Beam

2016-06-20 Thread Jean-Baptiste Onofré
:-) Is there a pipeline parallelism/threading model provide by Beam that equates the multi-threading model in Java for instance? Any examples if so? Thanks again, Amir -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: How to follow up a write with another ParDo on another PCollection

2016-06-15 Thread Jean-Baptiste Onofré
DeleteInputFiles - will not be run. Is this true in general? Thanks, Frank -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: KafkaIO Writer

2016-06-07 Thread Jean-Baptiste Onofré
refactoring the class to Read and Write like TextIO does. Thanks, Jesse -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: java.lang.UnsupportedOperationException with Kryo

2016-06-03 Thread Jean-Baptiste Onofré
at >>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:349) >>> ... 3 more >>> Caused by: java.lang.UnsupportedOperationException >>> at >>> java.util.Collections$UnmodifiableCollection.add(Collections.java:1055) >>> at >>> com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:116) >>> at >>> com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:22) >>> at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:679) >>> at >>> com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) >>> >>> >>> I am not able to avoid this "Kryo" Exception, Thanks for any help. >>> >>> Thanks >>> >>> Viswadeep. >>> > > -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: Could someone share working bootstrap project with Spark runner configured?

2016-06-02 Thread Jean-Baptiste Onofré
y serve as init: https://github.com/orian/cogroup-wrong-grouping Could someone modify it or share some working example (outside of Beam repo). Cheers, Pawel -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: TextIO with .withoutSharding() writing a only one shard of data and ignoring the rest

2016-05-31 Thread Jean-Baptiste Onofré
Yes, just tested, it happens only with the flink runner. Agree to create a Jira. Regards JB On 06/01/2016 03:41 AM, Davor Bonaci wrote: This will be a runner-specific issue. It would be the best to file a JIRA issue for this. On Tue, May 31, 2016 at 9:46 AM, Jean-Baptiste Onofré &l

Re: TextIO with .withoutSharding() writing a only one shard of data and ignoring the rest

2016-05-31 Thread Jean-Baptiste Onofré
ot;)*.withoutSharding()*); Result: Only part of data is written to file. After comparing to sharded output, it seems to be just one of shard files. Cheers, Pawel -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: Items not groupped correctly - CoGroupByKey - FlinkPipelienRunner

2016-05-31 Thread Jean-Baptiste Onofré
But using Tuple2.of() was just easier than: KeyDay.newBuilder().setKey(...).setTimestampUsec(...).build(). // The original description

Re: [ANNOUNCE] Beam-related talks this week, in and around Strata+Hadoop World

2016-05-30 Thread Jean-Baptiste Onofré
at Google: http://www.meetup.com/Big-things-are-happening-here/events/231274864/ If you're in London, and want some sun - you're welcomed! @JB - nice going, you got a conference named after you *JB*CNConf ;) Thanks, Amit On Mon, May 30, 2016 at 9:22 PM Jean-Baptiste Onofré <j...@nanthrax.

Re: [ANNOUNCE] Beam-related talks this week, in and around Strata+Hadoop World

2016-05-30 Thread Jean-Baptiste Onofré
big-data-eu/public/schedule/detail/49592) These are just the talks by folks adjacent to me, so if anyone else is giving a talk or knows about one, please do add to this thread. See you in London! Kenn -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: Parallelism

2016-05-24 Thread Jean-Baptiste Onofré
wever, all transforms after a GroupByKey execute in parallel based on the number of available keys. On Tue, May 24, 2016 at 7:43 AM, Jean-Baptiste Onofré <j...@nanthrax.net <mailto:j...@nanthrax.net>> wrote: Hi David, if you use the InProcessPipelineRunner (the "new"

Re: Parallelism

2016-05-24 Thread Jean-Baptiste Onofré
ne process in parallel is to use SparkPipelineRunner/ FlinkPipelineRunner (meaning I have to use Apache Beam + Spark/ Flink) or make use of Google Cloud Platform? Thanks [1]. https://cloud.google.com/dataflow/pipelines/specifying-exec-params#setting-other-cloud-pipeline-options -- Jean-Baptiste On

Re: Force pipe executions to run on same node

2016-05-21 Thread Jean-Baptiste Onofré
this can be controlled by defining groups (http://docs.spring.io/spring-xd/docs/1.2.0.RELEASE/reference/html/#deployment) and then specify deployment criteria to match this group. Is this possible with Beam? Best Ben -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend

Re: Problem with Pipeline In Flink Runner

2016-05-20 Thread Jean-Baptiste Onofré
: >>> > appName: TestQ08Task >>> > filesToStage: ... >>> > flinkMaster: [auto] >>> > parallelism: 1 >>> > runner: class org.apache.beam.runners.flink.FlinkPipelineRunner >>> > stableUniqueNames: WARN

Re: Problem with Pipeline In Flink Runner

2016-05-20 Thread Jean-Baptiste Onofré
option to define if it is in batch or stream mode > is going to stay for long, can't be this inferred some how ? > > Thanks, > -Ismaël > -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: Threads and ThreadLocal variables in Dataflow

2016-05-11 Thread Jean-Baptiste Onofré
ateReader(PipelineOptions pipelineOptions) throws IOException { return new SearchIndexReader(this); } @Override public void validate() { } @Override public Coder getDefaultOutputCoder() { return DocumentCoder.of(); } } -- Jean-Baptiste On

Re: KafkaIO Usage & Sample Code

2016-05-02 Thread Jean-Baptiste Onofré
It makes sense. Thanks Dan ! Regards JB On 05/02/2016 09:00 AM, Dan Halperin wrote: On Sun, May 1, 2016 at 10:29 PM, Jean-Baptiste Onofré <j...@nanthrax.net <mailto:j...@nanthrax.net>> wrote: Oh, thanks Frances. I mixed DirectPipelineRunner ("

Re: Questions on Beam Pipeline management , monitoring

2016-04-25 Thread Jean-Baptiste Onofré
-- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: subscribe

2016-03-23 Thread Jean-Baptiste Onofré
e.org>. On Tue, Mar 22, 2016 at 3:55 PM, Dongjoon Hyun <dongj...@apache.org <mailto:dongj...@apache.org>> wrote: -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com

Re: Output from Beam (on Flink) to Kafka

2016-03-21 Thread Jean-Baptiste Onofré
ing that it would be wrapped with an UnboundedFlinkSink, or some such, but that doesn’t seem to exist. Any advice or thoughts on what I’m trying to do? I’m running the latest incubator-beam (as of last night from Github), Flink 1.0.0 in cluster mode and Kafka 0.9.0.1, all on Google Compute Engine