Re: put record to kinesis and then trying consume using flink connector

2017-04-26 Thread Sathi Chowdhury
Thanks Alex. Yes exactly so.I was actually aware of it was challenging to do it in the main method of flink, even though the push appears after the my datastream is attached to kinesis , due to lazy execution, once the stream is connected then my publish did not work. If the publish is done

Re: Data duplication on a High Availability activated cluster after a Task Manager failure recovery

2017-04-26 Thread F.Amara
Hi Gordan, Appreciate your prompt reply. Thanks alot for pointing that out that Kafka Producer has at least once guarantee of message delivery. That seems to be the reason why I encountered duplicated data on a flink failure recovery scenario. -- View this message in context:

Re: CEP join across events

2017-04-26 Thread Elias Levy
You are correct. Apologies for the confusion. Given that ctx.getEventsForPattern returns an iterator instead of a value and that the example in the documentation deals with summing multiple matches, I got the impression that the call would return all previous matches instead of one at a time for

Iterating over keys in state backend

2017-04-26 Thread Ken Krugler
Is there a way to iterate over all of the key/value entries in the state backend, from within the operator that’s making use of the same? E.g. I’ve got a ReducingState, and on a timed interval (inside of the onTimer method) I need to iterate over all KV state and emit the N “best” entries.

Re: [BUG?] Cannot Load User Class on Local Environment

2017-04-26 Thread Matt
Let's wait for Till then, I hope he can figure this out. On Wed, Apr 26, 2017 at 11:03 AM, Stefan Richter < s.rich...@data-artisans.com> wrote: > Ok, now the question is also about what classloaders Ignite is creating > and how they are used, but the relevant code line in Flink is probably in >

Re: UnilateralSortMerger error (again)

2017-04-26 Thread Flavio Pompermaier
I've created a repository with a unit test to reproduce the error at https://github.com/fpompermaier/flink-batch-bug/ blob/master/src/test/java/it/okkam/flink/aci/TestDataInputDeserializer.java (probably this error is related also to FLINK-4719). The exception is thrown only when there are null

Re: Queryable State

2017-04-26 Thread Chet Masterson
After setting the logging to DEBUG on the job manager, I learned four things: (On the message formatting below, I have the Flink logs formatted into JSON so I can import them into Kibana) 1. The appropriate key value state is registered in both parallelism = 1 and parallelism = 3 environments. In

Re: Re-keying / sub-keying a stream without repartitioning

2017-04-26 Thread Elias Levy
On Wed, Apr 26, 2017 at 5:11 AM, Aljoscha Krettek wrote: > I did spend some time thinking about this and we had the idea for a while > now to add an operation like “keyByWithoutPartitioning()” (name not final > ;-) that would allow the user to tell the system that we don’t

Last chance: ApacheCon is just three weeks away

2017-04-26 Thread Rich Bowen
ApacheCon is just three weeks away, in Miami, Florida, May 15th - 18th. http://apachecon.com/ There's still time to register and attend. ApacheCon is the best place to find out about tomorrow's software, today. ApacheCon is the official convention of The Apache Software Foundation, and includes

Re: Flink docs in regards to State

2017-04-26 Thread Sand Stone
To be clear, I like the direction of Flink is going with State: Querytable State, MapState etc. MapState in particular is a great feature and I am trying to find more documentation and/or usage patterns with it before I dive into the deep end of the code. As far as I can tell, the key in MapState

Re: Kafka 0.10 jaas multiple clients

2017-04-26 Thread Timo Walther
Hi Gwenhael, I'm not a Kafka expert but if something is hardcoded that should not, it might be worth opening an issue for it. I loop in somebody who might knows more your problem. Timo Am 26/04/17 um 14:47 schrieb Gwenhael Pasquiers: Hello, Up to now we’ve been using kafka with jaas

Re: Queryable State

2017-04-26 Thread Chet Masterson
Ok...more information. 1. Built a fresh cluster from the ground up. Started testing queryable state at each step.2. When running under any configuration of task managers and job managers were parallelism = 1, the queries execute as expected.3. As soon as I cross over to parallelism = 3 with 3 task

Re: Flink docs in regards to State

2017-04-26 Thread Timo Walther
Hi, you are right. There are some limitation about RichReduceFunctions on windows. Maybe the new AggregateFunction `window.aggregate()` could solve your problem, you can provide an accumulator which is your custom state that you can update for each record. I couldn't find a documentation

Re: UnilateralSortMerger error (again)

2017-04-26 Thread Flavio Pompermaier
After digging into the code and test I think that the problem is almost certainly in the UnilateralSortMerger, there should be a missing synchronization on some shared object somewhere...Right now I'm trying to understand if this section of code creates some shared object (like queues) that are

Re: Multiple consumers on a subpartition

2017-04-26 Thread Albert Jonathan
Thank you for your responses and suggestions. I appreciate it. Albert On Wed, Apr 26, 2017 at 4:19 AM, Ufuk Celebi wrote: > Adding to what Zhijiang said: I think the way to go would be to create > multiple "read views" over the pipelined subpartition. You would have > to make

Re: [BUG?] Cannot Load User Class on Local Environment

2017-04-26 Thread Stefan Richter
Ok, now the question is also about what classloaders Ignite is creating and how they are used, but the relevant code line in Flink is probably in FlinkMiniCluster.scala, line 538 (current master): try { JobClient.submitJobAndWait( clientActorSystem, configuration,

Queries regarding Historical Reprocessing

2017-04-26 Thread vinay patil
Hi Guys, For historical reprocessing , I am reading the avro data from S3 and passing these records to the same pipeline for processing. I have the following queries: 1. I am running this pipeline as a stream application with checkpointing enabled, the records are successfully written to S3,

Kafka 0.10 jaas multiple clients

2017-04-26 Thread Gwenhael Pasquiers
Hello, Up to now we’ve been using kafka with jaas (plain login/password) the following way: - yarnship the jaas file - add the jaas file name into “flink-conf.yaml” using property “env.java.opts” How to support multiple secured kafka 0.10 consumers and producers (with

Lost connection to task manager

2017-04-26 Thread 猎豹移动 李木柯
I run the wordcount example , input data size is 10.9G command: ./bin/flink run -m yarn-cluster -yn 45 -yjm 2048 -ytm 2048 ./examples/batch/WordCount.jar --input /path --output /path1 and finally it throws exceptions as follows Can anyone give me some help?Thanks Caused by:

Re: Re-keying / sub-keying a stream without repartitioning

2017-04-26 Thread Aljoscha Krettek
Hi Elias, sorry for the delay, this must have fallen under the table after Flink Forward. I did spend some time thinking about this and we had the idea for a while now to add an operation like “keyByWithoutPartitioning()” (name not final ;-) that would allow the user to tell the system that we

Re: Why flink 1.2.0 delete flink-connector-redis?

2017-04-26 Thread Timo Walther
Hi, the Flink community decided to have the most important connectors (e.g. Kafka) in the core repository. All other connectors are in Apache Bahir (http://bahir.apache.org/). You can find the flink-connector-redis there. Timo Am 26/04/17 um 12:54 schrieb yunfan123: It exists in 1.1.5.

Why flink 1.2.0 delete flink-connector-redis?

2017-04-26 Thread yunfan123
It exists in 1.1.5. But be deleted in 1.2.0. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Why-flink-1-2-0-delete-flink-connector-redis-tp12825.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Multiple consumers on a subpartition

2017-04-26 Thread Ufuk Celebi
Adding to what Zhijiang said: I think the way to go would be to create multiple "read views" over the pipelined subpartition. You would have to make sure that the initial reference count of the partition buffers is incremented accordingly. The producer will be back pressured by both consumers now.

Re: CEP join across events

2017-04-26 Thread Kostas Kloudas
Hi Elias, If I understand correctly your use case, you want for an input: event_1 = (type=1, value_a=K, value_b=X) event_2 = (type=2, value_a=K, value_b=X) event_3 = (type=1, value_a=K, value_b=Y) to get a match: event_1, event_2 and discard event_3, right? In this case, Dawid is correct and

RE: inconsistent behaviour in GenericCsvInputFormat

2017-04-26 Thread JAVIER RODRIGUEZ BENITO
Hi Kurt, I´m using versión 1.2.0. Best, Javi De: Kurt Young [mailto:ykt...@gmail.com] Enviado el: viernes, 21 de abril de 2017 2:56 Para: JAVIER RODRIGUEZ BENITO CC: user@flink.apache.org Asunto: Re: inconsistent behaviour in GenericCsvInputFormat Hi,

Re: about yarn

2017-04-26 Thread lining jing
Hi, All Thanks, JinKui Shi, have told the answer, just add -d is ok 2017-04-26 14:16 GMT+08:00 lining jing : > Hi, all > > I use /bin/flink run -m yarn-cluster > commit my flink job. But, after this, the process which name is CliFrontend > is running. After a duration,

Re: CEP join across events

2017-04-26 Thread Dawid Wysakowicz
Hi Elias, You can do it with 1.3 and IterativeConditions. Method ctx.getEventsForPattern("foo") returns only those events that were matched in "foo" pattern in that particular branch. I mean that for a sequence like (type =1, value_b = X); (type=1, value_b=Y); (type=2, value_b=X) both events of

about yarn

2017-04-26 Thread lining jing
Hi, all I use /bin/flink run -m yarn-cluster commit my flink job. But, after this, the process which name is CliFrontend is running. After a duration, there are many CliFrontend run in my computer which is no need. Has any good idea soft this solution. Thanks!