Re: Enriching a tuple mapped from a datastream with data coming from a JDBC source

2016-10-03 Thread Philipp Bussche
Hi again,
I implemented the RichMap Function (open method runs a JDBC query to
populate a HashMap with data) which I am using in the map function.
Now there is another RichMap.map function that would add to the HashMap that
was initialized in the first function.
How would I share the Map between the two functions (I am using the
datastreaming API) ?

Thanks
Philipp



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Enriching-a-tuple-mapped-from-a-datastream-with-data-coming-from-a-JDBC-source-tp8993p9299.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Exception while running Flink jobs (1.0.0)

2016-10-03 Thread Flavio Pompermaier
I think you're running into the same exception I face sometimes..I've
opened a jira for it [1]. Could you please try to apply that patch and see
if things get better?

https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-4719

Best,
Flavio

On 3 Oct 2016 22:09, "Tarandeep Singh"  wrote:

> Now, when I ran it again (with lower task slots per machine) I got a
> different error-
>
> org.apache.flink.client.program.ProgramInvocationException: The program
> execution failed: Job execution failed.
> at org.apache.flink.client.program.Client.runBlocking(Client.java:381)
> at org.apache.flink.client.program.Client.runBlocking(Client.java:355)
> at org.apache.flink.client.program.Client.runBlocking(Client.java:315)
> at org.apache.flink.client.program.ContextEnvironment.
> execute(ContextEnvironment.java:60)
> at org.apache.flink.api.java.ExecutionEnvironment.execute(
> ExecutionEnvironment.java:855)
> 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.flink.client.program.PackagedProgram.callMainMethod(
> PackagedProgram.java:505)
> at org.apache.flink.client.program.PackagedProgram.
> invokeInteractiveModeForExecution(PackagedProgram.java:403)
> at org.apache.flink.client.program.Client.runBlocking(Client.java:248)
> at org.apache.flink.client.CliFrontend.executeProgramBlocking(
> CliFrontend.java:866)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:333)
> at org.apache.flink.client.CliFrontend.parseParameters(
> CliFrontend.java:1189)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1239)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job
> execution failed.
> at org.apache.flink.runtime.jobmanager.JobManager$$
> anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$
> mcV$sp(JobManager.scala:714)
> at org.apache.flink.runtime.jobmanager.JobManager$$
> anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:660)
> at org.apache.flink.runtime.jobmanager.JobManager$$
> anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:660)
> at scala.concurrent.impl.Future$PromiseCompletingRunnable.
> liftedTree1$1(Future.scala:24)
> at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(
> Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
> at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(
> AbstractDispatcher.scala:401)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(
> ForkJoinTask.java:260)
> at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.
> pollAndExecAll(ForkJoinPool.java:1253)
> at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.
> runTask(ForkJoinPool.java:1346)
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(
> ForkJoinPool.java:1979)
> at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(
> ForkJoinWorkerThread.java:107)
> Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class:
> javaec40-d994-yteBuffer
> at com.esotericsoftware.kryo.util.DefaultClassResolver.
> readName(DefaultClassResolver.java:138)
> at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(
> DefaultClassResolver.java:115)
> at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:641)
> at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:752)
> at org.apache.flink.api.java.typeutils.runtime.kryo.
> KryoSerializer.deserialize(KryoSerializer.java:228)
> at org.apache.flink.api.java.typeutils.runtime.
> PojoSerializer.deserialize(PojoSerializer.java:431)
> at org.apache.flink.runtime.plugable.NonReusingDeserializationDeleg
> ate.read(NonReusingDeserializationDelegate.java:55)
> at org.apache.flink.runtime.io.network.api.serialization.
> SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(
> SpillingAdaptiveSpanningRecordDeserializer.java:124)
> at org.apache.flink.runtime.io.network.api.reader.
> AbstractRecordReader.getNextRecord(AbstractRecordReader.java:65)
> at org.apache.flink.runtime.io.network.api.reader.
> MutableRecordReader.next(MutableRecordReader.java:34)
> at org.apache.flink.runtime.operators.util.ReaderIterator.
> next(ReaderIterator.java:73)
> at org.apache.flink.runtime.operators.FlatMapDriver.run(
> FlatMapDriver.java:101)
> at org.apache.flink.runtime.operators.BatchTask.run(
> BatchTask.java:480)
> at org.apache.flink.runtime.operators.BatchTask.invoke(
> BatchTask.java:345)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: javaec40-d994-yteBuffer
> at 

Re: Exception while running Flink jobs (1.0.0)

2016-10-03 Thread Tarandeep Singh
Now, when I ran it again (with lower task slots per machine) I got a
different error-

org.apache.flink.client.program.ProgramInvocationException: The program
execution failed: Job execution failed.
at org.apache.flink.client.program.Client.runBlocking(Client.java:381)
at org.apache.flink.client.program.Client.runBlocking(Client.java:355)
at org.apache.flink.client.program.Client.runBlocking(Client.java:315)
at
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:60)
at
org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:855)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:505)
at
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:403)
at org.apache.flink.client.program.Client.runBlocking(Client.java:248)
at
org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:866)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:333)
at
org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1189)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1239)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job
execution failed.
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:714)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:660)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:660)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class:
javaec40-d994-yteBuffer
at
com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
at
com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:641)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:752)
at
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.deserialize(KryoSerializer.java:228)
at
org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:431)
at
org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
at
org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:124)
at
org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:65)
at
org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:34)
at
org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73)
at
org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:101)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:480)
at
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: javaec40-d994-yteBuffer
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at
com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
... 15 more


-Tarandeep

On Mon, Oct 3, 2016 at 12:49 PM, Tarandeep Singh 
wrote:

> Hi,
>
> I am using flink-1.0.0 and running ETL 

Exception while running Flink jobs (1.0.0)

2016-10-03 Thread Tarandeep Singh
Hi,

I am using flink-1.0.0 and running ETL (batch) jobs on it for quite some
time (few months) without any problem. Starting this morning, I have been
getting errors like these-

"Received an event in channel 3 while still having data from a record. This
indicates broken serialization logic. If you are using custom serialization
code (Writable or Value types), check their serialization routines. In the
case of Kryo, check the respective Kryo serializer."

My datasets are in Avro format. The only thing that changed today is - I
moved to smaller cluster. When I first ran the ETL jobs, they failed with
this error-

"Insufficient number of network buffers: required 20, but only 10
available. The total number of network buffers is currently set to 2.
You can increase this number by setting the configuration key
'taskmanager.network.numberOfBuffers'"

I increased number of buffers to 30k. Also decreased number of slots per
machine as I noticed load per machine was too high. After that, when I
restart the jobs, I am getting the above error.

Can someone please help me debug it?

Thank you,
Tarandeep


Re: Regarding Late Elements

2016-10-03 Thread lgfmt
Not yet.
I'm hoping a Flink export on this mailing list will reply.

- LF



  From: vinay patil 
 To: user@flink.apache.org 
 Sent: Monday, October 3, 2016 8:09 AM
 Subject: Re: Regarding Late Elements
   
Hi LF,
So did you manage to get the workaround for it ?

I am using a Custom Trigger which is similar to 1.0.3 with few changes
Regards,Vinay Patil
On Mon, Oct 3, 2016 at 10:02 AM, lgfmt [via Apache Flink User Mailing List 
archive.] <[hidden email]> wrote:

 We have the same requirement - we cannot discard any data even if it arrives 
late. 
- LF
 



  From: Vinay Patil <[hidden email]>
 To: [hidden email] 
 Sent: Sunday, October 2, 2016 8:21 PM
 Subject: Regarding Late Elements
  
Hi Guys,
Just wanted to get an idea on Why Flink decided to completely discard late 
elements in the latest version ?, this was not the case in 1.0.3


P.S In our case the data is critical so we cannot discard a single record even 
if it is late, I have written a custom trigger (as suggested by Aljoscha) to 
even accept late elements.


Regards,Vinay Patil


 
   If you reply to this email, your message will be added to the discussion 
below: http://apache-flink-user- mailing-list-archive.2336050. 
n4.nabble.com/Regarding-Late- Elements-tp9284p9292.html   To start a new topic 
under Apache Flink User Mailing List archive., email [hidden email] 
 To unsubscribe from Apache Flink User Mailing List archive., click here.
 NAML 

 
View this message in context: Re: Regarding Late Elements
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


   

Re: Regarding Late Elements

2016-10-03 Thread vinay patil
Hi LF,

So did you manage to get the workaround for it ?

I am using a Custom Trigger which is similar to 1.0.3 with few changes

Regards,
Vinay Patil

On Mon, Oct 3, 2016 at 10:02 AM, lgfmt [via Apache Flink User Mailing List
archive.]  wrote:

> We have the same requirement - we cannot discard any data even if it
> arrives late.
>
>
> - LF
>
>
>
>
> --
> *From:* Vinay Patil <[hidden email]
> >
> *To:* [hidden email] 
> *Sent:* Sunday, October 2, 2016 8:21 PM
> *Subject:* Regarding Late Elements
>
> Hi Guys,
>
> Just wanted to get an idea on Why Flink decided to completely discard late
> elements in the latest version ?, this was not the case in 1.0.3
>
>
> P.S In our case the data is critical so we cannot discard a single record
> even if it is late, I have written a custom trigger (as suggested by
> Aljoscha) to even accept late elements.
>
>
> Regards,
> Vinay Patil
>
>
>
>
> --
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/Regarding-Late-Elements-tp9284p9292.html
> To start a new topic under Apache Flink User Mailing List archive., email
> ml-node+s2336050n1...@n4.nabble.com
> To unsubscribe from Apache Flink User Mailing List archive., click here
> 
> .
> NAML
> 
>




--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Regarding-Late-Elements-tp9284p9294.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.

Re: Using Flink

2016-10-03 Thread Govindarajan Srinivasaraghavan
Hi Gordon,

- I'm currently running my flink program on 1.2 SNAPSHOT with kafka source
and I have checkpoint enabled. When I look at the consumer offsets in kafka
it appears to be stagnant and there is a huge lag. But I can see my flink
program is in pace with kafka source in JMX metrics and outputs. Is there a
way to identify why the offsets are not committed to kafka?

On which commit was your Kafka connector built? There was a recent change
to the offset committing for Kafka 0.9 consumer, so identifying the exact
commit will help clarify whether the recent change introduced any new
problems. Also, what is your configured checkpoint interval? When
checkpointing is enabled, the Kafka consumer only commits to Kafka when
checkpoints are completed. So, offsets in Kafka are not updated until the
next checkpoint is triggered.

I am using 1.2-SNAPSHOT
'org.apache.flink', name: 'flink-connector-kafka-base_2.11', version:
'1.2-SNAPSHOT'
'org.apache.flink', name: 'flink-connector-kafka-0.9_2.11', version:
'1.2-SNAPSHOT'

- In my current application we custom loggers for debugging purposes. Let’s
say we want to find what’s happening for a particular user, we fire an api
request to add the custom logger for that particular user and use it for
logging along the data path. Is there a way to achieve this in flink? Are
there any global mutable parameters that I can use to achieve this
functionality?

I’m not sure if I understand the use case correctly, but it seems like you
will need to change configuration / behaviour of a specific Flink operator
at runtime. To my knowledge, the best way to do this in Flink right now is
to translate your original logging-trigger api requests to a stream of
events fed to Flink. This stream of events will then basically be changes
of your user logger behaviour, and your operators can change its logging
behaviour according to this stream.

I can send the changes as streams, but I need this change for all the
operators in my pipeline. Instead of using coflatmap at each operator to
combine the streams, is there a way to send a change to all the operators?

- Can I pass on state between operators? If I need the state stored on
previous operators, how can I fetch it?
I don’t think this is possible.
Fine, thanks.

Thanks.

On Mon, Oct 3, 2016 at 12:23 AM, Tzu-Li (Gordon) Tai 
wrote:

> Hi!
>
> - I'm currently running my flink program on 1.2 SNAPSHOT with kafka source
> and I have checkpoint enabled. When I look at the consumer offsets in kafka
> it appears to be stagnant and there is a huge lag. But I can see my flink
> program is in pace with kafka source in JMX metrics and outputs. Is there a
> way to identify why the offsets are not committed to kafka?
>
> On which commit was your Kafka connector built? There was a recent change
> to the offset committing for Kafka 0.9 consumer, so identifying the exact
> commit will help clarify whether the recent change introduced any new
> problems. Also, what is your configured checkpoint interval? When
> checkpointing is enabled, the Kafka consumer only commits to Kafka when
> checkpoints are completed. So, offsets in Kafka are not updated until the
> next checkpoint is triggered.
>
> - In my current application we custom loggers for debugging purposes.
> Let’s say we want to find what’s happening for a particular user, we fire
> an api request to add the custom logger for that particular user and use it
> for logging along the data path. Is there a way to achieve this in flink?
> Are there any global mutable parameters that I can use to achieve this
> functionality?
>
> I’m not sure if I understand the use case correctly, but it seems like you
> will need to change configuration / behaviour of a specific Flink operator
> at runtime. To my knowledge, the best way to do this in Flink right now is
> to translate your original logging-trigger api requests to a stream of
> events fed to Flink. This stream of events will then basically be changes
> of your user logger behaviour, and your operators can change its logging
> behaviour according to this stream.
>
> - Can I pass on state between operators? If I need the state stored on
> previous operators, how can I fetch it?
>
> I don’t think this is possible.
>
>
> Best Regards,
> Gordon
>
>
> On October 3, 2016 at 2:08:31 PM, Govindarajan Srinivasaraghavan (
> govindragh...@gmail.com) wrote:
>
> Hi,
>
>
>
> I have few questions on how I need to model my use case in flink. Please
> advise. Thanks for the help.
>
>
>
> - I'm currently running my flink program on 1.2 SNAPSHOT with kafka source
> and I have checkpoint enabled. When I look at the consumer offsets in kafka
> it appears to be stagnant and there is a huge lag. But I can see my flink
> program is in pace with kafka source in JMX metrics and outputs. Is there a
> way to identify why the offsets are not committed to kafka?
>
>
>
> - In my current application we custom loggers for debugging purposes.
> Let’s say we want to find 

Re: Regarding Late Elements

2016-10-03 Thread lgfmt
We have the same requirement - we cannot discard any data even if it arrives 
late. 
- LF
 



  From: Vinay Patil 
 To: user@flink.apache.org 
 Sent: Sunday, October 2, 2016 8:21 PM
 Subject: Regarding Late Elements
   
Hi Guys,
Just wanted to get an idea on Why Flink decided to completely discard late 
elements in the latest version ?, this was not the case in 1.0.3


P.S In our case the data is critical so we cannot discard a single record even 
if it is late, I have written a custom trigger (as suggested by Aljoscha) to 
even accept late elements.


Regards,Vinay Patil

   

Re: Flink Checkpoint runs slow for low load stream

2016-10-03 Thread Chakravarthy varaga
Hi Stephan,

Is the Async kafka offset commit released in 1.3.1?

Varaga

On Wed, Sep 28, 2016 at 9:49 AM, Chakravarthy varaga <
chakravarth...@gmail.com> wrote:

> Hi Stephan,
>
>  That should be great. Let me know once the fix is done and the
> snapshot version to use, I'll check and revert then.
>  Can you also share the JIRA that tracks the issue?
>
>  With regards to offset commit issue, I'm not sure as to how to
> proceed here. Probably I'll use your fix first and see if the problem
> reoccurs.
>
> Thanks much
> Varaga
>
> On Tue, Sep 27, 2016 at 7:46 PM, Stephan Ewen  wrote:
>
>> @CVP
>>
>> Flink stores in checkpoints in your case only the Kafka offsets (few
>> bytes) and the custom state (e).
>>
>> Here is an illustration of the checkpoint and what is stored (from the
>> Flink docs).
>> https://ci.apache.org/projects/flink/flink-docs-master/
>> internals/stream_checkpointing.html
>>
>>
>> I am quite puzzled why the offset committing problem occurs only for one
>> input, and not for the other.
>> I am preparing a fix for 1.2, possibly going into 1.1.3 as well.
>> Could you try out a snapshot version to see if that fixes your problem?
>>
>> Greetings,
>> Stephan
>>
>>
>>
>> On Tue, Sep 27, 2016 at 2:24 PM, Chakravarthy varaga <
>> chakravarth...@gmail.com> wrote:
>>
>>> Hi Stefan,
>>>
>>>  Thanks a million for your detailed explanation. I appreciate it.
>>>
>>>  -  The *zookeeper bundled with kafka 0.9.0.1* was used to start
>>> zookeeper. There is only 1 instance (standalone) of zookeeper running on my
>>> localhost (ubuntu 14.04)
>>>  -  There is only 1 Kafka broker (*version: 0.9.0.1* )
>>>
>>>  With regards to Flink cluster there's only 1 JM & 2 TMs started
>>> with no HA. I presume this does not use zookeeper anyways as it runs as
>>> standalone cluster.
>>>
>>>
>>>  BTW., The kafka connector version that I use is as suggested in the
>>> flink connectors page
>>>
>>>
>>>
>>>
>>> *. org.apache.flink
>>> flink-connector-kafka-0.9_2.10
>>> 1.1.1*
>>>
>>>  Do you see any issues with versions?
>>>
>>>  1) Do you have benchmarks wrt., to checkpointing in flink?
>>>
>>>  2) There isn't detailed explanation on what states are stored as
>>> part of the checkpointing process. For ex.,  If I have pipeline like
>>> *source -> map -> keyBy -> map -> sink, my assumption on what's stored
>>> is:*
>>>
>>> * a) The source stream's custom watermarked records*
>>>
>>> * b) Intermediate states of each of the transformations in the
>>> pipeline*
>>>
>>> * c) Delta of Records stored from the previous sink*
>>>
>>> * d) Custom States (SayValueState as in my case) - Essentially
>>> this is what I bother about storing.*
>>> * e) All of my operators*
>>>
>>>   Is my understanding right?
>>>
>>>  3) Is there a way in Flink to checkpoint only d) as stated above
>>>
>>>  4) Can you apply checkpointing to only streams and certain
>>> operators (say I wish to store aggregated values part of the transformation)
>>>
>>> Best Regards
>>> CVP
>>>
>>>
>>> On Mon, Sep 26, 2016 at 6:18 PM, Stephan Ewen  wrote:
>>>
 Thanks, the logs were very helpful!

 TL:DR - The offset committing to ZooKeeper is very slow and prevents
 proper starting of checkpoints.

 Here is what is happening in detail:

   - Between the point when the TaskManager receives the "trigger
 checkpoint" message and when the point when the KafkaSource actually starts
 the checkpoint is a long time (many seconds) - for one of the Kafka Inputs
 (the other is fine).
   - The only way this delayed can be introduced is if another
 checkpoint related operation (such as trigger() or notifyComplete() ) is
 still in progress when the checkpoint is started. Flink does not perform
 concurrent checkpoint operations on a single operator, to ease the
 concurrency model for users.
   - The operation that is still in progress must be the committing of
 the offsets (to ZooKeeper or Kafka). That also explains why this only
 happens once one side receives the first record. Before that, there is
 nothing to commit.


 What Flink should fix:
   - The KafkaConsumer should run the commit operations asynchronously,
 to not block the "notifyCheckpointComplete()" method.

 What you can fix:
   - Have a look at your Kafka/ZooKeeper setup. One Kafka Input works
 well, the other does not. Do they go against different sets of brokers, or
 different ZooKeepers? Is the metadata for one input bad?
   - In the next Flink version, you may opt-out of committing offsets to
 Kafka/ZooKeeper all together. It is not important for Flink's checkpoints
 anyways.

 Greetings,
 Stephan


 On Mon, Sep 26, 2016 at 5:13 PM, Chakravarthy varaga <
 chakravarth...@gmail.com> 

Side Inputs vs. Connected Streams

2016-10-03 Thread Sameer W
Hi,

I read the Side Inputs

design document. How does it compare to using ConnectedStreams with respect
to handling the ordering of streams transparently?

One of the challenges I have with ConnectedStreams is I need to buffer main
input if the rules stream has not arrived yet. Does this automatically go
away with Side Inputs? Will the call to  String sideValue =

   getRuntimeContext().getSideInput(filterString);
block if the side input is not available yet? And is the reverse also true?

Alternatively, if my rules are not large in number and I want to broadcast
them to all nodes is the below equivalent to using SideInputs where side
inputs are broadcast to all nodes and ensure that the side input is
evaluated before the main input:

DataStream ds4 = ds3.connect(dsSide.broadcast());

Will the above ensure that dsSide is always available before ds3 elements
arrive on the connected stream. Am I correct in assuming that ds2 changes
will continue to be broadcast to ds3 (with no ordering guarantees between
ds3 and dsSide, ofcourse).


Thanks,
Sameer


Re: Flink job fails to restore RocksDB state after upgrading to 1.2-SNAPSHOT

2016-10-03 Thread Josh
Hi Stefan,

Sorry for the late reply - I was away last week.
I've just got round to retrying my above scenario (run my job, take a
savepoint, restore my job) using the latest Flink 1.2-SNAPSHOT  -- and am
now seeing a different exception when restoring the state:

10/03/2016 11:29:02 Job execution switched to status FAILING.
java.lang.RuntimeException: Could not deserialize NFA.
at org.apache.flink.api.java.typeutils.runtime.JavaSerializer.deserialize(
JavaSerializer.java:86)
at org.apache.flink.api.java.typeutils.runtime.JavaSerializer.deserialize(
JavaSerializer.java:31)
at org.apache.flink.runtime.state.DefaultOperatorStateBackend.
getPartitionableState(DefaultOperatorStateBackend.java:107)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.
initializeState(FlinkKafkaConsumerBase.java:323)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(
AbstractUdfStreamOperator.java:105)
at org.apache.flink.streaming.runtime.tasks.StreamTask.
openAllOperators(StreamTask.java:396)
at org.apache.flink.streaming.runtime.tasks.StreamTask.
invoke(StreamTask.java:269)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:608)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.apache.flink.streaming.
connectors.kafka.internals.KafkaTopicPartition

Any ideas what's going on here? Is the Kafka consumer state management
broken right now in Flink master?

Thanks,
Josh


On Thu, Sep 22, 2016 at 9:28 AM, Stefan Richter  wrote:

> Hi,
>
> to me, this looks like you are running into the problem described under
> [FLINK-4603] : KeyedStateBackend cannot restore user code classes. I have
> opened a pull request (PR 2533) this morning that should fix this behavior
> as soon as it is merged into master.
>
> Best,
> Stefan
>
> Am 21.09.2016 um 23:49 schrieb Josh :
>
> Hi Stephan,
>
> Thanks for the reply. I should have been a bit clearer but actually I was
> not trying to restore a 1.0 savepoint with 1.2. I ran my job on 1.2 from
> scratch (starting with no state), then took a savepoint and tried to
> restart it from the savepoint - and that's when I get this exception. If I
> do this with the same job using an older version of Flink (1.1-SNAPSHOT
> taken in June), the savepoint and restore works fine.
>
> I wanted to use 1.2-SNAPSHOT because it has some changes I'd like to use
> (improvements to Kinesis connector + the bucketing sink). Anyway for now I
> have things working with an older version of Flink - but it would be good
> to know what's changed recently that's causing the restore to break and if
> my job is not going to be compatible with future releases of Flink.
>
> Best,
> Josh
>
> On Wed, Sep 21, 2016 at 8:21 PM, Stephan Ewen  wrote:
>
>> Hi Josh!
>>
>> The state code on 1.2-SNAPSHOT is undergoing pretty heavy changes right
>> now, in order to add the elasticity feature (change parallelism or running
>> jobs and still maintaining exactly once guarantees).
>> At this point, no 1.0 savepoints can be restored in 1.2-SNAPSHOT. We will
>> try and add compatibility towards 1.1 savepoints before the release of
>> version 1.2.
>>
>> I think the exception is probably caused by the fact that old savepoint
>> stored some serialized user code (the new one is not expected to) which
>> cannot be loaded.
>>
>> Adding Aljoscha and Stefan to this, to see if they can add anything.
>> In any case, this should have a much better error message.
>>
>> I hope you understand that the 1.2-SNAPSHOT version is in some parts WIP,
>> so not really recommended for general use.
>>
>> Does version 1.1 not work for you?
>>
>> Greetings,
>> Stephan
>>
>>
>> On Wed, Sep 21, 2016 at 7:44 PM, Josh  wrote:
>>
>>> Hi,
>>>
>>> I have a Flink job which uses the RocksDBStateBackend, which has been
>>> running on a Flink 1.0 cluster.
>>>
>>> The job is written in Scala, and I previously made some changes to the
>>> job to ensure that state could be restored. For example, whenever I call
>>> `map` or `flatMap` on a DataStream, I pass a named class: `.flatMap(new
>>> MyCustomFlatMapper())` instead of an anonymous function.
>>>
>>> I've been testing the job on Flink 1.2-SNAPSHOT, and I am no longer able
>>> to restore state. I'm seeing exceptions which look like this when trying to
>>> restore from a savepoint:
>>>
>>> java.lang.RuntimeException: Could not initialize keyed state backend.
>>> at org.apache.flink.streaming.api.operators.AbstractStreamOpera
>>> tor.open(AbstractStreamOperator.java:148)
>>> Caused by: java.lang.ClassNotFoundException:
>>> com.joshfg.flink.job.MyJob$MyCustomFlatMapper$$anon$4$$anon$2
>>> at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBa
>>> ckend$RocksDBRestoreOperation.restoreKVStateMetaData(RocksDB
>>> KeyedStateBackend.java:653)
>>>
>>> I'm not passing any anonymous functions to `map` or `flatMap` on Flink
>>> DataStreams, so it looks like this exception 

Re: How to stop job through java API

2016-10-03 Thread Dayong
Now, I am able to create an class of flink she'll front end UI class. Once I 
have this instance, I can launch cmd in Java. Cmd has features to connect 
remote resource manager to do most if actions.

Thanks,
Will

> On Oct 3, 2016, at 4:31 AM, "LINZ, Arnaud"  wrote:
> 
> Hi,
>  
> I have a similar issue. Here is how I deal with programmatically stopping 
> permanent streaming jobs, and I’m interested in knowing if there is a better 
> way now.
>  
> Currently, I use hand-made streaming sources that periodically check for some 
> flag and end if a stop request was made.  Stopping the source allow the 
> on-going streaming jobs to finish smoothly.
> That way, by using a “live file” presence check in my sources, and creating a 
> “stopped ok file” when the streaming execution ends ok, I’m able to make  “a 
> stopping shell script” that notify the streaming chain of a user cancellation 
> by deleting the “live file” and checking for the “stopped ok file” in a 
> waiting loop before ending.
>  
> This stopping shell script allow me to automate “stop/start” scripts when I 
> need to perform some task (like a reference table refresh) that requires the 
> streaming job to be stopped (without cancelling it from the rest interface, 
> so without relying on the snapshot system to prevent the loss of intermediate 
> data, and allowing me to implement clean-up code if needed).
>  
> The drawback of this method is not being able to use “off-the-shelf” sources 
> directly, I always have to wrap or patch them to implement the live file 
> check.
>  
> Best regards,
> Arnaud
>  
>  
> De : Aljoscha Krettek [mailto:aljos...@apache.org] 
> Envoyé : mercredi 21 septembre 2016 14:54
> À : user@flink.apache.org; Max Michels 
> Objet : Re: How to stop job through java API
>  
> Hi,
> right now this is not possible, I'm afraid.
>  
> I'm looping in Max who has done some work in that direction. Maybe he's got 
> something to say.
>  
> Cheers,
> Aljoscha
>  
> On Wed, 14 Sep 2016 at 03:54 Will Du  wrote:
> Hi folks,
> How to stop link job given job_id through java API?
> Thanks,
> Will
> 
> 
> L'intégrité de ce message n'étant pas assurée sur internet, la société 
> expéditrice ne peut être tenue responsable de son contenu ni de ses pièces 
> jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous 
> n'êtes pas destinataire de ce message, merci de le détruire et d'avertir 
> l'expéditeur.
> 
> The integrity of this message cannot be guaranteed on the Internet. The 
> company that sent this message cannot therefore be held liable for its 
> content nor attachments. Any unauthorized use or dissemination is prohibited. 
> If you are not the intended recipient of this message, then please delete it 
> and notify the sender.


RE: How to stop job through java API

2016-10-03 Thread LINZ, Arnaud
Hi,

I have a similar issue. Here is how I deal with programmatically stopping 
permanent streaming jobs, and I’m interested in knowing if there is a better 
way now.

Currently, I use hand-made streaming sources that periodically check for some 
flag and end if a stop request was made. Stopping the source allow the on-going 
streaming jobs to finish smoothly.
That way, by using a “live file” presence check in my sources, and creating a 
“stopped ok file” when the streaming execution ends ok, I’m able to make  “a 
stopping shell script” that notify the streaming chain of a user cancellation 
by deleting the “live file” and checking for the “stopped ok file” in a waiting 
loop before ending.

This stopping shell script allow me to automate “stop/start” scripts when I 
need to perform some task (like a reference table refresh) that requires the 
streaming job to be stopped (without cancelling it from the rest interface, so 
without relying on the snapshot system to prevent the loss of intermediate 
data, and allowing me to implement clean-up code if needed).

The drawback of this method is not being able to use “off-the-shelf” sources 
directly, I always have to wrap or patch them to implement the live file check.

Best regards,
Arnaud


De : Aljoscha Krettek [mailto:aljos...@apache.org]
Envoyé : mercredi 21 septembre 2016 14:54
À : user@flink.apache.org; Max Michels 
Objet : Re: How to stop job through java API

Hi,
right now this is not possible, I'm afraid.

I'm looping in Max who has done some work in that direction. Maybe he's got 
something to say.

Cheers,
Aljoscha

On Wed, 14 Sep 2016 at 03:54 Will Du 
> wrote:
Hi folks,
How to stop link job given job_id through java API?
Thanks,
Will



L'intégrité de ce message n'étant pas assurée sur internet, la société 
expéditrice ne peut être tenue responsable de son contenu ni de ses pièces 
jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous 
n'êtes pas destinataire de ce message, merci de le détruire et d'avertir 
l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company 
that sent this message cannot therefore be held liable for its content nor 
attachments. Any unauthorized use or dissemination is prohibited. If you are 
not the intended recipient of this message, then please delete it and notify 
the sender.


Re: Using Flink

2016-10-03 Thread Tzu-Li (Gordon) Tai
Hi!

- I'm currently running my flink program on 1.2 SNAPSHOT with kafka source and 
I have checkpoint enabled. When I look at the consumer offsets in kafka it 
appears to be stagnant and there is a huge lag. But I can see my flink program 
is in pace with kafka source in JMX metrics and outputs. Is there a way to 
identify why the offsets are not committed to kafka?

On which commit was your Kafka connector built? There was a recent change to 
the offset committing for Kafka 0.9 consumer, so identifying the exact commit 
will help clarify whether the recent change introduced any new problems. Also, 
what is your configured checkpoint interval? When checkpointing is enabled, the 
Kafka consumer only commits to Kafka when checkpoints are completed. So, 
offsets in Kafka are not updated until the next checkpoint is triggered.

- In my current application we custom loggers for debugging purposes. Let’s say 
we want to find what’s happening for a particular user, we fire an api request 
to add the custom logger for that particular user and use it for logging along 
the data path. Is there a way to achieve this in flink? Are there any global 
mutable parameters that I can use to achieve this functionality?

I’m not sure if I understand the use case correctly, but it seems like you will 
need to change configuration / behaviour of a specific Flink operator at 
runtime. To my knowledge, the best way to do this in Flink right now is to 
translate your original logging-trigger api requests to a stream of events fed 
to Flink. This stream of events will then basically be changes of your user 
logger behaviour, and your operators can change its logging behaviour according 
to this stream.

- Can I pass on state between operators? If I need the state stored on previous 
operators, how can I fetch it?

I don’t think this is possible.


Best Regards,
Gordon


On October 3, 2016 at 2:08:31 PM, Govindarajan Srinivasaraghavan 
(govindragh...@gmail.com) wrote:

Hi,

 

I have few questions on how I need to model my use case in flink. Please 
advise. Thanks for the help.

 

- I'm currently running my flink program on 1.2 SNAPSHOT with kafka source and 
I have checkpoint enabled. When I look at the consumer offsets in kafka it 
appears to be stagnant and there is a huge lag. But I can see my flink program 
is in pace with kafka source in JMX metrics and outputs. Is there a way to 
identify why the offsets are not committed to kafka?

 

- In my current application we custom loggers for debugging purposes. Let’s say 
we want to find what’s happening for a particular user, we fire an api request 
to add the custom logger for that particular user and use it for logging along 
the data path. Is there a way to achieve this in flink? Are there any global 
mutable parameters that I can use to achieve this functionality?

 

- Can I pass on state between operators? If I need the state stored on previous 
operators, how can I fetch it?

 

Thanks

Using Flink

2016-10-03 Thread Govindarajan Srinivasaraghavan
Hi,



I have few questions on how I need to model my use case in flink. Please
advise. Thanks for the help.



- I'm currently running my flink program on 1.2 SNAPSHOT with kafka source
and I have checkpoint enabled. When I look at the consumer offsets in kafka
it appears to be stagnant and there is a huge lag. But I can see my flink
program is in pace with kafka source in JMX metrics and outputs. Is there a
way to identify why the offsets are not committed to kafka?



- In my current application we custom loggers for debugging purposes. Let’s
say we want to find what’s happening for a particular user, we fire an api
request to add the custom logger for that particular user and use it for
logging along the data path. Is there a way to achieve this in flink? Are
there any global mutable parameters that I can use to achieve this
functionality?



- Can I pass on state between operators? If I need the state stored on
previous operators, how can I fetch it?



Thanks