Re: problem with increase job parallelism

2017-10-20 Thread Fabian Hueske
Hi Lei, setting explicit operator ID should solve this issue. As far as I know, the auto-generated operator id also depended on the operator parallelism in previous versions of Flink (not sure until which point). Which version are you running? Best, Fabian 2017-10-17 3:15 GMT+02:00 Lei Chen

Re: HBase config settings go missing within Yarn.

2017-10-20 Thread Piotr Nowojski
Is this /etc/hbase/conf/hbase-site.xml file is present on all of the machines? If yes, could you share your code? > On 20 Oct 2017, at 16:29, Niels Basjes wrote: > > I look at the logfiles from the Hadoop Yarn webinterface. I.e. actually > looking in the jobmanager.log of the

BucketingSink with disabled checkpointing will never clean up it's state

2017-10-20 Thread Rinat
Hi, got one more little question about BucketingSink with disabled checkpointing. In terms of my current task, I’m looking through sources of BucketingSink and it seem’s that I found an issue for the case, when checkpointing is disabled. BucketingSink - is a flink rich function, that also

Re: Avoid duplicate messages while restarting a job for an application upgrade

2017-10-20 Thread Antoine Philippot
Hi Piotrek, I come back to you with a Jira ticket that I created and a proposal the ticket : https://issues.apache.org/jira/browse/FLINK-7883 the proposal : https://github.com/aphilippot/flink/commit/9c58c95bb4b68ea337f7c583b7e039d86f3142a6 I'am open to any comments or suggestions Antoine Le

Re:

2017-10-20 Thread Piotr Nowojski
Hi, Only batch API is using managed memory. If you are using streaming API, you can do two things: - estimate max cache size based on for example fraction of max heap size - use WeakReference to implement your cache In batch API, you could estimate max cache size based on: - fraction of

Re: Question about configuring Rich Functions

2017-10-20 Thread Michael Kobit
This thread [1] is where I asked about it as well. [1] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/LocalStreamEnvironment-configuration-doesn-t-seem-to-be-used-in-RichFunction-operators-td15751.html#a15777 On Fri, Oct 13, 2017 at 9:50 PM Tony Wei

Re: flink can't read hdfs namenode logical url

2017-10-20 Thread Piotr Nowojski
Hi, Please double check the content of config files in YARN_CONF_DIR and HADOOP_CONF_DIR (the first one has a priority over the latter one) and that they are pointing to correct files. Also check logs (WARN and INFO) for any relevant entries. Piotrek > On 20 Oct 2017, at 06:07, 邓俊华

Re: java.lang.NoSuchMethodError and dependencies problem

2017-10-20 Thread Piotr Nowojski
That’s good to hear :) Unfortunately at this moment dependencies can pollute class path in both ways (Flink’s can mess with user’s application and also the other way around). Cheers, Piotrek > On 20 Oct 2017, at 15:11, r. r. wrote: > > By Ali Baba's beard and his forty

Re: java.lang.NoSuchMethodError and dependencies problem

2017-10-20 Thread r. r.
By Ali Baba's beard and his forty bandits, Piotrek, this worked! My understanding was that I have to prevent Flink from loading the older compress.jar and force the newer one. One I shade-relocated org.apache.commons.compress for my project the problem went away Many thanks! >

Re: Does heap memory gets released after computing windows?

2017-10-20 Thread Piotr Nowojski
Hi, Memory used by session windows should be released once window is triggered (allowedLateness can prolong window’s life). Unless your code introduces some memory leak (by not releasing references) everything should be garbage collected. Keep in mind that session windows with time gap of 10

Re: HBase config settings go missing within Yarn.

2017-10-20 Thread Piotr Nowojski
Hi, What do you mean by saying: > When I open the logfiles on the Hadoop cluster I see this: The error doesn’t come from Flink? Where do you execute hbaseConfig.addResource(new Path("file:/etc/hbase/conf/hbase-site.xml")); ? To me it seems like it is a problem with misconfigured HBase and

Re: Flink BucketingSink, subscribe on moving of file into final state

2017-10-20 Thread Piotr Nowojski
You’re welcome. Unfortunately I am not aware about a such use case before Piotrek > On 20 Oct 2017, at 13:47, Rinat wrote: > > Piotrek, thanks for your reply. > > Yes, now I’m looking for the most suitable way to extend BucketingSink > functionality, to handle

Re: Parallelism, registerEventTimeTimer and watermark problem

2017-10-20 Thread Aljoscha Krettek
Hi Fritz, If the watermark is not updating this usually means that one of the input partitions (if you're using Kafka) is not carrying data. In that case, the watermark/timestamp assigner will have no data on which to base an updated watermark. For such use cases I recently implemented a

Re: Custom Sink Checkpointing errors

2017-10-20 Thread vipul singh
Thanks Stefan for the answers. The serialization is happening during the creation of snapshot state. I have added a gist with a larger stacktrace( https://gist.github.com/neoeahit/aee5562bf0b8d8d02e2a012f6735d850). I am not using any serializer, in the custom sink. We have src.keyBy(m =>

Re: Flink BucketingSink, subscribe on moving of file into final state

2017-10-20 Thread Rinat
Piotrek, thanks for your reply. Yes, now I’m looking for the most suitable way to extend BucketingSink functionality, to handle moments of moving the file into final state. I thought, that maybe someone has already implemented such thing or knows any other approaches that will help me to not

Re: java.lang.NoSuchMethodError and dependencies problem

2017-10-20 Thread Piotr Nowojski
But you said > this seems to work as mvn dependency:tree -Ddetail=true only shows 1.14 To avoid this error that you describe I think that you have to ensure, that no 1.14 commons-compress comes from your application, because it can conflict with 1.4.1 used by flink cluster. By shading I meant

Re: HBase config settings go missing within Yarn.

2017-10-20 Thread Niels Basjes
To facilitate you guys helping me I put this test project on github: https://github.com/nielsbasjes/FlinkHBaseConnectProblem Niels Basjes On Fri, Oct 20, 2017 at 1:32 PM, Niels Basjes wrote: > Hi, > > Ik have a Flink 1.3.2 application that I want to run on a Hadoop yarn >

HBase config settings go missing within Yarn.

2017-10-20 Thread Niels Basjes
Hi, Ik have a Flink 1.3.2 application that I want to run on a Hadoop yarn cluster where I need to connect to HBase. What I have: In my environment: HADOOP_CONF_DIR=/etc/hadoop/conf/ HBASE_CONF_DIR=/etc/hbase/conf/ HIVE_CONF_DIR=/etc/hive/conf/ YARN_CONF_DIR=/etc/hadoop/conf/ In

Re: Custom Sink Checkpointing errors

2017-10-20 Thread Stefan Richter
Hi, the crash looks unrelated to Flink code from the dump’s trace. Since it happens somewhere in managing a jar file, it might be related to this: https://bugs.openjdk.java.net/browse/JDK-8142508 , point (2). Maybe your jar gets overwritten

Re: Flink Streaming example: Kafka010Example.scala doesn't work

2017-10-20 Thread Tzu-Li (Gordon) Tai
Hi, Thanks for reporting back! That’s good to know. Gordon On 20 October 2017 at 3:51:21 PM, Wojtkowski, Michal (michal.wojtkowski@roche.com) wrote: Hi Gordon,  Thanks for finding time to write back! I managed to solve the issue and it turned out to be entirely related to kafka

Flink BucketingSink, subscribe on moving of file into final state

2017-10-20 Thread Rinat
Hi All ! I’m trying to create a meta-info file, that contains link to file, created by Flink BucketingSink. At first I was trying to implement my own org.apache.flink.streaming.connectors.fs.Writer, that creates a meta-file on close method call. But I understood, that it’s not completely

Custom Sink Checkpointing errors

2017-10-20 Thread vipul singh
Hello all, I am working on a custom sink implementation, but having weird issues with checkpointing. I am using a custom ListState to checkpoint, and it looks like this: private var checkpointMessages: ListState[Bucket] =_ My snapshot function looks like: @throws[IOException] def