Re: Use a round-robin kafka partitioner

2017-10-24 Thread kla
Hi Chesnay, Thanks for your reply. I would like to use the partitioner within the Kafka Sink operation. By default kafka sink is using FixedPartitioner: public FlinkKafkaProducer010(String topicId, KeyedSerializationSchema serializationSchema, Properties producerConfig) {

Delta iteration not spilling to disk

2017-10-24 Thread Joshua Griffith
I’m currently using a delta iteration within a batch job and received the following error: java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 11 maxPartition: 24 number of overflow segments: 0 bucketSize: 125 Overall memory: 23232512 Partition

Reading Yarn Application Name in flink

2017-10-24 Thread Navneeth Krishnan
Hi All, Is there a way to read the yarn application id/ name within flink so that the logs can be sent to an external logging stack like ELK or CloudWatch merged by the application? Thanks, Navneeth

RE: Impersonation support in Flink

2017-10-24 Thread Newport, Billy
Our scenario is to enable a specific Kerberos to impersonate any Kerberos in a specific group, this is enabled the in hdfs configuration. That Kerberos does not need to be root, just a Kerberos allowed to impersonate that users in that group. We want the job to access HDFS as the impersonated

Re: Monitoring folder in flink

2017-10-24 Thread kitex
The code I have pasted is all that I have. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Monitoring folder in flink

2017-10-24 Thread रविशंकर नायर
Can you please share the full code? Thanks, RAV On Oct 22, 2017 3:37 AM, "Sugandha Amatya" wrote: I have folder where new files arrive at schedule. Why is my flink readfile not reading new files. I have used but *PROCESS_ONCE* and *PROCESS_CONTINUOUSLY*. When I use

Re: Monitoring folder in flink

2017-10-24 Thread Sugandha Amatya
Hi I found that flink polls directory based on modified date. In windows when I copy files the modified date remained same. So, PROCESS_CONTINUOUSLY resolved the issue. On Tue, Oct 24, 2017 at 6:09 PM, Fabian Hueske wrote: > Hi, > > with PROCESS_CONTINUOUSLY the application

Re: Use a round-robin kafka partitioner

2017-10-24 Thread Chesnay Schepler
Could you expand a bit more on what you want to achieve? (In particular /where/ you want to use this partitioner; as an operation before a sink or within a kafka sink) On 24.10.2017 09:24, kla wrote: Hey, I would like to use a round-robin kafka partitioner in the apache flink. (the default

Re: Flink flick cancel vs stop

2017-10-24 Thread Piotr Nowojski
I would propose implementations of NewSource to be not blocking/asynchronous. For example something like public abstract Future getCurrent(); Which would allow us to perform some certain actions while there are no data available to process (for example flush output buffers). Something like

Re: Avoid duplicate messages while restarting a job for an application upgrade

2017-10-24 Thread Aljoscha Krettek
Hi, Sorry for entering the discussion somewhat late but I wrote on the Issue you created, please have a look. Best, Aljoscha > On 20. Oct 2017, at 16:56, Antoine Philippot > wrote: > > Hi Piotrek, > > I come back to you with a Jira ticket that I created and a

Re: Flink REST API async?

2017-10-24 Thread Aljoscha Krettek
Hi, Unfortunately, the FLIP-6 efforts are taking longer than expected and we won't have those changes to the REST API in the 1.4 release (which should happen in about a month). We are planning to very quickly release 1.5 after that, with the changes to the REST API. The only work-around I

Could not initialize keyed state backend on restart from checkpoint

2017-10-24 Thread Federico D'Ambrosio
Hello everyone, while trying to restart a flink job from an externalized checkpoint I'm getting the following exception: java.lang.IllegalStateException: Could not initialize keyed state backend. at

Re: HBase config settings go missing within Yarn.

2017-10-24 Thread Niels Basjes
I changed my cluster config (on all nodes) to include the HBase config dir in the classpath. Now everything works as expected. This may very well be a misconfiguration of my cluster. How ever ... My current assesment: Tools like Pig use the HBase config which has been specified on the LOCAL

Re: HBase config settings go missing within Yarn.

2017-10-24 Thread Niels Basjes
Minor correction: The HBase jar files are on the classpath, just in a different order. On Tue, Oct 24, 2017 at 11:18 AM, Niels Basjes wrote: > I did some more digging. > > I added extra code to print both the environment variables and the > classpath that is used by the

Re: HBase config settings go missing within Yarn.

2017-10-24 Thread Niels Basjes
I did some more digging. I added extra code to print both the environment variables and the classpath that is used by the HBaseConfiguration to load the resource files. I call this both locally and during startup of the job (i.e. these logs arrive in the jobmanager.log on the cluster) Summary of

Re: Local combiner on each mapper in Flink

2017-10-24 Thread Kurt Young
I think you can use WindowedStream.aggreate Best, Kurt On Tue, Oct 24, 2017 at 1:45 PM, Le Xu wrote: > Thanks Kurt. Maybe I wasn't clear before, I was wondering if Flink has > implementation of combiner in DataStream (to use after keyBy and windowing). > > Thanks again! >

Re: Questions about checkpoints/savepoints

2017-10-24 Thread Aljoscha Krettek
Hi, That distinction with externalised checkpoints is a bit of a pitfall and I'm hoping that we can actually get rid of that distinction in the next version or the version after that. With that change, all checkpoints would always be externalised, since it's not really any noticeable overhead.

Re: Incompatible types of expression and result type.

2017-10-24 Thread Timo Walther
Hi, I could found the problem in your implementation. The Table API program is correct. However, the DataStream program that you construct in your TableSource has a wrong type. When ever you use a Row type, you need to specify the type either by implementing ResultTypeQueryable or in your

Re: How to test new sink

2017-10-24 Thread Timo Walther
Yes, if you think you need better public test utilities. Feel free to open an issue for it. Timo Am 10/23/17 um 5:32 PM schrieb Rinat: Timo, thx for your reply. I’m using gradle instead of maven, but I’ll look through the existing similar plugins for it. I don’t think, that sharing of

Use a round-robin kafka partitioner

2017-10-24 Thread kla
Hey, I would like to use a round-robin kafka partitioner in the apache flink. (the default one) I forked the Kafka's code from the DefaultPartitioner class. public class HashPartitioner extends KafkaPartitioner implements Serializable { private final AtomicInteger counter = new

Re: Questions about checkpoints/savepoints

2017-10-24 Thread vipul singh
Thanks Tony, that was the issue. I was thinking that when we use Rocksdb and provide an s3 path, it uses externalized checkpoints by default. Thanks so much! I have one followup question. Say in above case, I terminate the cluster, and since the metadata is on s3, and not on local storage, does

Re: Get EOF from PrometheusReporter in JM

2017-10-24 Thread Tony Wei
Hi Max, Good to know. Thanks very much. Best Regards, Tony Wei 2017-10-24 13:52 GMT+08:00 Maximilian Bode : > Hi Tony, > > thanks for troubleshooting this. I have added a commit to > https://github.com/apache/flink/pull/4586 that should enable you to use > the

Re: Questions about checkpoints/savepoints

2017-10-24 Thread Tony Wei
Hi, Did you enable externalized checkpoints? [1] Best, Tony Wei [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/checkpoints.html#externalized-checkpoints 2017-10-24 13:07 GMT+08:00 vipul singh : > Thanks Aljoscha for the answer above. > > I am