Re: CKAN inputFormat (batch)

2018-12-03 Thread vino yang
Hi Flavio, I can not open the first link[1] you provided. And what is your purpose? Introduce your CKAN input format to the community? Thanks, vino. [1]: https://ckan.org/about/instances/ Flavio Pompermaier 于2018年12月4日周二 上午1:09写道: > Hi to all, > we've just published an example of a simple

Re: If you are an expert in flink sql, then I really need your help...

2018-12-03 Thread clay4444
I have found out that checkpoint is not triggered. Regarding the in operation in flink sql, this sql will trigger checkpoint normally. select name,age from user where id in

Re: flink list and flink run commands timeout

2018-12-03 Thread Aneesha Kaushal
Thanks Chesnay! The exception is gone now.  On 03-Dec-2018, at 5:22 PM, Chesnay Schepler wrote:Based on the stacktrace the client is not running in legacy mode; please check the client flink-conf.yaml.

Flink Exception - AmazonS3Exception and ExecutionGraph - Error in failover strategy

2018-12-03 Thread Flink Developer
I have a Flink app on 1.5.2 which sources data from Kafka topic (400 partitions) and runs with 400 parallelism. The sink uses bucketing sink to S3 with rocks db. Checkpoint interval is 2 min and checkpoint timeout is 2 min. Checkpoint size is a few mb. After execution for a few days, I see:

Re: flink list and flink run commands timeout

2018-12-03 Thread Aneesha Kaushal
Thanks Chesnay! The exception is gone now. > On 03-Dec-2018, at 5:22 PM, Chesnay Schepler wrote: > > Based on the stacktrace the client is not running in legacy mode; please > check the client flink-conf.yaml.

CKAN inputFormat (batch)

2018-12-03 Thread Flavio Pompermaier
Hi to all, we've just published an example of a simple CKAN input format that downloads a CKAN resource (in parallel) from a CKAN catalog and produce a DataSet. This can be very helpful in setting up a Flink demo using an OpenData dataset available online (see [1] for a list of available

Re: Stream in loop and not getting to sink (Parquet writer )

2018-12-03 Thread Kostas Kloudas
Hi Avi, If Parquet is not a requirement then you can use the StreamingFileSink and write as plain text, if this is ok for you. In this case, you can set the batch size and specify a custom RollingPolicy in general. For example I would recommend to check [1] where you have, of course, to adjust

Re: Flink CEP support pattern match involving fields of previous events

2018-12-03 Thread Dawid Wysakowicz
Hi Florin, This feature is supported with IterativeCondition since 1.3.0. For questions about API and what features are supported in general please always have a look into documentation[1] first. Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/libs/cep.html On

Re: If you are an expert in flink sql, then I really need your help...

2018-12-03 Thread Timo Walther
Hi, it is very difficult to spot the problem with the little information you gave us. Maybe you can show us a simplified SQL query and the implementation of the `LAST_VALUE` function? An initial guess would be that you are running out of memory such that YARN kills your task manager. If

Flink CEP support pattern match involving fields of previous events

2018-12-03 Thread Spico Florin
Hello! I would like to ask you if you have added the support for CEP for match involving fields of previous events, the same that was asked here: "However it would be great to create Pattern out of a capability like: .where(second_evt->evt.getTemperature() == first_evt->evt.getTemperature()"

If you are an expert in flink sql, then I really need your help...

2018-12-03 Thread clay4444
I am using flink sql to do some complicated calculations. I have encountered some very difficult problems in this process, so I would like to ask everyone for your help. My goal is to build a data stream with a very accurate result, which is also in line with the Streaming System. The core

Re: Stream in loop and not getting to sink (Parquet writer )

2018-12-03 Thread Avi Levi
Thanks Kostas, Ok got it, so bucketingSink might not be a good choice here. can you please advice what will be the best approach ? I have heavy load of data that I consume from kafka that I want to process and put them in a file (doesn't have to be parquet) . I thought that StreamingFileSink might

Re: Stream in loop and not getting to sink (Parquet writer )

2018-12-03 Thread Kostas Kloudas
Hi Avi, For Bulk Formats like Parquet, unfortunately, we do not support setting the batch size. The part-files roll on every checkpoint. This is a known limitation and there are plans to alleviate it in the future. Setting the batch size (among other things) is supported for RowWise formats.

Re: Looking for example for bucketingSink / StreamingFileSink

2018-12-03 Thread miki haiat
HI Avi , Im assuming that the cause of the "pending" file is because the checkpoint isn't finished successfully [1] Can you try to change the checkpoint time to 1 min as well . Thanks, Miki [1]

Re: flink list and flink run commands timeout

2018-12-03 Thread Chesnay Schepler
Based on the stacktrace the client is not running in legacy mode; please check the client flink-conf.yaml. On 03.12.2018 12:10, Aneesha Kaushal wrote: Hello, I am facing the same Timeout exception, at flink run and flink list commands when I am trying to deploy jobs in Flink 1.6 in “legacy"

Re: Weird behavior in actorSystem shutdown in akka

2018-12-03 Thread Till Rohrmann
Hi Joshua, sorry for getting back to you so late. Personally, I haven't seen this problem before. Without more log context I think I won't be able to help you. This looks a bit more like an Akka problem than a Flink problem to be honest. One cause could be that akka.remote.flush-wait-on-shutdown

not able to join data coming from kafka

2018-12-03 Thread Rakesh Kumar
Hello Team, public class FlinkJoinDataStream { @SuppressWarnings("serial") public static void main(String[] args) { Properties props = new Properties(); props.setProperty("zookeeper.connect", "localhost:2181"); props.setProperty("bootstrap.servers", "localhost:9092");

Re: flink list and flink run commands timeout

2018-12-03 Thread Aneesha Kaushal
Hello, I am facing the same Timeout exception, at flink run and flink list commands when I am trying to deploy jobs in Flink 1.6 in “legacy" mode. We are planning to run in legacy mode because after upgrading from Flink 1.3 to Flink 1.6, flink job was not getting distributed across task

Looking for example for bucketingSink / StreamingFileSink

2018-12-03 Thread Avi Levi
Hi Guys, very new to flink so my apology for the newbie questions :) but I desperately looking for a good example for streaming to file using bucketingSink / StreamingFileSink . Unfortunately the examples in the documentation are not event compiling (at least not the ones in scala

Re: Dulicated messages in kafka sink topic using flink cancel-with-savepoint operation

2018-12-03 Thread Piotr Nowojski
Good to hear that :) Duplicated “uncommitted” messages are normal and to be expected. After all that’s what `read_uncommitted` is for - to be able to read the messages without waiting until they are committed and thus even if their transactions was later aborted. Piotrek > On 1 Dec 2018, at