Re: Does Flink SQL "in" operation has length limit?

2018-10-01 Thread Rong Rong
Hi Fabian. Yes I think that was what I missed. I haven't looked into the code but just inferring from the translated plan pasted by Henry. Let me try to take a look and put in a fix for this. Thanks, Rong On Mon, Oct 1, 2018, 7:28 AM Fabian Hueske wrote: > Hi, > > I had a look into the code.

Event timers - metrics

2018-10-01 Thread Alexey Trenikhun
Hello, Are built-in timer metrics? For example number of registered timers, number of triggered timers etc Thanks, Alexey

Flink support for multiple data centers

2018-10-01 Thread Olga Luganska
Hello, Does Flink support multiple data center implementation and failover procedures in case one of the data centers goes down? Another question I have is about data encryption. If application state which needs to be checkpointed contains data elements which are considered to be a

Re: Deserialization of serializer errored

2018-10-01 Thread Elias Levy
Any of the Flink folks seen this before? On Fri, Sep 28, 2018 at 5:23 PM Elias Levy wrote: > I am experiencing a rather odd error. We have a job running on a Flink > 1.4.2 cluster with two Kafka input streams, one of the streams is processed > by an async function, and the output of the async

Re: Streaming to Parquet Files in HDFS

2018-10-01 Thread Biswajit Das
Nice to see this finally! On Mon, Oct 1, 2018 at 1:53 AM Fabian Hueske wrote: > Hi Bill, > > Flink 1.6.0 supports writing Avro records as Parquet files to HDFS via the > previously mentioned StreamingFileSink [1], [2]. > > Best, Fabian > > [1] https://issues.apache.org/jira/browse/FLINK-9753 >

Re: flink:latest container on kubernetes fails to connect taskmanager to jobmanager

2018-10-01 Thread jwatte
It turns out that the latest flink:latest docker image is 5 days old, and thus bug was fixed 4 days ago in the flink-docker github. The problem is that the docker-entrypoint.sh script chains to jobmanager.sh by saying "start-foreground cluster" where the "cluster" argument is obsolete as of Flink

flink:latest container on kubernetes fails to connect taskmanager to jobmanager

2018-10-01 Thread jwatte
I'm using the standard Kubernetes deploy configs for jobmanager and taskmanager deployments, and jobmanager service. However, when the task managers start up, they try to register with the job manager over Akka on port 6123. This fails, because the Akka on the jobmanager discards those messages as

Re: error with flink

2018-10-01 Thread yuvraj singh
i am using 1.6.0 On Mon, Oct 1, 2018 at 8:05 PM Hequn Cheng wrote: > Hi yuvraj, > > It seems a null key has been keyed by. Which Flink version do you use? And > could you show some user code related about keyBy or GroupBy? > > Best, Hequn > > On Mon, Oct 1, 2018 at 9:26 PM yuvraj singh

Re: error with flink

2018-10-01 Thread Hequn Cheng
Hi yuvraj, It seems a null key has been keyed by. Which Flink version do you use? And could you show some user code related about keyBy or GroupBy? Best, Hequn On Mon, Oct 1, 2018 at 9:26 PM yuvraj singh <19yuvrajsing...@gmail.com> wrote: > Hi i am facing this problem with my flink job please

Re: Does Flink SQL "in" operation has length limit?

2018-10-01 Thread Fabian Hueske
Hi, I had a look into the code. From what I saw, we are translating the values into Rows. The problem here is that the IN clause is translated into a join and that the join results contains a time attribute field. This is a safety restriction to ensure that time attributes do not lose their

Re: About the retract of the calculation result of flink sql

2018-10-01 Thread Hequn Cheng
Hi clay, Keyed group by: > SELECT a, SUM(b) as d > FROM Orders > GROUP BY a Non Keyed group by: > SELECT SUM(b) as d > FROM Orders I would like to look into the problem. However, I can't find obvious problems from the sql. It would be great that can provide a minimal example to reproduce

Re: About the retract of the calculation result of flink sql

2018-10-01 Thread Fabian Hueske
Hi Clay, If you do env.setParallelism(1), the query won't be executed in parallel. However, looking at your screenshot the message order does not seem to be the problem here (given that you printed the content of the topic). Are you sure that it is not possible that the result decreases if some

error with flink

2018-10-01 Thread yuvraj singh
Hi i am facing this problem with my flink job please help me with it . java.lang.Exception: An async function call terminated with an exception. Failing the AsyncWaitOperator. at org.apache.flink.streaming.api.operators.async.Emitter.output(Emitter.java:137) at

Re: About the retract of the calculation result of flink sql

2018-10-01 Thread clay4444
hi,Timo I use env.setParallelism(1) in my code, I set the overall degree of parallelism of the program to 1, so that some calculations will still be parallelized? clay, -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: About the retract of the calculation result of flink sql

2018-10-01 Thread clay4444
hi,Hequn I don't understand you about the group by and non-keyed group by. Can you explain it in a little more detail, or give me an example, thank u . clay, -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink Scheduler Customization

2018-10-01 Thread Hequn Cheng
Hi Ananth, > if it detects significant backlog, skip over and consume from the latest offset, and schedule a separate backfill task for the backlogged 20 minutes? Fabian is right, there is no built-in operators for this. If you don't care about Watermark, I think we can implement it with a

Re: flink-1.6.1-bin-scala_2.11.tgz fails signture and hash verification

2018-10-01 Thread Till Rohrmann
This is unfortunately the realm of the ASF over which we don't have direct control. We could think about filing an INFRA JIRA ticket to report this problem (if it can be backtracked). On Mon, Oct 1, 2018 at 2:42 PM Gianluca Ortelli < gianl...@mediadistillery.com> wrote: > Hi Till, > > I also

Re: flink-1.6.1-bin-scala_2.11.tgz fails signture and hash verification

2018-10-01 Thread Gianluca Ortelli
Hi Till, I also believe that it's a problem with a single mirror. It was not a blocking problem for me; I just wanted you to be aware of it, in case you have some policy regarding the management of mirrors. Best, Gianluca On Mon, 1 Oct 2018 at 14:27, Till Rohrmann wrote: > Hi Gianluca, > >

Re: flink-1.6.1-bin-scala_2.11.tgz fails signture and hash verification

2018-10-01 Thread Till Rohrmann
Hi Gianluca, I've downloaded flink-1.6.1-bin-scala_2.11.tgz from here [1] and verified that the shasum512 and the signature are both correct. The only way I could explain this is that either your downloaded artifacts or the mirror you got the binaries from got corrupted. [1]

Re: Upgrade Flink with newer Java version

2018-10-01 Thread Chesnay Schepler
Please see https://issues.apache.org/jira/browse/FLINK-8033. On 01.10.2018 13:39, Georgi Stoyanov wrote: Hi, Oracle will stop support for Java 8 on Jan 2019. Do you guys plans to upgrade the version? If so, do you have ticket which we can watch for updates? Regards, G. Stoyanov

Upgrade Flink with newer Java version

2018-10-01 Thread Georgi Stoyanov
Hi, Oracle will stop support for Java 8 on Jan 2019. Do you guys plans to upgrade the version? If so, do you have ticket which we can watch for updates? Regards, G. Stoyanov

Re: In-Memory Lookup in Flink Operators

2018-10-01 Thread David Anderson
Hi Chirag, The community is also looking at an approach that involves using Bravo[1][2] to bootstrap state by loading the initial version of the state into a savepoint. [1] https://github.com/king/bravo [2]

Re: Superstep-like synchronization of streaming iteration

2018-10-01 Thread Paris Carbone
Hi Christian, It is great to see use iterative use cases, thanks for sharing your problem! Superstep iterative BSP synchronization for streams is a problem we have been looking into extensively, however this functionality is still not standardised yet on Flink. I think your use case is fully

Re: flink-1.6.1-bin-scala_2.11.tgz fails signture and hash verification

2018-10-01 Thread Gianluca Ortelli
Hi Fabian, the mirror is https://www.apache.org/dyn/closer.lua/flink/flink-1.6.1/flink-1.6.1-bin-scala_2.11.tgz I just tried a download and the hash is still wrong: it should be

Re: flink-1.6.1-bin-scala_2.11.tgz fails signture and hash verification

2018-10-01 Thread Fabian Hueske
Hi Gianluca, I tried to validate the issue but hash and signature are OK for me. Do you remember which mirror you used to download the binaries? Best, Fabian Am Sa., 29. Sep. 2018 um 17:24 Uhr schrieb vino yang : > Hi Gianluca, > > This is very strange, Till may be able to give an

Re: I want run flink program in ubuntu x64 Mult Node Cluster what is configuration?

2018-10-01 Thread Fabian Hueske
Hi, these issues are not related to Flink but rather generic Linux / bash issues. Ensure that the start scripts are executable (can be changed with chmod) your user has the right permissions to executed the start scripts. Also, you have to use the right path to the scripts. If you are in the base

Re: Regarding implementation of aggregate function using a ProcessFunction

2018-10-01 Thread Fabian Hueske
Hi, There are basically three options: 1) Use an AggregateFunction and store everything that you would put into state into the Accumulator. This can become quite expensive because the Accumulator is de/serialized for every function call if you use RocksDB. The advantage is that you don't have to

Re: In-Memory Lookup in Flink Operators

2018-10-01 Thread Fabian Hueske
Hi Chirag, Flink 1.5.0 added support for BroadcastState which should address your requirement of replicating the data. [1] The replicated data is stored in the configured state backend which can also be RocksDB. Regarding the reload, I would recommend Lasse's approach of having a custom source

Re: Flink Scheduler Customization

2018-10-01 Thread Fabian Hueske
Hi Ananth, You can certainly do this with Flink, but there are no built-in operators for this. What you probably want to do is to compare the timestamp of the event with the current processing time and drop the record if it is too old. If the timestamp is encoded in the record, you can do this in

Re: Data loss when restoring from savepoint

2018-10-01 Thread Juho Autio
Hi Andrey, To rule out for good any questions about sink behaviour, the job was killed and started with an additional Kafka sink. The same number of ids were missed in both outputs: KafkaSink & BucketingSink. I wonder what would be the next steps in debugging? On Fri, Sep 21, 2018 at 3:49 PM

Re: [DISCUSS] Dropping flink-storm?

2018-10-01 Thread Fabian Hueske
+1 to drop it. Thanks, Fabian Am Sa., 29. Sep. 2018 um 12:05 Uhr schrieb Niels Basjes : > I would drop it. > > Niels Basjes > > On Sat, 29 Sep 2018, 10:38 Kostas Kloudas, > wrote: > > > +1 to drop it as nobody seems to be willing to maintain it and it also > > stands in the way for future

Re: Streaming to Parquet Files in HDFS

2018-10-01 Thread Fabian Hueske
Hi Bill, Flink 1.6.0 supports writing Avro records as Parquet files to HDFS via the previously mentioned StreamingFileSink [1], [2]. Best, Fabian [1] https://issues.apache.org/jira/browse/FLINK-9753 [2] https://issues.apache.org/jira/browse/FLINK-9750 Am Fr., 28. Sep. 2018 um 23:36 Uhr schrieb

Re: Can't get the FlinkKinesisProducer to work against Kinesalite for tests

2018-10-01 Thread Fabian Hueske
Hi Bruno, Thanks for sharing your approach! Best, Fabian Am Do., 27. Sep. 2018 um 18:11 Uhr schrieb Bruno Aranda : > Hi again, > > We managed at the end to get data into Kinesalite using the > FlinkKinesisProducer, but to do so, we had to use different configuration, > such as ignoring the

Re: Does Flink SQL "in" operation has length limit?

2018-10-01 Thread Timo Walther
Hi, tuple should not be used anywhere in flink-table. @Rong can you point us to the corresponding code? I haven't looked into the code but we should definitely support this query. @Henry feel free to open an issue for it. Regards, Timo Am 28.09.18 um 19:14 schrieb Rong Rong: Yes. Thanks

Re: About the retract of the calculation result of flink sql

2018-10-01 Thread Timo Walther
Hi, you also need to keep the parallelism in mind. If your downstream operator or sink has a parallelism of 1 and your SQL query pipeline has a higher parallelism, the retract results are rebalanced and arrive in a wrong order. For example, if you view the changelog in SQL Client, the