Re: JDBC table source

2017-09-26 Thread Fabian Hueske
Yes, there's no built-in TableSource for that. However, it is certainly possible to implement a custom TableSource for your use case. The code of the JdbcInputFormat should be a good starting point. So you could run a query every n seconds (assuming you can consume the data of the last n seconds

Re: Building scala examples

2017-09-26 Thread Michael Fong
Thanks, Nico. I look again at flink-examples- streaming_2.10-1.4-SNAPSHOT.jar, and it indeed contains both. Originally I was looking at each self-contained jars as I used them as examples to create and run my own streaming program. They only contain java compiled class, if I am not mistaken.

Re: StreamCorruptedException

2017-09-26 Thread Sridhar Chellappa
Here is the snippet : public interface Rule { DataStream run(); } public class Rule1 implements Rule { private static final String RULE_ID = "Rule1" @Override public DataStream run() { Pattern MyMessage1Pattern = Pattern.begin("first").

Re: Cannot deploy Flink on YARN

2017-09-26 Thread Sridhar Chellappa
Emily, I did not get chance to capture the logs on the container. Since I have erased the instances, I have lost access to the logs. I have moved to no-ha mode (single master) and running OK. Aljoscha, Network connectivity is good. I am able to ssh to 10.200.0.6. Will try the HA mode and

Re: JDBC table source

2017-09-26 Thread Mohit Anchlia
Thanks. Idea was to query for 'x' records in last 'n' seconds using an indexed column. Looks like that is not possible? On Tue, Sep 26, 2017 at 3:24 PM, Fabian Hueske wrote: > Hi Mohit, > > no, a JdbcTableSource does not exist yet. However, since there is a > JdbcInputFormat

Re: JDBC table source

2017-09-26 Thread Fabian Hueske
Hi Mohit, no, a JdbcTableSource does not exist yet. However, since there is a JdbcInputFormat it should not be hard to wrap that in a TableSource. However, this would rather be a batch TableSource in the sense that it would just return the data that the query returns. Once all data is read it

Programmatic configuration with remote JobManager

2017-09-26 Thread Dustin Jenkins
Hello, I’m running a single Flink Job Manager with a Task Manager in Docker containers with Java 8. They are remotely located (flink.example.com ). I’m submitting a job from my desktop and passing the job to the Job Manager with -m flink.example.com:6123

Re: How to user ParameterTool.fromPropertiesFile() to get resource file inside my jar

2017-09-26 Thread Ron Crocker
What’s crazy is that I just stumbled on the same issue. Thanks for sharing! Ron — Ron Crocker Principal Engineer & Architect ( ( •)) New Relic rcroc...@newrelic.com M: +1 630 363 8835 > On Sep 15, 2017, at 7:30 AM, Tony Wei wrote: > > Hi Aljoscha, > > Thanks for your

JDBC table source

2017-09-26 Thread Mohit Anchlia
We are looking to stream data from the database. Is there already a jdbc table source available for streaming?

Re: How to clear registered timers for a merged window?

2017-09-26 Thread Yan Zhou [FDS Science] ­
Hi Aljoscha, Thanks for the information. In my case I didn't clean up the state explicitly in Trigger.onMerge(). Only OnMergeContext.mergePartitionedState(stateDesc) is called within my implementation, with the intention to copy the states from merged windows and re-register the timers. However,

Re: Kinesis connector - Jackson issue

2017-09-26 Thread Tomasz Dobrzycki
Yes I am using quickstart template. I have removed the exclusions for jackson: core, databind and annotations. On 26 September 2017 at 16:36, Tzu-Li (Gordon) Tai wrote: > Ah, I see. > > Are you using the Flink quickstart template to build your application? > I think

Re: Stream Task seems to be blocked after checkpoint timeout

2017-09-26 Thread Tony Wei
Hi Stefan, There is no unknown exception in my full log. The Flink version is 1.3.2. My job is roughly like this. env.addSource(Kafka) .map(ParseKeyFromRecord) .keyBy() .process(CountAndTimeoutWindow) .asyncIO(UploadToS3) .addSink(UpdateDatabase) It seemed all tasks stopped like the

Re: Stream Task seems to be blocked after checkpoint timeout

2017-09-26 Thread Stefan Richter
Hi, that is very strange indeed. I had a look at the logs and there is no error or exception reported. I assume there is also no exception in your full logs? Which version of flink are you using and what operators were running in the task that stopped? If this happens again, would it be

Re: CheckpointConfig says to persist periodic checkpoints, but no checkpoint directory has been configured.

2017-09-26 Thread Hao Sun
Thanks, I will try that. On Tue, Sep 26, 2017 at 8:24 AM Aljoscha Krettek wrote: > I'm not sure whether the JM is reading it or not. But you can manually set > the values on the Configuration using the setter methods. > > > On 26. Sep 2017, at 16:58, Hao Sun

Re: Kinesis connector - Jackson issue

2017-09-26 Thread Tzu-Li (Gordon) Tai
Ah, I see. Are you using the Flink quickstart template to build your application? I think exclusion is defined in the pom.xml of that archetype. Just above the exclusion I do see this message: “WARNING: You have to remove these excludes if your code relies on other version of these

Re: How to clear registered timers for a merged window?

2017-09-26 Thread Aljoscha Krettek
Hi, I think you should be able to use state for cleaning up your timers in Trigger.clear(). For this, you have to make sure to not clean up the state in Trigger.onMerge() and instead remove it in Trigger.clear(). I'm not sure whether this will be possible for your use case, though. Best,

Re: Kinesis connector - Jackson issue

2017-09-26 Thread Tomasz Dobrzycki
Hi Gordon, Thanks for your answer. - I've built it with Maven 3.2.5 - I am using Jackson in my application (version 2.7.4) Something that I have noticed when building Kinesis connector is that it excludes jackson: [INFO] Excluding

Re: CheckpointConfig says to persist periodic checkpoints, but no checkpoint directory has been configured.

2017-09-26 Thread Aljoscha Krettek
I'm not sure whether the JM is reading it or not. But you can manually set the values on the Configuration using the setter methods. > On 26. Sep 2017, at 16:58, Hao Sun wrote: > > Thanks Aljoscha, I still have questions. > Do I have to parse the yaml to a Configuration

Re: high-availability.jobmanager.port vs jobmanager.rpc.port

2017-09-26 Thread Elias Levy
I presume then that the Job Managers and Task Managers are performing service discovery via Zookeeper in HA mode, rather than from the config file or the masters file. Yes? On Mon, Sep 25, 2017 at 11:14 PM, Till Rohrmann wrote: > Because a single port could easily lead to

Re: Kinesis connector - Jackson issue

2017-09-26 Thread Tzu-Li (Gordon) Tai
Hi Tomasz, Yes, dependency clashes may surface when executing actual job runs on clusters. A few things to probably check first: - Have you built Flink or the Kinesis connector with Maven version 3.3 or above? If yes, try using a lower version, as 3.3+ results in some shading issues when used

Kinesis connector - Jackson issue

2017-09-26 Thread Tomasz Dobrzycki
Hi guys, I'm working with Kinesis connector and currently trying to solve a bizarre issue. I had problems with Kinesis and httpcomponents which I was able to solve using steps shown in: https://github.com/apache/flink/pull/4150/commits/9b539470ac308d7af9df9a70792aa1fa8c6995fc That did the trick

Exception in BucketingSink when cancelling Flink job

2017-09-26 Thread wangsan
Hi, We are currently using BucketingSink to save data into HDFS in parquet format. But when the flink job was cancelled, we always got Exception in BucketingSink's close method. The datailed exception info is as below: [ERROR] [2017-09-26 20:51:58,893]

Re: CheckpointConfig says to persist periodic checkpoints, but no checkpoint directory has been configured.

2017-09-26 Thread Aljoscha Krettek
Hi, I think the GlobalConfiguration is not necessarily read by the (local) JobManager. You could try using StreamExecutionEnvironment.createLocalEnvironment(int, Configuration) to manually specify a configuration. Best, Aljoscha > On 26. Sep 2017, at 05:49, Hao Sun wrote:

Re: Cannot deploy Flink on YARN

2017-09-26 Thread Aljoscha Krettek
Is the IP 10.200.0.6 reachable form the machine that runs the JobManager? > On 25. Sep 2017, at 19:58, Emily McMahon wrote: > > What's in the container log for the container that failed? > > On Sep 11, 2017 2:17 AM, "Sridhar Chellappa"

Re: Regarding flink-cassandra-connectors

2017-09-26 Thread Jagadish Gangulli
Sure, I will create a Jira for that. In addition to that, I would like to confirm, would it be possible to reuse the connection builder object across queries and across jobs. i.e if I create a Singleton class which would create a connection builder instance and could I use across the queries. I

Re: Regarding flink-cassandra-connectors

2017-09-26 Thread Tzu-Li (Gordon) Tai
Hi Jagadish, Yes, that indeed is something missing. If that is something you’re interested in, could you perhaps open a JIRA for that (AFAIK there isn’t one for the feature yet). Gordon On 26 September 2017 at 2:09:37 PM, Jagadish Gangulli (jagadi...@gmail.com) wrote: Thanks Gordon, Have

Re: Question about Flink Metrics

2017-09-26 Thread Tony Wei
Hi Hai Zhou, It's a good idea to implement my own reporter, but I think it is not the best solution. After all, reporter needs to be set well when starting the cluster. It is not efficient to update cluster whenever you have a new metric for a new streaming job. Anyway, it is still a workaround

Re: Regarding flink-cassandra-connectors

2017-09-26 Thread Jagadish Gangulli
Thanks Gordon, Have few more queries on the same lines, if I have to perform fetch i.e. select queries, I have to go for the batch queries, no streaming support is available. Regards, Jagadisha G On Tue, Sep 26, 2017 at 3:40 PM, Tzu-Li (Gordon) Tai wrote: > Hi Jagadish, >

Re: Question about Flink Metrics

2017-09-26 Thread Hai Zhou
Hi Tony, you can consider implementing a reporter, use a trick to convert the flink's metrics to the structure that suits your needs. This is just my personal practice, hoping to help you. Cheers, Hai Zhou > 在 2017年9月26日,17:49,Tony Wei 写道: > > Hi, > > Recently, I

Re: Flink Application Jar file on Docker container

2017-09-26 Thread Rahul Raj
Hi Stefan, Thanks a lot for your answer and sharing the link https://github.com/mesoshq/flink. I went through this and saw its spawning Jobmanager and taskmanager. Now I think, this should be happening. First JobManager will be started on flink cluster on one node, then task manager will be

Re: FIP-6: Job specific Docker images status?

2017-09-26 Thread Till Rohrmann
Hi Elias, we are currently still working on the Flip-6 building blocks. Once they are in place, it should be quite easy to complete the container support [1]. This will be our top priority since we see more and more interest in running Flink in a containerized environment. The unanswered

Re: java.lang.OutOfMemoryError: Java heap space at com.google.protobuf.AbstractMessageLite.toByteArray(AbstractMessageLite.java:62)

2017-09-26 Thread sohimankotia
Hi Stefan , Here is main class code : final String outFile = getOutFileName(backupDir); final Set keys = getAllRedisKeys(parameters); final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); env.setParallelism(parallelism); env .fromCollection(keys)

Re: java.lang.OutOfMemoryError: Java heap space at com.google.protobuf.AbstractMessageLite.toByteArray(AbstractMessageLite.java:62)

2017-09-26 Thread sohi mankotia
Hi Stefan , Here is main class code : final String outFile = getOutFileName(backupDir); final Set keys = getAllRedisKeys(parameters); final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); env.setParallelism(parallelism); env .fromCollection(keys)

Re: Regarding flink-cassandra-connectors

2017-09-26 Thread Tzu-Li (Gordon) Tai
Ah, sorry I just realized Till also answered your question on your cross-post at dev@. It’s usually fine to post questions to just a single mailing list :) On 26 September 2017 at 12:10:55 PM, Tzu-Li (Gordon) Tai (tzuli...@apache.org) wrote: Hi Jagadish, Yes, you are right that the Flink

Re: Regarding flink-cassandra-connectors

2017-09-26 Thread Tzu-Li (Gordon) Tai
Hi Jagadish, Yes, you are right that the Flink Cassandra connector uses the Datastax drivers internally, which is also the case for all the other Flink connectors; e.g., the Kafka connector uses the Kafka Java client, Elasticearch connector uses the ES Java client, etc. The main advantage

Re: Flink Application Jar file on Docker container

2017-09-26 Thread Stefan Richter
Hi, as in my answer to your previous mail, I suggest to take a look at https://github.com/mesoshq/flink . Unfortunately, there is not yet a lot documentation about the internals of how this works, so I am also looping in Till who might know more about

Re: Flink Job on Docker on Mesos cluster

2017-09-26 Thread Stefan Richter
Hi, I think what you need to have is a docker image that can spawn task managers as entry points. Please take a look at this project which gives some more detailed explanation: https://github.com/mesoshq/flink In particular, take a look at the shell script

Question about Flink Metrics

2017-09-26 Thread Tony Wei
Hi, Recently, I am using PrometheusReporter to monitor every metrics from Flink. I found that the metric name in Prometheus will map to the identifier from User Scope and System Scope [1], and the labels will map to Variables [2]. To monitor the same metrics from Prometheus, I would like to use

Re: akka timeout

2017-09-26 Thread Till Rohrmann
Alright. Glad to hear that things are now working :-) On Tue, Sep 26, 2017 at 9:55 AM, Steven Wu wrote: > Till, sorry for the confusion. I meant Flink documentation has the correct > info. our code was mistakenly referring to akka.ask.timeout for death watch. > > On Mon,

Re: Questions about checkpoints/savepoints

2017-09-26 Thread Stefan Richter
Hi, I have answered your questions inline: > It seems to me that checkpoints can be treated as flink internal recovery > mechanism, and savepoints act more as user-defined recovery points. Would > that be a correct assumption? You could see it that way, but I would describe savepoints more as

Re: java.lang.OutOfMemoryError: Java heap space at com.google.protobuf.AbstractMessageLite.toByteArray(AbstractMessageLite.java:62)

2017-09-26 Thread Stefan Richter
Hi, could you give us some more information like the size of your heap space, information about where and how you implemented access to Redis and how you kep the retrieved data, and most importantly a stack trace or (much better) a log? Best, Stefan > Am 26.09.2017 um 06:52 schrieb

Re: How to clear registered timers for a merged window?

2017-09-26 Thread Stefan Richter
Hi, I think that it is currently not possible to delete timers that did not trigger, because currently some of the data structures used for timers do not support random deletes efficiently. For the second part of the question about keeping the state of merged windows, I added Aljoscha in CC

Regarding flink-cassandra-connectors

2017-09-26 Thread Jagadish Gangulli
Hi, I have been recently into the application development with flink. We are trying to use the flink-apache connectors to get the data in and out from Cassandra. We attempted both Datastax drivers and Flink-cassandra connectors. In this process felt that flink-cassandra connector is more of a

RE: Flink kafka consumer that read from two partitions in local mode

2017-09-26 Thread Tzu-Li (Gordon) Tai
Hi, Glad you sorted it out :) AFAIK, the number of created Kafka partitions cannot be specified using the Kafka client. If you want topics to be created with 2 partitions, you’ll have to change the default for that in the broker configurations. Or, simply create the topic with the desired

Re: akka timeout

2017-09-26 Thread Steven Wu
Till, sorry for the confusion. I meant Flink documentation has the correct info. our code was mistakenly referring to akka.ask.timeout for death watch. On Mon, Sep 25, 2017 at 3:52 PM, Till Rohrmann wrote: > Quick question Steven. Where did you find the documentation

RE: Flink kafka consumer that read from two partitions in local mode

2017-09-26 Thread Sofer, Tovi
Hi, Issue was solved. After your guidance to producer part, I’ve checked in Kafka and saw that topic was created with one partition. I’ve re- created it with two partitions manually and it fixed the problem. // update in KAFKA_HOME/config/server.properties : set delete.topic.enable=true

Flink Application Jar file on Docker container

2017-09-26 Thread Rahul Raj
Currently I have a Flink Application Jar file running on Mesos cluster. The flink application simply reads data from Kafka and put it to HDFS. Now we are planning to create a docker image to run this application jar file inside docker containers on Mesos cluster via Marathon. Below are the

Re: Flink on EMR

2017-09-26 Thread Navneeth Krishnan
Hi, I’m using the default flink package that comes with EMR. I’m facing the issue while running my pipeline. Thanks. On Mon, Sep 25, 2017 at 11:09 PM Jörn Franke wrote: > Amazon EMR has already a Flink package. You just need to check the > checkbox. I would not install it

Re: high-availability.jobmanager.port vs jobmanager.rpc.port

2017-09-26 Thread Till Rohrmann
Because a single port could easily lead to clashes if there is another JobManager running on the same machine with the same port (e.g. due to standby JobManagers). Cheers, Till On Sep 26, 2017 03:20, "Elias Levy" wrote: > Why a range instead of just a single port

Re: Flink on EMR

2017-09-26 Thread Jörn Franke
Amazon EMR has already a Flink package. You just need to check the checkbox. I would not install it on your own. I think you can find it in the advanced options. > On 26. Sep 2017, at 07:14, Navneeth Krishnan wrote: > > Hello All, > > I'm trying to deploy flink on