Re: Fire and Purge with Idle State

2018-10-12 Thread Hequn Cheng
Hi shkob1, Currently, the idle state retention time is only used for unbounded operators in sql/table-api. The unbounded operators include non-window group by, non-window join, unbounded over, etc. The retention time affects neither sql/table-api window operators nor DataStream operators. Best,

Re: Custom Trigger + SQL Pattern

2018-10-12 Thread Hequn Cheng
Hi shkob1, > while one is time(session inactivity) the other is based on a specific event marked as a "last" event. How about using a session window and an udtf[1] to solve the problem. The session window may output multi `last` elements. However, we can use a udtf to split them into single ones.

Re: Questions in sink exactly once implementation

2018-10-12 Thread Hequn Cheng
Hi Henry, Yes, exactly once using atomic way is heavy for mysql. However, you don't have to buffer data if you choose option 2. You can simply overwrite old records with new ones if result data is idempotent and this way can also achieve exactly once. There is a document about End-to-End

Re: Taskmanager times out continuously for registration with Jobmanager

2018-10-12 Thread Abdul Qadeer
We were able to fix it by passing IP address instead of hostname for actor system listen address when starting taskmanager: def runTaskManager( taskManagerHostname: String, resourceID: ResourceID, actorSystemPort: Int,

Re: Not all files are processed? Stream source with ContinuousFileMonitoringFunction

2018-10-12 Thread Fabian Hueske
Hi, Which file system are you reading from? If you are reading from S3, this might be cause by S3's eventual consistency property. Have a look at FLINK-9940 [1] for a more detailed discussion. There is also an open PR [2], that you could try to patch the source operator with. Best, Fabian [1]

RE: org.apache.flink.runtime.rpc.exceptions.FencingTokenException:

2018-10-12 Thread Samir Tusharbhai Chauhan
Hi Till, Can you tell when do I receive below error message? 2018-10-13 03:02:01,337 ERROR org.apache.flink.runtime.rest.handler.taskmanager.TaskManagersHandler - Could not retrieve the redirect address. java.util.concurrent.CompletionException:

Not all files are processed? Stream source with ContinuousFileMonitoringFunction

2018-10-12 Thread Juan Miguel Cejuela
Dear flinksters, I'm using the class `ContinuousFileMonitoringFunction` as a source to monitor a folder for new incoming files.* I have the problem that not all the files that are sent to the folder get processed / triggered by the function*. Specific details of my workflow is that I send up to

Fire and Purge with Idle State

2018-10-12 Thread shkob1
Hey Say im aggregating an event stream by sessionId in SQL and im emitting the results once the session is "over", i guess i should be using Fire and Purge - i dont expect to need to session data once over. How should i treat the Idle state retention time - is it needed at all if im using purge?

Custom Trigger + SQL Pattern

2018-10-12 Thread shkob1
Hey! I have a use case in which im grouping a stream by session id - so far pretty standard, note that i need to do it through SQL and not by the table api. In my use case i have 2 trigger conditions though - while one is time (session inactivity) the other is based on a specific event marked as

Re: Making calls to external API wit Data Streams

2018-10-12 Thread Dominik Wosiński
Hey, It seems that You have written Async function that takes *String* and returns *String*. But in execution you expect the result of the function to be the tuple (*String, String).* That's where the mismatch occurs, the function itself is ok :) If you will change *DataStream[(String,String)]

Re: getRuntimeContext(): The runtime context has not been initialized.

2018-10-12 Thread Ahmad Hassan
Any help/pointers on this please ? Thanks. On Thu, 11 Oct 2018 at 10:33, Ahmad Hassan wrote: > Hi All, > > Thanks for the replies. Here is the code snippet of what we want to > achieve: > > We have sliding windows of 24hrs with 5 minutes apart. > > inStream > .filter(Objects::nonNull) >

Re: Making calls to external API wit Data Streams

2018-10-12 Thread Krishna Kalyan
Thanks for the quick reply Dom, I am using flink 1.6.1. [image: image.png] Error: Type Mismatch expected AsyncFunction actual AsyncWeatherAPIRequest On Fri, 12 Oct 2018 at 16:21, Dominik Wosiński wrote: > Hey, > What is the exact issue that you are facing and the Flink version that you >

Re: Making calls to external API wit Data Streams

2018-10-12 Thread Dominik Wosiński
Hey, What is the exact issue that you are facing and the Flink version that you are using ?? Best Regards, Dom. pt., 12 paź 2018 o 16:11 Krishna Kalyan napisał(a): > Hello All, > > I need some help making async API calls. I have tried the following code > below. > > class

Making calls to external API wit Data Streams

2018-10-12 Thread Krishna Kalyan
Hello All, I need some help making async API calls. I have tried the following code below. class AsyncWeatherAPIRequest extends AsyncFunction[String, String] { override def asyncInvoke(input: String, resultFuture: ResultFuture[String]): Unit = { val query = url("") val response =

Re: How do I initialize the window state on first run?

2018-10-12 Thread bupt_ljy
Yes…that’s an option, but it’ll be very complicated because of our storage and business. Now I’m trying to write an handler like the “KvStateHandler” so that I can access(read/write) the state from my client. Original Message Sender:Congxian qiuqcx978132...@gmail.com

Re: How do I initialize the window state on first run?

2018-10-12 Thread Congxian Qiu
IIUC, we can't initialize state at first run, maybe you could store the aggregated data in another place other than use flink's state, then use flink to aggregate the data realtime. bupt_ljy 于2018年10月12日周五 下午3:33写道: > Hi, vivo, > > My Flink program is to aggregate the data of a whole day,

Re: Are savepoints / checkpoints co-ordinated?

2018-10-12 Thread Congxian Qiu
AFAIK, "Cancel Job with Savepoint" will stop checkpointScheduler --> trigger a savepoint, then cancel your job. there will no more checkpoints. 于2018年10月12日周五 上午1:30写道: > Hi, > > > > I had a couple questions about savepoints / checkpoints > > > > When I issue "Cancel Job with Savepoint", how is

Re: Getting NoMethod found error while running job on flink 1.6.1

2018-10-12 Thread Chesnay Schepler
The cause cannot be that flink-metrics-core is not on the classpath as in that case you'd get a ClassNotFoundError. This is a version conflict, either caused by your fat jar bundling an older version of flink-metrics-core but a newer version of the kafka connector, or you upgrade your

Re: Flink leaves a lot RocksDB sst files in tmp directory

2018-10-12 Thread Stefan Richter
Hi, Can you maybe show us what is inside of one of the directory instance? Furthermore, your TM logs show multiple instances of OutOfMemoryErrors, so that might also be a problem. Also how was the job moved? If a TM is killed, of course it cannot cleanup. That is why the data goes to tmp dir

Re: [BucketingSink] notify on moving into pending/ final state

2018-10-12 Thread Kostas Kloudas
Hi Rinat, I have commented on your PR and on the JIRA. Let me know what you think. Cheers, Kostas > On Oct 11, 2018, at 4:45 PM, Dawid Wysakowicz wrote: > > Hi Ribat, > I haven't checked your PR but we introduced a new connector in flink 1.6 > called StreamingFileSink that is supposed to

Re: Taskmanager times out continuously for registration with Jobmanager

2018-10-12 Thread Till Rohrmann
It is hard to tell without all logs but it could easily be a K8s setup problem. Also problematic is that you are running a Flink version which is no longer actively supported. Try at least to use the latest bug fix release for 1.4. Cheers, Till On Fri, Oct 12, 2018, 09:43 Abdul Qadeer wrote: >

Re: When does Trigger.clear() get called?

2018-10-12 Thread Fabian Hueske
Hi Andrew, The PURGE action of a window removes the window state (i.e., the collected events or computed aggregate) but the window meta data including the Trigger remain. The Trigger.close() method is called, when the winodw is completely (i.e., all meta data) discarded. This happens, when the

Re: Taskmanager times out continuously for registration with Jobmanager

2018-10-12 Thread Abdul Qadeer
Hi Till, A few more data points: In a rerun of the same versions with fresh deployment, I see *log*.debug(*s"RegisterTaskManager: $*msg*"*) in JobManager, however the *AcknowledgeRegistration/AlreadyRegistered *messages are never sent, I have taken tcpdump for the taskmanager which doesn't

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-12 Thread Jörn Franke
Thank you very nice , I fully agree with that. > Am 11.10.2018 um 19:31 schrieb Zhang, Xuefu : > > Hi Jörn, > > Thanks for your feedback. Yes, I think Hive on Flink makes sense and in fact > it is one of the two approaches that I named in the beginning of the thread. > As also pointed out

Re: How do I initialize the window state on first run?

2018-10-12 Thread bupt_ljy
Hi, vivo, My Flink program is to aggregate the data of a whole day, assume we start this program on 6:00 am, the default state in the window should be the aggregated result of 0:00 am to 6:00 am. Original Message Sender:vino yangyanghua1...@gmail.com Recipient:bupt_ljybupt_...@163.com

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-12 Thread Taher Koitawala
Sounds smashing; I think the initial integration will help 60% or so flink sql users and a lot other use cases will emerge when we solve the first one. Thanks, Taher Koitawala On Fri 12 Oct, 2018, 10:13 AM Zhang, Xuefu, wrote: > Hi Taher, > > Thank you for your input. I think you emphasized