Multi-stream question

2018-04-06 Thread Michael Latta
I would like to “join” several streams (>3) in a custom operator. Is this feasible in Flink? Michael

Re: DFS problem with removing checkpoint

2018-04-06 Thread Szymon Szczypiński
Hi, in my case both doesn't deleted. In high-availability.storageDir the number of files of type "completedCheckpoint" are growing and also dirs in "state.backend.fs.checkpointdir/JobId/check-". In my case i have Windows DFS filesystem mounted on linux with cifs protocol. Can you give me

Re: Flink Client job submission through SSL

2018-04-06 Thread Shuyi Chen
You need to set the following flag in flink-conf.yaml: security.kerberos.login.keytab and security.kerberos.login.principal. Please refer to this link for more detail. Thanks Shuyi On Fri, Apr

Tracking deserialization errors

2018-04-06 Thread Elias Levy
I was wondering how are folks tracking deserialization errors. The AbstractDeserializationSchema interface provides no mechanism for the deserializer to instantiate a metric counter, and "deserialize" must return a null instead of raising an exception in case of error if you want your job to

Lot of data generated in out file

2018-04-06 Thread Ashish Attarde
Hi Flink Team, I am seeing one of the out file for on my task manager is dumping lot of data. Not sure, why this is happening. All the data that is getting dumped in out file is ideally what *parsedInput *stream should be getting. Here is the flink program that is executing:

Re: Flink 1.4.2 in Zeppelin Notebook

2018-04-06 Thread kedar mhaswade
Yes. You need to add the two properties for the job manager (I agree, it is confusing because the properties named "host" and "port" already available, but the names of the useful properties are different): Could you please try this and let us know if it works for you? Regards, Kedar On Fri,

Re: Side outputs never getting consumed

2018-04-06 Thread Timo Walther
Hi Julio, thanks for this great example. I could reproduce it on my machine and I could find the problem. You need to store the newly created branch of your pipeline in some variable like `val test = pipeline.process()` in order to access the side outputs via

Re: Restart strategy defined in flink-conf.yaml is ignored

2018-04-06 Thread Piotr Nowojski
Thanks! > On 6 Apr 2018, at 00:30, Alexander Smirnov > wrote: > > Thanks Piotr, > > I've created a JIRA issue to track it: > https://issues.apache.org/jira/browse/FLINK-9143 > > > Alex > > > On Thu, Apr 5,

Re: Get get file name when reading from files? Or assign timestamps from file creation time?

2018-04-06 Thread Fabian Hueske
Hi Josh, FileInputFormat stores the currently read FileInputSplit in a protected variable FileInputFormat.currentSplit. You can override your FileInputFormat and access the path of the read file from the FileInputSplit in the method that emits the records from the format. Best, Fabian

Get get file name when reading from files? Or assign timestamps from file creation time?

2018-04-06 Thread Josh Lemer
Hey there, is it possible to somehow read the filename of elements that are read from `env.readFile`? In our case, the date of creation is encoded in the file name. Otherwise, maybe it is possible to assign timestamps somehow by the file's creation time directly? Thanks!

Flink 1.4.2 in Zeppelin Notebook

2018-04-06 Thread Dipl.-Inf. Rico Bergmann
Hi! Has someone successfully integrated Flink 1.4.2 into Zeppelin notebook (using Flink in cluster mode, not local mode)? Best, Rico.

Re: Reg. the checkpointing mechanism

2018-04-06 Thread Nico Kruber
Hi James, The checkpoint coordinator at the JobManager is triggering the checkpoints by inserting checkpoint barriers into the sources. These will get to the TaskManagers via the same communication channels data is flowing between them. Please refer to [1] for more details. Nico [1]

Re: KeyedSream question

2018-04-06 Thread Shailesh Jain
I have a question related to KeyedStream, asking it here instead of starting a new thread. If I assign timestamps on a keyed stream, the resulting stream is not keyed. So essentially I would need to apply the key by operator again after the assign timestamps operator. Why should assigning

Re: Flink Client job submission through SSL

2018-04-06 Thread Sampath Bhat
Hi Chen The link you shared does not speak about flink job submission and Kerberos interaction. It only speaks about kerberos support for HDFS, zookeeper, kafka and YARN. Even if Flink supports Kerberos authentication for job submission through command line client then how should i pass the

Re: KeyedSream question

2018-04-06 Thread Michael Latta
Yes. It took a bit of digging in the website to find RichFlatMapFunction to get managed state. Michael Sent from my iPad > On Apr 6, 2018, at 3:29 AM, Fabian Hueske wrote: > > Hi, > > I think Flink is exactly doing what you are looking for. > If you use keyed state [1],

Re: KeyedSream question

2018-04-06 Thread Fabian Hueske
Hi, I think Flink is exactly doing what you are looking for. If you use keyed state [1], Flink will put the state always in the context of the key of the currently processed record. So if you have a MapFunction with keyed state, and the map() method is called with a record that has a key A, the

Re: Restart strategy defined in flink-conf.yaml is ignored

2018-04-06 Thread Alexander Smirnov
Thanks Piotr, I've created a JIRA issue to track it: https://issues.apache.org/jira/browse/FLINK-9143 Alex On Thu, Apr 5, 2018 at 11:28 PM Piotr Nowojski wrote: > Hi, > > Thanks for the details! I can confirm this behaviour. flink-conf.yaml > restart-strategy value

Re: Flink Client job submission through SSL

2018-04-06 Thread Shuyi Chen
Hi Sampath, Yes, Flink support Kerberos authentication for job submission. You can take a look at the document here for more detail (https://ci.apache.org/ projects/flink/flink-docs-release-1.4/ops/security-kerberos.html). Also, please make sure to use Flink release 1.4.1 or above, because there

Flink Client job submission through SSL

2018-04-06 Thread Sampath Bhat
Hello I would like to know if the job submission through flink command line say ./bin/flink run can be authenticated. Like if SSL is enabled then will the job submission require SSL certificates. But I don't see any behavior as such. Simple flink run is able to submit the job even if SSL is