Re: no log exists in JM and TM when updated to Flink 1.7

2019-01-02 Thread Joshua Fan
Hi Till I found the root cause why log-not-show when use logback, because flink does not include the logback-*.jar in the lib folder. After I put the logback jar file in lib, everything is ok now. I think flink should put the logback jar files into the lib directory, not just the log4j jar file,

Re: Using port ranges to connect with the Flink Client

2019-01-02 Thread Joshua Fan
Hi, Gyula I met a similar situation. We used flink 1.4 before, and everything is ok. Now, we upgrade to flink 1.7 and use non-legacy mode, there seems something not ok, it all refers to that it is impossible get the jobmanagerGateway at client side. When I create a cluster without a job, I

Flink error reading file over network (Windows)

2019-01-02 Thread miki haiat
Hi, Im trying to read a csv file from windows shard drive. I tried numbers option but i failed. I cant find an option to use SMB format, so im assuming that create my own input format is the way to achieve that ? What is the correct way to read file from windows network ?. Thanks, Miki

RE: Are Jobs allowed to be pending when slots are not enough

2019-01-02 Thread 张馨予
Hi Till Thank you for your reply. Just like your suggestion, in our current implements, we periodically check the free slots through REST API, and then submitting jobs once slots are enough. However, since there is a concept of ‘Flink cluster’, why can't we think about ‘cluster scheduling

Re: Is there a better way to debug ClassNotFoundException from FlinkUserCodeClassLoaders?

2019-01-02 Thread Hao Sun
Yep, javap shows the class is there, but FlinkUserCodeClassLoaders somehow could not find it suddenly javap -cp /opt/flink/lib/zendesk-fps-core-assembly-0.1.0.jar 'com.zendesk.fraudprevention.examples.ConnectedStreams$$anon$90$$anon$45' Compiled from "ConnectedStreams.scala" public final class

Re: Is there a better way to debug ClassNotFoundException from FlinkUserCodeClassLoaders?

2019-01-02 Thread qi luo
Hi Hao, Since Flink is using Child-First class loader, you may try search for the class "com.zendesk.fraudprevention.examples.ConnectedStreams$$anon$90$$anon$45” in your fat JAR. Is that an inner class? Best, Qi > On Jan 3, 2019, at 7:01 AM, Hao Sun wrote: > > Hi, > > I am wondering if

Re: same parallelism with different taskmanager and slots, skew occurs

2019-01-02 Thread varuy322
Hi, Till It's very kind of your reply. I got your point, I'm sorry to not make it clear about my issue. I generated data by streaming benchmark just as the link: https://github.com/dataArtisans/databricks-benchmark/blob/master/src/main/scala/com/databricks/benchmark/flink/EventGenerator.scala .

Change Window Size during runtime

2019-01-02 Thread Rad Rad
Hi All, I have one stream is consumed by FlinkKafkaConsumer which will be joined with another stream for defined window size such as Time.milliseconds(1). How can I change window size during runtime to Time.milliseconds(2)? Stream1.join(Stream2)

S3 StreamingFileSink never completes multipart uploads

2019-01-02 Thread Martin, Nick
I'm running on Flink 1.7.0 trying to use the StreamingFileSink with an S3A URI. What I'm seeing is that whenever the RollingPolicy determines that it's time to roll to a new part file, the whole Sink just hangs, and the in progress MultiPart Upload never gets completed. I've looked at the

Is there a better way to debug ClassNotFoundException from FlinkUserCodeClassLoaders?

2019-01-02 Thread Hao Sun
Hi, I am wondering if there are any protips to figure out what class is not found? = Logs org.apache.flink.streaming.runtime.tasks.StreamTaskException: Could not instantiate chained outputs. at org.apache.flink.streaming.api.graph.StreamConfig.getChainedOutputs(StreamConfig.java:324) at

Re: Running Flink on Yarn

2019-01-02 Thread Anil
Hi Andrey. Thanks for the reply. Apologies about the late follow up, I was out of office. Suppose I have 3 TM and each has 3 task slot and each kafka stream has 9 partitions each. Each thread will consumer from stream 1 (a1) and stream 2 (a2). Considering the query, data will need to be buffered

Re: using updating shared data

2019-01-02 Thread Elias Levy
One thing you must be careful of, is that if you are using event time processing, assuming that the control stream will only receive messages sporadically, is that event time will stop moving forward in the operator joining the streams while the control stream is idle. You can get around this by

Re: How to shut down Flink Web Dashboard in detached Yarn session?

2019-01-02 Thread Till Rohrmann
You could also use `jsp` or `ps` to check that no TaskExecutor and StandaloneJobClusterEntrypoint is running. If there are no such processes, then there should not be a Flink cluster running locally. Cheers, Till On Wed, Jan 2, 2019 at 6:31 PM Sai Inampudi wrote: > Hey Till, > > If it is

Re: using updating shared data

2019-01-02 Thread Till Rohrmann
Yes exactly Avi. Cheers, Till On Wed, Jan 2, 2019 at 5:42 PM Avi Levi wrote: > Thanks Till I will defiantly going to check it. just to make sure that I > got you correctly. you are suggesting the the list that I want to broadcast > will be broadcasted via control stream and it will be than be

Re: How to shut down Flink Web Dashboard in detached Yarn session?

2019-01-02 Thread Sai Inampudi
Hey Till, If it is running on a standalone Flink cluster, wouldn't running stop-cluster.sh work? When I run stop-cluster.sh, I get back: No taskexecutor daemon to stop on host . No standalonesession daemon to stop on host . So I assumed that meant that it is not running on a standalone cluster

Problem building 1.7.1 with scala-2.12

2019-01-02 Thread Cliff Resnick
The build fails at flink-connector-kafka-0.9 because _2.12 libraries apparently do not exist for kafka < 0.10. Any help appreciated! -Cliff

Re: Flink 1.7 jobmanager tries to lookup taskmanager by its hostname in k8s environment

2019-01-02 Thread Derek VerLee
I dealt with this issue by making the taskmanagers a statefulset. By itself, this doesn't solve the issue, because the taskmanager's `hostname` will not be a resovable FQDN on its own, you need to append the rest of the FQDN for the statefulset's "serviceName"

Re: using updating shared data

2019-01-02 Thread Avi Levi
Thanks Till I will defiantly going to check it. just to make sure that I got you correctly. you are suggesting the the list that I want to broadcast will be broadcasted via control stream and it will be than be kept in the relevant operator state correct ? and updates (CRUD) on that list will be

Re: Problem with metrics inside Kubernetes

2019-01-02 Thread Derek VerLee
See my reply I just posted to the thread "Flink 1.7 jobmanager tries to lookup taskmanager by its hostname in k8s environment". On 1/2/19 11:19 AM, Steven Nelson wrote: I have been working with Flink under Kubernetes

Problem with metrics inside Kubernetes

2019-01-02 Thread Steven Nelson
I have been working with Flink under Kubernetes recently and I have run into some problems with metrics. I think I have it figured out though. It appears that it's trying to use hostname resolution for the jobmanagers. This causes this error: Association with remote system

Re: Are Jobs allowed to be pending when slots are not enough

2019-01-02 Thread Till Rohrmann
Hi Xinyu, at the moment there is no such functionality in Flink. Whenever you submit a job, Flink will try to execute the job right away. If the job cannot get enough slots, then it will wait until the slot.request.timeout occurs and either fail or retry if you have a RestartStrategy configured.

Re: same parallelism with different taskmanager and slots, skew occurs

2019-01-02 Thread Till Rohrmann
Hi Rui, such a situation can occur if you have data skew in your data set (differently sized partitions if you key by some key). Assume you have 2 TMs with 2 slots each and you key your data by some key x. The partition assignment could look like: TM1: slot_1 = Partition_1, slot_2 = Partition_2

Re: Problem when use kafka sink with EXACTLY_ONCE in IDEA

2019-01-02 Thread Till Rohrmann
Hi Kaibo, which Kafka version are you running locally? When enabling exactly once processing guarantees, you need at least Kafka >= 0.11. The UnsupportedVersionException indicates that this constraint is not fulfilled [1]. [1]

Re: using updating shared data

2019-01-02 Thread Till Rohrmann
Hi Avi, you could use Flink's broadcast state pattern [1]. You would need to use the DataStream API but it allows you to have two streams (input and control stream) where the control stream is broadcasted to all sub tasks. So by ingesting messages into the control stream you can send model

Re: RuntimeException with valve output watermark when using CoGroup

2019-01-02 Thread Till Rohrmann
Thanks for the update Taneli. Glad that you solved the problem. If you should find out more about the more obscure case, let us know. Maybe there is something we can still improve to prevent misleading exceptions in the future. Cheers, Till On Tue, Jan 1, 2019 at 3:01 PM Taneli Saastamoinen <

Re: How to shut down Flink Web Dashboard in detached Yarn session?

2019-01-02 Thread Till Rohrmann
Hi Sai, could you check that the dashboard you are seeing is really running on Yarn and not a standalone Flink cluster which you have running locally? Cheers, Till On Mon, Dec 31, 2018 at 7:40 PM Sai Inampudi wrote: > Hey Gary, thanks for reaching out. > > Executing "yarn application -list"

Re: no log exists in JM and TM when updated to Flink 1.7

2019-01-02 Thread Till Rohrmann
Hi Joshua, could you check the content of the logback.xml. Maybe this file has changed between the versions. Cheers, Till On Wed, Dec 26, 2018 at 11:19 AM Joshua Fan wrote: > Hi, > > It is very weird that there is no log file for JM and TM when run flink > job on yarn after updated flink to

Re: 1.6 UI issues

2019-01-02 Thread Till Rohrmann
Hi Oleksandr, the requestJob call should only take longer if either the `JobMaster` is overloaded and too busy to respond to the request or if the ArchivedExecutionGraph is very large (e.g. very large accumulators) and generating it and sending it over to the RestServerEndpoint takes too long.

Are Jobs allowed to be pending when slots are not enough

2019-01-02 Thread 张馨予
Hi all We submit some batch jobs to a Flink cluster which with 500 slots for example. The parallelism of these jobs may be different, between 1 and 500. Is there any configuration that can make jobs running in submitting order once the cluster has enough slots? If not, could we meet this

Maintaining message input order in streams with keyBy/filter/connect

2019-01-02 Thread Patrick Fial
*Intro* I am using apache flink to build a rather complex network of data streams. The idea is, to implement a rule engine with flink. As a basic description of the application, this is how it is supposed to work: Data is received by a kafka consumer source, and processed with a number of data

Re: Can't list logs or stdout through web console on Flink 1.7 Kubernetes

2019-01-02 Thread Joshua Fan
I found the root cause why log-not-show when use logback, because flink does not include the logback-*.jar in the lib folder. After I put the logback jar file in lib, everything is ok now. On Fri, Dec 28, 2018 at 10:41 PM Chesnay Schepler wrote: > @Steven: Do you happen do know whether a JIRA

Re: 1.6 UI issues

2019-01-02 Thread Oleksandr Nitavskyi
Hello guys. Happy new year! Context: we started to have some troubles with UI after bumping our Flink version from 1.4 to 1.6.3. UI couldn’t render Job details page, so inspecting of the jobs for us has become impossible with the new version. And looks like we have a workaround for our UI

Re: How to migrate Kafka Producer ?

2019-01-02 Thread Piotr Nowojski
Hi Edward, Sorry for coming back so late (because of holiday season). You are unfortunately right. Our FlinkKafkaProducer should have been upgrade-able, but it is not. I have created a bug for this [1]. For the time being, until we fix the issue, you should be able to stick to 0.11 producer

Re: Data loss when restoring from savepoint

2019-01-02 Thread Juho Autio
Bump – does anyone know if Stefan will be available to comment the latest findings? Thanks. On Fri, Dec 21, 2018 at 2:33 PM Juho Autio wrote: > Stefan, I managed to analyze savepoint with bravo. It seems that the data > that's missing from output *is* found in savepoint. > > I simplified my