date:20190821

Re: Maximal watermark when two streams are connected

2019-08-21 Thread Jark Wu

Hi Sung, Watermark will be advanced only when records come in if you are using ".assignTimestampsAndWatermarks()". One way to solve this problem is you should call ".assignTimestampsAndWatermarks()" before the condition to make sure there are messages. Best, Jark On Thu, 22 Aug 2019 at 13:52, Su

Maximal watermark when two streams are connected

2019-08-21 Thread Sung Gon Yi

Hello, Originally, watermark of connected stream is set by minimum of watermarks two streams when two streams are connected. I wrote a code to connect two streams but one of streams does not have any message by a condition. In this situation, watermark is never increased and processing is stuck.

Re: How do we debug on a local task manager

2019-08-21 Thread Caizhi Weng

Hi Raj, Have you restarted the cluster? You need to restart the cluster to apply changes in flink-config.yaml. You can also set suspend=y in the debug argument so that task managers will pause and wait for the connection of Intellij before going on. Raj, Smriti 于2019年8月22日周四上午11:06写道： > Hello

How do we debug on a local task manager

2019-08-21 Thread Raj, Smriti

Hello, I added this to the command line argument -Denv.java.opts.taskmanager="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=6005" and also tried adding the below to the flink-config.yaml env.java.opts.taskmanager: -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=600

Re: TaskManager not connecting to ResourceManager in HA mode

2019-08-21 Thread Zili Chen

Besides, would you like to participant our survey thread[1] on user list about "How do you use high-availability services in Flink?" It would help Flink improve its high-availability serving. Best, tison. [1] https://lists.apache.org/x/thread.html/c0cc07197e6ba30b45d7709cc9e17d8497e5e3f33de504d5

Re: TaskManager not connecting to ResourceManager in HA mode

2019-08-21 Thread Zili Chen

Hi Aleksandar, base on your log: taskmanager_1 | 2019-08-22 00:05:03,713 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor- Connecting to ResourceManager akka.tcp://flink@jobmanager:6123/user/jobmanager() . taskmanager_1 | 2019-08-22 00:05:04

Re: Externalized checkpoints

2019-08-21 Thread Zhu Zhu

Hi Vishwas, You can configure "state.checkpoints.num-retained" to specify the max checkpoints to retain. By default it is 1. Thanks, Zhu Zhu Vishwas Siravara 于2019年8月22日周四上午6:48写道： > I am also using exactly once checkpointing mode, I have a kafka source and > sink so both support transactions

Re: TaskManager not connecting to ResourceManager in HA mode

2019-08-21 Thread Zhu Zhu

Hi Aleksandar, The resource manager address is retrieved from the HA services. Would you check whether your customized HA services is returning the right LeaderRetrievalService and whether the LeaderRetrievalService is really retrieving the right leader's address? Or is it possible that the stored

Re: Apache Flink - How to get heap dump when a job is failing in EMR

2019-08-21 Thread Yang Wang

I think it depends on the root cause of your job failure. Maybe the following jvm options could help you to get the heap dump. 1. -XX:+HeapDumpOnOutOfMemoryError 2. -XX:+HeapDumpBeforeFullGC 3. -XX:+HeapDumpAfterFullGC 4. -XX:HeapDumpPath=/tmp/heap.dump.1 Use *env.java.opts* to set java opts for

Re: Questions for platform to choose

2019-08-21 Thread Eliza

Hi on 2019/8/21 22:46, Robert Metzger wrote: I would recommend you to do some research yourself (there is plenty of material online), and then try out the most promising systems yourself. That's right. thank you. regards.

TaskManager not connecting to ResourceManager in HA mode

2019-08-21 Thread Aleksandar Mastilovic

Hi all, I’m experimenting with using my own implementation of HA services instead of ZooKeeper that would persist JobManager information on a Kubernetes volume instead of in ZooKeeper. I’ve set the high-availability option in flink-conf.yaml to the FQN of my factory class, and started the dock

Re: Externalized checkpoints

2019-08-21 Thread Vishwas Siravara

I am also using exactly once checkpointing mode, I have a kafka source and sink so both support transactions which should allow for exactly once processing. Is this the reason why there is only one checkpoint retained ? Thanks, Vishwas On Wed, Aug 21, 2019 at 5:26 PM Vishwas Siravara wrote: > H

Externalized checkpoints

2019-08-21 Thread Vishwas Siravara

Hi peeps, I am externalizing checkpoints in S3 for my flink job and I retain them on cancellation. However when I look into my S3 bucket where the checkpoints are stored there is only 1 checkpoint at any point in time . Is this the default behavior of flink where older checkpoints are deleted when

Help with combining multiple streams simultaneously.

2019-08-21 Thread Siddhartha Khaitan

Hello, Currently I have 2 streams and I enrich stream 1 with the second streams. To further enrich stream 1 we are planning to add 2 more streams so a total of 4 streams. 1. Stream 1 read from Kafka 2. Stream 2 read from Kafka 3. Stream 3 will be read from Kafka - new 4. Stream 4 will

Re: Window Function that releases when downstream work is completed

2019-08-21 Thread David Anderson

I'm not sure I fully understand the scenario you envision. Are you saying you want to have some sort of window that batches (and deduplicates) up until a downstream map has finished processing the previous deduplicated batch, and then the window should emit the new batch? If that's what you want,

Re: Can I use watermarkers to have a global trigger of different ProcessFunction's?

2019-08-21 Thread David Anderson

What Watermarks do is to advance the event time clock. You can consider a Watermark(t) as an assertion about the completeness of the stream -- it marks a point in the stream and says that at that point, the stream is (probably) now complete up to time t. The autoWatermarkInterval determines how of

Flink logback

2019-08-21 Thread Vishwas Siravara

Hi all, I modified the logback.xml provided by flink distribution, so now the logback.xml file looks like this : *${log.file} false %d{-MM-dd HH:mm:ss.SSS} [%thread]

Apache Flink - How to get heap dump when a job is failing in EMR

2019-08-21 Thread M Singh

Hi: Is there any configuration to get heap dump when job fails in an EMR ? Thanks

Re: Multiple trigger events on keyed window

2019-08-21 Thread Eric Isling

I should add that the behaviour persists, even when I force parallelism to 1. On Wed, Aug 21, 2019 at 5:19 PM Eric Isling wrote: > Dear list-members, > > I have a question regarding window-firing and element accumulation for a > slidindingwindow on a DataStream (Flink 1.8.1-2.12). > > My DataStr

Multiple trigger events on keyed window

2019-08-21 Thread Eric Isling

Dear list-members, I have a question regarding window-firing and element accumulation for a slidindingwindow on a DataStream (Flink 1.8.1-2.12). My DataStream is derived from a custom SourceFunction, which emits stirng-sequences of WINDOW size, in a deterministic sequence. The aim is to crete sli

Re: Recovery from job manager crash using check points

2019-08-21 Thread Zili Chen

Hi Min, For your question, the answer is no. In standalone case Flink uses an in memory checkpoint store which is able to restore your savepoint configured in command-line and recover states from it. Besides, stop with savepoint and resume the job from savepoint is the standard path to migrate j

Re: Questions for platform to choose

2019-08-21 Thread Robert Metzger

Hey Eliza, This decision depends on many factors, such as the experience of your team, your use case, your deployment model, your workload, expected growth etc. Posting the same question to the user mailing list of all these systems won't magically answer you the question, because there is no obje

RE: Recovery from job manager crash using check points

2019-08-21 Thread min.tan

Thanks for the helpful reply. One more question, does this zookeeper or HA requirement apply for a savepoint? Can I bounce a single jobmanager cluster and rerun my flink job from its previous states with a save point directory? e.g. ./bin/flink run myJob.jar -s savepointDirectory Regards, Min

Re: Questions for platform to choose

2019-08-21 Thread Oytun Tez

Flink 💅💂 --- Oytun Tez *M O T A W O R D* The World's Fastest Human Translation Platform. oy...@motaword.com — www.motaword.com On Wed, Aug 21, 2019 at 2:42 AM Eliza wrote: > Hello, > > We have all of spark, flink, storm, kafka installed. > For realtime streaming calculation, which one is the

Re: What is the recommended way to run flink with high availability on AWS?

2019-08-21 Thread sri hari kali charan Tummala

Ok, no problem. On Wed, Aug 21, 2019 at 12:22 AM Pei HE wrote: > Thanks Kali for the information. However, it doesn't work for me, because > I need features in Flink 1.7.x or later and use manged Amazon MSK. > -- > Pei > > > > On Tue, Aug 20, 2019 at 7:17 PM sri hari kali charan Tummala < > kali

Can I use watermarkers to have a global trigger of different ProcessFunction's?

2019-08-21 Thread Felipe Gutierrez

Hi, I am a little confused about watermarkers in Flink. My application is using EventTime. My sources are calling ctx.collectWithTimestamp and ctx.emitWatermark. Then I have a CoProcessFunction which merge the two streams. I have a state on this function and I want to clean this state every time

Re: Flink SQL: How to tag a column as 'rowtime' from a Avro based DataStream?

2019-08-21 Thread Niels Basjes

Hi, It has taken me quite a bit of time to figure this out. This is the solution I have now (works on my machine). Please tell me where I can improve this. Turns out that the schema you provide for registerDataStream only needs the 'top level' fields of the Avro datastructure. With only the top

KryoSerializer is used for List type instead of ListSerializer

2019-08-21 Thread spoganshev

The following code: val MAILBOX_SET_TYPE_INFO = object: TypeHint>() {}.typeInfo val env = StreamExecutionEnvironment.getExecutionEnvironment() println(MAILBOX_SET_TYPE_INFO.createSerializer(env.config)) prints: org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer@2c39e53 While there

Re: [SURVEY] How do you use high-availability services in Flink?

2019-08-21 Thread Zili Chen

In addition, FLINK-13750[1] also likely introduce breaking change on high-availability services. So it is highly encouraged you who might be affected by the change share your cases :-) Best, tison. [1] https://issues.apache.org/jira/browse/FLINK-13750 Zili Chen 于2019年8月21日周三下午3:32写道： > Hi gu

[SURVEY] How do you use high-availability services in Flink?

2019-08-21 Thread Zili Chen

Hi guys, We want to have an accurate idea of how users actually use high-availability services in Flink, especially how you customize high-availability services by HighAvailabilityServicesFactory. Basically there are standalone impl., zookeeper impl., embedded impl. used in MiniCluster, YARN impl

Re: Maximal watermark when two streams are connected

Maximal watermark when two streams are connected

Re: How do we debug on a local task manager

How do we debug on a local task manager

Re: TaskManager not connecting to ResourceManager in HA mode

Re: TaskManager not connecting to ResourceManager in HA mode

Re: Externalized checkpoints

Re: TaskManager not connecting to ResourceManager in HA mode

Re: Apache Flink - How to get heap dump when a job is failing in EMR

Re: Questions for platform to choose

TaskManager not connecting to ResourceManager in HA mode

Re: Externalized checkpoints

Externalized checkpoints

Help with combining multiple streams simultaneously.

Re: Window Function that releases when downstream work is completed

Re: Can I use watermarkers to have a global trigger of different ProcessFunction's?

Flink logback

Apache Flink - How to get heap dump when a job is failing in EMR

Re: Multiple trigger events on keyed window

Multiple trigger events on keyed window

Re: Recovery from job manager crash using check points

Re: Questions for platform to choose

RE: Recovery from job manager crash using check points

Re: Questions for platform to choose

Re: What is the recommended way to run flink with high availability on AWS?

Can I use watermarkers to have a global trigger of different ProcessFunction's?

Re: Flink SQL: How to tag a column as 'rowtime' from a Avro based DataStream?

KryoSerializer is used for List type instead of ListSerializer

Re: [SURVEY] How do you use high-availability services in Flink?

[SURVEY] How do you use high-availability services in Flink?

30 matches

Site Navigation

Mail list logo

Footer information