from:"Derek VerLee"

Re: Throttling/effective back-pressure on a Kafka sink

2019-05-30 Thread Derek VerLee

Was any progress ever made on this? We have seen the same issue in the past. What I do remember is, whatever I set max.block.ms to, is when the job crashes. I am going to attempt to reproduce the issue again and will report back. On 3/28/19

local disk cleanup after crash

2019-03-07 Thread Derek VerLee

I think that effort is put in to have task managers clean up their folders, however I have noticed that in some cases local folders are not cleaned up and can build up, eventually causing problems due to a full disk. As far as I know this only happens with

Re: Flink 1.7 jobmanager tries to lookup taskmanager by its hostname in k8s environment

2019-01-02 Thread Derek VerLee

I dealt with this issue by making the taskmanagers a statefulset. By itself, this doesn't solve the issue, because the taskmanager's `hostname` will not be a resovable FQDN on its own, you need to append the rest of the FQDN for the statefulset's "serviceName"

Re: Problem with metrics inside Kubernetes

2019-01-02 Thread Derek VerLee

See my reply I just posted to the thread "Flink 1.7 jobmanager tries to lookup taskmanager by its hostname in k8s environment". On 1/2/19 11:19 AM, Steven Nelson wrote: I have been working with Flink under Kubernetes

Re: long lived standalone job session cluster in kubernetes

2018-12-05 Thread Derek VerLee

bernetes. > > Best, > > Dawid > > [1] https://data-artisans.com/platform-overview > > On 30/11/2018 02:10, Derek VerLee wrote: >> >> I'm looking at the job cluster mode, it looks gr

long lived standalone job session cluster in kubernetes

2018-11-29 Thread Derek VerLee

I'm looking at the job cluster mode, it looks great and I and considering migrating our jobs off our "legacy" session cluster and into Kubernetes. I do need to ask some questions because I haven't found a lot of details in the documentation about how it works

strange behavior with jobmanager.rpc.address on standalone HA cluster

2018-05-05 Thread Derek VerLee

Two things: 1. It would be beneficial I think to drop a line somewhere in the docs (probably on the production ready checklist as well as the HA page) explaining that enabling zookeeper "highavailability" allows for your jobs to restart automatically after a

Re: intentional back-pressure (or a poor man's side-input)

2018-05-03 Thread Derek VerLee

d happen to checkpoint barriers if you are blocking one side of the input? Should this block checkpoint barrier as well? Should you cancel checkpoint?). Piotrek On 2 May 2018, at 16:31, Derek VerLee <derekver...@gmail.com> wrote: I was just thinking about about letting a coprocessf

Re: Migration to Flip6 Kubernetes

2018-05-02 Thread Derek VerLee

Is anyone actively working on direct Kubernetes support? I'd be excited to see this get in sooner rather than later, I'd be happy to start a PR. On 3/22/18 10:37 AM, Till Rohrmann wrote: Hi Edward and Eron, you're

intentional back-pressure (or a poor man's side-input)

2018-05-02 Thread Derek VerLee

I was just thinking about about letting a coprocessfunction "block" or cause back pressure on one of it's streams? Has this been discussed as an option? Does anyone know a way to effectively accomplish this? I think I could get a lot of mileage out of something

Clarification on slots and efficiency

2018-04-11 Thread Derek VerLee

From the docs ( https://ci.apache.org/projects/flink/flink-docs-master/concepts/runtime.html ) By adjusting the number of task slots, users can define how subtasks are isolated from each other. Having one slot per TaskManager means each

substantial realistic and idiomatic example applications

2017-12-12 Thread Derek VerLee

We are new to working with Flink and now that we have some basics down, we are looking for some codebases for Flink applications of real-world complexity and size, that could additionally be considered idiomatic and generally good code. Can anyone

Re: Streaming : a way to "key by partition id" without redispatching data

2017-11-10 Thread Derek VerLee

I was about to ask this question myself. I find myself re-keying by the same keys repeatedly. I think in principle you could always just roll more work into one window operation with a more complex series of maps/folds/windowfunctions or processfunction.

Re: Generate watermarks per key in a KeyedStream

2017-11-09 Thread Derek VerLee

We are contending with the same issue, as it happens. We have dozens, and potentially down the line, may need to deal with thousands of different "time systems" as you put it, and may not be know at compile time or job start time. In a practical sense, how

Re: Do timestamps and watermarks exist after window evaluation?

2017-11-09 Thread Derek VerLee

, at 18:54, Derek VerLee <derekver...@gmail.com> wrote: When composing ("chaining") multiple windowing operations on the same stream are watermarks transmitted down stream after w

Do timestamps and watermarks exist after window evaluation?

2017-11-08 Thread Derek VerLee

When composing ("chaining") multiple windowing operations on the same stream are watermarks transmitted down stream after window evaluation, and are the records emitted from WindowFunctions given timestamps? Do I need to or should I always assignTimestampsAndWatermarks to the outputsof window

RocksDB usage for broad slow data

2017-10-14 Thread Derek VerLee

We have a data which is broad and slow; hundreds of thousands of keys, a small number will get an event every few seconds, some get an event every few days, and the vast majority will get an event in a few times an hour. Let's say then that keeping this data

Re: Enriching data from external source with cache

2017-10-02 Thread Derek VerLee

this work. Regards, Timo Am 9/29/17 um 8:39 PM schrieb Derek VerLee: My basic problem will sound familiar I think, I need to enrich incoming data using a REST call to an external system for slowly

Enriching data from external source with cache

2017-09-29 Thread Derek VerLee

My basic problem will sound familiar I think, I need to enrich incoming data using a REST call to an external system for slowly evolving metadata. and some cache based lag is acceptable, so to reduce load on the external system and to process more efficiently, I would

Re: Throttling/effective back-pressure on a Kafka sink

local disk cleanup after crash

Re: Flink 1.7 jobmanager tries to lookup taskmanager by its hostname in k8s environment

Re: Problem with metrics inside Kubernetes

Re: long lived standalone job session cluster in kubernetes

long lived standalone job session cluster in kubernetes

strange behavior with jobmanager.rpc.address on standalone HA cluster

Re: intentional back-pressure (or a poor man's side-input)

Re: Migration to Flip6 Kubernetes

intentional back-pressure (or a poor man's side-input)

Clarification on slots and efficiency

substantial realistic and idiomatic example applications

Re: Streaming : a way to "key by partition id" without redispatching data

Re: Generate watermarks per key in a KeyedStream

Re: Do timestamps and watermarks exist after window evaluation?

Do timestamps and watermarks exist after window evaluation?

RocksDB usage for broad slow data

Re: Enriching data from external source with cache

Enriching data from external source with cache

19 matches

Site Navigation

Mail list logo

Footer information