Re: CVE-2021-44228 - Log4j2 vulnerability

2021-12-15 Thread Richard Deurwaarder
> > Folks, what about the veverica platform. Is there any >>>> mitigation around it? >>>> > >>>> > On Fri, Dec 10, 2021 at 3:32 PM Chesnay Schepler >>> > <mailto:ches...@apache.org>> wrote: >>>> > >>>> &

CVE-2021-44228 - Log4j2 vulnerability

2021-12-10 Thread Richard Deurwaarder
Hello, There has been a log4j2 vulnerability made public https://www.randori.com/blog/cve-2021-44228/ which is making some waves :) This post even explicitly mentions Apache Flink: https://securityonline.info/apache-log4j2-remote-code-execution-vulnerability-alert/ And fortunately, I saw this

What happens when a job is rescaled

2020-11-13 Thread Richard Deurwaarder
Hello, I have a question about what actually happens when a job is started from an existing checkpoint, in particular when the parallelism has changed. *Context:* We have a flink 1.11.2 (DataStream API) job running on Kubernetes (GCP) writing its state to GCS. Normally we run with 12 TMs each 3

Re: Does flink support retries on checkpoint write failures

2020-02-01 Thread Richard Deurwaarder
(e.g. rocksdb exception / >> taskmanager crash / etc), there would be no Source rewind to the last >> successful checkpoint and this record would be lost forever, correct? >> >> On Wed, 29 Jan 2020, 17:51 Richard Deurwaarder, wrote: >> >>> Hi Till, >>

Does flink support retries on checkpoint write failures

2020-01-28 Thread Richard Deurwaarder
asically does anyone recognize this behavior? Regards, Richard Deurwaarder [1] We use an HDFS implementation provided by Google https://github.com/GoogleCloudDataproc/bigdata-interop/tree/master/gcs [2] https://cloud.google.com/storage/docs/json_api/v1/status-codes#410_Gone [3] https://github.com

Re: PubSub source throwing grpc errors

2020-01-15 Thread Richard Deurwaarder
Hi Itamar and Till, Yes this actually looks a lot worse than it is, fortunately. >From what I understand this means: something has not released or properly shutdown an grpc client and the library likes to inform you about this. I would definartly expect to see this if the job crashes at the

Re: Setting environment variables of the taskmanagers (yarn)

2019-09-25 Thread Richard Deurwaarder
ASK_MANAGER_ENV_PREFIX = > "containerized.taskmanager.env."; > > > > Best Regards > > Peter Huang > > > > > On Tue, Sep 24, 2019 at 8:02 AM Richard Deurwaarder > wrote: > >> Hello, >> >> We have our flink job (1.8.0) running on

Setting environment variables of the taskmanagers (yarn)

2019-09-24 Thread Richard Deurwaarder
Hello, We have our flink job (1.8.0) running on our hadoop 2.7 cluster with yarn. We would like to add the GCS connector to use GCS rather than HDFS. Following the documentation of the GCS connector[1] we have to specify which credentials we want to use and there are two ways of doing this: *

Re: Job Manager becomes irresponsive if the size of the session cluster grows

2019-07-26 Thread Richard Deurwaarder
Hello, We run into the same problem. We've done most of the same steps/observations: - increase memory - increase cpu - No noticable increase in GC activity - Little network io Our current setup has the liveliness probe disabled and we've increased (akka)timeouts, this seems to help

Re: Flink Zookeeper HA: FileNotFoundException blob - Jobmanager not starting up

2019-07-23 Thread Richard Deurwaarder
wrong. > > Best, Fabian > > Am Mi., 17. Juli 2019 um 19:50 Uhr schrieb Richard Deurwaarder < > rich...@xeli.eu>: > >> Hello, >> >> I've got a problem with our flink cluster where the jobmanager is not >> starting up anymore, because it tries to down

Flink Zookeeper HA: FileNotFoundException blob - Jobmanager not starting up

2019-07-17 Thread Richard Deurwaarder
Hello, I've got a problem with our flink cluster where the jobmanager is not starting up anymore, because it tries to download non existant (blob) file from the zookeeper storage dir. We're running flink 1.8.0 on a kubernetes cluster and use the google storage connector [1] to store checkpoints,

Re: BigQuery source ?

2019-06-04 Thread Richard Deurwaarder
I've looked into this briefly a while ago out of interest and read about how beam handles this. I've never actually implemented but the concept sounds reasonable to me. What I read from their code is that beam exports the BigQuery data to Google Storage. This export shards the data in files with

Re: [ANNOUNCE] Apache Flink 1.8.0 released

2019-04-11 Thread Richard Deurwaarder
Very nice! Thanks Aljoscha and all contributors! I have one question, will the docker image for 1.8.0 be released soon as well? https://hub.docker.com/_/flink has the versions up to 1.7.2. Regards, Richard On Wed, Apr 10, 2019 at 4:54 PM Rong Rong wrote: > Congrats! Thanks Aljoscha for being

Re: Submitting job to Flink on yarn timesout on flip-6 1.5.x

2019-02-26 Thread Richard Deurwaarder
t; > Best, > Gary > > [1] https://flink.apache.org/news/2018/11/30/release-1.7.0.html > [2] https://issues.apache.org/jira/browse/FLINK-10392 > [3] https://issues.apache.org/jira/browse/FLINK-11713 > > On Mon, Feb 18, 2019 at 12:00 PM Richard Deurwaarder > wrote: > >> He

Re: Share broadcast state between multiple operators

2019-02-26 Thread Richard Deurwaarder
, > Till > > On Mon, Feb 25, 2019 at 11:45 AM Richard Deurwaarder > wrote: > >> Hi All, >> >> Due to the way our code is structured, we would like to use the broadcast >> state at multiple points of our pipeline. So not only share it between >> multip

Share broadcast state between multiple operators

2019-02-25 Thread Richard Deurwaarder
Hi All, Due to the way our code is structured, we would like to use the broadcast state at multiple points of our pipeline. So not only share it between multiple instances of the same operator but also between multiple operators. See the image below for a simplified example. Flink does not seem

Submitting job to Flink on yarn timesout on flip-6 1.5.x

2019-02-18 Thread Richard Deurwaarder
Hello, I am trying to upgrade our job from flink 1.4.2 to 1.7.1 but I keep running into timeouts after submitting the job. The flink job runs on our hadoop cluster and starts using Yarn. Relevant config options seem to be: jobmanager.rpc.port: 55501 recovery.jobmanager.port: 55502

Implementation error: Unhandled exception - "Implementation error: Unhandled exception."

2018-11-07 Thread Richard Deurwaarder
Hello, We have a flink job / cluster running in kubernetes. Flink 1.6.2 (but the same happens in 1.6.0 and 1.6.1) To upgrade our job we use the REST API. Every so often the jobmanager seems to be stuck in a crashing state and the logs show me this stack trace: 2018-11-07 18:43:05,815