Re: Connection refused error when writing to socket?

2017-01-31 Thread Li Peng
Yes I did open a socket with netcat. Turns out my first error was due to a stream without a sink triggering the socket connect and (I thought that without a sink the stream wouldn't affect anything so I didn't comment it out, and I didn't open the socket for that port). However I did play with it

Connection refused error when writing to socket?

2017-01-30 Thread Li Peng
Hi there, I'm trying to test a couple of things by having my stream write to a socket, but it keeps failing to connect (I'm trying to have a stream write to a socket, and have another stream read from that socket). Caused by: java.net.ConnectException: Connection refused (Connection refused) at

Proper ways to write iterative DataSets with dependencies

2017-01-26 Thread Li Peng
Hi there, I just started investigating Flink and I'm curious if I'm approaching my issue in the right way. My current usecase is modeling a series of transformations, where I start with some transformations, which when done can yield another transformation, or a result to output to some sink, or

Re: Streaming data to Segment

2019-11-21 Thread Li Peng
which allow you to initialize and dispose resources > properly. > > On Thu, 21 Nov 2019, 5:23 Li Peng, wrote: > >> Hey folks, I'm interested in streaming some data to Segment >> <https://segment.com/docs/sources/server/java/>, using their existing >> java lib

What S3 Permissions does StreamingFileSink need?

2019-12-04 Thread Li Peng
Hey folks, I'm trying to use StreamingFileSink with s3, using IAM roles for auth. Does anyone know what permissions the role should have for the specified s3 bucket to work properly? I've been getting some auth errors, and I suspect I'm missing some permissions: data "aws_iam_policy_document"

Re: StreamingFileSink doesn't close multipart uploads to s3?

2019-12-06 Thread Li Peng
based on this experience, StreamingFileSink.forRowFormat() requires it too! Is this the intended behavior? If so, the docs should probably be updated. Thanks, Li On Fri, Dec 6, 2019 at 2:01 PM Li Peng wrote: > Hey folks, I'm trying to get StreamingFileSink to write to s3 every > minute, with

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-10 Thread Li Peng
! Li On Mon, Dec 9, 2019 at 7:10 PM Yang Wang wrote: > Hi Li Peng, > > You are running standalone session cluster or per-job cluster on > kubernetes. Right? > If so, i think you need to check your log4j.properties in the image, not > local. The log is > stored to /opt/fli

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-10 Thread Li Peng
ted as a volume from the configmap. > > > > Best > > Yun Tang > > > > *From: *Li Peng > *Date: *Wednesday, December 11, 2019 at 9:37 AM > *To: *Yang Wang > *Cc: *vino yang , user > *Subject: *Re: Flink on Kubernetes seems to ignore log4j.properties > > &g

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-11 Thread Li Peng
the logs under /opt/flink/log/jobmanager.log? If not, > please share the > commands the JobManager and TaskManager are using? If the command is > correct > and the log4j under /opt/flink/conf is expected, it is so curious why we > could not get the logs. > > > Best, > Yang >

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-12 Thread Li Peng
: > @Li Peng >I found your problems. Your start cmd use args “start-foreground”, It > will run “exec "${FLINK_BIN_DIR}"/flink-console.sh ${ENTRY_POINT_NAME} "$ > {ARGS[@]}””, and In ' flink-console.sh’, the code is “log_setting=( > "-Dlog4j.configu

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-12 Thread Li Peng
, Dec 12, 2019 at 12:09 PM Li Peng wrote: > Hey ouywl, interesting, I figured something like that would happen. I > actually replaced all the log4j-x files with the same config I originally > posted, including log4j-console, but that didn't change the behavior either. > > Hey Yang,

Re: What S3 Permissions does StreamingFileSink need?

2019-12-05 Thread Li Peng
? resources = [ "arn:aws:s3:::bucket-name", "arn:aws:s3:::bucket-name/", "arn:aws:s3:::bucket-name/*" ] Thanks, Li On Thu, Dec 5, 2019 at 6:11 AM r_khachatryan wrote: > Hi Li, > > Could you please list the permissions you see and the error message you > recei

Re: What S3 Permissions does StreamingFileSink need?

2019-12-06 Thread Li Peng
Maybe I need to add the directory level as a resource? > You don't have to. > > If it's possible in your setup, you can debug by granting all s3 > permissions to all objects, like this: > actions = ["s3:*"] > resources = ["*"] > > Regards, > Roma

StreamingFileSink doesn't close multipart uploads to s3?

2019-12-06 Thread Li Peng
Hey folks, I'm trying to get StreamingFileSink to write to s3 every minute, with flink-s3-fs-hadoop, and based on the default rolling policy, which is configured to "roll" every 60 seconds, I thought that would be automatic (I interpreted rolling to mean actually close a multipart upload to s3).

Streaming Files to S3

2019-11-25 Thread Li Peng
Hey folks, I'm trying to stream large volume data and write them as csv files to S3, and one of the restrictions is to try and keep the files to below 100MB (compressed) and write one file per minute. I wanted to verify with you guys regarding my understanding of StreamingFileSink: 1. From the

Streaming data to Segment

2019-11-20 Thread Li Peng
Hey folks, I'm interested in streaming some data to Segment , using their existing java library. This is a pretty high throughput stream, so I wanted for each parallel operator to have its own instance of the segment client. From what I could tell,

Task-manager kubernetes pods take a long time to terminate

2020-01-29 Thread Li Peng
Hey folks, I'm deploying a Flink cluster via kubernetes, and starting each task manager with taskmanager.sh. I noticed that when I tell kubectl to delete the deployment, the job-manager pod usually terminates very quickly, but any task-manager that doesn't get terminated before the job-manager,

Re: Task-manager kubernetes pods take a long time to terminate

2020-02-04 Thread Li Peng
ve them there if you want to reuse them by the next Flink > cluster deploy. > > What's the status of taskmanager pod when you delete it and get stuck? > > > Best, > Yang > > Li Peng 于2020年1月31日周五 上午4:51写道: > >> Hi Yun, >> >> I'm currently specifying that

Re: Task-manager kubernetes pods take a long time to terminate

2020-01-30 Thread Li Peng
able/ops/deployment/kubernetes.html#session-cluster-resource-definitions > [2] > https://github.com/apache/flink/blob/7e1a0f446e018681cb537dd936ae54388b5a7523/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java#L158 > > Best > Yun Tang > > --

Re: Task-manager kubernetes pods take a long time to terminate

2020-02-04 Thread Li Peng
My yml files follow most of the instructions here: http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/ What command did you use to delete the deployments? I use : helm --tiller-namespace prod delete --purge my-deployment I noticed that for environments without much data

Re: Best way set max heap size via env variables or program arguments?

2020-01-02 Thread Li Peng
p size is always slightly smaller than the configured > 'taskmanager.heap.size'. > > > Thank you~ > > Xintong Song > > > > On Wed, Jan 1, 2020 at 3:10 AM Li Peng wrote: > >> Hey folks, we've been running a k8 flink application, using the >> taskmanager.sh scr

Best way set max heap size via env variables or program arguments?

2019-12-31 Thread Li Peng
Hey folks, we've been running a k8 flink application, using the taskmanager.sh script and passing in the -Djobmanager.heap.size=9000m and -Dtaskmanager.heap.size=7000m as options to the script. I noticed from the logs, that the Maximum heap size logged completely ignores these arguments, and just

Flink 1.10.1 not using FLINK_TM_HEAP for TaskManager JVM Heap size correctly?

2020-06-12 Thread Li Peng
Hey folks, we recently migrated from Flink 1.9.x to 1.10.1, and we noticed some wonky behavior in how JVM is configured: 1. We Add FLINK_JM_HEAP=5000m and FLINK_TM_HEAP=1400m variables to the environment 2. The JobManager allocates the right heap size as expected 3. However, the TaskManager

Re: How to gracefully handle job recovery failures

2021-06-15 Thread Li Peng
e jobs (but still recover others). >> The only way I can think of is to remove the corresponding nodes in >> ZooKeeper which is not very safe. >> >> I'm pulling in Robert and Till who might know better. >> >> Regards, >> Roman >> >> >> On Thu, J

How to gracefully handle job recovery failures

2021-06-09 Thread Li Peng
Hey folks, we have a cluster with HA mode enabled, and recently after doing a zookeeper restart, our Kafka cluster (Flink v. 1.11.3, Scala v. 2.12) crashed and was stuck in a crash loop, with the following error: 2021-06-10 02:14:52.123 [cluster-io-thread-1] ERROR

Re: How to gracefully handle job recovery failures

2021-06-10 Thread Li Peng
On Thu, Jun 10, 2021 at 9:50 AM Roman Khachatryan wrote: > Hi Li, > > The missing file is a serialized job graph and the job recovery can't > proceed without it. > Unfortunately, the cluster can't proceed if one of the jobs can't recover. > > Regards, > Roman > > On T

How do I increase number of db connections of the Flink JDBC Connector?

2021-02-18 Thread Li Peng
Hey folks, I'm trying to use flink to write high throughput incoming data to a SQL db using the JDBC Connector as described here: https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/jdbc.html However, after enabling this, my data consumption rate slowed down to a crawl. After

Re: How do I increase number of db connections of the Flink JDBC Connector?

2021-02-19 Thread Li Peng
Ah got it, thanks! On Thu, Feb 18, 2021 at 10:53 PM Chesnay Schepler wrote: > Every works uses exactly 1 connection, so in order to increase the > number of connections you must indeed increase the worker parallelism. > > On 2/19/2021 6:51 AM, Li Peng wrote: > > Hey folks,