Re: Flink 1.17 upgrade issue when using azure storage account for checkpoints/savepoints

2023-03-25 Thread
lar error with Google Cloud Storage, and there is workaround in slack thread https://apache-flink.slack.com/archives/C03G7LJTS2G/p1679320815257449 -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B sign

Re: Cannot cast GoogleHadoopFileSystem to hadoop.fs.FileSystem to list file in Flink 1.15

2022-06-07 Thread
abind.JsonMappingException: Scala > module 2.11.3 requires Jackson Databind version >= 2.11.0 and < 2.12.0 We solve the problem by moving plugins into correct plugins directory. Thanks for the help from slack. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key

Re: Cannot cast GoogleHadoopFileSystem to hadoop.fs.FileSystem to list file in Flink 1.15

2022-06-04 Thread
-2.2.2-shaded.jar In 1.15, we add flink-gs-fs-hadoop-1.15.0.jar to /opt/flink/lib to support GCS. Maybe this different causes problem? -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: Cannot cast GoogleHadoopFileSystem to hadoop.fs.FileSystem to list file in Flink 1.15

2022-06-01 Thread
[] - Shutting KubernetesApplicationClusterEntrypoint down with application status UNKNOWN. Diagnostics Cluster entrypoint has been closed externally.. > > Best regards, > > Qingsheng > > > On Jun 2, 2022, at 09:08, ChangZhuo Chen (陳昌倬) wrote: > > > > Hi, > &

Cannot cast GoogleHadoopFileSystem to hadoop.fs.FileSystem to list file in Flink 1.15

2022-06-01 Thread
?] We found a similar issue in Spark [0]. However, we are not sure if it is related, and if it is, how can we apply this fix. Any help is welcome. [0] https://issues.apache.org/jira/browse/SPARK-9206 -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA0

Re: Prometheus metrics does not work in 1.15.0 taskmanager

2022-05-04 Thread
On Wed, May 04, 2022 at 01:53:01PM +0200, Chesnay Schepler wrote: > Disabling the kafka metrics _should_ work. Is there anyway to disable Kafka metrics when using low level process function? -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C

Re: Prometheus metrics does not work in 1.15.0 taskmanager

2022-05-03 Thread
QL statements in one job. We are running a streaming application with low level API with Kubernetes operator FlinkDeployment. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: Prometheus metrics does not work in 1.15.0 taskmanager

2022-05-03 Thread
On Tue, May 03, 2022 at 10:28:18AM +0200, Chesnay Schepler wrote: > Is there any warning in the logs containing "Error while handling metric"? No, we don't find any "Error while handling metric" -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key

Re: Prometheus metrics does not work in 1.15.0 taskmanager

2022-05-03 Thread
eporting metrics for reporter prom of type org.apache.flink.metrics.prometheus.PrometheusReporter. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Prometheus metrics does not work in 1.15.0 taskmanager

2022-05-02 Thread
, Netty, Input] ... -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

flink operator sometimes cannot start jobmanager after upgrading

2022-04-29 Thread
] JobManager deployment does not exist 2022-04-29 10:00:55,863 o.a.f.k.o.c.FlinkDeploymentController [INFO ][namespace/flink-deployment-name] Reconciliation successfully completed [0] https://github.com/apache/flink-kubernetes-operator -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http

Re: how to setup working dir in Flink operator

2022-04-25 Thread
: /srv/working-dir result: flink-io-*, localState/ are in /tmp all other configuration are the same. > nit: if the TaskManager pod crashed and was deleted too fast, you could > kill the JobManager first, then you will have enough time to get the logs > and yamls. Thanks for the tip. -

how to setup working dir in Flink operator

2022-04-24 Thread
/tmp to store its data. - set `process.taskmanager.working-dir` does not work. Flink still uses /tmp to store its data. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP

Re: Questions about checkpoint retention

2022-02-05 Thread
nfigured on gs://flink-checkpoints. Is there any way to config retention safely for Flink? * We don't use DELETE_ON_CANCELLATION to avoid deleting state data by accidently. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: Time different between checkpoint and savepoint restoration in GCS

2021-10-25 Thread
nto underlying state backend. > [1] > https://www.ververica.com/blog/differences-between-savepoints-and-checkpoints-in-flink > [2] > https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/state/savepoints/#what-is-a-savepoint-how-is-a-savepoint-different-from-a-checkpoint Thanks

Time different between checkpoint and savepoint restoration in GCS

2021-10-25 Thread
GCS buckets, not sure if this will affect the throughput of GCS. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: Flink 1.14.0 reactive mode cannot rescale

2021-10-19 Thread
* Yes, we do have broadcasted streams for configuration. We can change to use aligned checkpoint to see if it is okay. * [0] is marked as fixed in version 1.14.0, so maybe there are other * part that needs to be fixed? [0] https://issues.apache.org/jira/browse/FLINK-22815 -- ChangZhuo Chen (陳

Flink 1.14.0 reactive mode cannot rescale

2021-10-18 Thread
[] - Shutting down rest endpoint. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: Inconsistent parallelism in web UI when using reactive mode

2021-10-12 Thread
te yet for when > it will be fixed. Thanks for the information. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Inconsistent parallelism in web UI when using reactive mode

2021-10-11 Thread
to fix this inconsistent so that it would not confused engineer when deploying Flink application. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: 1.13.1 jobmanager annotations by pod template does not work

2021-06-15 Thread
ked > TaskManager pods are managed by Flink ResourceManager. > This is the root cause which makes the difference. Thanks for the clarification. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B si

Re: 1.13.1 jobmanager annotations by pod template does not work

2021-06-15 Thread
etween jobmanager and taskmanager when handling annotations and labels [0]. [0] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/#pod-template -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint

Re: 1.13.1 jobmanager annotations by pod template does not work

2021-06-11 Thread
template-hash=55846fd8f7 type=flink-native-kubernetes Annotations: -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: NPE when restoring from savepoint in Flink 1.13.1 application

2021-06-11 Thread
On Thu, Jun 10, 2021 at 07:10:45PM +0200, Roman Khachatryan wrote: > Hi ChangZhuo, > > Thanks for reporting, it looks like a bug. > I've opened a ticket for that [1]. > > [1] > https://issues.apache.org/jira/browse/FLINK-22966 Thanks for the help. -- ChangZhuo Chen

1.13.1 jobmanager annotations by pod template does not work

2021-06-11 Thread
/#kubernetes-jobmanager-annotations [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/#pod-template -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27

NPE when restoring from savepoint in Flink 1.13.1 application

2021-06-09 Thread
org.apache.flink.runtime.webmonitor.handlers.JarRunHandler [] - Exception occurred in REST handler: Could not execute application. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: Customer operator in BATCH execution mode

2021-05-26 Thread
Serialize keyed states into JSON. b. Output to Kafka. c. Streaming application consumes data from Kafka, and update its keyed states according to it. We hope that in this way, we can rebuild our states with almost the same code in streaming. -- ChangZhuo Chen (陳昌倬) czchen@{czchen

Customer operator in BATCH execution mode

2021-05-25 Thread
ream/execution_mode/ -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: Flink 1.13.0 reactive mode: Job stop and cannot restore from checkpoint

2021-05-17 Thread
gh for debugging, thanks. [0] https://issues.apache.org/jira/browse/FLINK-22686 -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: Flink-pod-template-issue

2021-05-17 Thread
org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/#pod-template -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: How to setup HA properly with Kubernetes Standalone Application Cluster

2021-05-14 Thread
- --fromSavepoint - gs://a-savepoint - --job-classname - com.example.my.application -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

How to setup HA properly with Kubernetes Standalone Application Cluster

2021-05-14 Thread
it is jobmanager failure in Kubernetes Standalone Application Cluster. [0] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#deploy-application-cluster -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key

Flink 1.13.0 reactive mode: Job stop and cannot restore from checkpoint

2021-05-13 Thread
. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: How to specific key serializer

2021-03-31 Thread
m TypeInformation, you should let it return the > correct serializer. Hi Gordon, Thanks for the tip. We have solve the problem by specific TypeInformation in readKeyedState. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790

How to specific key serializer

2021-03-29 Thread
/julianpeeters/sbt-avrohugger -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

Re: Checkpoint fail due to timeout

2021-03-17 Thread
(at least takes > longer than 3 hours). You can use aligned checkpoint to scala your job. Just restarting from checkpoint with the same jar file, and new parallelism shall do the trick. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE6

Re: Checkpoint fail due to timeout

2021-03-16 Thread
ing blocked on > different Objects. Hi, This call stack is similar to our case as described in [0]. Maybe they are the same issue? [0] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-debug-checkpoint-savepoint-stuck-in-Flink-1-12-2-td42103.html -- ChangZhuo Che

Re: How to debug checkpoint/savepoint stuck in Flink 1.12.2

2021-03-11 Thread
ad.java:834) ps: * The original UID is redacted by their underlying type. * It looks like subtask id in UI is off-by-one in stacktrace. -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debconf,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature

How to debug checkpoint/savepoint stuck in Flink 1.12.2

2021-03-10 Thread
data. Since these operators do not have many data to be stored in checkpoint/savepoint, we wonder, how can we debug this problem? -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debconf,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc

Re: Cannot connect to queryable state proxy

2021-02-07 Thread
On Thu, Feb 04, 2021 at 04:26:42PM +0800, ChangZhuo Chen (陳昌倬) wrote: > Hi, > > We have problem connecting to queryable state client proxy as described > in [0]. Any help is appreciated. > > * The port 6125 is opened in taskmanager pod. > > ``` > root@-654

Cannot connect to queryable state proxy

2021-02-04 Thread
mp# nc -vz localhost 6125 nc: connect to localhost port 6125 (tcp) failed: Connection refused nc: connect to localhost port 6125 (tcp) failed: Cannot assign requested address ``` [0] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/queryable_state.html -- ChangZhuo

Cannot start from savepoint using Flink 1.12 in standalone Kubernetes + Kubernetes HA

2020-12-29 Thread
Chen (陳昌倬) czchen@{czchen,debconf,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B signature.asc Description: PGP signature