Re: Custom Prometheus metrics disappeared in 1.16.2 => 1.17.1 upgrade

2023-12-04 Thread Javier Vegas
at functionality was internal to the PrometheusReporter > implementation and your usecase should've continued working if it had > depended on Flink's metric API. > > Best, > Mason > > On Thu, Sep 28, 2023 at 2:51 AM Javier Vegas wrote: >> >> Thanks! I saw

Re: Custom Prometheus metrics disappeared in 1.16.2 => 1.17.1 upgrade

2023-09-28 Thread Javier Vegas
it may break code that indirectly interacts with > the reporter via the singleton instance (e.g., a test trying to assert what > metrics are reported). > > > > On Wed, Sep 27, 2023 at 11:11 AM Javier Vegas wrote: >> >> I implemented some custom Prometheus m

Re: Custom Prometheus metrics disappeared in 1.16.2 => 1.17.1 upgrade

2023-09-27 Thread Javier Vegas
invisible. Do you have any suggestion so my metrics work as in 1.16.2? Thanks again, Javier Vegas El mar, 26 sept 2023 a las 19:42, Javier Vegas () escribió: > > I implemented some custom Prometheus metrics that were working on > 1.16.2, with my configuration > > metrics.re

Custom Prometheus metrics disappeared in 1.16.2 => 1.17.1 upgrade

2023-09-26 Thread Javier Vegas
xplain the missing metrics? Thanks, Javier Vegas

Re: Error upgrading operator CRD

2023-07-07 Thread Javier Vegas
-helm.tgz : 404 Not Found Not sure why helm wants to find 1.0.1 because I have 1.3.1 installed (but that would have result in a 404 too, since that downloads site only has versions 1.4.0 and 1.5.0 of the operator El vie, 7 jul 2023 a las 10:59, Javier Vegas () escribió: > Somehow I was able in the p

Error upgrading operator CRD

2023-07-07 Thread Javier Vegas
flink-kubernetes-operator/crds/flinksessionjobs.flink.apache.org-v1.yml" does not exist Do I need to pass more arguments to kubectl for it to find the path? How can I verify the CRD path? Thanks, Javier Vegas

Re: DuplicateJobSubmissionException on restart after taskmanagers crash

2023-01-20 Thread Javier Vegas
cause these entries won't be reliably cleaned up when encountering the situation described by FLINK-21928 <https://issues.apache.org/jira/browse/FLINK-21928>." so I guess I need to do some manual cleanup of my S3 HA data before restarting El vie, 20 ene 2023 a las 4:58, Javier Vega

DuplicateJobSubmissionException on restart after taskmanagers crash

2023-01-20 Thread Javier Vegas
ion so the Flink app can be restarted without having to uninstall it first? Thanks, Javier Vegas org.apache.flink.util.FlinkException: Failed to execute job at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:2203)

"Exec Failure java.io.EOFException null" message before taskmanagers crash

2022-10-25 Thread Javier Vegas
ptions because there are no more taskmanagers around. Any idea what that "Exec Failure java.io.EOFException null" message is about, or what can I do to debug it if it happens again? Thanks, Javier Vegas message from jobmanager Source: event-activity (2/4)#0 (090ef433d97011f8f595885a9bb3

Re: HA not working in standalone mode for operator 1.2

2022-10-13 Thread Javier Vegas
ing to standalone? > > The native mode should work well in almost any setup. > > Gyula > > On Thu, 13 Oct 2022 at 21:41, Javier Vegas wrote: > >> Hi, I have a S3 HA Flink app that works as expected deployed via >> operator 1.2 in native mode, but I am seeing errors when s

Re: Validation error trying to use standalone mode with operator 1.2.0

2022-10-13 Thread Javier Vegas
. > > Cheers > Gyula > > On Sat, 8 Oct 2022 at 00:00, Javier Vegas wrote: > >> >> I am following the operator quickstart >> >> https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.2/docs/try-flink-kubernetes-operator/quick-start/ >

HA not working in standalone mode for operator 1.2

2022-10-13 Thread Javier Vegas
Hi, I have a S3 HA Flink app that works as expected deployed via operator 1.2 in native mode, but I am seeing errors when switching to standalone mode (which I want to do mostly to save me having to set jarURI explicitly). I can see the job manager writes the JobGraph in S3, and in the web UI I can

Validation error trying to use standalone mode with operator 1.2.0

2022-10-07 Thread Javier Vegas
I am following the operator quickstart https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.2/docs/try-flink-kubernetes-operator/quick-start/ kubectl create -f https://raw.githubusercontent.com/apache/flink-kubernetes-operator/release-1.2/examples/basic.yaml works fine, bu

Missing logback-console.xml when submitting job via operator

2022-09-29 Thread Javier Vegas
My Flink app uses logback for logging, when submitting it from the operator I get this error: ERROR in ch.qos.logback.classic.joran.JoranConfigurator@7364985f - Could not open URL [file:/opt/flink/conf/logback-console.xml]. java.io.FileNotFoundException: /opt/flink/conf/logback-console.xml (No suc

Re: Classloading issues with Flink Operator / Kubernetes Native

2022-09-21 Thread Javier Vegas
at 4:04 PM Javier Vegas wrote: > >> >> jarURI: local:///opt/flink/lib/MYJARNAME.jar >> >> El mar, 20 sept 2022 a las 0:25, Yaroslav Tkachenko (< >> yaros...@goldsky.com>) escribió: >> >>> Hi Javier, >>> >>> What do you specify as

Re: Classloading issues with Flink Operator / Kubernetes Native

2022-09-20 Thread Javier Vegas
jarURI: local:///opt/flink/lib/MYJARNAME.jar El mar, 20 sept 2022 a las 0:25, Yaroslav Tkachenko () escribió: > Hi Javier, > > What do you specify as a jarURI? > > On Mon, Sep 19, 2022 at 3:56 PM Javier Vegas wrote: > >> I am doing the same thing (migrating from s

Re: Classloading issues with Flink Operator / Kubernetes Native

2022-09-19 Thread Javier Vegas
I am doing the same thing (migrating from standalone to operator in native mode) and also have my jar in /opt/flink/lib but for me it works fine, no class loading errors on app startup. El vie, 16 sept 2022 a las 9:28, Yaroslav Tkachenko () escribió: > Application mode. I've done a bit more resea

Re: serviceAccount permissions issue for high availability in operator 1.1

2022-09-11 Thread Javier Vegas
2]. > > [1]. > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/ > [2]. > https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.1/docs/operations/rbac/ > > Best, > Yang > > Javier Vegas 于2022年9

serviceAccount permissions issue for high availability in operator 1.1

2022-09-06 Thread Javier Vegas
itional permissions beside configMap edit to be able to run HA using the operator? Thanks, Javier Vegas

Re: How to open a Prometheus metrics port on the rest service when using the Kubernetes operator?

2022-09-05 Thread Javier Vegas
rator templates in the Helm chart. Any way I can modify the ports in the Flink rest service? Thanks, Javier Vegas El dom, 4 sept 2022 a las 1:59, Javier Vegas () escribió: > Hi, Biao! > > Thanks for the fast response! Setting that in the podTemplate opens the > metrics port

Re: How to open a Prometheus metrics port on the rest service when using the Kubernetes operator?

2022-09-04 Thread Javier Vegas
emptyDir: {} > > > > The bold line are about how to specify the metric reporter and expose the > metric. The annotations are not required if you use PodMonitor or > ServiceMonitor. Hope it can help! > > > > Best, > > Biao Geng > > > > *From: *Javier

How to open a Prometheus metrics port on the rest service when using the Kubernetes operator?

2022-09-03 Thread Javier Vegas
service to make my job metrics available to Prometheus. Thanks, Javier Vegas

Re: NodePort conflict for multiple HA application-mode standalone Kubernetes deploys in same namespace

2022-07-24 Thread Javier Vegas
the second job, the first job taskmanagers start executing tasks sent by the second job jobmanager, and the second job taskmanagers execute jobs from both jobmanagers. El vie, 22 jul 2022 a las 12:03, Javier Vegas () escribió: > > I am deploying a high-availability Flink job to Kuberne

NodePort conflict for multiple HA application-mode standalone Kubernetes deploys in same namespace

2022-07-22 Thread Javier Vegas
of NodePort in jobmanager-rest-service.yaml?) besides hard-coding different nodePorts for different jobs running in same namespace? Thanks, Javier Vegas

standalone mode support in the kubernetes operator (FLIP-25)

2022-07-13 Thread Javier Vegas
bernetes/> deployments yet" and mentions https://cwiki.apache.org/confluence/display/FLINK/FLIP-225%3A+Implement+standalone+mode+support+in+the+kubernetes+operator as a "what's next" step. Is there a timeline for that to be released? Thanks, Javier Vegas

Re: IllegalArgumentException: URI is not hierarchical error when initializating jobmanager in cluster

2022-02-02 Thread Javier Vegas
: > https://nightlies.apache.org/flink/flink-docs-master/docs/ops/debugging/debugging_classloading/ > > > > On Wed, Jan 26, 2022 at 4:13 AM Javier Vegas wrote: > >> I am porting a Scala service to Flink in order to make it more scalable >> via running it in a cluster.

Mesos deploy starts Mesos framework but does not start job managers

2021-10-20 Thread Javier Vegas
am attaching all the relevant lines in the syslog. Any ideas what the problem could be or what else I could check to see what is happening? Thanks, Javier Vegas syslog Description: Binary data

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-29 Thread Javier Vegas
ache.org/projects/flink/flink-docs-master/docs/deployment/config/#jobmanager-rpc-port-1 > > On Wed, Sep 29, 2021 at 10:11 AM Javier Vegas wrote: > >> Matthias, thanks for the suggestion! I changed my jobmanager.rpc.address >> param from $HOSTNAME to $HOST:$PORT0 which in the log

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-29 Thread Javier Vegas
n the recipes section of the Marathon > docs [1], HOST was used as well. > > Matthias > > [1] > https://mesosphere.github.io/marathon/docs/recipes.html#command-executor-health-checks > > On Wed, Sep 29, 2021 at 3:37 AM Javier Vegas wrote: > >> Another update: Looki

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-28 Thread Javier Vegas
d to 8081 on the container is a random port that I can not know beforehand. Does Mesos master try to reach Flink using that Web UI setting? Could this be the issue causing my connection problem, or is this a red herring and the problem is a different one? Thanks, Javier Vegas On Tue, Sep

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-28 Thread Javier Vegas
HADOOP_CLASSPATH=$(hadoop classpath) >> export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so >> (as per the documentation you linked) >> >> Regards, >> Roman >> >> On Mon, Sep 27, 2021 at 7:38 PM Javier Vegas wrote: >> > >> > I am trying

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-28 Thread Javier Vegas
/lib/libmesos.so > (as per the documentation you linked) > > Regards, > Roman > > On Mon, Sep 27, 2021 at 7:38 PM Javier Vegas wrote: > > > > I am trying to start Flink 1.13.2 on Mesos following the instrucions in > https://nightlies.apache.org/flink/flink-docs-release-1.13/do

Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-27 Thread Javier Vegas
m not sure if I need to have HADOOP_HOME set or not, but I don't see anything about HADOOP_HOME in the FLink docs. Any tips on how I can fix my Docker+Marathon+Mesos environment so Flink can connect to my Mesos master? Thanks, Javier Vegas flink--mesos-appmaster-6c49aa87e1d4.log Description: Binary data