Re: Easiest way to do a batch outer join

2023-08-09 Thread Flavio Pompermaier
in terms of > capabilities, I think you can try it with GlobalWindow. Another possible > solution is to convert the DataStream to a table[1] first and then try it > with a join on the Table API. > > [1] > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/tableapi/

Easiest way to do a batch outer join

2023-08-07 Thread Flavio Pompermaier
Hello everybody, I have a use case where I need to exclude from a DataStream (that is technically a DataSet since I work in batch mode) all already-indexed documents. My idea is to perfrom an outer join but I didn't find any simple example on DataStream working on batch mode..I've tried using

Re: TCP Socket stream scalability

2023-07-17 Thread Flavio Pompermaier
I had a similar situation with my Elasticsearch source where you don't know before executing the query (via scroll API for example) how many splits you will find. How should you handle those situation with new Source API? On Mon, Jul 17, 2023 at 10:09 AM Martijn Visser wrote: > Hi Kamal, > > It

Re: State bootstrapping for Flink SQL / Table API jobs

2023-04-26 Thread Flavio Pompermaier
This feature would be an awesome addition! I'm looking forward to it On Mon, Apr 24, 2023 at 3:59 PM Илья Соин wrote: > Thank you, Shammon FY > > -- > *Sincerely,* > *Ilya Soin* > > On 24 Apr 2023, at 15:19, Shammon FY wrote: > >  > Thanks Илья, there's already a FLIP [1] and discussion

Re: Kryo EOFException: No more bytes left

2021-12-22 Thread Flavio Pompermaier
Hi Dan, in my experience this kind of errors are caused by some other problem that's not immediately obvious (like some serialization, memory or RocksDB problem). Could it be that an Avro field cannot be null or viceversa? On Tue, Dec 21, 2021 at 7:21 PM Dan Hill wrote: > I was not able to

Views support in PostgresCatalog

2021-11-26 Thread Flavio Pompermaier
Hi to all, I was trying to use a view of my Postgres database through the PostgresCatalog but at the moment it seems that the current implementation ignores views. Probably this is caused by the fact that there's no way to avoid INSERT statements in Flink. However, the thrown error is somehow

Re: Avro SpecificRecordBase question

2021-08-05 Thread Flavio Pompermaier
Hi Kirill, as far as I know SpecificRecordBase should work in Flink, I don't know if there's any limitation in StateFun. It seems that the typeClass passed to the generateFieldsFromAvroSchema from the PravegaDeserializationSchema.. Maybe the pravega.LoadsSource does not bind correctly the Avro

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-07-23 Thread Flavio Pompermaier
Could this be related to https://issues.apache.org/jira/browse/FLINK-22414? On Thu, Jul 22, 2021 at 3:53 PM Timo Walther wrote: > Thanks, this should definitely work with the pre-packaged connectors of > Ververica platform. > > I guess we have to investigate what is going on. Until then, a >

Re: Re: Re: Official flink java client

2021-04-23 Thread Flavio Pompermaier
be careful when > upgrading. > > Best, > Yun > > > --Original Mail ------ > *Sender:*Flavio Pompermaier > *Send Date:*Fri Apr 23 16:10:05 2021 > *Recipients:*Yun Gao > *CC:*gaurav kulkarni , User < > user@flink.apache.org> > *Subjec

Re: Re: Official flink java client

2021-04-23 Thread Flavio Pompermaier
submit jar jobs > and query > job status, and they might be able to help. > > Best, > Yun > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/ops/rest_api.html > > --Original Mail ------ > *Sender:*Flavio Pompermaier > *Sen

Re: Official flink java client

2021-04-23 Thread Flavio Pompermaier
I also interface to Flink clusters using REST in order to avoid many annoying problems (due to dependency conflicts, classpath or env variables). I use an extended version of the RestClusterClient that you can reuse if you want to. It is available at [1] and it add some missing methods to the

Re: Flink Hadoop config on docker-compose

2021-04-22 Thread Flavio Pompermaier
gt;>> However, in "HadoopUtils"[2] we do not support getting the hadoop >>> configuration from classpath. >>> >>> >>> [1]. >>> https://github.com/apache/flink/blob/release-1.11/flink-dist/src/main/flink-bin/bin/config.sh#L256 >>> [2]

Re: Flink Hadoop config on docker-compose

2021-04-15 Thread Flavio Pompermaier
,415 INFO > org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - > Initializing cluster services. > > Here's my code: > > https://gist.github.com/rmetzger/0cf4ba081d685d26478525bf69c7bd39 > > Hope this helps! > > On Wed, Apr 14, 2021 at 5:37 PM Flavio Pompermaier > wrote: &g

Flink Hadoop config on docker-compose

2021-04-14 Thread Flavio Pompermaier
Hi everybody, I'm trying to set up reading from HDFS using docker-compose and Flink 1.11.3. If I pass 'env.hadoop.conf.dir' and 'env.yarn.conf.dir' using FLINK_PROPERTIES (under environment section of the docker-compose service) I see in the logs the following line: "Could not find Hadoop

Re: Flink docker 1.11.3 actually runs 1.11.2

2021-04-13 Thread Flavio Pompermaier
hing wrong in the dockerfiles (they reference the > correct release url), and the referenced release correctly identifies > itself as 1.11.3 . > I also started a container with the image, started a jobmanager, and the > logs show 1.11.3 like they are supposed to do. > > On 4/13/2021 6

Flink docker 1.11.3 actually runs 1.11.2

2021-04-13 Thread Flavio Pompermaier
Hi to all, I've just build a docker that use the image flink:1.11.3-scala_2.12-java11 but the web UI (and logs too) display Flink 1.11.2 (Commit: fe36135). Was there an error with the release? Best, Flavio

Re: Flink 1.13 and CSV (batch) writing

2021-04-11 Thread Flavio Pompermaier
Kurt > > > On Fri, Apr 9, 2021 at 7:41 PM Flavio Pompermaier > wrote: > >> That's absolutely useful. IMHO also join should work without >> windows/triggers and left/right outer joins should be easier in order to >> really migrate legacy code. >> Also reduceGro

Re: Flink 1.13 and CSV (batch) writing

2021-04-09 Thread Flavio Pompermaier
at 12:38 PM Kurt Young wrote: > Converting from table to DataStream in batch mode is indeed a problem now. > But I think this will > be improved soon. > > Best, > Kurt > > > On Fri, Apr 9, 2021 at 6:14 PM Flavio Pompermaier > wrote: > >> In my real CSV I

Re: Flink 1.13 and CSV (batch) writing

2021-04-09 Thread Flavio Pompermaier
nt design, > it seems not easy to do. > > Regarding null values, I'm not sure if I understand the issue you had. > What do you mean by > using ',bye' to test null Long values? > > [1] https://issues.apache.org/jira/browse/FLINK-22178 > > Best, > Kurt > > > O

Re: Flink 1.13 and CSV (batch) writing

2021-04-09 Thread Flavio Pompermaier
And another thing: in my csv I added ',bye' (to test null Long values) but I get a parse error..if I add 'csv.null-literal' = '' it seems to work..is that the right way to solve this problem? On Fri, Apr 9, 2021 at 10:13 AM Flavio Pompermaier wrote: > Thanks Kurt, now it works. However I ca

Re: Flink 1.13 and CSV (batch) writing

2021-04-09 Thread Flavio Pompermaier
>> *+--+--+* >> >> *| id | name |* >> >> *+------+--+* >> >> *|4 |d |* >> >> *+

Re: Flink 1.13 and CSV (batch) writing

2021-04-08 Thread Flavio Pompermaier
; reproduce the problem. > > Concretely I am running this code: > > final EnvironmentSettings envSettings = EnvironmentSettings.newInstance() > .useBlinkPlanner() > .inStreamingMode() > .build(); > final TableEnvironment tableEnv = TableEnvironment.create(envSett

Re: Flink 1.13 and CSV (batch) writing

2021-04-08 Thread Flavio Pompermaier
Any help here? Moreover if I use the DataStream APIs there's no left/right outer join yet..are those meant to be added in Flink 1.13 or 1.14? On Wed, Apr 7, 2021 at 12:27 PM Flavio Pompermaier wrote: > Hi to all, > I'm testing writing to a CSV using Flink 1.13 and I get the following &

Flink 1.13 and CSV (batch) writing

2021-04-07 Thread Flavio Pompermaier
Hi to all, I'm testing writing to a CSV using Flink 1.13 and I get the following error: The matching candidates: org.apache.flink.table.sinks.CsvBatchTableSinkFactory Unsupported property keys: format.quote-character I create the table env using this: final EnvironmentSettings envSettings =

Re: Flink + Hive + Compaction + Parquet?

2021-03-15 Thread Flavio Pompermaier
What about using Apache Hudi o Apache Iceberg? On Thu, Mar 4, 2021 at 10:15 AM Dawid Wysakowicz wrote: > Hi, > > I know Jingsong worked on Flink/Hive filesystem integration in the > Table/SQL API. Maybe he can shed some light on your questions. > > Best, > > Dawid > On 02/03/2021 21:03, Theo

Re: Integration with Apache AirFlow

2021-02-02 Thread Flavio Pompermaier
rry, but aren't these question better suited for the Airflow mailing > lists? > > On 2/2/2021 12:35 PM, Flavio Pompermaier wrote: > > Thank you all for the hints. However looking at the REST API[1] of AirFlow > 2.0 I can't find how to setup my DAG (if this is the right concept). > D

Re: Integration with Apache AirFlow

2021-02-02 Thread Flavio Pompermaier
, but it is mainly for streaming scenario which means the >> job won’t stop. In your case which are all batch jobs it doesn’t help much. >> Hope this helps. >> >> Regard, >> Xin >> >> >> 2021年2月2日 下午4:30,Flavio Pompermaier 写道: >> >> Hi Xin,

Re: Integration with Apache AirFlow

2021-02-02 Thread Flavio Pompermaier
o define two airflow operators to submit dependent flink job, as > long as the first one can reach the end. > > Regards, > Xin > > 2021年2月1日 下午6:43,Flavio Pompermaier 写道: > > Any advice here? > > On Wed, Jan 27, 2021 at 9:49 PM Flavio Pompermaier > wrote: > >>

Re: Integration with Apache AirFlow

2021-02-01 Thread Flavio Pompermaier
Any advice here? On Wed, Jan 27, 2021 at 9:49 PM Flavio Pompermaier wrote: > Hello everybody, > is there any suggested way/pointer to schedule Flink jobs using Apache > AirFlow? > What I'd like to achieve is the submission (using the REST API of AirFlow) > of 2 jobs, where the

Integration with Apache AirFlow

2021-01-27 Thread Flavio Pompermaier
Hello everybody, is there any suggested way/pointer to schedule Flink jobs using Apache AirFlow? What I'd like to achieve is the submission (using the REST API of AirFlow) of 2 jobs, where the second one can be executed only if the first one succeed. Thanks in advance Flavio

Re: How to fix deprecation on registerTableSink

2021-01-25 Thread Flavio Pompermaier
en't dropped > the registerTableSink yet. It is also fine to continue using it for now. > > Regards, > Timo > > On 25.01.21 09:40, Flavio Pompermaier wrote: > > Any advice on how to fix those problems? > > > > Best, > > Flavio > > > >

Re: How to fix deprecation on registerTableSink

2021-01-25 Thread Flavio Pompermaier
Any advice on how to fix those problems? Best, Flavio On Thu, Jan 21, 2021 at 4:03 PM Flavio Pompermaier wrote: > Hello everybody, > I was trying to get rid of the deprecation warnings about > using BatchTableEnvironment.registerTableSink() but I don't know how to > proceed. &g

How to fix deprecation on registerTableSink

2021-01-21 Thread Flavio Pompermaier
Hello everybody, I was trying to get rid of the deprecation warnings about using BatchTableEnvironment.registerTableSink() but I don't know how to proceed. My current code does the following: BatchTableEnvironment benv = BatchTableEnvironment.create(env); benv.registerTableSink("outUsers",

Re: Distribute Parallelism/Tasks within RichOutputFormat?

2020-12-23 Thread Flavio Pompermaier
I'm not an expert of the streaming APIs but you could try to do something like this: DataStream ds = null; DataStream ds1 = ds.filter(...).setParallelism(3); DataStream ds2 = ds.filter(...).setParallelism(7); Could it fit your needs? Best, Flavio On Wed, Dec 23, 2020 at 3:54 AM Hailu, Andreas

Re: How to debug a Flink Exception that does not have a stack trace?

2020-12-10 Thread Flavio Pompermaier
It looks like the problem is that there's a problem in reading a null value in the AvroRowDataDeserializationSchema (see below for the snippet of code from Flink 1.11.1). The problem is due to the fact that there's a bad typing of the source so the call to createConverter() within the

Re: Error while connecting with MSSQL server

2020-12-09 Thread Flavio Pompermaier
I issued a PR some time ago at https://github.com/apache/flink/pull/12038 but Flink committers were busy in refactoring that part..I don't know if it is still required to have that part into the jdbc connector Flink code of if using the new factories (that use the java services) you could register

Batch compressed file output

2020-11-27 Thread Flavio Pompermaier
Hello guys, I have to write my batch data (Dataset) to a file format. Actually what I need to do is: 1. split the data if it exceeds some size threshold (by line count or max MB) 2. compress the output data (possibly without converting to the hadoop format) Are there any suggestions

Re: Logs of JobExecutionListener

2020-11-25 Thread Flavio Pompermaier
ons, so flink-runtime has > barely no annotations. > > The reason why some classes in non-public-facing packages have > annotations is just that at some point someone decided to make something > consciously @Public or @Internal. > > On 24.11.20 12:25, Flavio Pompermaier wrote: &

Re: Jdbc input format and system properties

2020-11-24 Thread Flavio Pompermaier
ini > cluster? Could you set a breakpoint to the static initializer of > AbandonedConnectionCleanupThread and check what's going on there? > > > > On Fri, Nov 20, 2020 at 12:58 PM Flavio Pompermaier > wrote: > >> Yes, that's what is surprising..I already did a remote de

Re: Logs of JobExecutionListener

2020-11-24 Thread Flavio Pompermaier
; Long story short, I think the easiest solution would be to build yourself > an utility class which offers the required methods. The second best option > in my opinion would be to add these methods to the RestClusterClient w/o > giving guarantees for their stability. > > Cheers, > Ti

JobListener weird behaviour

2020-11-24 Thread Flavio Pompermaier
Hello everybody, these days I have been trying to use the JobListener to implement a simple logic in our platform that consists in calling an external service to signal that the job has ended and, in case of failure, save the error cause. After some problems to make it work when starting a job

Re: Logs of JobExecutionListener

2020-11-23 Thread Flavio Pompermaier
ogram/rest/RestClusterClientExtended.java On Mon, Nov 23, 2020 at 4:38 PM Flavio Pompermaier wrote: > I don't know if they need to be added also to the ClusterClient but for > sure they are missing in the RestClusterClient > > On Mon, Nov 23, 2020 at 4:31 PM Aljoscha Krettek > wrot

Re: Logs of JobExecutionListener

2020-11-23 Thread Flavio Pompermaier
I don't know if they need to be added also to the ClusterClient but for sure they are missing in the RestClusterClient On Mon, Nov 23, 2020 at 4:31 PM Aljoscha Krettek wrote: > On 23.11.20 16:26, Flavio Pompermaier wrote: > > Thank you Aljosha,.now that's more clear! > >

Re: Logs of JobExecutionListener

2020-11-23 Thread Flavio Pompermaier
); - public boolean isJobRunning(JobID fjid) - public JarUploadResponseBody uploadJar(Path uploadedFile); and I was also going to add jarRun.. Let me know, Flavio On Mon, Nov 23, 2020 at 3:57 PM Aljoscha Krettek wrote: > On 20.11.20 22:09, Flavio Pompermaier wrote: > > To achiev

Re: Logs of JobExecutionListener

2020-11-23 Thread Flavio Pompermaier
support, Flavio On Fri, Nov 20, 2020 at 10:09 PM Flavio Pompermaier wrote: > I think that the problem is that my REST service submits the job to > the Flink standalone cluster and responds to the client with the > submitted job ID. > To achieve this, I was using the > RestCluste

Re: Logs of JobExecutionListener

2020-11-20 Thread Flavio Pompermaier
is: is there a simple way to achieve my goal? Am I doing something wrong? At the moment I had to implement a job-status polling thread after the line (1) but this looks like a workaround to me.. Best, Flavio On Thu, Nov 19, 2020 at 4:28 PM Flavio Pompermaier wrote: > > You're right..I removed my

Re: Jdbc input format and system properties

2020-11-20 Thread Flavio Pompermaier
t's in the project, > mysql can access it. > > On Fri, Nov 20, 2020 at 10:46 AM Flavio Pompermaier > wrote: > >> I've just tested the following code in a java class and the property >> (-Dcom.mysql.cj.disableAbandonedConnectionCleanup=true) is read correctly >>

Re: Jdbc input format and system properties

2020-11-20 Thread Flavio Pompermaier
roject is: mysql mysql-connector-java 8.0.22 provided On Fri, Nov 20, 2020 at 10:07 AM Flavio Pompermaier wrote: > no no I didn't relocate any class related to jdbc > > Il ven 20 nov 2020, 10:02 Arvid Heise ha scritto: > >> I was particularly asking if you relocat

Re: Jdbc input format and system properties

2020-11-20 Thread Flavio Pompermaier
the value of > PropertyDefinitions.SYSP_disableAbandonedConnectionCleanup in your final > jar? > > On Fri, Nov 20, 2020 at 9:35 AM Flavio Pompermaier > wrote: > >> the mysql connector is put in the client classpath and in the Flink lib >> dir. When i debugged remotely the AbandonedCon

Re: Jdbc input format and system properties

2020-11-20 Thread Flavio Pompermaier
haded then. You could decompile your jar to be sure. Have you verified > that this is working as intended without Flink? > > On Thu, Nov 19, 2020 at 9:19 PM Flavio Pompermaier > wrote: > >> the properties arrives to the task manager because I can see them in the >> java process

Re: Jdbc input format and system properties

2020-11-19 Thread Flavio Pompermaier
perties arrived at the task manager in the > remote debugger session? For example, you could check the JVisualVM > Overview tab. > > On Thu, Nov 19, 2020 at 8:38 PM Flavio Pompermaier > wrote: > >> At the moment I use a standalone cluster, isn't using env.java.opts the >

Re: Jdbc input format and system properties

2020-11-19 Thread Flavio Pompermaier
want to have it on the > task managers. > > The specific options to pass it to the task managers depend on the way you > deploy. -yD for yarn for example. For docker or k8s, you would use env. > > On Wed, Nov 18, 2020 at 10:20 PM Flavio Pompermaier > wrote: > >>

Re: Logs of JobExecutionListener

2020-11-19 Thread Flavio Pompermaier
>> Andrey >> >> >> >> On Thu, Nov 19, 2020 at 2:40 PM Aljoscha Krettek >> >> wrote: >> >> >> >>> JobListener.onJobExecuted() is only invoked in >> >>> ExecutionEnvironment.execute() and ContextEnvironment.execute(). If

Re: Logs of JobExecutionListener

2020-11-19 Thread Flavio Pompermaier
() and ContextEnvironment.execute(). If none >> of these is still in the call chain with that setup then the listener >> will not be invoked. >> >> Also, this would only happen on the client, not on the broker (in your >> case) or the server (JobManager). >> >>

Re: Logs of JobExecutionListener

2020-11-19 Thread Flavio Pompermaier
rote: > >>> Hi Flavio, > >>> > >>> I think I can reproduce what you are reporting (assuming you also pass > >>> '--output' to 'flink run'). > >>> I am not sure why it behaves like this. I would suggest filing a Jira > >>> ticket

Re: Logs of JobExecutionListener

2020-11-19 Thread Flavio Pompermaier
uld suggest filing a Jira > > ticket for this. > > > > Best, > > Andrey > > > > On Wed, Nov 18, 2020 at 9:45 AM Flavio Pompermaier > > > wrote: > > > >> is this a bug or is it a documentation problem...? > >> > >>

Jdbc input format and system properties

2020-11-18 Thread Flavio Pompermaier
Hi to all, while trying to solve a leak with dynamic class loading I found out that mysql connector creates an AbandonedConnectionCleanupThread that is retained in the ChildFirstClassLoader..from version 8.0.22 there's the possibility to inhibit this thread passing the system property

Re: Logs of JobExecutionListener

2020-11-17 Thread Flavio Pompermaier
is this a bug or is it a documentation problem...? Il sab 14 nov 2020, 18:44 Flavio Pompermaier ha scritto: > I've also verified that the problem persist also using a modified version > of the WordCount class. > If you add the code pasted at the end of this email at the end of its main

Re: [External Sender] Re: Random Task executor shutdown (java.lang.OutOfMemoryError: Metaspace)

2020-11-17 Thread Flavio Pompermaier
-first / parent-first classloading refactoring and at that time that was the way to go..but now it can cause this kind of problems if using child-first policy. On Mon, Nov 16, 2020 at 8:44 PM Flavio Pompermaier wrote: > Thank you Kye for your insights...in my mind, if the job runs without > pr

Re: [External Sender] Re: Random Task executor shutdown (java.lang.OutOfMemoryError: Metaspace)

2020-11-16 Thread Flavio Pompermaier
l be) different in your case. The problem is that if >>>> user-code registers something in some (static) storage located in class >>>> loaded with parent (TaskTracker) classloader, then its associated classes >>>> will never be GC'd and Metaspace will grow. A good

Re: Random Task executor shutdown (java.lang.OutOfMemoryError: Metaspace)

2020-11-16 Thread Flavio Pompermaier
16, 2020 at 3:29 PM Jan Lukavský wrote: > Yes, that could definitely cause this. You should probably avoid using > these flink-internal shaded classes and ship your own versions (not shaded). > > Best, > > Jan > On 11/16/20 3:22 PM, Flavio Pompermaier wrote: > > Than

Re: Random Task executor shutdown (java.lang.OutOfMemoryError: Metaspace)

2020-11-16 Thread Flavio Pompermaier
heap (in general), but to look at where the 15k > objects of type Class are referenced from. That might help you figure this > out. I'm not sure if there is something that can be done in general to > prevent this type of leaks. That would be probably question on dev@ > mailing li

Random Task executor shutdown (java.lang.OutOfMemoryError: Metaspace)

2020-11-16 Thread Flavio Pompermaier
Hello everybody, I was writing this email when a similar thread on this mailing list appeared.. The difference is that the other problem seems to be related with Flink 1.10 on YARN and does not output anything helpful in debugging the cause of the problem. Indeed, in my use case I use Flink

Re: Logs of JobExecutionListener

2020-11-14 Thread Flavio Pompermaier
System.out.println(" EXECUTED"); } }); env.execute("WordCount Example"); } else { System.out.println("Printing result to stdout. Use --output to specify output path."); counts.print(); } On Fri, Nov 13, 2020 at 4:25 PM Flavio Pompermaier

Re: Logs of JobExecutionListener

2020-11-13 Thread Flavio Pompermaier
unfortunately.. > Best, > Matthias > > On Thu, Nov 12, 2020 at 7:48 PM Flavio Pompermaier > wrote: > >> Actually what I'm experiencing is that the JobListener is executed >> successfully if I run my main class from the IDE, while the job listener is >> not fired at

Re: Logs of JobExecutionListener

2020-11-12 Thread Flavio Pompermaier
with the env.execute() and i do env.registerJobListener() when I create the Exceution env via ExecutionEnvironment.getExecutionEnvironment(). Thanks in advance for any help, Flavio On Thu, Nov 12, 2020 at 2:13 PM Flavio Pompermaier wrote: > Hello everybody, > I'm trying to use the JobLi

Logs of JobExecutionListener

2020-11-12 Thread Flavio Pompermaier
Hello everybody, I'm trying to use the JobListener to track when a job finishes (with Flink 1.11.0). It works great but I have the problem that logs inside the onJobExecuted are not logged anywhere..is it normal? Best, Flavio

Re: [Discuss] Add JobListener (hook) in flink job lifecycle

2020-11-06 Thread Flavio Pompermaier
his https://issues.apache.org/jira/browse/FLINK-20020 help? > > Cheers, > Kostas > > On Thu, Nov 5, 2020 at 9:39 PM Flavio Pompermaier > wrote: > > > > Hi everybody, > > I was trying to use the JobListener in my job but onJobExecuted() on > Flink 1.11.0 but I can't under

Re: [Discuss] Add JobListener (hook) in flink job lifecycle

2020-11-05 Thread Flavio Pompermaier
https://issues.apache.org/jira/browse/FLINK-12214 > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ > -- Flavio Pompermaier Development Department OKKAM S.r.l. Tel. +(39) 0461 041809

Re: Flink job percentage

2020-11-05 Thread Flavio Pompermaier
D", "FAILED", "RECONCILING" ] > > Naturally, for your code you only want to check for the lattern. > > The documentation is hence correct. FYI, we directly access the > corresponding enums to generate this list, so it _cannot_ be out-of-sync. > > On 11

Re: Flink job percentage

2020-11-05 Thread Flavio Pompermaier
ot;FAILED", "RECONCILING" ] > > Naturally, for your code you only want to check for the lattern. > > The documentation is hence correct. FYI, we directly access the > corresponding enums to generate this list, so it _cannot_ be out-of-sync. > > On 11/5/2020 11:16

Re: Flink job percentage

2020-11-05 Thread Flavio Pompermaier
AM Robert Metzger > wrote: > >> Hi Flavio, >> >> I'm not aware of such a heuristic being implemented anywhere. You need to >> come up with something yourself. >> >> On Fri, Aug 7, 2020 at 12:55 PM Flavio Pompermaier >> wrote: >> >>>

Re: Missing help about run-application action in Flink CLI client

2020-11-04 Thread Flavio Pompermaier
Here it is: https://issues.apache.org/jira/browse/FLINK-19969 Best, Flavio On Wed, Nov 4, 2020 at 11:08 AM Kostas Kloudas wrote: > Could you also post the ticket here @Flavio Pompermaier and we will > have a look before the upcoming release. > > Thanks, > Kostas > > On W

Missing help about run-application action in Flink CLI client

2020-11-04 Thread Flavio Pompermaier
Hello everybody, I was looking into currently supported application-modes when submitting a Flink job so I tried to use the CLI help (I'm using Flink 11.0) but I can't find any help about he action "run-application" at the moment...am I wrong? Is there any JIRA to address this missing

Re: RestClusterClient and classpath

2020-10-30 Thread Flavio Pompermaier
is thread, 2 and 3 days ago > respectively. We could've saved some time here had you checked whether the > jar actually contains the class. > > On 10/30/2020 12:24 PM, Flavio Pompermaier wrote: > > I just discovered that I was using the "slim" jar instead of the "fat&

Re: RestClusterClient and classpath

2020-10-30 Thread Flavio Pompermaier
sing transitive dependencies in static fields IIRC). > > > Actually I was able to use the REST API without creating the JobGraph > > I'm not debating that, and pointed that out myself. > > [without a job graph you] cannot use the REST API *(outside of > uploading j

Re: RestClusterClient and classpath

2020-10-30 Thread Flavio Pompermaier
class references. > > On 10/30/2020 10:48 AM, Flavio Pompermaier wrote: > > For "REST only client" I mean using only the REST API to interact with the > Flink cluster, i.e. without creating any PackagedProgram and thus incurring > into classpath problems. > I'v

Re: RestClusterClient and classpath

2020-10-30 Thread Flavio Pompermaier
? Do you mean a plain http client, > not something that Flink provides? > > On 10/30/2020 10:02 AM, Flavio Pompermaier wrote: > > Nothing to do also with IntelliJ..do you have any sample project I > can reuse to test the job submission to a cluster? > I can't really unders

Re: RestClusterClient and classpath

2020-10-30 Thread Flavio Pompermaier
job (for example I can count the number of completed vertices wrt the total count of vertices). Is there any suggested way to do that apart from polling? Best, Flavio On Wed, Oct 28, 2020 at 12:19 PM Flavio Pompermaier wrote: > I'm runnin the code from Eclipse, the jar exists and it conta

Re: RestClusterClient and classpath

2020-10-28 Thread Flavio Pompermaier
I think that we are setting the correct classloader > > during jobgraph creation [1]. Is that what you mean? > > > > Cheers, > > Kostas > > > > [1] > https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/program/

Re: RestClusterClient and classpath

2020-10-28 Thread Flavio Pompermaier
extClassLoader = > Thread.currentThread().getContextClassLoader();try { > > Thread.currentThread().setContextClassLoader(packagedProgram.getUserCodeClassLoader()); >// do tstuff} finally { >Thread.currentThread().setContextClassLoader(contextClassLoader);} > > > On

Re: RestClusterClient and classpath

2020-10-28 Thread Flavio Pompermaier
Any help here? How can I understand why the classes inside the jar are not found when creating the PackagedProgram? On Tue, Oct 27, 2020 at 11:04 AM Flavio Pompermaier wrote: > In the logs I see that the jar is the classpath (I'm trying to debug the > program from the IDE)..

Re: RestClusterClient and classpath

2020-10-27 Thread Flavio Pompermaier
0/27/2020 10:36 AM, Chesnay Schepler wrote: > > Well it happens on the client before you even hit the RestClusterClient, > so I assume that either your jar is not packaged correctly or you your > JobExecutor is putting it on the classpath. > > On 10/27/2020 9:42 AM, Flavio Pomper

Re: RestClusterClient and classpath

2020-10-27 Thread Flavio Pompermaier
eing? I'm wondering if the > error happens on the client or server side (among other questions I have). > > On Mon, Oct 26, 2020 at 5:58 PM Flavio Pompermaier > wrote: > >> Hi to all, >> I was trying to use the RestClusterClient to submit my job to the Flink >> cluste

RestClusterClient and classpath

2020-10-26 Thread Flavio Pompermaier
Hi to all, I was trying to use the RestClusterClient to submit my job to the Flink cluster. However when I submit the job Flink cannot find the classes contained in the "fat" jar..what should I do? Am I missing something in my code? This is the current client code I'm testing: public static void

Re: Packaging multiple Flink jobs from a single IntelliJ project

2020-09-01 Thread Flavio Pompermaier
Yes, the recommended way to proceed in your use case is to put all classes in a single JAR and to specify the main class to run to the flink client. Best, Flavio

Re: Client's documentation for deploy and run remotely.

2020-08-19 Thread Flavio Pompermaier
I agree with you that that part of the docs is quite outdated.. On Thu, Aug 13, 2020 at 4:55 PM Jacek Grzebyta wrote: > It seems the documentation might be outdated. Probably I found what I > wanted in different request: >

Flink job percentage

2020-08-07 Thread Flavio Pompermaier
Hi to all, one of our customers asked us to see a percentage of completion of a Flink Batch job. Is there any already implemented heuristic I can use to compute it? Will this be possible also when DataSet api will migrate to DataStream..? Thanks in advance, Flavio

Re: Submit Flink 1.11 job from java

2020-08-07 Thread Flavio Pompermaier
s the job and returns a JobClient. > > Best, > Godfrey > > Flavio Pompermaier 于2020年8月6日周四 下午9:45写道: > >> Hi to all, >> in my current job server I submit jobs to the cluster setting up an SSH >> session with the JobManager host and running the bin/flink

Submit Flink 1.11 job from java

2020-08-06 Thread Flavio Pompermaier
Hi to all, in my current job server I submit jobs to the cluster setting up an SSH session with the JobManager host and running the bin/flink run command remotely (since the jar is put in the flink-web-upload directory). Unfortunately, this approach makes very difficult to caputre all exceptions

Re: JDBCOutputFormat dependency loading error

2020-08-03 Thread Flavio Pompermaier
Yes, the problem indeed was mine (2 different connectors for mariadb, both mysql and mariadb-client), Sorry for the confusion On Mon, Aug 3, 2020 at 12:26 PM Till Rohrmann wrote: > Glad to hear it! > > On Mon, Aug 3, 2020 at 11:59 AM Flavio Pompermaier > wrote: > >> Yes

Re: JDBCOutputFormat dependency loading error

2020-08-03 Thread Flavio Pompermaier
e dependencies in your user jar to > make your job run. Please check from where com.mysql.cj.jdbc.Driver is > being loaded when running the job from the IDE. > > Cheers, > Till > > On Fri, Jul 31, 2020 at 4:55 PM Flavio Pompermaier > wrote: > >> Hi to all, >> I

JDBCOutputFormat dependency loading error

2020-07-31 Thread Flavio Pompermaier
Hi to all, I'm trying to run my DataSet job on Flink 1.11.0 and I'm connecting toward Mariadb in my code. I've put the mariadb-java-client-2.6.0.jar in the lib directory and in the pom.xml I set that dependency as provided. The code runs successfully from the Ide but when I try to run the code on

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Flavio Pompermaier
ing > `runCustomOperation()`. Out of curiosity, what are you using that for? > > We have definitely thought about the first two points you mentioned, > though. Especially processing-time will make it tricky to define unified > execution semantics. > > Best, > Aljoscha > > On 30.07.2

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Flavio Pompermaier
hat it works > well on bounded input. > > On Thu, Jul 30, 2020 at 8:49 AM Flavio Pompermaier > wrote: > >> Just to contribute to the discussion, when we tried to do the migration we >> faced some problems that could make migration quite difficult. >> 1

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Flavio Pompermaier
Just to contribute to the discussion, when we tried to do the migration we faced some problems that could make migration quite difficult. 1 - It's difficult to test because of https://issues.apache.org/jira/browse/FLINK-18647 2 - missing mapPartition 3 - missing DataSet

Re: Customization of execution environment

2020-07-30 Thread Flavio Pompermaier
theory it would be nicer if the configuration returned >> was editable, but the handling of configs in Flink is pretty involved >> already. >> >> >> On Tue, Jul 28, 2020 at 10:13 AM Flavio Pompermaier >> wrote: >> >>> Hi to all, >>>

Customization of execution environment

2020-07-28 Thread Flavio Pompermaier
Hi to all, migrating to Flink 1.11 I've tried to customize the exec env in this way: ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); BatchTableEnvironment bte = BatchTableEnvironment.create(env); final Configuration conf = bte.getConfig().getConfiguration();

Re: AllwindowStream and RichReduceFunction

2020-07-28 Thread Flavio Pompermaier
uld work with an aggregate() instead of reduce(). > > Best, > Aljoscha > > On 24.07.20 17:02, Flavio Pompermaier wrote: > > In my reduce function I want to compute some aggregation on the > sub-results > > of a map-partition (that I tried to migrate from DataSet

Re: Parquet batch table sink in Flink 1.11

2020-07-27 Thread Flavio Pompermaier
urce/sink for `AvroParquetOutputFormat`, because > the data structure is always Row or RowData, should not be a avro object. > > Best, > Jingsong > > On Tue, Jul 21, 2020 at 8:09 PM Flavio Pompermaier > wrote: > >> This is what I actually do but I was hoping to be able to get rid of the

  1   2   3   4   5   6   7   >