Re: MongoOutputFormat does not write back to collection

2015-07-22 Thread Stefano Bortoli
() and finalizeGlobal to the outputCommitter to FileOutputCommitter(), or keep it as a default in case of no specific assignment. saluti, Stefano 2015-07-22 16:48 GMT+02:00 Stefano Bortoli s.bort...@gmail.com: In fact, on close() of the HadoopOutputFormat the fileOutputCommitter returns false

Re: MongoOutputFormat does not write back to collection

2015-07-22 Thread Stefano Bortoli
Debugging, it seem the commitTask method of the MongoOutputCommitter is never called. Is it possible that this 'bulk' approach of mongo-hadoop 1.4 does not fit the task execution method of Flink? any idea? thanks a lot in advance. saluti, Stefano Stefano Bortoli, PhD *ENS Technical Director

Re: MongoOutputFormat does not write back to collection

2015-07-22 Thread Stefano Bortoli
-07-22 15:53 GMT+02:00 Stefano Bortoli bort...@okkam.it: Debugging, it seem the commitTask method of the MongoOutputCommitter is never called. Is it possible that this 'bulk' approach of mongo-hadoop 1.4 does not fit the task execution method of Flink? any idea? thanks a lot in advance

Re: kryo exception due to race condition

2015-10-07 Thread Stefano Bortoli
i Stefano, > > we'll definitely look into it once Flink Forward is over and we've > finished the current release work. Thanks for reporting the issue. > > Cheers, > Till > > On Tue, Oct 6, 2015 at 9:21 AM, Stefano Bortoli <bort...@okkam.it> wrote: > >> Hi guys, I co

Re: kryo exception due to race condition

2015-10-06 Thread Stefano Bortoli
Hi guys, I could manage to complete the process crossing byte arrays I deserialize within the group function. However, I think this workaround is feasible just with relatively simple processes. Any idea/plan about to fix the serialization problem? saluti, Stefano Stefano Bortoli, PhD *ENS

kryo exception due to race condition

2015-10-01 Thread Stefano Bortoli
Hi guys, I hit a Kryo exception while running a process 'crossing' POJOs datasets. I am using the 0.10-milestone-1. Checking the serializer: org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.deserialize(KryoSerializer.java:210) I have noticed that the Kryo instance is reused along

Re: data flow example on cluster

2015-10-02 Thread Stefano Bortoli
I had problems running a flink job with maven, probably there is some issue of classloading. For me worked to run a simple java command with the uberjar. So I build the jar using maven, and then run it this way java -Xmx2g -cp target/youruberjar.jar yourclass arg1 arg2 hope it helps, Stefano

Re: kryo exception due to race condition

2015-10-02 Thread Stefano Bortoli
r this? > > On Thu, Oct 1, 2015 at 6:46 PM, Stefano Bortoli <s.bort...@gmail.com> > wrote: > >> Hi guys, >> >> I hit a Kryo exception while running a process 'crossing' POJOs datasets. >>

Re: threads, parallelism and task managers

2016-04-13 Thread Stefano Bortoli
>>> >>> In my laptop I have 8 cores and if I put parallelism to 16 I expect to >>> see 16 calls to the connection pool (i.e. ' CREATING >>> NEW CONNECTION!') while I see only 8 (up to my maximum number of cores). >>> The num

Joda DateTimeSerializer

2016-04-08 Thread Stefano Bortoli
Hi to all, we've just upgraded to Flink 1.0.0 and we had some problems with joda DateTime serialization. The problem was caused by Flink-3305 that removed the JavaKaffee dependency. We had to re-add such dependency in our application and then register the DateTime serializer in the environment:

threads, parallelism and task managers

2016-03-23 Thread Stefano Bortoli
Hi guys, I am trying to test a job that should run a number of tasks to read from a RDBMS using an improved JDBC connector. The connection and the reading run smoothly, but I cannot seem to be able to move above the limit of 8 concurrent threads running. 8 is of course the number of cores of my

Re: Oracle 11g number serialization: classcast problem

2016-03-23 Thread Stefano Bortoli
Flink. We are testing it on a Oracle table of 11 billion records, but we did not get through a complete run. We are just at first prototype level, so there is surely some work to do. :-) saluti, Stefano 2016-03-23 10:38 GMT+01:00 Chesnay Schepler <ches...@apache.org>: > On 23.03.2016 10:

Re: threads, parallelism and task managers

2016-03-30 Thread Stefano Bortoli
Ufuk Celebi <u...@apache.org>: > Do you have the code somewhere online? Maybe someone can have a quick > look over it later. I'm pretty sure that is indeed a problem with the > custom input format. > > – Ufuk > > On Tue, Mar 29, 2016 at 3:50 PM, Stefano Bortoli &l

Re: threads, parallelism and task managers

2016-03-29 Thread Stefano Bortoli
ith > > > >val executionContext = ExecutionContext.fromExecutor(new > > ForkJoinPool()) > > > > and thus the number of concurrently running threads is limited to the > number > > of cores (using the default constructor of the ForkJoinPool). > > What

Re: threads, parallelism and task managers

2016-03-29 Thread Stefano Bortoli
xt? That should actually be > something which you shouldn’t be concerned with since it is only used > internally by the runtime. > > Cheers, > Till > ​ > > On Tue, Mar 29, 2016 at 12:09 PM, Stefano Bortoli <s.bort...@gmail.com> > wrote: > >> Well, in the

Re: threads, parallelism and task managers

2016-03-29 Thread Stefano Bortoli
should have 32 > threads running. > ​ > > On Tue, Mar 29, 2016 at 12:29 PM, Stefano Bortoli <s.bort...@gmail.com> > wrote: > >> In fact, I don't use it. I just had to crawl back the runtime >> implementation to get to the point where parallelism was switching from 32 &g

Re: threads, parallelism and task managers

2016-03-29 Thread Stefano Bortoli
there would be millions of queries before the statistics are collected. Perhaps we are doing something wrong, still to figure out what. :-/ thanks a lot for your help. saluti, Stefano 2016-03-29 13:30 GMT+02:00 Stefano Bortoli <s.bort...@gmail.com>: > That is exactly my point. I shoul

Re: Oracle 11g number serialization: classcast problem

2016-03-23 Thread Stefano Bortoli
found when i wrote the formats. > > Is the INT returned as a double as well? > > Note: The (runtime) output type is in no way connected to the TypeInfo you > pass when constructing the format. > > > On 21.03.2016 14:16, Stefano Bortoli wrote: > >> Hi squirrels, >>

Re: Requesting the next InputSplit failed

2016-04-29 Thread Stefano Bortoli
- prallelism.default:24 >>>>>>- env.java.home=/usr/lib/jvm/java-8-oracle/ >>>>>>- taskmanager.network.numberOfBuffers:16384 >>>>>> >>>>>> The job just read a window of max 100k elements and then writes a >>>>>>

Re: Weird Kryo exception (Unable to find class: java.ttil.HashSet)

2016-05-24 Thread Stefano Bortoli
Till mentioned the fact that 'spilling on disk' was managed through exception catch. The last serialization error was related to bad management of Kryo buffer that was not cleaned after spilling on exception management. Is it possible we are dealing with an issue similar to this but caused by

Re: Requesting the next InputSplit failed

2016-04-28 Thread Stefano Bortoli
I had this type of exception when trying to build and test Flink on a "small machine". I worked around the test increasing the timeout for Akka.

Re: Requesting the next InputSplit failed

2016-04-28 Thread Stefano Bortoli
, it is not clear why it should refuse a connection to itself after 40min of run. we'll try to figure out possible environment issues. Its a fresh installation, therefore we may have left out some configurations. saluti, Stefano 2016-04-28 9:22 GMT+02:00 Stefano Bortoli <s.bort...@gmail.com>: &

Re: JDBC sink in flink

2016-07-05 Thread Stefano Bortoli
The connection will be managed by the splitManager, no need of using a pool. However, if you had to, probably you should look into establishConnection() method of the JDBCInputFormat. 2016-07-05 10:52 GMT+02:00 Flavio Pompermaier : > why do you need a connection pool? >

Re: JDBC sink in flink

2016-07-05 Thread Stefano Bortoli
As Chesnay said, it not necessary to use a pool as the connection is reused across split. However, if you had to customize it for some reasons, you can do it starting from the JDBC Input and Output format. cheers! 2016-07-05 13:27 GMT+02:00 Harikrishnan S : > Awesome !

GROUP BY TUMBLE on ROW range

2017-10-17 Thread Stefano Bortoli
Hi all, Is there a way to use a tumble window group by with row range in streamSQL? I mean, something like this: // "SELECT COUNT(*) " + // "FROM T1 " + //"GROUP BY TUMBLE(rowtime, INTERVAL '2' ROWS PRECEDING )" However, even looking at tests and looking at the "row

RE: GROUP BY TUMBLE on ROW range

2017-10-18 Thread Stefano Bortoli
(maybe partitioned on some key) sorted on time with a ROW_NUMBER function that assigns increasing numbers to rows. - do a group by on the row number modulo the window size. Btw. count windows are supported by the Table API. Best, Fabian 2017-10-17 17:16 GMT+02:00 Stefano Bortoli <ste