[Structured streaming, V2] commit on ContinuousReader

2018-05-03 Thread Jiří Syrový
Version: 2.3, DataSourceV2, ContinuousReader Hi, We're creating a new data source to fetch data from streaming source that requires commiting received data and we would like to commit data once in a while after it has been retrieved and correctly processed and then fetch more. One option could b

Re: Dependency Injection and Microservice development with Spark

2017-01-04 Thread Jiří Syrový
Hi, another nice approach is to use instead of it Reader monad and some framework to support this approach (e.g. Grafter - https://github.com/zalando/grafter). It's lightweight and helps a bit with dependencies issues. 2016-12-28 22:55 GMT+01:00 Lars Albertsson : > Do you really need dependency

Re: NegativeArraySizeException / segfault

2016-05-30 Thread Jiří Syrový
I think I saw this one already as the first indication that something is wrong and it was related to https://issues.apache.org/jira/browse/SPARK-13516 2016-05-28 1:34 GMT+02:00 Koert Kuipers : > it seemed to be related to an Aggregator, so for tests we replaced it with > an ordinary Dataset.reduc

Fwd: Aggregation + Adding static column + Union + Projection = Problem

2016-02-26 Thread Jiří Syrový
Hi, I've recently noticed a bug in Spark (branch 1.6) that appears if you do the following Let's have some DataFrame called df. 1) Aggregation of multiple columns on the Dataframe df and store result as result_agg_1 2) Do another aggregation of multiple columns, but on one less grouping columns

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Jiří Syrový
+1 Tested in standalone mode and so far seems to be fairly stable. 2015-12-16 22:32 GMT+01:00 Michael Armbrust : > Please vote on releasing the following candidate as Apache Spark version > 1.6.0! > > The vote is open until Saturday, December 19, 2015 at 18:00 UTC and > passes if a majority of at