Re: HDFS streaming source concerns

2022-04-19 Thread Adrian Bednarz
Hello, We are actually working on a similar problem against S3. The checkpointing thing got me thinking if the checkpoint would indeed succeed with a large backlog of files. I always imagined that SplitEnumerator lists all available files and SourceReader is responsible for reading those files aft

OVER IGNORE NULLS support

2021-10-08 Thread Adrian Bednarz
Hi, we've been trying to run a query similar to SELECT id, type, LAG(id) IGNORE NULLS OVER (PARTITION BY type ORDER BY ts) AS lastId FROM Events A query without IGNORE NULLS clause executes just fine. This syntax is supported by Calcite and our clients expect it to work. Our platform uses Flink

Re: Subpar performance of temporal joins with RocksDB backend

2021-07-19 Thread Adrian Bednarz
options: > - state.backend.rocksdb.predefined-options = > SPINNING_DISK_OPTIMIZED_HIGH_MEM > - state.backend.rocksdb.memory.partitioned-index-filters = true > > Regards, > Maciek > > sob., 10 lip 2021 o 08:54 Adrian Bednarz > napisał(a): > > > > I didn’t tweak any RocksDB knobs.

Re: Subpar performance of temporal joins with RocksDB backend

2021-07-09 Thread Adrian Bednarz
. On Fri, 9 Jul 2021 at 20:43, Maciej Bryński wrote: > Hi Adrian, > Could you share your state backend configuration ? > > Regards, > Maciek > > pt., 9 lip 2021 o 19:09 Adrian Bednarz > napisał(a): > > > > Hello, > > > > We are experimenting with l

Subpar performance of temporal joins with RocksDB backend

2021-07-09 Thread Adrian Bednarz
Hello, We are experimenting with lookup joins in Flink 1.13.0. Unfortunately, we unexpectedly hit significant performance degradation when changing the state backend to RocksDB. We performed tests with two tables: fact table TXN and dimension table CUSTOMER with the following schemas: TXN: |--