Re: [PROPOSAL] Introduce Elastic Bloom Filter For Flink

2018-05-23 Thread Elias Levy
I would suggest you consider an alternative data structures: a Cuckoo Filter or a Golumb Compressed Sequence. The GCS data structure was introduced in Cache-, Hash- and Space-Efficient Bloom Filters by F. Putze, P. Sanders,

[jira] [Created] (FLINK-9429) Quickstart E2E not working locally

2018-05-23 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-9429: Summary: Quickstart E2E not working locally Key: FLINK-9429 URL: https://issues.apache.org/jira/browse/FLINK-9429 Project: Flink Issue Type: Bug

Re: [VOTE] Release 1.5.0, release candidate #5

2018-05-23 Thread Till Rohrmann
-1 Piotr just found a race condition between the TM registration at the RM and slot requests coming from the SlotManager/RM [1]. This is a release blocker since it affects all deployments. Consequently I have to cancel this RC :-( [1] https://issues.apache.org/jira/browse/FLINK-9427 On Wed, May

[jira] [Created] (FLINK-9428) Allow operators to flush data on checkpoint pre-barrier

2018-05-23 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-9428: --- Summary: Allow operators to flush data on checkpoint pre-barrier Key: FLINK-9428 URL: https://issues.apache.org/jira/browse/FLINK-9428 Project: Flink Issue

Re: [VOTE] Release 1.5.0, release candidate #5

2018-05-23 Thread Gary Yao
+1 (non-binding) I have run all examples (batch & streaming), and only found a non-blocking issue with TPCHQuery3 [1] which has been introduced a year ago. I have also deployed a cluster with HA enabled on YARN (Hadoop 2.8.3) without problems. [1]

[jira] [Created] (FLINK-9427) Cannot download from BlobServer, because the server address is unknown.

2018-05-23 Thread Piotr Nowojski (JIRA)
Piotr Nowojski created FLINK-9427: - Summary: Cannot download from BlobServer, because the server address is unknown. Key: FLINK-9427 URL: https://issues.apache.org/jira/browse/FLINK-9427 Project:

Re: [VOTE] Release 1.5.0, release candidate #5

2018-05-23 Thread Ted Yu
+1 Checked signatures Ran test suite Due to FLINK-9340 and FLINK-9091, I had to run tests in multiple rounds. Cheers On Wed, May 23, 2018 at 7:39 AM, Fabian Hueske wrote: > +1 (binding) > > - checked hashes and signatures > - checked source archive and didn't find

Re: [VOTE] Release 1.5.0, release candidate #5

2018-05-23 Thread Fabian Hueske
+1 (binding) - checked hashes and signatures - checked source archive and didn't find unexpected binary files - built from source archive skipping the tests (mvn -DskipTests clean install), started a local cluster, and ran an example program. Thanks, Fabian 2018-05-23 15:39 GMT+02:00 Till

Re: [VOTE] Release 1.5.0, release candidate #5

2018-05-23 Thread Till Rohrmann
Fabian pointed me to the updated ASF release policy [1] and the changes it implies for the checksum files. New releases should no longer provide a MD5 checksum file and the sha checksum file should have a proper file name extension `sha512` instead of `sha`. I've updated the release artifacts [2]

[jira] [Created] (FLINK-9426) Harden RocksDBWriteBatchPerformanceTest.benchMark()

2018-05-23 Thread Sihua Zhou (JIRA)
Sihua Zhou created FLINK-9426: - Summary: Harden RocksDBWriteBatchPerformanceTest.benchMark() Key: FLINK-9426 URL: https://issues.apache.org/jira/browse/FLINK-9426 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-9425) Make release scripts compliant with ASF release policy

2018-05-23 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-9425: Summary: Make release scripts compliant with ASF release policy Key: FLINK-9425 URL: https://issues.apache.org/jira/browse/FLINK-9425 Project: Flink Issue

[jira] [Created] (FLINK-9424) BlobClientSslTest does not work in all environments

2018-05-23 Thread Timo Walther (JIRA)
Timo Walther created FLINK-9424: --- Summary: BlobClientSslTest does not work in all environments Key: FLINK-9424 URL: https://issues.apache.org/jira/browse/FLINK-9424 Project: Flink Issue Type:

Re: [VOTE] Enable GitBox integration (#2)

2018-05-23 Thread Stefan Richter
+1 > Am 23.05.2018 um 14:31 schrieb Stephan Ewen : > > +1 > > On Wed, May 23, 2018 at 11:11 AM, Fabian Hueske wrote: > >> +1 >> >> 2018-05-23 8:49 GMT+02:00 Aljoscha Krettek : >> >>> +1 >>> On 22. May 2018, at 15:36, Thomas

Re: [VOTE] Enable GitBox integration (#2)

2018-05-23 Thread Stephan Ewen
+1 On Wed, May 23, 2018 at 11:11 AM, Fabian Hueske wrote: > +1 > > 2018-05-23 8:49 GMT+02:00 Aljoscha Krettek : > > > +1 > > > > > On 22. May 2018, at 15:36, Thomas Weise wrote: > > > > > > +1 > > > > > > > > > On Tue, May 22, 2018 at

Re: [PROPOSAL] Introduce Elastic Bloom Filter For Flink

2018-05-23 Thread sihua zhou
Thanks for your reply @Fabian and @Stefan @Fabian: The bloom filter state I proposal would be "elastic" and "lazy allocation", what we have on each key group is a list of bloom filter node(which is shrinkable), every bloom filter node has its capacity, we allocate a new one only when the

[jira] [Created] (FLINK-9422) Dedicated operator for UNION on streaming tables with time attributes

2018-05-23 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-9422: Summary: Dedicated operator for UNION on streaming tables with time attributes Key: FLINK-9422 URL: https://issues.apache.org/jira/browse/FLINK-9422 Project: Flink

[jira] [Created] (FLINK-9421) RunningJobsRegistry entries are not cleaned up after job termination

2018-05-23 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-9421: Summary: RunningJobsRegistry entries are not cleaned up after job termination Key: FLINK-9421 URL: https://issues.apache.org/jira/browse/FLINK-9421 Project: Flink

[jira] [Created] (FLINK-9420) Add tests for SQL IN sub-query operator in streaming

2018-05-23 Thread Timo Walther (JIRA)
Timo Walther created FLINK-9420: --- Summary: Add tests for SQL IN sub-query operator in streaming Key: FLINK-9420 URL: https://issues.apache.org/jira/browse/FLINK-9420 Project: Flink Issue Type:

[jira] [Created] (FLINK-9419) UNION should not be treated as retraction producing operator

2018-05-23 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-9419: Summary: UNION should not be treated as retraction producing operator Key: FLINK-9419 URL: https://issues.apache.org/jira/browse/FLINK-9419 Project: Flink

Re: [PROPOSAL] Introduce Elastic Bloom Filter For Flink

2018-05-23 Thread Stefan Richter
Hi, In general, I like the proposal as well. We should try to integrate all forms of keyed state with the backend, to avoid the problems that we are currently facing with the timer service. We should discuss which exact implementation of bloom filters are the best fit. @Fabian: There are also

Re: [PROPOSAL] Introduce Elastic Bloom Filter For Flink

2018-05-23 Thread Fabian Hueske
Thanks for the proposal Sihua! Let me try to summarize the motivation / scope of this proposal. You are proposing to add support for a special Bloom Filter state per KeyGroup and reduce the number of key accesses by checking the Bloom Filter first. This is would be a rather generic feature that

Re: [VOTE] Release 1.5.0, release candidate #5

2018-05-23 Thread Piotr Nowojski
+1 from me. Additionally for this RC5 I did some manual tests to double check backward compatibility of the bug fix: https://issues.apache.org/jira/browse/FLINK-9295 The issue with this bug fix was that it was merged after 1.5.0 RC4 but just

Re: [VOTE] Enable GitBox integration (#2)

2018-05-23 Thread Fabian Hueske
+1 2018-05-23 8:49 GMT+02:00 Aljoscha Krettek : > +1 > > > On 22. May 2018, at 15:36, Thomas Weise wrote: > > > > +1 > > > > > > On Tue, May 22, 2018 at 2:37 AM, Timo Walther > wrote: > > > >> +1 > >> > >> Am 22.05.18 um 10:49 schrieb

[jira] [Created] (FLINK-9418) Migrate SharedBuffer to use MapState

2018-05-23 Thread Dawid Wysakowicz (JIRA)
Dawid Wysakowicz created FLINK-9418: --- Summary: Migrate SharedBuffer to use MapState Key: FLINK-9418 URL: https://issues.apache.org/jira/browse/FLINK-9418 Project: Flink Issue Type:

[PROPOSAL] Introduce Elastic Bloom Filter For Flink

2018-05-23 Thread sihua zhou
Hi Devs! I proposal to introduce "Elastic Bloom Filter" for Flink, the reason I make up this proposal is that, it helped us a lot on production, it let's improve the performance with reducing consumption of resources. Here is a brief description fo the motivation of why it's so powful, more

[jira] [Created] (FLINK-9417) Send heartbeat requests from RPC endpoint's main thread

2018-05-23 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-9417: Summary: Send heartbeat requests from RPC endpoint's main thread Key: FLINK-9417 URL: https://issues.apache.org/jira/browse/FLINK-9417 Project: Flink Issue

[jira] [Created] (FLINK-9416) Make job submission retriable operation in case of a ongoing leader election

2018-05-23 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-9416: Summary: Make job submission retriable operation in case of a ongoing leader election Key: FLINK-9416 URL: https://issues.apache.org/jira/browse/FLINK-9416 Project:

Re: [VOTE] Release 1.5.0, release candidate #5

2018-05-23 Thread Till Rohrmann
Thanks for the pointer Sihua, I've properly closed FLINK-9070. On Wed, May 23, 2018 at 4:49 AM, sihua zhou wrote: > > Hi, > just one minor thing, I found the JIRA release notes seem a bit > inconsistent with the this RC. For example, https://issues.apache.org/ >

Re: [VOTE] Release 1.5.0, release candidate #5

2018-05-23 Thread Aljoscha Krettek
We need to triage our Jira issues before the release: Issues that are actually merged/fixed should be marked as done and have 1.5.0 as fixVersion, all issues that are not closed/done/fixed should not have 1.5.0 as release version. I.e. there should be no open issues with 1.5.0 as fixVersion,

Re: [VOTE] Enable GitBox integration (#2)

2018-05-23 Thread Aljoscha Krettek
+1 > On 22. May 2018, at 15:36, Thomas Weise wrote: > > +1 > > > On Tue, May 22, 2018 at 2:37 AM, Timo Walther wrote: > >> +1 >> >> Am 22.05.18 um 10:49 schrieb Ted Yu: >> >> +1 >>> Original message From: Chesnay Schepler < >>>