Re: Breaking changes in the Streaming API
big +1 on reworking the timestamp extractor. There were too many users stumbling across this. I also misunderstood it when trying it out the first time. On Tue, Feb 9, 2016 at 4:54 PM, Stephan Ewenwrote: > Hi everyone! > > There are two remaining issues right now pending for 1.0 that will cause > breaking API changes in the Streaming API. > > > 1) > [FLINK-3379] > > The Timestamp Extractor needs to be changed. That really seems necessary, > based on the user feedback, because a lot of people mentioned that they are > getting confused about the TimestampExtractor's mixed two-way system of > generating watermarks. > > The issue suggests to pull the two different modes of generating watermarks > into two different classes. > > > 2) > > [FLINK-3371] makes the "Trigger" an abstract class (currently interface) > and moves the "TriggerResult" to a dedicated class. This is necessary for > avoiding breaking changes in the future, after the release. > > The reason why for these changes are "aligned windows", which have one > Trigger for the entire window across all keys ( > https://issues.apache.org/jira/browse/FLINK-3370) > > Aligned windows are for example most sliding/tumbling time windows, while > unaligned windows (with a trigger per key) are for example session and > count windows. For aligned windows, we can implement an optimized > representation that uses less memory and is more lightweight to checkpoint. > > Also, the Trigger class may evolve a bit, and and with an abstract class we > can add methods without breaking user-defined Triggers in the future. > > > Greetings, > Stephan >
[jira] [Created] (FLINK-3391) Typo in SavepointITCase
Sebastian Klemke created FLINK-3391: --- Summary: Typo in SavepointITCase Key: FLINK-3391 URL: https://issues.apache.org/jira/browse/FLINK-3391 Project: Flink Issue Type: Bug Reporter: Sebastian Klemke Priority: Trivial {code} --- flink-tests/src/test/java/org/apache/flink/test/checkpointing/SavepointITCase.java (revision 8ccd7544edb25e82cc8a898809cc7c8bb7893620) +++ flink-tests/src/test/java/org/apache/flink/test/checkpointing/SavepointITCase.java (revision ) @@ -441,7 +441,7 @@ LOG.info("Created temporary savepoint directory: " + savepointDir + "."); config.setString(ConfigConstants.STATE_BACKEND, "filesystem"); - config.setString(SavepointStoreFactory.SAVEPOINT_BACKEND_KEY, "fileystem"); + config.setString(SavepointStoreFactory.SAVEPOINT_BACKEND_KEY, "filesystem"); config.setString(FsStateBackendFactory.CHECKPOINT_DIRECTORY_URI_CONF_KEY, checkpointDir.toURI().toString()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3389) Add Pre-defined Options settings for RocksDB State backend
Stephan Ewen created FLINK-3389: --- Summary: Add Pre-defined Options settings for RocksDB State backend Key: FLINK-3389 URL: https://issues.apache.org/jira/browse/FLINK-3389 Project: Flink Issue Type: Improvement Components: state backends Affects Versions: 1.0.0 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 1.0.0 The RocksDB State Backend performance can be optimized for certain settings (for example running on disk or SSD) with certain options. Since it is hard to tune for users, we should add a set of predefined options for certain settings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3390) Savepoint resume is not retried
Sebastian Klemke created FLINK-3390: --- Summary: Savepoint resume is not retried Key: FLINK-3390 URL: https://issues.apache.org/jira/browse/FLINK-3390 Project: Flink Issue Type: Bug Reporter: Sebastian Klemke When during resuming from a savepoint, restoreState fails for a task node, job is retried but without retrying resume from savepoint state. This leads to the job being restarted with empty state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
StateBackend
Hi, In Flink it is possible to have different backends for operator state. I am wondering what the best approach for different state backends would be. Let's assume the backend is a database server. The following questions arise: - Should the database server be started manually by the user or can Flink start the server automatically it used? (this seems to be the approach for RocksDB as embedded servers) - Should each job use the same or individual backup server (or maybe a mix of both?) I personally think, that Flink should not start-up a backup server but assume that it is available when the job is submitted. This allows the user also the start up multiple instances of the backup server and choose which one to use for each job individually. What do you think about it? I ask because of the current PR for Redis as StateBackend: https://github.com/apache/flink/pull/1617 There is no embedded mode for Redis as for RocksDB. -Matthias signature.asc Description: OpenPGP digital signature
[jira] [Created] (FLINK-3392) Unprotected access to elements in ClosableBlockingQueue#size()
Ted Yu created FLINK-3392: - Summary: Unprotected access to elements in ClosableBlockingQueue#size() Key: FLINK-3392 URL: https://issues.apache.org/jira/browse/FLINK-3392 Project: Flink Issue Type: Bug Reporter: Ted Yu Priority: Minor Here is related code: {code} public int size() { return elements.size(); } {code} Access to elements should be protected by lock.lock() / lock.unlock() pair. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3386) Kafka consumers should not necessarily fail on expiring data
Gyula Fora created FLINK-3386: - Summary: Kafka consumers should not necessarily fail on expiring data Key: FLINK-3386 URL: https://issues.apache.org/jira/browse/FLINK-3386 Project: Flink Issue Type: Improvement Components: Streaming Connectors Affects Versions: 1.0.0 Reporter: Gyula Fora Currently if the data in a kafka topic expires while reading from it, it causes an unrecoverable failure as subsequent retries will also fail on invalid offsets. While this might be the desired behaviour under some circumstances, it would probably be better in most cases to automatically jump to the earliest valid offset in these cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)