[jira] [Created] (FLINK-7936) Lack of synchronization w.r.t. taskManagers in MetricStore#add()
Ted Yu created FLINK-7936: - Summary: Lack of synchronization w.r.t. taskManagers in MetricStore#add() Key: FLINK-7936 URL: https://issues.apache.org/jira/browse/FLINK-7936 Project: Flink Issue Type: Bug Reporter: Ted Yu Priority: Minor {code} String tmID = ((QueryScopeInfo.TaskManagerQueryScopeInfo) info).taskManagerID; tm = taskManagers.computeIfAbsent(tmID, k -> new TaskManagerMetricStore()); {code} In other places, access to taskManagers is protected by lock on MetricStore.this -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7935) Metrics with user supplied scope variables
Elias Levy created FLINK-7935: - Summary: Metrics with user supplied scope variables Key: FLINK-7935 URL: https://issues.apache.org/jira/browse/FLINK-7935 Project: Flink Issue Type: Improvement Affects Versions: 1.3.2 Reporter: Elias Levy We use DataDog for metrics. DD and Flink differ somewhat in how they track metrics. Flink names and scopes metrics together, at least by default. E.g. by default the System scope for operator metrics is {{.taskmanager}}. The scope variables become part of the metric's full name. In DD the metric would be named something generic, e.g. {{taskmanager.job.operator}}, and they would be distinguished by their tag values, e.g. {{tm_id=foo}}, {{job_name=var}}, {{operator_name=baz}}. Flink allows you to configure the format string for system scopes, so it is possible to set the operator scope format to {{taskmanager.job.operator}}. We do this for all scopes: {code} metrics.scope.jm: jobmanager metrics.scope.jm.job: jobmanager.job metrics.scope.tm: taskmanager metrics.scope.tm.job: taskmanager.job metrics.scope.task: taskmanager.job.task metrics.scope.operator: taskmanager.job.operator {code} This seems to work. The DataDog Flink metric's plugin submits all scope variables as tags, even if they are not used within the scope format. And it appears internally this does not lead to metrics conflicting with each other. We would like to extend this to user defined metrics, but you can define variables/scopes when adding a metric group or metric with the user API, so that in DD we have a single metric with a tag with many different values, rather than hundreds of metrics to just the one value we want to measure across different event types. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7934) Upgrade Calcite dependency to 1.15
Rong Rong created FLINK-7934: Summary: Upgrade Calcite dependency to 1.15 Key: FLINK-7934 URL: https://issues.apache.org/jira/browse/FLINK-7934 Project: Flink Issue Type: Bug Reporter: Rong Rong Umbrella issue for all related issues for Apache Calcite 1.15 release. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7933) Test instability PrometheusReporterTest
Kostas Kloudas created FLINK-7933: - Summary: Test instability PrometheusReporterTest Key: FLINK-7933 URL: https://issues.apache.org/jira/browse/FLINK-7933 Project: Flink Issue Type: Bug Components: Metrics Affects Versions: 1.4.0 Reporter: Kostas Kloudas Travis log: https://travis-ci.org/kl0u/flink/jobs/293220196 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7931) FlinkKinesisProducer violates at-least-once guarantees (1.3 branch)
Aljoscha Krettek created FLINK-7931: --- Summary: FlinkKinesisProducer violates at-least-once guarantees (1.3 branch) Key: FLINK-7931 URL: https://issues.apache.org/jira/browse/FLINK-7931 Project: Flink Issue Type: Bug Components: Kinesis Connector Reporter: Tzu-Li (Gordon) Tai Assignee: Tzu-Li (Gordon) Tai Priority: Blocker Fix For: 1.4.0, 1.3.3 Currently, there is no flushing of KPL outstanding records on checkpoints in the {{FlinkKinesisProducer}}. Likewise to the at-least-once issue on the Flink Kafka producer before, this may lead to data loss if there are asynchronous failing records after a checkpoint which the records was part of was completed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7930) Support periodic jobs with state that gets restored and persisted in a savepoint
Flavio Pompermaier created FLINK-7930: - Summary: Support periodic jobs with state that gets restored and persisted in a savepoint Key: FLINK-7930 URL: https://issues.apache.org/jira/browse/FLINK-7930 Project: Flink Issue Type: New Feature Components: DataStream API Reporter: Flavio Pompermaier As discussed in http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/State-snapshotting-when-source-is-finite-td16398.html, it could be useful to support the use case of periodic jobs with state that gets restored and persisted in a savepoint (in order to avoid the need of an external sink) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7929) Add unit/integration tests for states backed by RocksDB
Bowen Li created FLINK-7929: --- Summary: Add unit/integration tests for states backed by RocksDB Key: FLINK-7929 URL: https://issues.apache.org/jira/browse/FLINK-7929 Project: Flink Issue Type: Test Components: State Backends, Checkpointing Reporter: Bowen Li Assignee: Bowen Li Fix For: 1.5.0 While exploring how to implement FLINK-7475, I didn't find any existing unit tests (or there are but I didn't find them...) that I can easily run to test if {{RocksDB(Value/List/Map/...)State}} works. We should add unit/integration tests for {{RocksDB(Value/List/Map/...)State}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)