[jira] [Updated] (FLINK-3814) Update code style guide regarding Preconditions

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-3814:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Update code style guide regarding Preconditions
> ---
>
> Key: FLINK-3814
> URL: https://issues.apache.org/jira/browse/FLINK-3814
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: auto-deprioritized-major
>
> We recently added a Preconditions class to replace Guava when possible. 
> The Code Style Guide however still suggests to use Guava when possible 
> (explicitly naming the ported checkNotNull), and never mentions our own 
> Preconditions class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5676) Grouping on nested fields does not work

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-5676:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Grouping on nested fields does not work
> ---
>
> Key: FLINK-5676
> URL: https://issues.apache.org/jira/browse/FLINK-5676
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Timo Walther
>Priority: Major
>  Labels: auto-deprioritized-major
>
> {code}
> tEnv
>   .fromDataSet(pojoWithinnerPojo)
>   .groupBy("innerPojo.get('line')")
>   .select("innerPojo.get('line')")
> {code}
> fails with 
> {code}
> ValidationException: Cannot resolve [innerPojo] given input 
> ['innerPojo.get(line)].
> {code}
> I don't know if we want to support that but the exception should be more 
> helpful anyway.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4620) Automatically creating savepoints

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-4620:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Automatically creating savepoints
> -
>
> Key: FLINK-4620
> URL: https://issues.apache.org/jira/browse/FLINK-4620
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / State Backends
>Affects Versions: 1.1.2
>Reporter: Niels Basjes
>Priority: Major
>  Labels: auto-deprioritized-major
>
> In the current versions of Flink you can run an external command and then a 
> savepoint is persisted in a durable location.
> Feature request: Make this a lot more automatic and easy to use.
> _Proposed workflow_
> # In my application I do something like this:
> {code}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.setStateBackend(new FsStateBackend("hdfs:///tmp/applicationState"));
> env.enableCheckpointing(5000, CheckpointingMode.EXACTLY_ONCE);
> env.enableAutomaticSavePoints(30);
> env.enableAutomaticSavePointCleaner(10);
> {code}
> # When I start the application for the first time the state backend is 
> 'empty'. 
> I expect the system to start in a clean state.
> After 10 minutes (30ms) a savepoint is created and stored.
> # When I stop and start the topology again it will automatically restore the 
> last available savepoint.
> Things to think about:
> * Note that this feature still means the manual version is useful!!
> * What to do on startup if the state is incompatible with the topology? Fail 
> the startup?
> * How many automatic savepoints to we keep? Only the last one?
> * Perhaps the API should allow multiple automatic savepoints at different 
> intervals in different locations.
> {code}
> // Make every 10 minutes and keep the last 10
> env.enableAutomaticSavePoints(30, new 
> FsStateBackend("hdfs:///tmp/applicationState"), 10);
> // Make every 24 hours and keep the last 30
> // Useful for being able to reproduce a problem a few days later
> env.enableAutomaticSavePoints(8640, new 
> FsStateBackend("hdfs:///tmp/applicationDailyStateSnapshot"), 30);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3273) Remove Scala dependency from flink-streaming-java

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-3273:
--
Priority: Minor  (was: Major)

> Remove Scala dependency from flink-streaming-java
> -
>
> Key: FLINK-3273
> URL: https://issues.apache.org/jira/browse/FLINK-3273
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Maximilian Michels
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> {{flink-streaming-java}} depends on Scala through {{flink-clients}}, 
> {{flink-runtime}}, and {{flink-testing-utils}}. We should get rid of the 
> Scala dependency just like we did for {{flink-java}}. Integration tests and 
> utilities which depend on Scala should be moved to {{flink-tests}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-13847) Update release scripts to also update docs/_config.yml

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336436#comment-17336436
 ] 

Flink Jira Bot commented on FLINK-13847:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Update release scripts to also update docs/_config.yml
> --
>
> Key: FLINK-13847
> URL: https://issues.apache.org/jira/browse/FLINK-13847
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Release System
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Major
>  Labels: stale-major
>
> During the 1.9.0 release process, we missed quite a few configuration updates 
> in {{docs/_config.yml}} related to Flink versions. This should be able to be 
> done automatically in via the release scripts.
> A list of settings in that file that needs to be touched on every major 
> release include:
> * version
> * version_title
> * github_branch
> * baseurl
> * stable_baseurl
> * javadocs_baseurl
> * pythondocs_baseurl
> * is_stable
> * Add new link to previous_docs
> This can probably be done via the 
> {{tools/releasing/create_release_branch.sh}} script, which is used for every 
> major release.
> We should also update the release guide in the project wiki to cover checking 
> that file as an item in checklists.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-2838) Inconsistent use of URL and Path classes for resources

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336904#comment-17336904
 ] 

Flink Jira Bot commented on FLINK-2838:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Inconsistent use of URL and Path classes for resources
> --
>
> Key: FLINK-2838
> URL: https://issues.apache.org/jira/browse/FLINK-2838
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Priority: Major
>  Labels: stale-major
>
> Flink uses either Path or URL to point to resources. For example,  
> {{JobGraph.addJar}} expects a Path while {{JobWithJars}} expects URLs for 
> JARs.
> The URL class requires an explicit file schema (e.g. file://, hdfs://) while 
> the Path class expects any kind of well-formed paths. We should make clear 
> which one to use for resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-4947) Make all configuration possible via flink-conf.yaml and CLI.

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336841#comment-17336841
 ] 

Flink Jira Bot commented on FLINK-4947:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Make all configuration possible via flink-conf.yaml and CLI.
> 
>
> Key: FLINK-4947
> URL: https://issues.apache.org/jira/browse/FLINK-4947
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Jamie Grier
>Priority: Major
>  Labels: stale-major
>
> I think it's important to make all configuration possible via the 
> flink-conf.yaml and the command line.
> As an example:  To enable "externalizedCheckpoints" you must actually call 
> the StreamExecutionEnvironment#enableExternalizedCheckpoints() method from 
> your Flink program.
> Another example of this would be configuring the RocksDB state backend.
> I think it important to make deployment flexible and easy to build tools 
> around.  For example, the infrastructure teams that make these configuration 
> decisions and provide tools for deploying Flink apps, will be different from 
> the teams deploying apps.  The team writing apps should not have to set all 
> of this lower level configuration up in their programs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-3642) Disentangle ExecutionConfig

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-3642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336883#comment-17336883
 ] 

Flink Jira Bot commented on FLINK-3642:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Disentangle ExecutionConfig
> ---
>
> Key: FLINK-3642
> URL: https://issues.apache.org/jira/browse/FLINK-3642
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataSet, API / DataStream
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: stale-major
>
> Initially, the {{ExecutionConfig}} started out being a configuration to 
> configure the behaviour of the system with respect to the associated job. As 
> such it stored information about the restart strategy, registered types and 
> the parallelism of the job. However, it happened that the {{ExecutionConfig}} 
> has become more of an easy entry-point to pass information into the system. 
> As such, the user can now set arbitrary information as part of the 
> {{GlobalJobParameters}} in the {{ExecutionConfig}} which is piped to all 
> kinds of different locations in the system, e.g. the serializers, JM, 
> ExecutionGraph, TM, etc. 
> This mixture of user code classes with system parameters makes it really 
> cumbersome to send system information around, because you always need a user 
> code class loader to deserialize it. Furthermore, there are different means 
> how the {{ExecutionConfig}} is passed to the system. One is giving it to the 
> {{Serializers}} created in the JavaAPIPostPass and another is giving it 
> directly to the {{JobGraph}}, for example. The problem is that the 
> {{ExecutionConfig}} contains information which is required at different 
> stages of a program execution.
> I think it would be beneficial to disentangle the {{ExecutionConfig}} a 
> little bit along the lines of the different concerns for which the 
> {{ExecutionConfig}} is used currently. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-10616) Jepsen test fails while tearing down Hadoop

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-10616:
---
Priority: Minor  (was: Major)

> Jepsen test fails while tearing down Hadoop
> ---
>
> Key: FLINK-10616
> URL: https://issues.apache.org/jira/browse/FLINK-10616
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.1, 1.7.0
>Reporter: Gary Yao
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> While tearing down Hadoop, the tests sporadically fail with the exception 
> below:
> {noformat}
> Caused by: java.lang.RuntimeException: sudo -S -u root bash -c "cd /; ps aux 
> | grep hadoop | grep -v grep | awk \"\{print \\\$2}\" | xargs kill -9" 
> returned non-zero exit status 123 on 172.31.39.235. STDOUT:
> STDERR:
>     at jepsen.control$throw_on_nonzero_exit.invokeStatic(control.clj:129) 
> ~[jepsen-0.1.10.jar:na]
>     at jepsen.control$throw_on_nonzero_exit.invoke(control.clj:122) 
> ~[jepsen-0.1.10.jar:na]
>     at jepsen.control$exec_STAR_.invokeStatic(control.clj:166) 
> ~[jepsen-0.1.10.jar:na]
>     at jepsen.control$exec_STAR_.doInvoke(control.clj:163) 
> ~[jepsen-0.1.10.jar:na]
>     at clojure.lang.RestFn.applyTo(RestFn.java:137) [clojure-1.9.0.jar:na]
>     at clojure.core$apply.invokeStatic(core.clj:657) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.core$apply.invoke(core.clj:652) ~[clojure-1.9.0.jar:na]
>     at jepsen.control$exec.invokeStatic(control.clj:182) 
> ~[jepsen-0.1.10.jar:na]
>     at jepsen.control$exec.doInvoke(control.clj:176) 
> ~[jepsen-0.1.10.jar:na]
>     at clojure.lang.RestFn.invoke(RestFn.java:2088) [clojure-1.9.0.jar:na]
>     at jepsen.control.util$grepkill_BANG_.invokeStatic(util.clj:197) 
> ~[classes/:na]
>     at jepsen.control.util$grepkill_BANG_.invoke(util.clj:191) 
> ~[classes/:na]
>     at jepsen.control.util$grepkill_BANG_.invokeStatic(util.clj:194) 
> ~[classes/:na]
>     at jepsen.control.util$grepkill_BANG_.invoke(util.clj:191) 
> ~[classes/:na]
>     at jepsen.flink.hadoop$db$reify__3102.teardown_BANG_(hadoop.clj:128) 
> ~[classes/:na]
>     at jepsen.flink.db$combined_db$reify__217$fn__220.invoke(db.clj:119) 
> ~[na:na]
>     at clojure.core$map$fn__5587.invoke(core.clj:2745) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.lang.LazySeq.sval(LazySeq.java:40) ~[clojure-1.9.0.jar:na]
>     at clojure.lang.LazySeq.seq(LazySeq.java:49) ~[clojure-1.9.0.jar:na]
>     at clojure.lang.RT.seq(RT.java:528) ~[clojure-1.9.0.jar:na]
>     at clojure.core$seq__5124.invokeStatic(core.clj:137) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.core$dorun.invokeStatic(core.clj:3125) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.core$doall.invokeStatic(core.clj:3140) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.core$doall.invoke(core.clj:3140) ~[clojure-1.9.0.jar:na]
>     at jepsen.flink.db$combined_db$reify__217.teardown_BANG_(db.clj:119) 
> ~[na:na]
>     at jepsen.db$fn__2137$G__2133__2141.invoke(db.clj:8) 
> ~[jepsen-0.1.10.jar:na]
>     at jepsen.db$fn__2137$G__2132__2146.invoke(db.clj:8) 
> ~[jepsen-0.1.10.jar:na]
>     at clojure.core$partial$fn__5561.invoke(core.clj:2617) 
> ~[clojure-1.9.0.jar:na]
>     at jepsen.control$on_nodes$fn__2116.invoke(control.clj:372) 
> ~[jepsen-0.1.10.jar:na]
>     at clojure.lang.AFn.applyToHelper(AFn.java:154) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.lang.AFn.applyTo(AFn.java:144) ~[clojure-1.9.0.jar:na]
>     at clojure.core$apply.invokeStatic(core.clj:657) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.core$with_bindings_STAR_.invokeStatic(core.clj:1965) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.core$with_bindings_STAR_.doInvoke(core.clj:1965) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.lang.RestFn.applyTo(RestFn.java:142) [clojure-1.9.0.jar:na]
>     at clojure.core$apply.invokeStatic(core.clj:661) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.core$bound_fn_STAR_$fn__5471.doInvoke(core.clj:1995) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.lang.RestFn.invoke(RestFn.java:408) [clojure-1.9.0.jar:na]
>     at jepsen.util$real_pmap$launcher__1168$fn__1169.invoke(util.clj:49) 
> ~[jepsen-0.1.10.jar:na]
>     at clojure.core$binding_conveyor_fn$fn__5476.invoke(core.clj:2022) 
> ~[clojure-1.9.0.jar:na]
>     at clojure.lang.AFn.call(AFn.java:18) ~[clojure-1.9.0.jar:na]
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_171]
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[na:1.8.0_171]
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[na:1.8.0_171]
>     at 

[jira] [Commented] (FLINK-11942) Flink kinesis connector throws kinesis producer daemon fatalError

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336507#comment-17336507
 ] 

Flink Jira Bot commented on FLINK-11942:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Flink kinesis connector throws kinesis producer daemon fatalError
> -
>
> Key: FLINK-11942
> URL: https://issues.apache.org/jira/browse/FLINK-11942
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.7.2
>Reporter: indraneel r
>Priority: Major
>  Labels: stale-major
>
> Flink connector crashes repeatedly with following error:
> {quote}437062 [kpl-callback-pool-28-thread-0] WARN 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer - An 
> exception occurred while processing a record
>  java.lang.RuntimeException: Unexpected error
>      at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.fatalError(Daemon.java:533)
>      at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.fatalError(Daemon.java:513)
>      at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.add(Daemon.java:183)
>      at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:536)
>      at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554)
>      at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534)
>      at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
>      at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
>      at 
> org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
>      at 
> com.myorg.bi.web.sessionization.windowing.SessionProcessingFunction$$anonfun$process$2.apply(SessionProcessingFunction.scala:37)
>      at 
> com.myorg.bi.web.sessionization.windowing.SessionProcessingFunction$$anonfun$process$2.apply(SessionProcessingFunction.scala:33)
>      at scala.collection.immutable.Stream.foreach(Stream.scala:594)
>      at 
> com.myorg.bi.web.sessionization.windowing.SessionProcessingFunction.process(SessionProcessingFunction.scala:33)
>      at 
> com.myorg.bi.web.sessionization.windowing.SessionProcessingFunction.process(SessionProcessingFunction.scala:13)
>      at 
> org.apache.flink.streaming.api.scala.function.util.ScalaProcessWindowFunctionWrapper.process(ScalaProcessWindowFunctionWrapper.scala:63)
>      at 
> org.apache.flink.streaming.runtime.operators.windowing.functions.InternalIterableProcessWindowFunction.process(InternalIterableProcessWindowFunction.java:50)
>      at 
> org.apache.flink.streaming.runtime.operators.windowing.functions.InternalIterableProcessWindowFunction.process(InternalIterableProcessWindowFunction.java:32)
>      at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.emitWindowContents(WindowOperator.java:546)
>      at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onEventTime(WindowOperator.java:454)
>      at 
> org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:251)
>      at 
> org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:128)
>      at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:775)
>      at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(StreamInputProcessor.java:262)
>      at 
> org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:189)
>      at 
> org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:111)
>      at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:184)
>      at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)
>      at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
>      at 

[jira] [Commented] (FLINK-13695) Integrate checkpoint notifications into StreamTask's lifecycle

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336972#comment-17336972
 ] 

Flink Jira Bot commented on FLINK-13695:


This issue was labeled "stale-critical" 7 ago and has not received any updates 
so it is being deprioritized. If this ticket is actually Critical, please raise 
the priority and ask a committer to assign you the issue or revive the public 
discussion.


> Integrate checkpoint notifications into StreamTask's lifecycle
> --
>
> Key: FLINK-13695
> URL: https://issues.apache.org/jira/browse/FLINK-13695
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: stale-critical
>
> The {{StreamTask's}} asynchronous checkpoint notifications are decoupled from 
> the {{StreamTask's}} lifecycle. Consequently, it can happen that a 
> {{StreamTask}} is terminating/cancelling and still sends asynchronous 
> checkpoint notifications (e.g. acknowledge/decline checkpoint notifications). 
> This is problematic because a cancelling/terminating {{StreamTask}} might 
> cause an asynchronous checkpoint to fail which is expected and should not be 
> reported to the {{JobMaster}}. 
> Hence, the checkpoint notifications should be coupled with the 
> {{StreamTask's}} lifecycle (e.g. notifications must be sent from the 
> {{StreamTask's}} main thread) and if not valid, then they need to be filtered 
> out/suppressed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-10297) PostVersionedIOReadableWritable ignores result of InputStream.read(...)

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336980#comment-17336980
 ] 

Flink Jira Bot commented on FLINK-10297:


This issue was labeled "stale-critical" 7 ago and has not received any updates 
so it is being deprioritized. If this ticket is actually Critical, please raise 
the priority and ask a committer to assign you the issue or revive the public 
discussion.


> PostVersionedIOReadableWritable ignores result of InputStream.read(...)
> ---
>
> Key: FLINK-10297
> URL: https://issues.apache.org/jira/browse/FLINK-10297
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.4.2, 1.5.3, 1.6.0
>Reporter: Stefan Richter
>Priority: Critical
>  Labels: stale-critical
>
> PostVersionedIOReadableWritable ignores result of {{InputStream.read(...)}}. 
> Probably the intention was to invoke {{readFully}}. As it is now, this can 
> lead to a corrupted deserialization.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-10409) Collection data sink does not propagate exceptions

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-10409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-10409:
---
Labels: auto-deprioritized-major  (was: stale-major)

> Collection data sink does not propagate exceptions
> --
>
> Key: FLINK-10409
> URL: https://issues.apache.org/jira/browse/FLINK-10409
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Dawid Wysakowicz
>Priority: Major
>  Labels: auto-deprioritized-major
>
> I would assume that this test should fail with {{RuntimeException}}, but it 
> actually runs just fine.
> {code}
> @Test
> public void testA() throws Exception {
>   StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
>   List resultList = new ArrayList<>();
>   SingleOutputStreamOperator result = 
> env.fromElements("A").map(obj -> {
>   throw new RuntimeException();
>   });
>   DataStreamUtils.collect(result).forEachRemaining(resultList::add);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3240) Remove or document DataStream(.global|.forward)

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-3240:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Remove or document DataStream(.global|.forward)
> ---
>
> Key: FLINK-3240
> URL: https://issues.apache.org/jira/browse/FLINK-3240
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Robert Metzger
>Priority: Major
>  Labels: auto-deprioritized-major
>
> It seems that DataStream.global() and DataStream.forward() are not documented.
> From the JavaDocs, I don't really get why we need them.
> For DataStream.global(), users can just set the parallelism of the following 
> operator to p=1.
> Forward is the default behavior anyways.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-6814) Store information about whether or not a registered state is queryable in checkpoints

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-6814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336742#comment-17336742
 ] 

Flink Jira Bot commented on FLINK-6814:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Store information about whether or not a registered state is queryable in 
> checkpoints
> -
>
> Key: FLINK-6814
> URL: https://issues.apache.org/jira/browse/FLINK-6814
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing, Runtime / Queryable State
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Major
>  Labels: stale-major
>
> Currently, we store almost all information that comes with the registered 
> state's {{StateDescriptor}}s (state name, state serializer, etc.) in 
> checkpoints, except from information about whether or not the state is 
> queryable. I propose to add that also.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336529#comment-17336529
 ] 

Flink Jira Bot commented on FLINK-11402:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> User code can fail with an UnsatisfiedLinkError in the presence of multiple 
> classloaders
> 
>
> Key: FLINK-11402
> URL: https://issues.apache.org/jira/browse/FLINK-11402
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / Task
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Priority: Major
>  Labels: stale-major, starter
> Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz
>
>
> As reported on the user mailing list thread "[`env.java.opts` not persisting 
> after job canceled or failed and then 
> restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];,
>  there can be issues with using native libraries and user code class loading.
> h2. Steps to reproduce
> I was able to reproduce the issue reported on the mailing list using 
> [snappy-java|https://github.com/xerial/snappy-java] in a user program. 
> Running the attached user program works fine on initial submission, but 
> results in a failure when re-executed.
> I'm using Flink 1.7.0 using a standalone cluster started via 
> {{bin/start-cluster.sh}}.
> 0. Unpack attached Maven project and build using {{mvn clean package}} *or* 
> directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> 1. Download 
> [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar]
>  and unpack libsnappyjava for your system:
> {code}
> jar tf snappy-java-1.1.7.2.jar | grep libsnappy
> ...
> org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so
> ...
> org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib
> ...
> {code}
> 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} 
> (path needs to be adjusted for your system):
> {code}
> env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
> {code}
> 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> Program execution finished
> Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
> Job Runtime: 359 ms
> {code}
> 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> 
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: Job failed. 
> (JobID: 7d69baca58f33180cb9251449ddcd396)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
>   at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
>   at com.github.uce.HelloSnappy.main(HelloSnappy.java:18)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
>   at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
>   at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
>   at 
> 

[jira] [Commented] (FLINK-12118) Documentation in left navigation is not in step with its content

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336498#comment-17336498
 ] 

Flink Jira Bot commented on FLINK-12118:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Documentation in left navigation is not in step with its content
> 
>
> Key: FLINK-12118
> URL: https://issues.apache.org/jira/browse/FLINK-12118
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Yu Haidong
>Priority: Major
>  Labels: stale-major
> Attachments: 图片 11.png, 图片 12.png
>
>
> Hi Team:
>     A little bug I think.
>     The content in offical website documentation of left navigation is :
>     "1.8(snapshot)",
>     but when you click it, you will see :
>      "1.9-SNAPSHOT"
>       in the content, maybe it is a little bug.
>     Thank you!
>            
>                          Haidong
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-2197) Scala API is not working when using batch and streaming API in the same program

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-2197:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Scala API is not working when using batch and streaming API in the same 
> program
> ---
>
> Key: FLINK-2197
> URL: https://issues.apache.org/jira/browse/FLINK-2197
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataSet, API / DataStream, API / Scala
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: auto-deprioritized-major
>
> If one uses the the Scala batch and streaming API from within the same 
> program and imports both corresponding package objects, then the Scala API no 
> longer works because it is lacking the implicit {{TypeInformation}} values. 
> The reason for this is that both package objects contain an implicit function 
> {{createTypeInformation}}. This creates an ambiguity which is not possible 
> for the Scala compiler to resolve.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13113) Introduce range partition in blink

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-13113:
---
Priority: Minor  (was: Major)

> Introduce range partition in blink
> --
>
> Key: FLINK-13113
> URL: https://issues.apache.org/jira/browse/FLINK-13113
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Runtime
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: auto-deprioritized-major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5351) Make the TypeExtractor support functions with more than 2 inputs

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-5351:
--
Priority: Minor  (was: Major)

> Make the TypeExtractor support functions with more than 2 inputs
> 
>
> Key: FLINK-5351
> URL: https://issues.apache.org/jira/browse/FLINK-5351
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Type Serialization System, Library / Graph 
> Processing (Gelly)
>Reporter: Vasia Kalavri
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently, the The TypeExtractor doesn't support functions with more than 2 
> inputs. We found that adding such support would be a useful feature for Gelly 
> in FLINK-5097.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-6626) Unifying lifecycle management of SubmittedJobGraph- and CompletedCheckpointStore

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-6626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336749#comment-17336749
 ] 

Flink Jira Bot commented on FLINK-6626:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Unifying lifecycle management of SubmittedJobGraph- and 
> CompletedCheckpointStore
> 
>
> Key: FLINK-6626
> URL: https://issues.apache.org/jira/browse/FLINK-6626
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: stale-major
>
> Currently, Flink uses the {{SubmittedJobGraphStore}} to persist {{JobGraphs}} 
> such that they can be recovered in case of failures. The 
> {{SubmittedJobGraphStore}} is managed by by the {{JobManager}}. Additionally, 
> Flink has the {{CompletedCheckpointStore}} which stores checkpoints for a 
> given {{ExecutionGraph}}/job. The {{CompletedCheckpointStore}} is managed by 
> the {{CheckpointCoordinator}}.
> The {{SubmittedJobGraphStore}} and the {{CompletedCheckpointStore}} are 
> somewhat related because in the latter we store checkpoints for jobs 
> contained in the former. I think it would be nice wrt lifecycle management to 
> let the {{SubmittedJobGraphStore}} manage the lifecycle of the 
> {{CompletedCheckpointStore}}, because often it does not make much sense to 
> keep only checkpoints without a job or a job without checkpoints. 
> An idea would be when we register a job with the {{SubmittedJobGraphStore}} 
> then it returns a {{CompletedCheckpointStore}}. This store can then be given 
> to the {{CheckpointCoordinator}} to store the checkpoints. When a job enters 
> a terminal state it could be the responsibility of the 
> {{SubmittedJobGraphStore}} to decide what to do with the job data 
> ({{JobGraph}} and {{Checkpoints}}), e.g. keeping it or cleaning it up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-10106) Include test name in temp directory of e2e test

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336597#comment-17336597
 ] 

Flink Jira Bot commented on FLINK-10106:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Include test name in temp directory of e2e test
> ---
>
> Key: FLINK-10106
> URL: https://issues.apache.org/jira/browse/FLINK-10106
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 1.6.0
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: pull-request-available, stale-major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For better debuggability it would help to include the name of the e2e test in 
> the created temporary testing directory 
> {{temp-test-directory--UUID}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-9413) Tasks can fail with PartitionNotFoundException if consumer deployment takes too long

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-9413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336983#comment-17336983
 ] 

Flink Jira Bot commented on FLINK-9413:
---

This issue was labeled "stale-critical" 7 ago and has not received any updates 
so it is being deprioritized. If this ticket is actually Critical, please raise 
the priority and ask a committer to assign you the issue or revive the public 
discussion.


> Tasks can fail with PartitionNotFoundException if consumer deployment takes 
> too long
> 
>
> Key: FLINK-9413
> URL: https://issues.apache.org/jira/browse/FLINK-9413
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.4.0, 1.5.0, 1.6.0
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: pull-request-available, stale-critical
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{Tasks}} can fail with a {{PartitionNotFoundException}} if the deployment of 
> the producer takes too long. More specifically, if it takes longer than the 
> {{taskmanager.network.request-backoff.max}}, then the {{Task}} will give up 
> and fail.
> The problem is that we calculate the {{InputGateDeploymentDescriptor}} for a 
> consuming task once the producer has been assigned a slot but we do not wait 
> until it is actually running. The problem should be fixed if we wait until 
> the task is in state {{RUNNING}} before assigning the result partition to the 
> consumer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-8379) Improve type checking for DataView

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336671#comment-17336671
 ] 

Flink Jira Bot commented on FLINK-8379:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Improve type checking for DataView
> --
>
> Key: FLINK-8379
> URL: https://issues.apache.org/jira/browse/FLINK-8379
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Timo Walther
>Priority: Major
>  Labels: stale-major
>
> At the moment an accumulator with no proper type information is a valid 
> accumulator.
> {code}
>   public static class CountDistinctAccum {
>   public MapView map;
>   public long count;
>   }
> {code}
> I quickly looked into the code and it seems that MapView with type 
> information for key and value can be null. We should add a null check at the 
> correct position to inform the user about the non-existing type information. 
> We should also add the type check added with FLINK-8139 for the key type of 
> MapView.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16616) Drop BucketingSink

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-16616:
---
Labels: auto-deprioritized-major  (was: stale-major)

> Drop BucketingSink
> --
>
> Key: FLINK-16616
> URL: https://issues.apache.org/jira/browse/FLINK-16616
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: Robert Metzger
>Priority: Major
>  Labels: auto-deprioritized-major
>
> (See this discussion for context: 
> https://lists.apache.org/thread.html/r799be74658bc7e169238cc8c1e479e961a9e85ccea19089290940ff0%40%3Cdev.flink.apache.org%3E)
>  
> The bucketing sink has been deprecated in the 1.9 release [2], because we 
> have the new StreamingFileSink [3] for quite a while.
> *The purpose of this ticket is to track all dependent tickets: If you think 
> something needs to be implemented before we can drop the BucketingSink, add 
> it as a "depends on" ticket*
> [2] https://issues.apache.org/jira/browse/FLINK-13396 
> [3] https://issues.apache.org/jira/browse/FLINK-9749



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4902) Flink Task Chain not getting input in a distributed manner

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-4902:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Flink Task Chain not getting input in a distributed manner
> --
>
> Key: FLINK-4902
> URL: https://issues.apache.org/jira/browse/FLINK-4902
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataSet
>Affects Versions: 1.1.0
> Environment: RHEL 6.6
>Reporter: Sajeev Ramakrishnan
>Priority: Major
>  Labels: auto-deprioritized-major
>
> Dear Team,
>   I have the following tasks chained as a single subtask.
> left outer join -> filter -> map -> flatMap.
> The input to this would be two streams 
> memberPlan - 22 million
> groupPlan - 1 million.
> I am running the entire job with parallelism 16. Before this task chain, I am 
> doing two left outer joins.
> The problem is that one slot is getting 22 plus million (includes some from 
> groupPlan) and rest 15 slots are getting the input from groupPlan.
> This is making the entire execution very slow, probably 4 hours slower.
> Can you please throw some light on this.
> Regards,
> Sajeev



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-6272) Rolling file sink saves incomplete lines on failure

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-6272:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Rolling file sink saves incomplete lines on failure
> ---
>
> Key: FLINK-6272
> URL: https://issues.apache.org/jira/browse/FLINK-6272
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Common, Connectors / FileSystem
>Affects Versions: 1.2.0
> Environment: Flink 1.2.0, Scala 2.11, Debian GNU/Linux 8.7 (jessie), 
> CDH 5.8, YARN
>Reporter: Jakub Nowacki
>Priority: Major
>  Labels: auto-deprioritized-major
>
> We have simple pipeline with Kafka source (0.9), which transforms data and 
> writes to Rolling File Sink, which runs on YARN. The sink is a plain HDFS 
> sink with StringWriter configured as follows:
> {code:java}
> val fileSink = new BucketingSink[String]("some_path")
> fileSink.setBucketer(new DateTimeBucketer[String]("-MM-dd"))
> fileSink.setWriter(new StringWriter())
> fileSink.setBatchSize(1024 * 1024 * 1024) // this is 1 GB
> {code}
> Checkpoint is on. Both Kafka source and File sink are in theory with 
> [exactly-once 
> guarantee|https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/guarantees.html].
> On failure in some files, which seem to be complete (not {{in_progress}} 
> files ore something, but under 1 GB and confirmed to be created on failure), 
> it comes out that the last line is cut. In our case it shows because we save 
> the data in line-by-line JSON and this creates invalid JSON line. This does 
> not happen always when the  but I noticed at least 3 incidents like that at 
> least.
> Also, I am not sure if it is a separate bug but we see some data duplication 
> in this case coming from Kafka. I.e.after the pipeline is restarted some 
> number of messages come out from Kafka source, which already have been saved 
> in the previous file. We can check that the messages are duplicated as they 
> have same data but different timestamp, which is added within Flink pipeline. 
> This should not happen in theory as the sink and source have [exactly-once 
> guarantee|https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/guarantees.html].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3983) Allow users to set any (relevant) configuration parameter of the KinesisProducerConfiguration

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-3983:
--
Priority: Minor  (was: Major)

> Allow users to set any (relevant) configuration parameter of the 
> KinesisProducerConfiguration
> -
>
> Key: FLINK-3983
> URL: https://issues.apache.org/jira/browse/FLINK-3983
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common, Connectors / Kinesis
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently, users can only set some of the configuration parameters in the 
> {{KinesisProducerConfiguration}} through Properties.
> It would be good to introduce configuration keys for these keys so that users 
> can change the producer configuration.
> I think these and most of the other variables in the 
> KinesisProducerConfiguration should be exposed via properties:
> - aggregationEnabled
> - collectionMaxCount
> - collectionMaxSize
> - connectTimeout
> - credentialsRefreshDelay
> - failIfThrottled
> - logLevel
> - metricsGranularity
> - metricsLevel
> - metricsNamespace
> - metricsUploadDelay



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-10224) Web frontend does not handle scientific notation properly

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336588#comment-17336588
 ] 

Flink Jira Bot commented on FLINK-10224:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Web frontend does not handle scientific notation properly
> -
>
> Key: FLINK-10224
> URL: https://issues.apache.org/jira/browse/FLINK-10224
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
>Affects Versions: 1.5.3
>Reporter: Paul Lin
>Priority: Major
>  Labels: stale-major
> Attachments: task metrics panel.png
>
>
> Task metrics can be large numbers which would be converted into scientific 
> notation by REST server, but the web frontend does not handle them properly.
> For instance, below is a panel for Kafka consumer metric "records-lag-max", 
> which is actually "9.1839984E7". 
> !task metrics panel.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8107) UNNEST causes cyclic type checking exception

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-8107:
--
Priority: Minor  (was: Major)

> UNNEST causes cyclic type checking exception
> 
>
> Key: FLINK-8107
> URL: https://issues.apache.org/jira/browse/FLINK-8107
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.4.0
>Reporter: Timo Walther
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> The following query causes an assertion error:
> {code}
>   def main(args: Array[String]): Unit = {
> val env = ExecutionEnvironment.getExecutionEnvironment
> val tEnv = TableEnvironment.getTableEnvironment(env)
> val input2 = env.fromElements(
>   WC("hello", 1, Array(1, 2, 3)),
>   WC("hello", 1, Array(1, 2, 3)),
>   WC("ciao", 1, Array(1, 2, 3))
> )
> tEnv.registerDataSet("entity", input2)
> tEnv.registerDataSet("product", input2, 'product)
> val table = tEnv.sqlQuery("SELECT t.item.* FROM product, 
> UNNEST(entity.myarr) AS t (item)")
> table.toDataSet[Row].print()
>   }
>   case class WC(word: String, frequency: Long, myarr: Array[Int])
> {code}
> It leads to:
> {code}
> Exception in thread "main" java.lang.AssertionError: Cycle detected during 
> type-checking
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:93)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:945)
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.getRowType(AbstractNamespace.java:115)
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.getRowTypeSansSystemColumns(AbstractNamespace.java:122)
>   at 
> org.apache.calcite.sql.validate.AliasNamespace.validateImpl(AliasNamespace.java:71)
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:945)
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.getRowType(AbstractNamespace.java:115)
>   at 
> org.apache.calcite.sql.validate.AliasNamespace.getRowType(AliasNamespace.java:41)
>   at 
> org.apache.calcite.sql.validate.DelegatingScope.resolveInNamespace(DelegatingScope.java:101)
>   at org.apache.calcite.sql.validate.ListScope.resolve(ListScope.java:191)
>   at 
> org.apache.calcite.sql.validate.ListScope.findQualifyingTableNames(ListScope.java:156)
>   at 
> org.apache.calcite.sql.validate.DelegatingScope.fullyQualify(DelegatingScope.java:326)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateIdentifier(SqlValidatorImpl.java:2785)
>   at 
> org.apache.calcite.sql.SqlIdentifier.validateExpr(SqlIdentifier.java:324)
>   at org.apache.calcite.sql.SqlOperator.validateCall(SqlOperator.java:407)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateCall(SqlValidatorImpl.java:5084)
>   at 
> org.apache.calcite.sql.validate.UnnestNamespace.validateImpl(UnnestNamespace.java:52)
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:945)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:926)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2961)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2946)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateJoin(SqlValidatorImpl.java:2998)
>   at 
> org.apache.flink.table.calcite.FlinkCalciteSqlValidator.validateJoin(FlinkCalciteSqlValidator.scala:67)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2955)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3206)
>   at 
> org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:945)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:926)
>   at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:226)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:901)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:611)
>   at 
> 

[jira] [Commented] (FLINK-9689) Flink consumer deserialization example

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-9689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336616#comment-17336616
 ] 

Flink Jira Bot commented on FLINK-9689:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Flink consumer deserialization example
> --
>
> Key: FLINK-9689
> URL: https://issues.apache.org/jira/browse/FLINK-9689
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Reporter: Satheesh
>Priority: Major
>  Labels: stale-major
>
> Its hard to find relevant custom deserialization example for Flink Kafka 
> consumer. It will be much useful to add a sample program for implementing 
> custom deserialization in the blink-examples folder.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9469) Add tests that cover PatternStream#flatSelect

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-9469:
--
Labels: auto-deprioritized-major pull-request-available  (was: 
pull-request-available stale-major)

> Add tests that cover PatternStream#flatSelect
> -
>
> Key: FLINK-9469
> URL: https://issues.apache.org/jira/browse/FLINK-9469
> Project: Flink
>  Issue Type: Improvement
>  Components: Library / CEP
>Reporter: Dawid Wysakowicz
>Priority: Major
>  Labels: auto-deprioritized-major, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8294) Missing examples/links in Data Sink docs

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-8294:
--
Priority: Minor  (was: Major)

> Missing examples/links in Data Sink docs
> 
>
> Key: FLINK-8294
> URL: https://issues.apache.org/jira/browse/FLINK-8294
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.4.0
>Reporter: Julio Biason
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> In the [Data 
> Sink|https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/datastream_api.html#data-sinks]
>  documentation, there is no example on how to use said functions -- even if 
> they are only intent for debugging (which is exactly what I want to do right 
> now).
> While {{print}} is quite simple, what I need is to get the resulting 
> processing, so I'd probably need some of the {{write}} functions (since 
> FLINK-8285 mentions that iterators are out).
> I'd either suggest adding the examples or link the listed functions to their 
> documentation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4785) Flink string parser doesn't handle string fields containing two consecutive double quotes

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-4785:
--
Labels: auto-deprioritized-major csv  (was: csv stale-major)

> Flink string parser doesn't handle string fields containing two consecutive 
> double quotes
> -
>
> Key: FLINK-4785
> URL: https://issues.apache.org/jira/browse/FLINK-4785
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataSet
>Affects Versions: 1.1.2
>Reporter: Flavio Pompermaier
>Priority: Major
>  Labels: auto-deprioritized-major, csv
>
> To reproduce the error run 
> https://github.com/okkam-it/flink-examples/blob/master/src/main/java/it/okkam/flink/Csv2RowExample.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3984) Event time of stream transformations is undocumented

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-3984:
--
Priority: Minor  (was: Major)

> Event time of stream transformations is undocumented
> 
>
> Key: FLINK-3984
> URL: https://issues.apache.org/jira/browse/FLINK-3984
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Documentation
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> The Event Time, Windowing, and DataStream Transformation documentation 
> section fail to state what event time, if any, the output of transformations 
> have on a stream that is configured to use event time and that has timestamp 
> assigners.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-6866) ClosureCleaner.clean fails for scala's JavaConverters wrapper classes

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-6866:
--
Priority: Minor  (was: Major)

> ClosureCleaner.clean fails for scala's JavaConverters wrapper classes
> -
>
> Key: FLINK-6866
> URL: https://issues.apache.org/jira/browse/FLINK-6866
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, API / Scala
>Affects Versions: 1.2.0, 1.3.0
> Environment: Scala 2.10.6, Scala 2.11.11
> Does not appear using Scala 2.12
>Reporter: SmedbergM
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> MWE: https://github.com/SmedbergM/ClosureCleanerBug
> MWE console output: 
> https://gist.github.com/SmedbergM/ce969e6e8540da5b59c7dd921a496dc5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-5719) Let LatencyMarkers completely bypass operators / chains

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-5719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336806#comment-17336806
 ] 

Flink Jira Bot commented on FLINK-5719:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Let LatencyMarkers completely bypass operators / chains
> ---
>
> Key: FLINK-5719
> URL: https://issues.apache.org/jira/browse/FLINK-5719
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Major
>  Labels: stale-major
>
> Currently, {{LatencyMarker}} s are forwarded through operators via the 
> operator interfaces and methods, i.e. 
> {{AbstractStreamOperator#processLatencyMarker()}},  
> {{Output#emitLatencyMarker()}}, 
> {{OneInputStreamOperator#processLatencyMarker()}} etc.
> The main issue with this is that {{LatencyMarker}} s are essentially internal 
> elements, and the implementation on how to handle them should be final. 
> Exposing them through operator interfaces will allow the user to override the 
> implementation, and also makes the user interface for operators 
> over-complicated.
> [~aljoscha] suggested to bypass such internal stream elements from the 
> operator to keep the operator interfaces minimal, in FLINK-5017.
> We propose a similar approach here for {{LatencyMarker}} as well. Since the 
> chaining output calls contribute very little to the measured latency and can 
> be ignored, instead of passing it through operator chains, latency markers 
> can simply be passed downstream once tasks receive them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-4001) Add event time support to filesystem connector

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336866#comment-17336866
 ] 

Flink Jira Bot commented on FLINK-4001:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Add event time support to filesystem connector
> --
>
> Key: FLINK-4001
> URL: https://issues.apache.org/jira/browse/FLINK-4001
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common, Connectors / FileSystem
>Reporter: Robert Metzger
>Priority: Major
>  Labels: stale-major
>
> Currently, the file system connector (rolling file sink) does not respect the 
> event time of records.
> For full reprocessing capabilities, we need to make the sink aware of the 
> event time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-7001) Improve performance of Sliding Time Window with pane optimization

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336733#comment-17336733
 ] 

Flink Jira Bot commented on FLINK-7001:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Improve performance of Sliding Time Window with pane optimization
> -
>
> Key: FLINK-7001
> URL: https://issues.apache.org/jira/browse/FLINK-7001
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Jark Wu
>Priority: Major
>  Labels: stale-major
>
> Currently, the implementation of time-based sliding windows treats each 
> window individually and replicates records to each window. For a window of 10 
> minute size that slides by 1 second the data is replicated 600 fold (10 
> minutes / 1 second). We can optimize sliding window by divide windows into 
> panes (aligned with slide), so that we can avoid record duplication and 
> leverage the checkpoint.
> I will attach a more detail design doc to the issue.
> The following issues are similar to this issue: FLINK-5387, FLINK-6990



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5094) Support RichReduceFunction and RichFoldFunction as incremental window aggregation functions

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-5094:
--
Priority: Minor  (was: Major)

> Support RichReduceFunction and RichFoldFunction as incremental window 
> aggregation functions
> ---
>
> Key: FLINK-5094
> URL: https://issues.apache.org/jira/browse/FLINK-5094
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 1.1.3, 1.2.0
>Reporter: Fabian Hueske
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Support {{RichReduceFunction}} and {{RichFoldFunction}} as incremental window 
> aggregation functions in order to initialize the functions via {{open()}}.
> The main problem is that we do not want to provide the full power of 
> {{RichFunction}} for incremental aggregation functions, such as defining own 
> operator state. This could be achieve by providing some kind of 
> {{RestrictedRuntimeContext}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-13395) Add source and sink connector for Alibaba Log Service

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336458#comment-17336458
 ] 

Flink Jira Bot commented on FLINK-13395:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Add source and sink connector for Alibaba Log Service
> -
>
> Key: FLINK-13395
> URL: https://issues.apache.org/jira/browse/FLINK-13395
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Ke Li
>Priority: Major
>  Labels: stale-major
>
> Alibaba Log Service is a big data service which has been widely used in 
> Alibaba Group and thousands of customers of Alibaba Cloud. The core storage 
> engine of Log Service is named Loghub which is a large scale distributed 
> storage system which provides producer and consumer to push and pull data 
> like Kafka, AWS Kinesis and Azure Eventhub does. 
> Log Service provides a complete solution to help user collect data from both 
> on premise and cloud data sources. More than 10 PB data is sent to and 
> consumed from Loghub every day.  And hundreds of thousands of users 
> implemented their DevOPS and big data system based on Log Service.
> Log Service and Flink/Blink has became the de facto standard of big data 
> architecture for unified data processing in Alibaba Group and more users of 
> Alibaba Cloud.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-7271) ExpressionReducer does not optimize string-to-time conversion

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-7271:
--
Labels: auto-deprioritized-major  (was: stale-major)

> ExpressionReducer does not optimize string-to-time conversion
> -
>
> Key: FLINK-7271
> URL: https://issues.apache.org/jira/browse/FLINK-7271
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.3.1
>Reporter: Timo Walther
>Priority: Major
>  Labels: auto-deprioritized-major
>
> Expressions like {{"1996-11-10".toDate}} or {{"1996-11-10 
> 12:12:12".toTimestamp}} are not recognized by the ExpressionReducer and are 
> evaluated during runtime instead of pre-flight phase. In order to optimize 
> the runtime we should allow constant expression reduction here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16881) use Catalog's total size info in planner

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336250#comment-17336250
 ] 

Flink Jira Bot commented on FLINK-16881:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> use Catalog's total size info in planner
> 
>
> Key: FLINK-16881
> URL: https://issues.apache.org/jira/browse/FLINK-16881
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner
>Reporter: godfrey he
>Priority: Major
>  Labels: stale-major
>
> in some case, {{Catalog}} only contains {{totalSize}} and row count is 
> unknown. we also can use {{totalSize}} to infer row count, or even use 
> {{totalSize}} to decide whether the join is broadcast join



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-20427) Remove CheckpointConfig.setPreferCheckpointForRecovery because it can lead to data loss

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-20427:
---
Priority: Major  (was: Critical)

> Remove CheckpointConfig.setPreferCheckpointForRecovery because it can lead to 
> data loss
> ---
>
> Key: FLINK-20427
> URL: https://issues.apache.org/jira/browse/FLINK-20427
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, Runtime / Checkpointing
>Affects Versions: 1.12.0
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: auto-deprioritized-critical
> Fix For: 1.14.0
>
>
> The {{CheckpointConfig.setPreferCheckpointForRecovery}} allows to configure 
> whether Flink prefers checkpoints for recovery if the 
> {{CompletedCheckpointStore}} contains savepoints and checkpoints. This is 
> problematic because due to this feature, Flink might prefer older checkpoints 
> over newer savepoints for recovery. Since some components expect that the 
> always the latest checkpoint/savepoint is used (e.g. the 
> {{SourceCoordinator}}), it breaks assumptions and can lead to 
> {{SourceSplits}} which are not read. This effectively means that the system 
> loses data. Similarly, this behaviour can cause that exactly once sinks might 
> output results multiple times which violates the processing guarantees. 
> Hence, I believe that we should remove this setting because it changes 
> Flink's behaviour in some very significant way potentially w/o the user 
> noticing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9123) Scala version of ProcessFunction doesn't work

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-9123:
--
Priority: Minor  (was: Major)

> Scala version of ProcessFunction doesn't work
> -
>
> Key: FLINK-9123
> URL: https://issues.apache.org/jira/browse/FLINK-9123
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, API / Scala, Documentation
>Reporter: Julio Biason
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> The source code example of ProcessFunction doesn't compile:
>  
> {code:java}
> value Context is not a member of object 
> org.apache.flink.streaming.api.functions.ProcessFunction
> [error] import 
> org.apache.flink.streaming.api.functions.ProcessFunction.Context
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5086) Clean dead snapshot files produced by the tasks failing to acknowledge checkpoints

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-5086:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Clean dead snapshot files produced by the tasks failing to acknowledge 
> checkpoints
> --
>
> Key: FLINK-5086
> URL: https://issues.apache.org/jira/browse/FLINK-5086
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Reporter: Xiaogang Shi
>Priority: Major
>  Labels: auto-deprioritized-major
>
> A task may fail when performing checkpoints. In that case, the task may have 
> already copied some data to external storage. But since the task fails to 
> send the state handler to {{CheckpointCoordinator}}, the copied data will not 
> be deleted by {{CheckpointCoordinator}}. 
> I think we must find a method to clean such dead snapshot data to avoid 
> unlimited usage of external storage. 
> One possible method is to clean these dead files when the task recovers. When 
> a task recovers, {{CheckpointCoordinator}} will tell the task all the 
> retained checkpoints. The task then can scan the external storage to delete 
> all the  snapshots not in these retained checkpoints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-12294) Kafka connector, work with grouping partitions

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-12294:
---
Labels: auto-deprioritized-major performance  (was: performance stale-major)

> Kafka connector, work with grouping partitions
> --
>
> Key: FLINK-12294
> URL: https://issues.apache.org/jira/browse/FLINK-12294
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream, Connectors / Kafka, Runtime / Task
>Reporter: Sergey
>Priority: Major
>  Labels: auto-deprioritized-major, performance
> Attachments: KeyGroupAssigner.java, KeyGroupRangeAssignment.java
>
>
> Additional flag (with default false value) controlling whether topic 
> partitions already grouped by the key. Exclude unnecessary shuffle/resorting 
> operation when this parameter set to true. As an example, say we have 
> client's payment transaction in a kafka topic. We grouping by clientId 
> (transaction with the same clientId goes to one kafka topic partition) and 
> the task is to find max transaction per client in sliding windows. In terms 
> of map\reduce there is no needs to shuffle data between all topic consumers, 
> may be it`s worth to do within each consumer to gain some speedup due to 
> increasing number of executors within each partition data. With N messages 
> (in partition) instead of N*ln(N) (current realization with 
> shuffle/resorting) it will be just N operations. For windows with thousands 
> events - the tenfold gain of execution speed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-11942) Flink kinesis connector throws kinesis producer daemon fatalError

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-11942:
---
Priority: Minor  (was: Major)

> Flink kinesis connector throws kinesis producer daemon fatalError
> -
>
> Key: FLINK-11942
> URL: https://issues.apache.org/jira/browse/FLINK-11942
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.7.2
>Reporter: indraneel r
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Flink connector crashes repeatedly with following error:
> {quote}437062 [kpl-callback-pool-28-thread-0] WARN 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer - An 
> exception occurred while processing a record
>  java.lang.RuntimeException: Unexpected error
>      at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.fatalError(Daemon.java:533)
>      at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.fatalError(Daemon.java:513)
>      at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.add(Daemon.java:183)
>      at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:536)
>      at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554)
>      at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534)
>      at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
>      at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
>      at 
> org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
>      at 
> com.myorg.bi.web.sessionization.windowing.SessionProcessingFunction$$anonfun$process$2.apply(SessionProcessingFunction.scala:37)
>      at 
> com.myorg.bi.web.sessionization.windowing.SessionProcessingFunction$$anonfun$process$2.apply(SessionProcessingFunction.scala:33)
>      at scala.collection.immutable.Stream.foreach(Stream.scala:594)
>      at 
> com.myorg.bi.web.sessionization.windowing.SessionProcessingFunction.process(SessionProcessingFunction.scala:33)
>      at 
> com.myorg.bi.web.sessionization.windowing.SessionProcessingFunction.process(SessionProcessingFunction.scala:13)
>      at 
> org.apache.flink.streaming.api.scala.function.util.ScalaProcessWindowFunctionWrapper.process(ScalaProcessWindowFunctionWrapper.scala:63)
>      at 
> org.apache.flink.streaming.runtime.operators.windowing.functions.InternalIterableProcessWindowFunction.process(InternalIterableProcessWindowFunction.java:50)
>      at 
> org.apache.flink.streaming.runtime.operators.windowing.functions.InternalIterableProcessWindowFunction.process(InternalIterableProcessWindowFunction.java:32)
>      at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.emitWindowContents(WindowOperator.java:546)
>      at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onEventTime(WindowOperator.java:454)
>      at 
> org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:251)
>      at 
> org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:128)
>      at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:775)
>      at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(StreamInputProcessor.java:262)
>      at 
> org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:189)
>      at 
> org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:111)
>      at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:184)
>      at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)
>      at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
>      at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704)
>      at java.lang.Thread.run(Thread.java:748)
>  Caused by: java.lang.InterruptedException
>      at 
> 

[jira] [Commented] (FLINK-2147) Approximate calculation of frequencies in data streams

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336921#comment-17336921
 ] 

Flink Jira Bot commented on FLINK-2147:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Approximate calculation of frequencies in data streams
> --
>
> Key: FLINK-2147
> URL: https://issues.apache.org/jira/browse/FLINK-2147
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: Gábor Gévay
>Priority: Major
>  Labels: approximate, stale-major, statistics
>
> Count-Min sketch is a hashing-based algorithm for approximately keeping track 
> of the frequencies of elements in a data stream. It is described by Cormode 
> et al. in the following paper:
> http://dimacs.rutgers.edu/~graham/pubs/papers/cmsoft.pdf
> Note that this algorithm can be conveniently implemented in a distributed 
> way, as described in section 3.2 of the paper.
> The paper
> http://www.vldb.org/conf/2002/S10P03.pdf
> also describes algorithms for approximately keeping track of frequencies, but 
> here the user can specify a threshold below which she is not interested in 
> the frequency of an element. The error-bounds are also different than the 
> Count-min sketch algorithm.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-8470) DelayTrigger and DelayAndCountTrigger in Flink Streaming Window API

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-8470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336664#comment-17336664
 ] 

Flink Jira Bot commented on FLINK-8470:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> DelayTrigger and DelayAndCountTrigger in Flink Streaming Window API
> ---
>
> Key: FLINK-8470
> URL: https://issues.apache.org/jira/browse/FLINK-8470
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Affects Versions: 2.0.0
>Reporter: Vijay Kansal
>Priority: Major
>  Labels: stale-major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In Flink streaming API, we do not have any in-built window trigger(s) 
> available for the below use cases:
>  1. DelayTrigger: Window function should trigger in case the 1st element 
> belonging to this window arrived more than maxDelay(ms) before the current 
> processing time.
> 2. DelayAndCountTrigger: Window function should trigger in case the 1st 
> element belonging to this window arrived more than maxDelay(ms) before the 
> current processing time or there are more than maxCount elements in the 
> window.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5901) DAG can not show properly in IE

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-5901:
--
Priority: Major  (was: Critical)

> DAG can not show properly in IE
> ---
>
> Key: FLINK-5901
> URL: https://issues.apache.org/jira/browse/FLINK-5901
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
> Environment: IE 11
>Reporter: Tao Wang
>Priority: Major
>  Labels: auto-deprioritized-critical
> Attachments: using IE.png, using chrom(same job).png
>
>
> The DAG of running jobs can not show properly in IE11(I am using 
> 11.0.9600.18059, but assuming same with IE9). The description of task is 
> not shown within the rectangle.
> Chrome is well. I pasted the screeshot under IE and Chrome below.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5094) Support RichReduceFunction and RichFoldFunction as incremental window aggregation functions

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-5094:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Support RichReduceFunction and RichFoldFunction as incremental window 
> aggregation functions
> ---
>
> Key: FLINK-5094
> URL: https://issues.apache.org/jira/browse/FLINK-5094
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 1.1.3, 1.2.0
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: auto-deprioritized-major
>
> Support {{RichReduceFunction}} and {{RichFoldFunction}} as incremental window 
> aggregation functions in order to initialize the functions via {{open()}}.
> The main problem is that we do not want to provide the full power of 
> {{RichFunction}} for incremental aggregation functions, such as defining own 
> operator state. This could be achieve by providing some kind of 
> {{RestrictedRuntimeContext}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-10031) Support handling late events in Table API & SQL

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-10031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-10031:
---
Priority: Minor  (was: Major)

> Support handling late events in Table API & SQL
> ---
>
> Key: FLINK-10031
> URL: https://issues.apache.org/jira/browse/FLINK-10031
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Timo Walther
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> The Table API & SQL drop late events right now. We should offer something 
> like a side channel that allows to capture late events for separate 
> processing. For example, this could either be simply a table sink or a 
> special table (in Table API) derived from the original table that allows 
> further processing. The exact design needs to be discussed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5683) Fix RedisConnector documentation (Java)

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-5683:
--
Labels: auto-deprioritized-major redis  (was: redis stale-major)

> Fix RedisConnector documentation (Java)
> ---
>
> Key: FLINK-5683
> URL: https://issues.apache.org/jira/browse/FLINK-5683
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common, Documentation
>Affects Versions: 1.1.3, 1.1.4
>Reporter: Vikram Rawat
>Priority: Major
>  Labels: auto-deprioritized-major, redis
>
> The RedisConnector documentation for Java can be improved.
> https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/streaming/connectors/redis.html
>  
> The code snippet on the site is:
> DataStream stream = ...;
> stream.addSink(new RedisSink>(conf, new 
> RedisExampleMapper());
> This gives compile time error in IDE. It should be changed to:
> DataStream> stream = ...;
> stream.addSink(new RedisSink>(conf, new 
> RedisExampleMapper());
> The code snippets on the Project WebSite must be accurate so I recommend this 
> change. And it will only take a few minutes. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13695) Integrate checkpoint notifications into StreamTask's lifecycle

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-13695:
---
Priority: Major  (was: Critical)

> Integrate checkpoint notifications into StreamTask's lifecycle
> --
>
> Key: FLINK-13695
> URL: https://issues.apache.org/jira/browse/FLINK-13695
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: auto-deprioritized-critical
>
> The {{StreamTask's}} asynchronous checkpoint notifications are decoupled from 
> the {{StreamTask's}} lifecycle. Consequently, it can happen that a 
> {{StreamTask}} is terminating/cancelling and still sends asynchronous 
> checkpoint notifications (e.g. acknowledge/decline checkpoint notifications). 
> This is problematic because a cancelling/terminating {{StreamTask}} might 
> cause an asynchronous checkpoint to fail which is expected and should not be 
> reported to the {{JobMaster}}. 
> Hence, the checkpoint notifications should be coupled with the 
> {{StreamTask's}} lifecycle (e.g. notifications must be sent from the 
> {{StreamTask's}} main thread) and if not valid, then they need to be filtered 
> out/suppressed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3588) Add a streaming (exactly-once) JDBC connector

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-3588:
--
Priority: Minor  (was: Major)

> Add a streaming (exactly-once) JDBC connector
> -
>
> Key: FLINK-3588
> URL: https://issues.apache.org/jira/browse/FLINK-3588
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Affects Versions: 1.0.0
>Reporter: Chesnay Schepler
>Priority: Minor
>  Labels: auto-deprioritized-major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4621) Improve decimal literals of SQL API

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-4621:
--
Priority: Minor  (was: Major)

> Improve decimal literals of SQL API
> ---
>
> Key: FLINK-4621
> URL: https://issues.apache.org/jira/browse/FLINK-4621
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Timo Walther
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently, all SQL {{DECIMAL}} types are converted to BigDecimals internally. 
> By default, the SQL parsers creates {{DECIMAL}} literals of any number e.g. 
> {{SELECT 1.0, 12, -0.5 FROM x}}. I think it would be better if these simple 
> numbers would be represented as Java primitives instead of objects.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9802) Harden End-to-end tests against download failures

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-9802:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Harden End-to-end tests against download failures
> -
>
> Key: FLINK-9802
> URL: https://issues.apache.org/jira/browse/FLINK-9802
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: auto-deprioritized-major
>
> Several end-to-end tests download libraries (kafka, zookeeper, elasticsearch) 
> to set them up locally for testing purposes. Currently, (at least for the 
> elasticsearch test), we do not guard against failed downloads.
> We should do a sweep over all tests and harden them against download 
> failures, by retrying the download on failure, and explicitly exiting if the 
> download did not succeed after N attempts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-7624) Add kafka-topic for "KafkaProducer" metrics

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336707#comment-17336707
 ] 

Flink Jira Bot commented on FLINK-7624:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Add kafka-topic for "KafkaProducer" metrics
> ---
>
> Key: FLINK-7624
> URL: https://issues.apache.org/jira/browse/FLINK-7624
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Runtime / Metrics
>Reporter: Hai Zhou
>Priority: Major
>  Labels: stale-major
>
> Currently, metric in "KafkaProducer" MetricGroup, Such as:
> {code:java}
> localhost.taskmanager.dc4092a96ea4e54ecdbd13b9a5c209b2.Flink Streaming 
> Job.Sink--MTKafkaProducer08.0.KafkaProducer.record-queue-time-avg
> {code}
> The metric name in the "KafkaProducer" group does not have a kafka-topic name 
> part,  if the job writes data to two different kafka sinks, these metrics 
> will not distinguish.
> I wish that modify the above metric name as follows:
> {code:java}
> localhost.taskmanager.dc4092a96ea4e54ecdbd13b9a5c209b2.Flink Streaming 
> Job.Sink--MTKafkaProducer08.0.KafkaProducer. topic>.record-queue-time-avg
> {code}
> Best,
> Hai Zhou



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-6626) Unifying lifecycle management of SubmittedJobGraph- and CompletedCheckpointStore

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-6626:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Unifying lifecycle management of SubmittedJobGraph- and 
> CompletedCheckpointStore
> 
>
> Key: FLINK-6626
> URL: https://issues.apache.org/jira/browse/FLINK-6626
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: auto-deprioritized-major
>
> Currently, Flink uses the {{SubmittedJobGraphStore}} to persist {{JobGraphs}} 
> such that they can be recovered in case of failures. The 
> {{SubmittedJobGraphStore}} is managed by by the {{JobManager}}. Additionally, 
> Flink has the {{CompletedCheckpointStore}} which stores checkpoints for a 
> given {{ExecutionGraph}}/job. The {{CompletedCheckpointStore}} is managed by 
> the {{CheckpointCoordinator}}.
> The {{SubmittedJobGraphStore}} and the {{CompletedCheckpointStore}} are 
> somewhat related because in the latter we store checkpoints for jobs 
> contained in the former. I think it would be nice wrt lifecycle management to 
> let the {{SubmittedJobGraphStore}} manage the lifecycle of the 
> {{CompletedCheckpointStore}}, because often it does not make much sense to 
> keep only checkpoints without a job or a job without checkpoints. 
> An idea would be when we register a job with the {{SubmittedJobGraphStore}} 
> then it returns a {{CompletedCheckpointStore}}. This store can then be given 
> to the {{CheckpointCoordinator}} to store the checkpoints. When a job enters 
> a terminal state it could be the responsibility of the 
> {{SubmittedJobGraphStore}} to decide what to do with the job data 
> ({{JobGraph}} and {{Checkpoints}}), e.g. keeping it or cleaning it up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-2899) The groupReduceOn* methods which take types as a parameter fail with TypeErasure

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-2899:
--
Labels: auto-deprioritized-major  (was: stale-major)

> The groupReduceOn* methods which take types as a parameter fail with 
> TypeErasure
> 
>
> Key: FLINK-2899
> URL: https://issues.apache.org/jira/browse/FLINK-2899
> Project: Flink
>  Issue Type: Bug
>  Components: Library / Graph Processing (Gelly)
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>Priority: Major
>  Labels: auto-deprioritized-major
>
> I tried calling  groupReduceOnEdges (EdgesFunctionWithVertexValue T> edgesFunction, EdgeDirection direction, TypeInformation typeInfo) in 
> order to make the vertex-centric version of the Triangle Count library method 
> applicable to any kind of key and I got a TypeErasure Exception. 
> After doing a bit of debugging (see the hack in 
> https://github.com/andralungu/flink/tree/trianglecount-vertexcentric), I saw 
> that actually the call to 
> TypeExtractor.createTypeInfo(NeighborsFunctionWithVertexValue.class,  in 
> ApplyNeighborCoGroupFunction does not work properly, i.e. it returns null. 
> From what I see, the coGroup in groupReduceOnNeighbors tries to infer a type 
> before "returns" is called. 
> I may be missing something, but that particular feature (groupReduceOn with 
> types) is not documented or tested so we would also need some tests for that. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-12304) AvroInputFormat should support schema evolution

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-12304:
---
Labels: auto-deprioritized-major  (was: stale-major)

> AvroInputFormat should support schema evolution
> ---
>
> Key: FLINK-12304
> URL: https://issues.apache.org/jira/browse/FLINK-12304
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.8.0
>Reporter: John
>Priority: Major
>  Labels: auto-deprioritized-major
>
> From the avro spec:
> _A reader of Avro data, whether from an RPC or a file, can always parse that 
> data because its schema is provided. But that schema may not be exactly the 
> schema that was expected. For example, if the data was written with a 
> different version of the software than it is read, then records may have had 
> fields added or removed._
> The AvroInputFormat should allow the application to supply a reader's schema 
> to support cases where data was written with an old version of a schema and 
> needs to be read with a newer version.  The reader's schema can have addition 
> fields with defaults so that the old schema can be adapted to the new.  The 
> underlying avro java library supports schema resolution, so adding support in 
> AvroInputFormat should be straight forward.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5260) Add docs about how to react to cancellation

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-5260:
--
Priority: Minor  (was: Major)

> Add docs about how to react to cancellation
> ---
>
> Key: FLINK-5260
> URL: https://issues.apache.org/jira/browse/FLINK-5260
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Ufuk Celebi
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Task cancellation with operators that work against external services can be a 
> source of confusion for users. Since users need to cooperate for this, I 
> would like to see some docs added about how the user can do this (which 
> callbacks are available, how to react to interrupts, etc.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8297) RocksDBListState stores whole list in single byte[]

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-8297:
--
Priority: Minor  (was: Major)

> RocksDBListState stores whole list in single byte[]
> ---
>
> Key: FLINK-8297
> URL: https://issues.apache.org/jira/browse/FLINK-8297
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Affects Versions: 1.3.2, 1.4.0
>Reporter: Jan Lukavský
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> RocksDBListState currently keeps whole list of data in single RocksDB 
> key-value pair, which implies that the list actually must fit into memory. 
> Larger lists are not supported and end up with OOME or other error. The 
> RocksDBListState could be modified so that individual items in list are 
> stored in separate keys in RocksDB and can then be iterated over. A simple 
> implementation could reuse existing RocksDBMapState, with key as index to the 
> list and a single RocksDBValueState keeping track of how many items has 
> already been added to the list. Because this implementation might be less 
> efficient in come cases, it would be good to make it opt-in by a construct 
> like
> {{new RocksDBStateBackend().enableLargeListsPerKey()}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8236) Allow to set the parallelism of table queries

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-8236:
--
Priority: Minor  (was: Major)

> Allow to set the parallelism of table queries
> -
>
> Key: FLINK-8236
> URL: https://issues.apache.org/jira/browse/FLINK-8236
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.4.0
>Reporter: Timo Walther
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Right now the parallelism of a table program is determined by the parallelism 
> of the stream/batch environment. E.g., by default, tumbling window operators 
> use the default parallelism of the environment. Simple project and select 
> operations have the same parallelism as the inputs they are applied on.
> While we cannot change forwarding operations because this would change the 
> results when using retractions, it should be possible to change the 
> parallelism for operators after shuffling operations.
> It should be possible to specify the default parallelism of a table program 
> in the {{TableConfig}} and/or {{QueryConfig}}. The configuration per query 
> has higher precedence that the configuration per table environment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-2220) Join on Pojo without hashCode() silently fails

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-2220:
--
Priority: Minor  (was: Major)

> Join on Pojo without hashCode() silently fails
> --
>
> Key: FLINK-2220
> URL: https://issues.apache.org/jira/browse/FLINK-2220
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataSet
>Affects Versions: 0.9, 0.8.1
>Reporter: Marcus Leich
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> I need to perform a join using a complete Pojo as join key.
> With DOP > 1 this only works if the Pojo comes with a meaningful hasCode() 
> implementation, as otherwise equal objects will get hashed to different 
> partitions based on their memory address and not on the content.
> I guess it's fine if users are required to implement hasCode() themselves, 
> but it would be nice of documentation or better yet, Flink itself could alert 
> users that this is a requirement, similar to how Comparable is required for 
> keys.
> Use the following code to reproduce the issue:
> public class Pojo implements Comparable {
> public byte[] data;
> public Pojo () {
> }
> public Pojo (byte[] data) {
> this.data = data;
> }
> @Override
> public int compareTo(Pojo o) {
> return UnsignedBytes.lexicographicalComparator().compare(data, 
> o.data);
> }
> // uncomment me for making the join work
> /* @Override
> public int hashCode() {
> return Arrays.hashCode(data);
> }*/
> }
> public void testJoin () throws Exception {
> final ExecutionEnvironment env = 
> ExecutionEnvironment.createLocalEnvironment();
> env.setParallelism(4);
> DataSet> left = env.fromElements(
> new Tuple2<>(new Pojo(new byte[] {0, 24, 23, 1, 3}), "black"),
> new Tuple2<>(new Pojo(new byte[] {0, 14, 13, 14, 13}), "red"),
> new Tuple2<>(new Pojo(new byte[] {1}), "Spark"),
> new Tuple2<>(new Pojo(new byte[] {2}), "good"),
> new Tuple2<>(new Pojo(new byte[] {5}), "bug"));
> DataSet> right = env.fromElements(
> new Tuple2<>(new Pojo(new byte[] {0, 24, 23, 1, 3}), "white"),
> new Tuple2<>(new Pojo(new byte[] {0, 14, 13, 14, 13}), 
> "green"),
> new Tuple2<>(new Pojo(new byte[] {1}), "Flink"),
> new Tuple2<>(new Pojo(new byte[] {2}), "evil"),
> new Tuple2<>(new Pojo(new byte[] {5}), "fix"));
> // will not print anything unless Pojo has a real hashCode() 
> implementation
> 
> left.join(right).where(0).equalTo(0).projectFirst(1).projectSecond(1).print();
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19156) Migration of transactionIdHint in Kafka is never applied

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336951#comment-17336951
 ] 

Flink Jira Bot commented on FLINK-19156:


This issue was labeled "stale-critical" 7 ago and has not received any updates 
so it is being deprioritized. If this ticket is actually Critical, please raise 
the priority and ask a committer to assign you the issue or revive the public 
discussion.


> Migration of transactionIdHint in Kafka is never applied
> 
>
> Key: FLINK-19156
> URL: https://issues.apache.org/jira/browse/FLINK-19156
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.9.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: stale-critical
>
> The code that checks if we should migrate the transaction id is as follows:
> {code}
> @Deprecated
> private static final 
> ListStateDescriptor 
> NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR =
>   new ListStateDescriptor<>("next-transactional-id-hint", 
> TypeInformation.of(NextTransactionalIdHint.class));
> if 
> (context.getOperatorStateStore().getRegisteredStateNames().contains(NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR))
>  {
>   migrateNextTransactionalIdHindState(context);
> }
> {code}
> The condition in if statement is never met because it checks if a 
> {{Set}} contains object of type {{ListStateDescriptor}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3627) Task stuck on lock in StreamSource when cancelling

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-3627:
--
Priority: Minor  (was: Major)

> Task stuck on lock in StreamSource when cancelling
> --
>
> Key: FLINK-3627
> URL: https://issues.apache.org/jira/browse/FLINK-3627
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Jamie Grier
>Priority: Minor
>  Labels: auto-deprioritized-major, hang
>
> I've seen this occur a couple of times when the # of network buffers is set 
> too low.  The job fails with the an appropriate message indicating that the 
> user should increase the # of network buffers.  However, some of the task 
> threads then hang with a stack trace similar to the following.
> 2016-03-16 13:38:54,017 WARN  org.apache.flink.runtime.taskmanager.Task   
>   - Task 'Source: EventGenerator -> (Flat Map, blah -> Filter -> 
> Projection -> Flat Map -> Timestamps/Watermarks -> Map) (46/144)' did not 
> react to cancelling signal, but is stuck in method:
>  
> org.apache.flink.streaming.api.operators.StreamSource$ManualWatermarkContext.collect(StreamSource.java:317)
> flink.benchmark.generator.LoadGeneratorSource.run(LoadGeneratorSource.java:38)
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:78)
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:56)
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:224)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
> java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-11783) Deadlock during Join operation

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-11783:
---
Labels: auto-deprioritized-major  (was: stale-major)

> Deadlock during Join operation
> --
>
> Key: FLINK-11783
> URL: https://issues.apache.org/jira/browse/FLINK-11783
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataSet
>Affects Versions: 1.7.2
>Reporter: Julien Nioche
>Priority: Major
>  Labels: auto-deprioritized-major
> Attachments: flink_is_stuck.png
>
>
> I am running a filtering job on a large dataset with Flink running in 
> distributed mode. Most tasks in the Join operation have completed a while ago 
> and only the tasks from a particular TaskManager are still running. These 
> tasks make progress but extremely slowly.
> When logging onto the machine running this TM I can see that all threads are 
> TIMED_WAITING .
> Could there be a synchronization problem?
> See attachment for a screenshot of the Flink UI and the stack below.
>  
> *{{$ jstack 9183 | grep -A 15 "DataSetFilterJob"}}*
> {{"CHAIN Join (Join at with(JoinOperator.java:543)) -> Map (Map at 
> (DataSetFilterJob.java:67)) (66/150)" #155 prio=5 os_prio=0 
> tid=0x7faa5c01c000 nid=0x248c waiting on condition [0x7fa9d15d5000]}}
> {{ java.lang.Thread.State: TIMED_WAITING (parking)}}
> {{ at sun.misc.Unsafe.park(Native Method)}}
> {{ - parking to wait for <0x0007bfa89578> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)}}
> {{ at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)}}
> {{ at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)}}
> {{ at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.AsynchronousBlockReader.getNextReturnedBlock(AsynchronousBlockReader.java:98)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.AsynchronousBlockReader.getNextReturnedBlock(AsynchronousBlockReader.java:43)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.ChannelReaderInputView.nextSegment(ChannelReaderInputView.java:228)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.advance(AbstractPagedInputView.java:158)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.readByte(AbstractPagedInputView.java:271)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.readUnsignedByte(AbstractPagedInputView.java:278)}}
> {{ at org.apache.flink.types.StringValue.readString(StringValue.java:746)}}
> {{ at 
> org.apache.flink.api.common.typeutils.base.StringSerializer.deserialize(StringSerializer.java:75)}}
> {{ at 
> org.apache.flink.api.common.typeutils.base.StringSerializer.deserialize(StringSerializer.java:80)}}
> {{--}}
> {{"CHAIN Join (Join at with(JoinOperator.java:543)) -> Map (Map at 
> (DataSetFilterJob.java:67)) (65/150)" #154 prio=5 os_prio=0 
> tid=0x7faa5c01b000 nid=0x248b waiting on condition [0x7fa9d14d4000]}}
> {{ java.lang.Thread.State: TIMED_WAITING (parking)}}
> {{ at sun.misc.Unsafe.park(Native Method)}}
> {{ - parking to wait for <0x0007b8e0eb50> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)}}
> {{ at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)}}
> {{ at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)}}
> {{ at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.AsynchronousBlockReader.getNextReturnedBlock(AsynchronousBlockReader.java:98)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.AsynchronousBlockReader.getNextReturnedBlock(AsynchronousBlockReader.java:43)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.ChannelReaderInputView.nextSegment(ChannelReaderInputView.java:228)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.advance(AbstractPagedInputView.java:158)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.readByte(AbstractPagedInputView.java:271)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.readUnsignedByte(AbstractPagedInputView.java:278)}}
> {{ at org.apache.flink.types.StringValue.readString(StringValue.java:746)}}
> {{ at 
> org.apache.flink.api.common.typeutils.base.StringSerializer.deserialize(StringSerializer.java:75)}}
> {{ at 
> org.apache.flink.api.common.typeutils.base.StringSerializer.deserialize(StringSerializer.java:80)}}
> {{--}}
> {{"CHAIN Join (Join at with(JoinOperator.java:543)) -> Map (Map at 
> (DataSetFilterJob.java:67)) (68/150)" #153 prio=5 os_prio=0 
> tid=0x7faa5c019800 nid=0x248a 

[jira] [Commented] (FLINK-14521) CoLocationGroup is not set into JobVertex if the stream node is chained with others

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336395#comment-17336395
 ] 

Flink Jira Bot commented on FLINK-14521:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> CoLocationGroup is not set into JobVertex if the stream node is chained with 
> others
> ---
>
> Key: FLINK-14521
> URL: https://issues.apache.org/jira/browse/FLINK-14521
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.9.0, 1.9.1
>Reporter: Yun Gao
>Priority: Major
>  Labels: pull-request-available, stale-major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> StreamingJobGraphGenerator.isChainable dose not consider the coLocationGroup, 
> if A -> B is chained, the coLocationGroup of the corresponding JobVertex will 
> be set with that of the head node, namely A. Therefore, if B has declared 
> coLocationGroup but A does not, then the coLocationGroup of B will be ignored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8416) Kinesis consumer doc examples should demonstrate preferred default credentials provider

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-8416:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Kinesis consumer doc examples should demonstrate preferred default 
> credentials provider
> ---
>
> Key: FLINK-8416
> URL: https://issues.apache.org/jira/browse/FLINK-8416
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kinesis, Documentation
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Major
>  Labels: auto-deprioritized-major
>
> The Kinesis consumer docs 
> [here](https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/connectors/kinesis.html#kinesis-consumer)
>  demonstrate providing credentials by explicitly supplying the AWS Access ID 
> and Key.
> The always preferred approach for AWS, unless running locally, is to 
> automatically fetch the shipped credentials from the AWS environment.
> That is actually the default behaviour of the Kinesis consumer, so the docs 
> should demonstrate that more clearly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-2824) Iteration feedback partitioning does not work as expected

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336905#comment-17336905
 ] 

Flink Jira Bot commented on FLINK-2824:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Iteration feedback partitioning does not work as expected
> -
>
> Key: FLINK-2824
> URL: https://issues.apache.org/jira/browse/FLINK-2824
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 0.10.0
>Reporter: Gyula Fora
>Priority: Major
>  Labels: stale-major
>
> Iteration feedback partitioning is not handled transparently and can cause 
> serious issues if the user does not know the specific implementation details 
> of streaming iterations (which is not a realistic expectation).
> Example:
> IterativeStream it = ... (parallelism 1)
> DataStream mapped = it.map(...) (parallelism 2)
> // this does not work as the feedback has parallelism 2 != 1
> // it.closeWith(mapped.partitionByHash(someField))
> // so we need rebalance the data
> it.closeWith(mapped.map(NoOpMap).setParallelism(1).partitionByHash(someField))
> This program will execute but the feedback will not be partitioned by hash to 
> the mapper instances:
> The partitioning will be set from the noOpMap to the iteration sink which has 
> parallelism different from the mapper (1 vs 2) and then the iteration source 
> forwards the element to the mapper (always to 0).
> So the problem is basically that the iteration source/sink pair gets the 
> parallelism of the input stream (p=1) not the head operator (p = 2) which 
> leads to incorrect partitioning.
> Workaround:
> Set input parallelism to the same as the head operator
> Suggested solution:
> The iteration construction should be reworked to set the parallelism of the 
> source/sink to the parallelism of the head operator (and validate that all 
> heads have the same parallelism)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-6199) Single outstanding Async I/O operation per key

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-6199:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Single outstanding Async I/O operation per key
> --
>
> Key: FLINK-6199
> URL: https://issues.apache.org/jira/browse/FLINK-6199
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Jamie Grier
>Priority: Major
>  Labels: auto-deprioritized-major
>
> I would like to propose we extend the Async I/O semantics a bit such that a 
> user can guarantee a single outstanding async request per key.
> This would allow a user to order async requests per key while still achieving 
> the throughput benefits of using Async I/O in the first place.
> This is essential for operations where stream order is important but we still 
> need to use Async operations to interact with an external system in a 
> performant way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-7640) Dashboard should display information about JobManager cluster in HA mode

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336706#comment-17336706
 ] 

Flink Jira Bot commented on FLINK-7640:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Dashboard should display information about JobManager cluster in HA mode
> 
>
> Key: FLINK-7640
> URL: https://issues.apache.org/jira/browse/FLINK-7640
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.3.2
>Reporter: Elias Levy
>Priority: Major
>  Labels: stale-major
>
> Currently the dashboard provides no information about the status of a cluster 
> of JobManagers configured in high-availability mode.  
> The dashboard should display the status and membership of a JM cluster in the 
> Overview and Job Manager sections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9052) Flink complains when scala.Option is used inside POJO

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-9052:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Flink complains when scala.Option is used inside POJO
> -
>
> Key: FLINK-9052
> URL: https://issues.apache.org/jira/browse/FLINK-9052
> Project: Flink
>  Issue Type: Bug
>  Components: API / Type Serialization System
>Affects Versions: 1.4.2
>Reporter: Truong Duc Kien
>Priority: Major
>  Labels: auto-deprioritized-major
> Attachments: TypeInfomationTest.scala
>
>
> According to the documentation, Flink has a specialized serializer for Option 
> type. However, when an Option field is used inside a POJO, the following 
> WARNING appears in TaskManagers' log.
>  
> {{No fields detected for class scala.Option. Cannot be used as a PojoType. 
> Will be handled as GenericType}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-7129) Support dynamically changing CEP patterns

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336730#comment-17336730
 ] 

Flink Jira Bot commented on FLINK-7129:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Support dynamically changing CEP patterns
> -
>
> Key: FLINK-7129
> URL: https://issues.apache.org/jira/browse/FLINK-7129
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Reporter: Dawid Wysakowicz
>Priority: Major
>  Labels: stale-major
>
> An umbrella task for introducing mechanism for injecting patterns through 
> coStream



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8844) Export job jar file name or job version property via REST API

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-8844:
--
Priority: Minor  (was: Major)

> Export job jar file name or job version property via REST API
> -
>
> Key: FLINK-8844
> URL: https://issues.apache.org/jira/browse/FLINK-8844
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / REST
>Affects Versions: 1.4.3
>Reporter: Elias Levy
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> To aid automated deployment of jobs, it would be useful if the REST API 
> exposed either a running job's jar filename or a version property the job 
> could set, similar to how it sets the job name.
> As it is now there is no standard mechanism to determine what version of a 
> job is running in a cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-11030) Cannot use Avro logical types with ConfluentRegistryAvroDeserializationSchema

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336546#comment-17336546
 ] 

Flink Jira Bot commented on FLINK-11030:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Cannot use Avro logical types with ConfluentRegistryAvroDeserializationSchema
> -
>
> Key: FLINK-11030
> URL: https://issues.apache.org/jira/browse/FLINK-11030
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.6.2
>Reporter: Maciej Bryński
>Priority: Major
>  Labels: pull-request-available, stale-major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I created Specific class for Kafka topic. 
>  Avro schema includes logicalTypes.
>  Then I want to read data using following code:
> {code:scala}
> val deserializationSchema = 
> ConfluentRegistryAvroDeserializationSchema.forSpecific(classOf[mySpecificClass],
>  schemaRegistryUrl)
> val kafkaStream = env.addSource(
>   new FlinkKafkaConsumer011(topic, deserializationSchema, kafkaProperties)
> )
> kafkaStream.print()
> {code}
>  Result:
>  {code}
>  Exception in thread "main" 
> org.apache.flink.runtime.client.JobExecutionException: 
> java.lang.ClassCastException: java.lang.Long cannot be cast to 
> org.joda.time.DateTime
>  at 
> org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:623)
>  at 
> org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:123)
>  at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1511)
>  at 
> org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:645)
>  at TransactionEnrichment$.main(TransactionEnrichment.scala:50)
>  at TransactionEnrichment.main(TransactionEnrichment.scala)
>  Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to 
> org.joda.time.DateTime
>  at platform_tbl_game_transactions_v1.Value.put(Value.java:222)
>  at org.apache.avro.generic.GenericData.setField(GenericData.java:690)
>  at 
> org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:119)
>  at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
>  at 
> org.apache.flink.formats.avro.RegistryAvroDeserializationSchema.deserialize(RegistryAvroDeserializationSchema.java:74)
>  at 
> org.apache.flink.streaming.util.serialization.KeyedDeserializationSchemaWrapper.deserialize(KeyedDeserializationSchemaWrapper.java:44)
>  at 
> org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.runFetchLoop(Kafka09Fetcher.java:142)
>  at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:738)
>  at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
>  at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:56)
>  at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
>  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>  at java.lang.Thread.run(Thread.java:748)
>  {code}
>  When using Kafka Consumer there was a hack for this to use LogicalConverters.
>  Unfortunately it's not working in flink.
>  {code}
>  SpecificData.get.addLogicalTypeConversion(new 
> TimeConversions.TimestampConversion)
>  {code}
> Problem probably is cause by the fact we're creating own instance of 
> SpecificData
> https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroDeserializationSchema.java#L145
> And there is no logical conversions added.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-6473) Add OVER window support for batch tables

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-6473:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Add OVER window support for batch tables
> 
>
> Key: FLINK-6473
> URL: https://issues.apache.org/jira/browse/FLINK-6473
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: auto-deprioritized-major
>
> Add support for OVER windows for batch tables. 
> Since OVER windows are supported for streaming tables, this issue is not 
> about the API (which is available) but about adding the execution strategies 
> and translation for OVER windows on batch tables.
> The feature could be implemented using the following plans
> *UNBOUNDED OVER*
> {code}
> DataSet[Row] input = ...
> DataSet[Row] result = input
>   .groupBy(partitionKeys)
>   .sortGroup(orderByKeys)
>   .reduceGroup(computeAggregates)
> {code}
> This implementation is quite straightforward because we don't need to retract 
> rows.
> *BOUNDED OVER*
> A bit more challenging are BOUNDED OVER windows, because we need to retract 
> values from aggregates and we don't want to store rows temporarily on the 
> heap.
> {code}
> DataSet[Row] input = ...
> DataSet[Row] sorted = input
>   .partitionByHash(partitionKey)
>   .sortPartition(partitionKeys, orderByKeys)
> DataSet[Row] result = sorted.coGroup(sorted)
>   .where(partitionKey).equalTo(partitionKey)
>   .with(computeAggregates)
> {code}
> With this, the data set should be partitioned and sorted once. The sorted 
> {{DataSet}} would be consumed twice (the optimizer should inject a temp 
> barrier on one of the inputs to avoid a consumption deadlock). The 
> {{CoGroupFunction}} would accumulate new rows into the aggregates from one 
> input and retract them from the other. Since both input streams are properly 
> sorted, this can happen in a zigzag fashion. We need verify that the 
> generated plan is was we want it to be.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-12304) AvroInputFormat should support schema evolution

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336488#comment-17336488
 ] 

Flink Jira Bot commented on FLINK-12304:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> AvroInputFormat should support schema evolution
> ---
>
> Key: FLINK-12304
> URL: https://issues.apache.org/jira/browse/FLINK-12304
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.8.0
>Reporter: John
>Priority: Major
>  Labels: stale-major
>
> From the avro spec:
> _A reader of Avro data, whether from an RPC or a file, can always parse that 
> data because its schema is provided. But that schema may not be exactly the 
> schema that was expected. For example, if the data was written with a 
> different version of the software than it is read, then records may have had 
> fields added or removed._
> The AvroInputFormat should allow the application to supply a reader's schema 
> to support cases where data was written with an old version of a schema and 
> needs to be read with a newer version.  The reader's schema can have addition 
> fields with defaults so that the old schema can be adapted to the new.  The 
> underlying avro java library supports schema resolution, so adding support in 
> AvroInputFormat should be straight forward.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336698#comment-17336698
 ] 

Flink Jira Bot commented on FLINK-7865:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Remove predicate restrictions on TableFunction left outer join
> --
>
> Key: FLINK-7865
> URL: https://issues.apache.org/jira/browse/FLINK-7865
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Xingcan Cui
>Priority: Major
>  Labels: stale-major
>
> To cover up the improper translation of lateral table left outer join 
> (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} 
> literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has 
> been fixed in Calcite, we should remove the restrictions. The tasks may 
> include removing Table API/SQL condition check, removing validation tests, 
> enabling integration tests, updating the documents, etc.
> See [this thread on Calcite dev 
> list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E]
>  for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-9565) Evaluating scalar UDFs in parallel

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336623#comment-17336623
 ] 

Flink Jira Bot commented on FLINK-9565:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Evaluating scalar UDFs in parallel
> --
>
> Key: FLINK-9565
> URL: https://issues.apache.org/jira/browse/FLINK-9565
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.4.2
>Reporter: yinhua.dai
>Priority: Major
>  Labels: stale-major
>
> As per 
> [https://stackoverflow.com/questions/50790023/does-flink-sql-support-to-run-projections-in-parallel,]
>  scalar UDF in the same SQL is always evaluated sequentially even when those 
> UDF are irrelevant, it may increase latency when the UDF is time consuming 
> function.
> It would be great if Flink SQL can support to run those UDF in parallel to 
> reduce calculation latency.
>  
> cc [~fhueske]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-12298) Make column functions accept custom Range class rather than Expression

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336491#comment-17336491
 ] 

Flink Jira Bot commented on FLINK-12298:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Make column functions accept custom Range class rather than Expression
> --
>
> Key: FLINK-12298
> URL: https://issues.apache.org/jira/browse/FLINK-12298
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Dawid Wysakowicz
>Priority: Major
>  Labels: stale-major
>
> I would suggest to rework the column functions to a more typesafe approach 
> with custom {{Range}} class. Right now {{withColumns}} accepts array of 
> Expressions. We have 
> {{org.apache.flink.table.api.scala.ImplicitExpressionOperations#to}} method, 
> but also we have implicit conversion from {{scala.Range}} to Expression 
> range. This already introduces ambiguity, as it is unclear what will be the 
> end product of expression like {{1 to 9}}. This approach defers the checking 
> of the types of expressions to expressions resolution phase.
> I would suggest to make 
> {{org.apache.flink.table.api.scala.withColumns#apply}} always accept e.g. 
> {{ColumnRange}} that could always accept only {{Integer}} or {{String}} 
> instead of Expressions. Such class could look like (this is just a very rough 
> outline):
> {code}
> class ColumnRange {
>  IndexRange idx(Integer idx);
>  ReferenceRange ref(String reference);
> }
> class IndexRange {
>   IndexRange to(Integer idx);
> }
> class ReferenceRange {
>   ReferenceRange to(String ref);
> }
> {code}
> We could also have implicit conversion from {{scala.Range}} to {{IndexRange}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8663) Execution of DataStreams result in non functionality of size of Window for countWindow

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-8663:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Execution of DataStreams result in non functionality of size of Window for 
> countWindow
> --
>
> Key: FLINK-8663
> URL: https://issues.apache.org/jira/browse/FLINK-8663
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.4.0
> Environment: package com.vnl.stocks;
> import java.util.concurrent.TimeUnit;
> import org.apache.flink.api.common.functions.MapFunction;
> import org.apache.flink.streaming.api.datastream.AllWindowedStream;
> import org.apache.flink.streaming.api.datastream.DataStream;
> import org.apache.flink.streaming.api.datastream.WindowedStream;
> import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
> import 
> org.apache.flink.streaming.api.windowing.assigners.SlidingEventTimeWindows;
> import org.apache.flink.streaming.api.windowing.time.Time;
> import org.apache.flink.streaming.api.windowing.windows.GlobalWindow;
> import org.apache.flink.streaming.api.windowing.windows.TimeWindow;
> public class StocksProcessing {
>     
>     public static void main(String[] args) throws Exception {
>             final StreamExecutionEnvironment env =
>             StreamExecutionEnvironment.getExecutionEnvironment();
>         
>             //Read from a socket stream at map it to StockPrice objects
>             DataStream socketStockStream = env
>             .socketTextStream("localhost", )
>             .map(new MapFunction() {
>             private String[] tokens;
>         
>             @Override
>             public StockPrice map(String value) throws 
> Exception {
>             tokens = value.split(",");
>             return new StockPrice(tokens[0],
>             Double.parseDouble(tokens[1]));
>             }
>             });
>                 
>             socketStockStream.print();
>             //Generate other stock streams
>             DataStream SPX_stream = env.addSource(new 
> StockSource("SPX", 10));
>               //  DataStream FTSE_stream = env.addSource(new 
> StockSource("FTSE", 20));
>               //  DataStream DJI_stream = env.addSource(new 
> StockSource("DJI", 30));
>               //  DataStream BUX_stream = env.addSource(new 
> StockSource("BUX", 40));
>         
>             //Merge all stock streams together
>                 
>             DataStream stockStream = 
> socketStockStream.union(SPX_stream/*, FTSE_stream, DJI_stream, BUX_stream*/);
>                 
>                 
>                // stockStream.print();
>             Thread.sleep(100);
>                                                 
>             AllWindowedStream windowedStream = 
> stockStream
>                         .countWindowAll(10, 5);
>                         
>                         //.keyBy("symbol")
>                         //.timeWindowAll(Time.of(10, TimeUnit.SECONDS), 
> Time.of(1, TimeUnit.SECONDS));
>                 
>                     //stockStream.keyBy("symbol");
>                     //Compute some simple statistics on a rolling window
>                     DataStream lowest = 
> windowedStream.maxBy("price");
>                     //DataStream highest = windowedStream.;
>                     /*DataStream maxByStock = 
> windowedStream.groupBy("symbol")
>                     .maxBy("price").flatten();
>                     DataStream rollingMean = 
> windowedStream.groupBy("symbol")
>                     .mapWindow(new WindowMean()).flatten();*/
>                     lowest.print();
>                     
>                       Thread.sleep(100);
>             /*    
>                     AllWindowedStream 
> windowedStream1 = lowest
>                             .countWindowAll(5,2);
>                     //windowedStream1.print();
>                     DataStream highest = 
> windowedStream1.minBy("price");*/
>                     //highest.print();
>                     
>                     env.execute("Stock stream");
>         }
> }
>Reporter: Subham
>Priority: Major
>  Labels: auto-deprioritized-major
>
> I used AllWindowedStream to process a stream and generate 
> maximum of my window using countWindowAll functions. In this output the size 
> and slide of window works incorrectly.
> Refer below example for the bug
> Initial stream : 1,2,3,4,5,6.
> Output 1: (Find min for window 10,5) : 1,6,11.(This is correct)
> 

[jira] [Updated] (FLINK-3983) Allow users to set any (relevant) configuration parameter of the KinesisProducerConfiguration

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-3983:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Allow users to set any (relevant) configuration parameter of the 
> KinesisProducerConfiguration
> -
>
> Key: FLINK-3983
> URL: https://issues.apache.org/jira/browse/FLINK-3983
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common, Connectors / Kinesis
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Priority: Major
>  Labels: auto-deprioritized-major
>
> Currently, users can only set some of the configuration parameters in the 
> {{KinesisProducerConfiguration}} through Properties.
> It would be good to introduce configuration keys for these keys so that users 
> can change the producer configuration.
> I think these and most of the other variables in the 
> KinesisProducerConfiguration should be exposed via properties:
> - aggregationEnabled
> - collectionMaxCount
> - collectionMaxSize
> - connectTimeout
> - credentialsRefreshDelay
> - failIfThrottled
> - logLevel
> - metricsGranularity
> - metricsLevel
> - metricsNamespace
> - metricsUploadDelay



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-10276) Job Manager and Task Manager Metrics Reporter Ports Configuration

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-10276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336585#comment-17336585
 ] 

Flink Jira Bot commented on FLINK-10276:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Job Manager and Task Manager Metrics Reporter Ports Configuration
> -
>
> Key: FLINK-10276
> URL: https://issues.apache.org/jira/browse/FLINK-10276
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Configuration, Runtime / Metrics
>Reporter: Deirdre Kong
>Priority: Major
>  Labels: stale-major, starter
>
> *Problem Statement:*
> When deploying Flink using YARN, the job manager and task manager can be on 
> the same node or different nodes.  Say I specify the port range to be 
> 9249-9250, if JM and TM are deployed on the same node, the port for JM will 
> be 9249 and the port for TM will be 9250.  If JM and TM are deployed on 
> different nodes, then the ports for JM and TM will be 9249.
> I can only configure Prometheus once for the ports to scrape JM and TMs 
> metrics.  In this case, I won't know whether port 9249 is for JM or TM.  If 
> would be great if we can specify in flink-conf.yaml on the port we want for 
> JM reporter and TMs reporter.
> *Comment from Till:*
> I think we could extend Vino's proposal for Yarn as well: Maybe it makes 
> sense to allow to override certain configuration settings for the 
> TaskManagers when deploying on Yarn. That way one could define a fixed port 
> for the JM and a port range for the TMs. Having such a distinction you can 
> configure your Prometheus to scrape for the single JM and the TMs 
> individually. However, Flink does not yet support such a feature. You can 
> open a JIRA issue to track the problem.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9907) add CRC32 checksum in table Api and sql

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-9907:
--
Priority: Minor  (was: Major)

> add CRC32 checksum in table Api and sql
> ---
>
> Key: FLINK-9907
> URL: https://issues.apache.org/jira/browse/FLINK-9907
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: xueyu
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
>
> CRC32 returns the cyclic redundancy check value of a given string as a 32-bit 
> unsigned value, null if string is null. In mysql and hive the syntax like
> select CRC32('test') = 3632233996
> Though it is not a hash function, however the pattern and behavior looks 
> similar with hash functions (md5, sha1, sha2 etc..), so I put codegen in 
> HashCalcCallGen



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-12303) Scala 2.12 lambdas does not work in event classes inside streams.

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336489#comment-17336489
 ] 

Flink Jira Bot commented on FLINK-12303:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Scala 2.12 lambdas does not work in event classes inside streams.
> -
>
> Key: FLINK-12303
> URL: https://issues.apache.org/jira/browse/FLINK-12303
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, API / Scala
>Affects Versions: 1.7.2
> Environment: Scala 2.11/2.12, Oracle Java 1.8.0_172
>Reporter: Matěj Novotný
>Priority: Major
>  Labels: stale-major
>
> When you use lambdas inside event classes used in streams it does work in 
> Scala 2.11. It stoped working in Scala 2.12. It does compile but does not 
> process any data and does not throw any exception. I would expect that it 
> would not compile in case I have used some not supported field in event class 
> or I would throw some exception at least.
>  
> For more detail check my demonstration repo, please: 
> [https://github.com/matej-novotny/flink-lambda-bug]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8179) Convert CharSequence to String when registering / converting a Table

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-8179:
--
Priority: Minor  (was: Major)

> Convert CharSequence to String when registering / converting a Table
> 
>
> Key: FLINK-8179
> URL: https://issues.apache.org/jira/browse/FLINK-8179
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.5.0
>Reporter: Fabian Hueske
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Avro objects store text values as `java.lang.CharSequence`. Right now, these 
> fields are treated as `ANY` type by Calcite (`GenericType` by Flink). So, 
> importing a `DataStream` or `DataSet` of Avro objects results in Table where 
> the text values cannot be used as String fields.
> We should convert `CharSequence` fields to `String` when importing / 
> converting to a Table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-11196) Extend S3 EntropyInjector to use key replacement (instead of key removal) when creating checkpoint metadata files

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336537#comment-17336537
 ] 

Flink Jira Bot commented on FLINK-11196:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Extend S3 EntropyInjector to use key replacement (instead of key removal) 
> when creating checkpoint metadata files
> -
>
> Key: FLINK-11196
> URL: https://issues.apache.org/jira/browse/FLINK-11196
> Project: Flink
>  Issue Type: Improvement
>  Components: FileSystems
>Affects Versions: 1.7.0
>Reporter: Mark Cho
>Priority: Major
>  Labels: pull-request-available, stale-major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We currently use S3 entropy injection when writing out checkpoint data.
> We also use external checkpoints so that we can resume from a checkpoint 
> metadata file later.
> The current implementation of S3 entropy injector makes it difficult to 
> locate the checkpoint metadata files since in the newer versions of Flink, 
> `state.checkpoints.dir` configuration controls where the metadata and state 
> files are written, instead of having two separate paths (one for metadata, 
> one for state files).
> With entropy injection, we replace the entropy marker in the path specified 
> by `state.checkpoints.dir` with entropy (for state files) or we strip out the 
> marker (for metadata files).
>  
> We need to extend the entropy injection so that we can replace the entropy 
> marker with a predictable path (instead of removing it) so that we can do a 
> prefix query for just the metadata files.
> By not using the entropy key replacement (defaults to empty string), you get 
> the same behavior as it is today (entropy marker removed).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-11783) Deadlock during Join operation

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336519#comment-17336519
 ] 

Flink Jira Bot commented on FLINK-11783:


This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Deadlock during Join operation
> --
>
> Key: FLINK-11783
> URL: https://issues.apache.org/jira/browse/FLINK-11783
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataSet
>Affects Versions: 1.7.2
>Reporter: Julien Nioche
>Priority: Major
>  Labels: stale-major
> Attachments: flink_is_stuck.png
>
>
> I am running a filtering job on a large dataset with Flink running in 
> distributed mode. Most tasks in the Join operation have completed a while ago 
> and only the tasks from a particular TaskManager are still running. These 
> tasks make progress but extremely slowly.
> When logging onto the machine running this TM I can see that all threads are 
> TIMED_WAITING .
> Could there be a synchronization problem?
> See attachment for a screenshot of the Flink UI and the stack below.
>  
> *{{$ jstack 9183 | grep -A 15 "DataSetFilterJob"}}*
> {{"CHAIN Join (Join at with(JoinOperator.java:543)) -> Map (Map at 
> (DataSetFilterJob.java:67)) (66/150)" #155 prio=5 os_prio=0 
> tid=0x7faa5c01c000 nid=0x248c waiting on condition [0x7fa9d15d5000]}}
> {{ java.lang.Thread.State: TIMED_WAITING (parking)}}
> {{ at sun.misc.Unsafe.park(Native Method)}}
> {{ - parking to wait for <0x0007bfa89578> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)}}
> {{ at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)}}
> {{ at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)}}
> {{ at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.AsynchronousBlockReader.getNextReturnedBlock(AsynchronousBlockReader.java:98)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.AsynchronousBlockReader.getNextReturnedBlock(AsynchronousBlockReader.java:43)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.ChannelReaderInputView.nextSegment(ChannelReaderInputView.java:228)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.advance(AbstractPagedInputView.java:158)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.readByte(AbstractPagedInputView.java:271)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.readUnsignedByte(AbstractPagedInputView.java:278)}}
> {{ at org.apache.flink.types.StringValue.readString(StringValue.java:746)}}
> {{ at 
> org.apache.flink.api.common.typeutils.base.StringSerializer.deserialize(StringSerializer.java:75)}}
> {{ at 
> org.apache.flink.api.common.typeutils.base.StringSerializer.deserialize(StringSerializer.java:80)}}
> {{--}}
> {{"CHAIN Join (Join at with(JoinOperator.java:543)) -> Map (Map at 
> (DataSetFilterJob.java:67)) (65/150)" #154 prio=5 os_prio=0 
> tid=0x7faa5c01b000 nid=0x248b waiting on condition [0x7fa9d14d4000]}}
> {{ java.lang.Thread.State: TIMED_WAITING (parking)}}
> {{ at sun.misc.Unsafe.park(Native Method)}}
> {{ - parking to wait for <0x0007b8e0eb50> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)}}
> {{ at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)}}
> {{ at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)}}
> {{ at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.AsynchronousBlockReader.getNextReturnedBlock(AsynchronousBlockReader.java:98)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.AsynchronousBlockReader.getNextReturnedBlock(AsynchronousBlockReader.java:43)}}
> {{ at 
> org.apache.flink.runtime.io.disk.iomanager.ChannelReaderInputView.nextSegment(ChannelReaderInputView.java:228)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.advance(AbstractPagedInputView.java:158)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.readByte(AbstractPagedInputView.java:271)}}
> {{ at 
> org.apache.flink.runtime.memory.AbstractPagedInputView.readUnsignedByte(AbstractPagedInputView.java:278)}}
> {{ at org.apache.flink.types.StringValue.readString(StringValue.java:746)}}
> {{ at 
> org.apache.flink.api.common.typeutils.base.StringSerializer.deserialize(StringSerializer.java:75)}}
> {{ at 
> 

[jira] [Commented] (FLINK-17912) KafkaShuffleITCase.testAssignedToPartitionEventTime: "Watermark should always increase"

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336963#comment-17336963
 ] 

Flink Jira Bot commented on FLINK-17912:


This issue was labeled "stale-critical" 7 ago and has not received any updates 
so it is being deprioritized. If this ticket is actually Critical, please raise 
the priority and ask a committer to assign you the issue or revive the public 
discussion.


> KafkaShuffleITCase.testAssignedToPartitionEventTime: "Watermark should always 
> increase"
> ---
>
> Key: FLINK-17912
> URL: https://issues.apache.org/jira/browse/FLINK-17912
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Tests
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Robert Metzger
>Priority: Critical
>  Labels: stale-critical, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=2062=logs=1fc6e7bf-633c-5081-c32a-9dea24b05730=0d9ad4c1-5629-5ffc-10dc-113ca91e23c5
> {code}
> 2020-05-22T21:16:24.7188044Z 
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> 2020-05-22T21:16:24.7188796Z  at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147)
> 2020-05-22T21:16:24.7189596Z  at 
> org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:677)
> 2020-05-22T21:16:24.7190352Z  at 
> org.apache.flink.streaming.util.TestStreamEnvironment.execute(TestStreamEnvironment.java:81)
> 2020-05-22T21:16:24.7191261Z  at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1673)
> 2020-05-22T21:16:24.7191824Z  at 
> org.apache.flink.test.util.TestUtils.tryExecute(TestUtils.java:35)
> 2020-05-22T21:16:24.7192325Z  at 
> org.apache.flink.streaming.connectors.kafka.shuffle.KafkaShuffleITCase.testAssignedToPartition(KafkaShuffleITCase.java:296)
> 2020-05-22T21:16:24.7192962Z  at 
> org.apache.flink.streaming.connectors.kafka.shuffle.KafkaShuffleITCase.testAssignedToPartitionEventTime(KafkaShuffleITCase.java:126)
> 2020-05-22T21:16:24.7193436Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-05-22T21:16:24.7193999Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-05-22T21:16:24.7194720Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-05-22T21:16:24.7195226Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-05-22T21:16:24.7195864Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-05-22T21:16:24.7196574Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-05-22T21:16:24.7197511Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-05-22T21:16:24.7198020Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-05-22T21:16:24.7198494Z  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2020-05-22T21:16:24.7199128Z  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> 2020-05-22T21:16:24.7199689Z  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> 2020-05-22T21:16:24.7200308Z  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 2020-05-22T21:16:24.7200645Z  at java.lang.Thread.run(Thread.java:748)
> 2020-05-22T21:16:24.7201029Z Caused by: 
> org.apache.flink.runtime.JobException: Recovery is suppressed by 
> NoRestartBackoffTimeStrategy
> 2020-05-22T21:16:24.7201643Z  at 
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:116)
> 2020-05-22T21:16:24.7202275Z  at 
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:78)
> 2020-05-22T21:16:24.7202863Z  at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:192)
> 2020-05-22T21:16:24.7203525Z  at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:185)
> 2020-05-22T21:16:24.7204072Z  at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:179)
> 2020-05-22T21:16:24.7204618Z  at 
> org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:503)
> 2020-05-22T21:16:24.7205255Z  at 
> org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:386)
> 2020-05-22T21:16:24.7205716Z  at 
> 

[jira] [Updated] (FLINK-9268) RockDB errors from WindowOperator

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-9268:
--
Labels: auto-deprioritized-major  (was: stale-major)

> RockDB errors from WindowOperator
> -
>
> Key: FLINK-9268
> URL: https://issues.apache.org/jira/browse/FLINK-9268
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, Runtime / State Backends
>Affects Versions: 1.4.2
>Reporter: Narayanan Arunachalam
>Priority: Major
>  Labels: auto-deprioritized-major
>
> The job has no sinks, one Kafka source, does a windowing based on session and 
> uses processing time. The job fails with the error given below after running 
> for few hours. The only way to recover from this error is to cancel the job 
> and start a new one.
> Using S3 backend for externalized checkpoints.
> A representative job DAG:
> val streams = sEnv
>  .addSource(makeKafkaSource(config))
>  .map(makeEvent)
>  .keyBy(_.get(EVENT_GROUP_ID))
>  .window(ProcessingTimeSessionWindows.withGap(Time.seconds(60)))
>  .trigger(PurgingTrigger.of(ProcessingTimeTrigger.create()))
>  .apply(makeEventsList)
> .addSink(makeNoOpSink)
> A representative config:
> state.backend=rocksDB
> checkpoint.enabled=true
> external.checkpoint.enabled=true
> checkpoint.mode=AT_LEAST_ONCE
> checkpoint.interval=90
> checkpoint.timeout=30
> Error:
> TimerException\{java.lang.NegativeArraySizeException}
>  at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:252)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NegativeArraySizeException
>  at org.rocksdb.RocksDB.get(Native Method)
>  at org.rocksdb.RocksDB.get(RocksDB.java:810)
>  at 
> org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:86)
>  at 
> org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:49)
>  at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onProcessingTime(WindowOperator.java:496)
>  at 
> org.apache.flink.streaming.api.operators.HeapInternalTimerService.onProcessingTime(HeapInternalTimerService.java:255)
>  at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:249)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-6354) Add documentation for migration away from Checkpointed interface

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336765#comment-17336765
 ] 

Flink Jira Bot commented on FLINK-6354:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Add documentation for migration away from Checkpointed interface
> 
>
> Key: FLINK-6354
> URL: https://issues.apache.org/jira/browse/FLINK-6354
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Documentation
>Affects Versions: 1.2.0, 1.2.1, 1.3.0
>Reporter: Aljoscha Krettek
>Priority: Major
>  Labels: stale-major
>
> This should follow the procedure outlined in FLINK-6353.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-12525) Impose invariant StreamExecutionEnvironment.setBufferTimeout > 0

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-12525:
---
Priority: Minor  (was: Major)

> Impose invariant StreamExecutionEnvironment.setBufferTimeout > 0
> 
>
> Key: FLINK-12525
> URL: https://issues.apache.org/jira/browse/FLINK-12525
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Runtime / Network
>Reporter: Robert Stoll
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> The documentation for the [DataStream 
> API|https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/docs/dev/datastream_api.md#controlling-latency]
>  states:
> {quote}buffer timeout of 0 should be avoided, because it can cause severe 
> performance degradation.
> {quote}
> I don't know if the documentation is not appropriate and there are valid 
> cases where a timeout of 0 makes sense. But if not, then the invariant should 
> not be 
> {code}
> if (timeoutMillis < -1) {
>   throw new IllegalArgumentException("Timeout of buffer must be non-negative 
> or -1");
> }
> {code}
> But {{timeoutMillis < 0}} (can also be a second invariant)
> IMO it is bad practice to state it only in the documentation. The API should 
> guide the user in this case (in this sense a second invariant stating the 
> quote above would make more sense).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-7271) ExpressionReducer does not optimize string-to-time conversion

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-7271:
--
Priority: Minor  (was: Major)

> ExpressionReducer does not optimize string-to-time conversion
> -
>
> Key: FLINK-7271
> URL: https://issues.apache.org/jira/browse/FLINK-7271
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.3.1
>Reporter: Timo Walther
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Expressions like {{"1996-11-10".toDate}} or {{"1996-11-10 
> 12:12:12".toTimestamp}} are not recognized by the ExpressionReducer and are 
> evaluated during runtime instead of pre-flight phase. In order to optimize 
> the runtime we should allow constant expression reduction here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8179) Convert CharSequence to String when registering / converting a Table

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-8179:
--
Labels: auto-deprioritized-major  (was: stale-major)

> Convert CharSequence to String when registering / converting a Table
> 
>
> Key: FLINK-8179
> URL: https://issues.apache.org/jira/browse/FLINK-8179
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.5.0
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: auto-deprioritized-major
>
> Avro objects store text values as `java.lang.CharSequence`. Right now, these 
> fields are treated as `ANY` type by Calcite (`GenericType` by Flink). So, 
> importing a `DataStream` or `DataSet` of Avro objects results in Table where 
> the text values cannot be used as String fields.
> We should convert `CharSequence` fields to `String` when importing / 
> converting to a Table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5429) Code generate types between operators in Table API

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-5429:
--
Priority: Minor  (was: Major)

> Code generate types between operators in Table API
> --
>
> Key: FLINK-5429
> URL: https://issues.apache.org/jira/browse/FLINK-5429
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Legacy Planner
>Reporter: Timo Walther
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently, the Table API uses the generic Row type for shipping records 
> between operators in underlying DataSet and DataStream API. For efficiency 
> reasons we should code generate those records. The final design is up for 
> discussion but here are some ideas:
> A row like {{(a: INT NULL, b: INT NOT NULL, c: STRING)}} could look like
> {code}
> final class GeneratedRow$123 {
>   public boolean a_isNull;
>   public int a;
>   public int b;
>   public String c;
> }
> {code}
> Types could be generated using Janino in the pre-flight phase. The generated 
> types should use primitive types wherever possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-9698) "case class must be static and globally accessible" is too constrained

2021-04-29 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17336615#comment-17336615
 ] 

Flink Jira Bot commented on FLINK-9698:
---

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> "case class must be static and globally accessible" is too constrained
> --
>
> Key: FLINK-9698
> URL: https://issues.apache.org/jira/browse/FLINK-9698
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Type Serialization System
>Reporter: Jeff Zhang
>Priority: Major
>  Labels: stale-major
>
> The following code can reproduce this issue. 
> {code}
> object BatchJob {
>   def main(args: Array[String]) {
> // set up the batch execution environment
> val env = ExecutionEnvironment.getExecutionEnvironment
> val tenv = TableEnvironment.getTableEnvironment(env)
> case class Person(id:Int, name:String)
> val ds = env.fromElements(Person(1,"jeff"), Person(2, "andy"))
> tenv.registerDataSet("table_1", ds);
>   }
> }
> {code}
> Although I have workaround to declare case class outside of the main method, 
> this workaround won't work in scala-shell. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-11063) Make flink-table Scala-free

2021-04-29 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-11063:
---
Labels: auto-deprioritized-major  (was: stale-major)

> Make flink-table Scala-free
> ---
>
> Key: FLINK-11063
> URL: https://issues.apache.org/jira/browse/FLINK-11063
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.7.0
>Reporter: Timo Walther
>Priority: Major
>  Labels: auto-deprioritized-major
>
> Currently, the Table & SQL API is implemented in Scala. This decision was 
> made a long-time ago when the initial code base was created as part of a 
> master's thesis. The community kept Scala because of the nice language 
> features that enable a fluent Table API like {{table.select('field.trim())}} 
> and because Scala allows for quick prototyping (e.g. multi-line comments for 
> code generation). The committers enforced not splitting the code-base into 
> two programming languages.
> However, nowadays the {{flink-table}} module more and more becomes an 
> important part in the Flink ecosystem. Connectors, formats, and SQL client 
> are actually implemented in Java but need to interoperate with 
> {{flink-table}} which makes these modules dependent on Scala. As mentioned in 
> an earlier mail thread, using Scala for API classes also exposes member 
> variables and methods in Java that should not be exposed to users. Java is 
> still the most important API language and right now we treat it as a 
> second-class citizen.
> In order to not introduce more technical debt, the community aims to make the 
> {{flink-table}} module Scala-free as a long-term goal. This will be a 
> continuous effort that can not be finished within one release. We aim for 
> avoiding API-breaking changes.
> A full description can be found in the corresponding 
> [FLIP-28|https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free].
> FLIP-28 also contains a rough roadmap and serves as migration guidelines.
> This Jira issue is an umbrella issue for tracking the efforts and possible 
> migration blockers.
> *+Update+*: Due to the big code contribution of Alibaba into Flink SQL. We 
> will only perform porting of API classes for now. This is mostly tracked by 
> FLINK-11448.
> FLIP-28 is legacy and has been integrated into 
> [FLIP-32|https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


<    5   6   7   8   9   10   11   12   13   14   >