[jira] [Updated] (FLINK-14459) Python module build hangs

2019-10-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14459:
---
Labels: pull-request-available  (was: )

> Python module build hangs
> -
>
> Key: FLINK-14459
> URL: https://issues.apache.org/jira/browse/FLINK-14459
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.9.0, 1.10.0, 1.9.1
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
>
> The build of python module hangs when installing conda. See travis log: 
> https://api.travis-ci.org/v3/job/599704570/log.txt
> Can't reproduce it neither on my local mac nor on my repo with travis. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14434) Dispatcher#createJobManagerRunner should returns on creation succeed, not after startJobManagerRunner

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14434:
---
Labels: pull-request-available  (was: )

> Dispatcher#createJobManagerRunner should returns on creation succeed, not 
> after startJobManagerRunner
> -
>
> Key: FLINK-14434
> URL: https://issues.apache.org/jira/browse/FLINK-14434
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.10.0
>Reporter: Zili Chen
>Assignee: Zili Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
> Attachments: patch.diff
>
>
> In an edge case, let's said
> 1) job finished nearly immediately
> 2) Dispatcher has been suspended in {{#startJobManagerRunner}} after 
> {{jobManagerRunner.start();}} but before {{return jobManagerRunner;}}
> due to
> 1) we put {{jobManagerRunnerFutures}} with {{#startJobManagerRunner}} 
> finished.
> 2) the creation of JobManagerRunner doesn't happen in MainThread.
> it is a possible execution order
> 1) JobManagerRunner created in akka-dispatcher thread
> 2) then apply {{Dispatcher#startJobManagerRunner}}
> 3) until {{jobManagerRunner.start();}} and before {{return jobManagerRunner;}}
> 4) this thread suspended
> 5) job finished, execute callback on MainThread
> 6) {{jobManagerRunnerFutures.get(jobID).getNow(null)}} returns {{null}} 
> because akka-dispatcher thread doesn't {{return jobManagerRunner;}}
> 7) it report {{There is a newer JobManagerRunner for the job}} but actually 
> not.
> **Solution**
> Two perspective but we can even have them both.
> 1. return {{jobManagerRunnerFuture}} in {{#createJobManagerRunner}}, let 
> {{#startJobManagerRunner}} an action
> 2. on JobManagerRunner created, execute {{#startJobManagerRunner}} in 
> MainThread.
> CC [~trohrmann]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14175) Upgrade KPL version in flink-connector-kinesis to fix application OOM

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14175:
---
Labels: pull-request-available  (was: )

> Upgrade KPL version in flink-connector-kinesis to fix application OOM
> -
>
> Key: FLINK-14175
> URL: https://issues.apache.org/jira/browse/FLINK-14175
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.6.3, 1.6.4, 1.6.5, 1.7.2, 1.7.3, 1.8.0, 1.8.1, 1.8.2, 
> 1.9.0
> Environment: [link title|http://example.com][link 
> title|http://example.com]
>Reporter: Abhilasha Seth
>Priority: Major
>  Labels: pull-request-available
>
> The [KPL 
> version|https://github.com/apache/flink/blob/release-1.9/flink-connectors/flink-connector-kinesis/pom.xml#L38]
>  (0.12.9) used by flink-connector-kinesis in the affected Flink versions has 
> a thread leak bug that causes applications to run out of memory after 
> frequent restarts:
> KPL Issue - [https://github.com/awslabs/amazon-kinesis-producer/issues/224]
> Fix - [https://github.com/awslabs/amazon-kinesis-producer/pull/225/files]
> Upgrading KPL to 0.12.10 or higher is necessary to avoid this issue. The 
> recommended version to upgrade would be the latest (0.13.1)
> Note that KPL version in Flink 1.10.0 has been updated to the latest version 
> (0.13.1): https://issues.apache.org/jira/browse/FLINK-12847
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14416) Add Module interface and ModuleManager

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14416:
---
Labels: pull-request-available  (was: )

> Add Module interface and ModuleManager
> --
>
> Key: FLINK-14416
> URL: https://issues.apache.org/jira/browse/FLINK-14416
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14450) Change SchedulingTopology to extend base topology

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14450:
---
Labels: pull-request-available  (was: )

> Change SchedulingTopology to extend base topology
> -
>
> Key: FLINK-14450
> URL: https://issues.apache.org/jira/browse/FLINK-14450
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.10.0
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> This task is to change SchedulingTopology to extend the base topology 
> introduced in FLINK-14330.
> More details see FLINK-14330 and the [design 
> doc|https://docs.google.com/document/d/1f88luAOfUQ6Pm4JkxYexLXpfH-crcXJdbubi1pS2Y5A/edit#].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8822) RotateLogFile may not work well when sed version is below 4.2

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-8822:
--
Labels: pull-request-available  (was: )

> RotateLogFile may not work well when sed version is below 4.2
> -
>
> Key: FLINK-8822
> URL: https://issues.apache.org/jira/browse/FLINK-8822
> Project: Flink
>  Issue Type: Bug
>  Components: Command Line Client
>Affects Versions: 1.4.0
>Reporter: Xin Liu
>Priority: Major
>  Labels: pull-request-available
>
> In bin/config.sh rotateLogFilesWithPrefix(),it use extended regular to 
> process filename with "sed -E",but when sed version is below 4.2,it turns out 
> "sed: invalid option -- 'E'"
> and RotateLogFile won't work well : There will be only one logfile no matter 
> what is $MAX_LOG_FILE_NUMBER.
> so use sed -r may be more suitable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14456) Exclude lastJobExecutionResult from ClusterClient

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14456:
---
Labels: pull-request-available  (was: )

> Exclude lastJobExecutionResult from ClusterClient
> -
>
> Key: FLINK-14456
> URL: https://issues.apache.org/jira/browse/FLINK-14456
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client / Job Submission
>Reporter: Zili Chen
>Assignee: Zili Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> Towards a {{ClusterClient}} interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14330) Introduce a unified topology interface

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14330:
---
Labels: pull-request-available  (was: )

> Introduce a unified topology interface
> --
>
> Key: FLINK-14330
> URL: https://issues.apache.org/jira/browse/FLINK-14330
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.10.0
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> When working on FLINK-14312 to partition {{JobGraph}} into logical pipelined 
> regions, I found that we can hardly reuse the existing util 
> {{PipelinedRegionComputeUtil#computePipelinedRegions(..)}}  to do it since 
> it's based on the {{FailoverTopology}}.
> To avoid code duplication, we need a unified topology base for 
> {{FailoverTopology}} and {{JobGraph/LogicalTopology}}.
> Besides that, the inconsistency of {{FailoverTopology}} and 
> {{SchedulingTopology}} is also causing troubles for development and 
> performance.
> That's why I'd propose to unify the interfaces all these topologies. More 
> details can be found in this [design 
> doc|https://docs.google.com/document/d/1f88luAOfUQ6Pm4JkxYexLXpfH-crcXJdbubi1pS2Y5A/edit?usp=sharing].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-7629) Scala stream aggregations should support nested field expressions

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7629:
--
Labels: pull-request-available  (was: )

> Scala stream aggregations should support nested field expressions
> -
>
> Key: FLINK-7629
> URL: https://issues.apache.org/jira/browse/FLINK-7629
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Minor
>  Labels: pull-request-available
>
> In the Scala API, {{KeyedStream.maxBy}} and similar methods currently only 
> work with a field name, and not with nested field expressions, such as 
> "fieldA.fieldX". (Their documentation says this should work.)
> The reason for this is that the string overload of {{KeyedStream.aggregate}} 
> uses {{fieldNames2Indices}} and then calls the integer overload. Instead, it 
> should create a {{SumAggregator}} or {{ComparableAggregator}} directly, as 
> the integer overload does (and as the Java API does). The ctors of 
> {{SumAggregator}} or {{ComparableAggregator}} will call 
> {{FieldAccessorFactory.getAccessor}}, which will correctly handle a nested 
> field expression.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8046) ContinuousFileMonitoringFunction wrongly ignores files with exact same timestamp

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-8046:
--
Labels: pull-request-available stream  (was: stream)

> ContinuousFileMonitoringFunction wrongly ignores files with exact same 
> timestamp
> 
>
> Key: FLINK-8046
> URL: https://issues.apache.org/jira/browse/FLINK-8046
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.3.2
>Reporter: Juan Miguel Cejuela
>Priority: Major
>  Labels: pull-request-available, stream
>
> The current monitoring of files sets the internal variable 
> `globalModificationTime` to filter out files that are "older". However, the 
> current test (to check "older") does 
> `boolean shouldIgnore = modificationTime <= globalModificationTime;` (rom 
> `shouldIgnore`)
> The comparison should strictly be SMALLER (NOT smaller or equal). The method 
> documentation also states "This happens if the modification time of the file 
> is _smaller_ than...".
> The equality acceptance for "older", makes some files with same exact 
> timestamp to be ignored. The behavior is also non-deterministic, as the first 
> file to be accepted ("first" being pretty much random) makes the rest of 
> files with same exact timestamp to be ignored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8090) Improve error message when registering different states under the same name.

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-8090:
--
Labels: pull-request-available  (was: )

> Improve error message when registering different states under the same name.
> 
>
> Key: FLINK-8090
> URL: https://issues.apache.org/jira/browse/FLINK-8090
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.4.0
>Reporter: Kostas Kloudas
>Assignee: Xingcan Cui
>Priority: Major
>  Labels: pull-request-available
>
> Currently a {{ProcessFunction}} like this:
> {code}
> final MapStateDescriptor> 
> firstMapStateDescriptor = new MapStateDescriptor<>(
>   "timon-one",
>   BasicTypeInfo.INT_TYPE_INFO,
>   source.getType());
> final ListStateDescriptor secondListStateDescriptor = new 
> ListStateDescriptor(
>   "timon-one",
>   BasicTypeInfo.INT_TYPE_INFO);
> new ProcessFunction, Object>() {
>   private static final long serialVersionUID = 
> -805125545438296619L;
>   private transient MapState Tuple2> firstMapState;
> private transient ListState 
> secondListState;
>   @Override
>   public void open(Configuration parameters) 
> throws Exception {
>   super.open(parameters);
>   firstMapState = 
> getRuntimeContext().getMapState(firstMapStateDescriptor);
>   secondListState = 
> getRuntimeContext().getListState(secondListStateDescriptor);
>   }
>   @Override
>   public void processElement(Tuple2 Long> value, Context ctx, Collector out) throws Exception {
>   Tuple2 v = 
> firstMapState.get(value.f0);
>   if (v == null) {
>   v = new Tuple2<>(value.f0, 0L);
>   }
>   firstMapState.put(value.f0, new 
> Tuple2<>(v.f0, v.f1 + value.f1));
>   }
>   }
> {code}
> fails with:
> {code}
> java.lang.RuntimeException: Error while getting state
>   at 
> org.apache.flink.runtime.state.DefaultKeyedStateStore.getListState(DefaultKeyedStateStore.java:74)
>   at 
> org.apache.flink.streaming.api.operators.StreamingRuntimeContext.getListState(StreamingRuntimeContext.java:127)
>   at 
> org.apache.flink.queryablestate.itcases.AbstractQueryableStateTestBase$2.open(AbstractQueryableStateTestBase.java:327)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.open(KeyedProcessOperator.java:58)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:381)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassCastException: 
> org.apache.flink.runtime.state.heap.HeapMapState cannot be cast to 
> org.apache.flink.api.common.state.ListState
>   at 
> org.apache.flink.runtime.state.DefaultKeyedStateStore.getListState(DefaultKeyedStateStore.java:71)
>   ... 9 more
> {code}
> Which is cryptic, as it does not explain the reason for the problem. The 
> error message should be something along the line of "Duplicate state name".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14449) SavepointMigrationTestBase deadline should be setup in the test

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14449:
---
Labels: pull-request-available  (was: )

> SavepointMigrationTestBase deadline should be setup in the test
> ---
>
> Key: FLINK-14449
> URL: https://issues.apache.org/jira/browse/FLINK-14449
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.8.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> The {{SavepointMigrationTestBase}} contains a {{static final Deadline}} that 
> is used in all tests. In practice this means that the deadline is quite 
> unreliable, since it is setup when the class is instantiated, opposed to any 
> tests being run.
> If fork-reuse is enabled the tests consistently fail with a timeout for this 
> reason.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4822) Ensure that the Kafka 0.8 connector is compatible with kafka-consumer-groups.sh

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4822:
--
Labels: pull-request-available  (was: )

> Ensure that the Kafka 0.8 connector is compatible with 
> kafka-consumer-groups.sh
> ---
>
> Key: FLINK-4822
> URL: https://issues.apache.org/jira/browse/FLINK-4822
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: Robert Metzger
>Priority: Major
>  Labels: pull-request-available
>
> The Kafka 0.8 connector is not properly creating all datastructures in 
> Zookeeper for Kafka's {{kafka-consumer-groups.sh}} tool.
> A user reported the issue here: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Kafka-connector08-not-updating-the-offsets-with-the-zookeeper-td9469.html#a9498
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-6101) GroupBy fields with arithmetic expression (include UDF) can not be selected

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-6101:
--
Labels: pull-request-available  (was: )

> GroupBy fields with arithmetic expression (include UDF) can not be selected
> ---
>
> Key: FLINK-6101
> URL: https://issues.apache.org/jira/browse/FLINK-6101
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: lincoln lee
>Assignee: lincoln lee
>Priority: Minor
>  Labels: pull-request-available
>
> currently the TableAPI do not support selecting GroupBy fields with 
> expression either using original field name or the expression 
> {code}
>  val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, 'b % 3)
> .select('b, 'c.min, 'e, 'a.avg, 'd.count)
> {code}
> caused
> {code}
> org.apache.flink.table.api.ValidationException: Cannot resolve [b] given 
> input [e, ('b % 3), TMP_0, TMP_1, TMP_2].
> {code}
> (BTW, this syntax is invalid in RDBMS which will indicate the selected column 
> is invalid in the select list because it is not contained in either an 
> aggregate function or the GROUP BY clause in SQL Server.)
> and 
> {code}
>  val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, 'b % 3)
> .select('b%3, 'c.min, 'e, 'a.avg, 'd.count)
> {code}
> will also cause
> {code}
> org.apache.flink.table.api.ValidationException: Cannot resolve [b] given 
> input [e, ('b % 3), TMP_0, TMP_1, TMP_2].
> {code}
> and add an alias in groupBy clause "group(e, 'b%3 as 'b)" work without avail. 
> and apply an UDF doesn’t work either
> {code}
>table.groupBy('a, Mod('b, 3)).select('a, Mod('b, 3), 'c.count, 'c.count, 
> 'd.count, 'e.avg)
> org.apache.flink.table.api.ValidationException: Cannot resolve [b] given 
> input [a, org.apache.flink.table.api.scala.batch.table.Mod$('b, 3), TMP_0, 
> TMP_1, TMP_2].
> {code}
> the only way to get this work can be 
> {code}
>  val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .select('a, 'b%3 as 'b, 'c, 'd, 'e)
> .groupBy('e, 'b)
> .select('b, 'c.min, 'e, 'a.avg, 'd.count)
> {code}
> One way to solve this is to add support alias in groupBy clause ( it seems a 
> bit odd against SQL though TableAPI has a different groupBy grammar),  
> and I prefer to support select original expressions and UDF in groupBy 
> clause(make consistent with SQL).
> as thus:
> {code}
> // use expression
> val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, 'b % 3)
> .select('b % 3, 'c.min, 'e, 'a.avg, 'd.count)
> // use UDF
> val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, Mod('b,3))
> .select(Mod('b,3), 'c.min, 'e, 'a.avg, 'd.count)
> {code}

> After had a look into the code, found there was a problem in the groupBy 
> implementation, validation hadn't considered the expressions in groupBy 
> clause. it should be noted that a table has been actually changed after 
> groupBy operation ( a new Table) and the groupBy keys replace the original 
> field reference in essence.
>  
> What do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8124) EventTimeTrigger (and other triggers) could have less specific generic types

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-8124:
--
Labels: pull-request-available  (was: )

> EventTimeTrigger (and other triggers) could have less specific generic types
> 
>
> Key: FLINK-8124
> URL: https://issues.apache.org/jira/browse/FLINK-8124
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Cristian
>Priority: Minor
>  Labels: pull-request-available
>
> When implementing custom WindowAssigners, it is possible to need different 
> implementations of the {{Window}} class (other than {{TimeWindow}}). In such 
> cases, it is not possible to use the existing triggers (e.g. 
> {{EventTimeTrigger}}) because it extends from {{Trigger}} 
> which is (unnecessarily?) specific.
> It should be possible to make that class more generic by using 
> {{EventTimeTrigger extends Trigger}} instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8191) Add a RoundRobinPartitioner to be shipped with the Kafka connector

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-8191:
--
Labels: pull-request-available  (was: )

> Add a RoundRobinPartitioner to be shipped with the Kafka connector
> --
>
> Key: FLINK-8191
> URL: https://issues.apache.org/jira/browse/FLINK-8191
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Kafka
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Aegeaner
>Priority: Major
>  Labels: pull-request-available
>
> We should perhaps consider adding a round-robin partitioner ready for use to 
> be shipped with the Kafka connector, along side the already available 
> {{FlinkFixedPartitioner}}.
> See the original discussion here:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/FlinkKafkaProducerXX-td16951.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-8037) Missing cast in integer arithmetic in TransactionalIdsGenerator#generateIdsToAbort

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-8037:
--
Labels: kafka kafka-connect pull-request-available  (was: kafka 
kafka-connect)

> Missing cast in integer arithmetic in 
> TransactionalIdsGenerator#generateIdsToAbort
> --
>
> Key: FLINK-8037
> URL: https://issues.apache.org/jira/browse/FLINK-8037
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: Ted Yu
>Assignee: Greg Hogan
>Priority: Minor
>  Labels: kafka, kafka-connect, pull-request-available
>
> {code}
>   public Set generateIdsToAbort() {
> Set idsToAbort = new HashSet<>();
> for (int i = 0; i < safeScaleDownFactor; i++) {
>   idsToAbort.addAll(generateIdsToUse(i * poolSize * 
> totalNumberOfSubtasks));
> {code}
> The operands are integers where generateIdsToUse() expects long parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-7684) Avoid multiple data copies in MergingWindowSet

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7684:
--
Labels: pull-request-available  (was: )

> Avoid multiple data copies in MergingWindowSet
> --
>
> Key: FLINK-7684
> URL: https://issues.apache.org/jira/browse/FLINK-7684
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 1.2.0, 1.2.1, 1.3.0, 1.3.1, 1.3.2
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Major
>  Labels: pull-request-available
>
> Currently MergingWindowSet uses ListState of tuples to persists it's mapping. 
> This is inefficient because this ListState of tuples must be converted to a 
> HashMap on each access.
> Furthermore, for some cases it might be inefficient to check whether mapping 
> has changed before saving it on state.
> Those two issues are causing multiple data copies and constructing multiple 
> Lists/Maps per each processed element, which is a reason for noticeable 
> performance issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-7613) Fix documentation error in QuickStart

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7613:
--
Labels: pull-request-available  (was: )

> Fix documentation error in QuickStart
> -
>
> Key: FLINK-7613
> URL: https://issues.apache.org/jira/browse/FLINK-7613
> Project: Flink
>  Issue Type: Task
>  Components: Documentation
>Affects Versions: 1.4.0
>Reporter: Raymond Tay
>Priority: Minor
>  Labels: pull-request-available
>
> In the `QuickStart => Run The Example` section, there's a typographical error 
> which points the reader to `*jobmanager* but it should be `*taskmanager*` in 
> Apache Flink 1.4.x. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-7539) Make AvroOutputFormat default codec configurable

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7539:
--
Labels: pull-request-available  (was: )

> Make AvroOutputFormat default codec configurable
> 
>
> Key: FLINK-7539
> URL: https://issues.apache.org/jira/browse/FLINK-7539
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Sebastian Klemke
>Priority: Major
>  Labels: pull-request-available
>
> In my organization there is a requirement that all avro datasets stored on 
> HDFS should be compressed. Currently, this requires invoking setCodec() 
> manually on all AvroOutputFormat instances. To ease setting up 
> AvroOutputFormat instances, we'd like to be able to configure default codec 
> site-wide, ideally via flink-conf.yaml



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-7561) Add support for pre-aggregation in DataStream API

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7561:
--
Labels: pull-request-available  (was: )

> Add support for pre-aggregation in DataStream API
> -
>
> Key: FLINK-7561
> URL: https://issues.apache.org/jira/browse/FLINK-7561
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-7486) flink-mesos: Support for adding unique attribute / group_by attribute constraints

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7486:
--
Labels: pull-request-available  (was: )

> flink-mesos: Support for adding unique attribute / group_by attribute 
> constraints
> -
>
> Key: FLINK-7486
> URL: https://issues.apache.org/jira/browse/FLINK-7486
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Mesos
>Affects Versions: 1.3.2
>Reporter: Bhumika Bayani
>Assignee: Bhumika Bayani
>Priority: Major
>  Labels: pull-request-available
>
> In our setup, we have multiple mesos-workers. Inspite of this, flink 
> application master most of the times ends up spawning all task-managers on 
> same mesos-worker.
> We intend to ensure HA of task managers. We would like to make sure each 
> task-manager is running on different mesos-worker as well as such 
> mesos-worker which does not share the AZ attribute with earlier task manager 
> instances.
> Netflix-fenzo supports adding UniqueHostAttribute and BalancedHostAttribute 
> contraints. Flink-mesos should also enable us to add these kind of 
> constraints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-7384) Unify event and processing time handling in the AbstractKeyedCEPPatternOperator.

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7384:
--
Labels: pull-request-available  (was: )

> Unify event and processing time handling in the 
> AbstractKeyedCEPPatternOperator.
> 
>
> Key: FLINK-7384
> URL: https://issues.apache.org/jira/browse/FLINK-7384
> Project: Flink
>  Issue Type: Improvement
>  Components: Library / CEP
>Reporter: Kostas Kloudas
>Assignee: Dawid Wysakowicz
>Priority: Minor
>  Labels: pull-request-available
>
> With the recent changes introduced in 
> https://issues.apache.org/jira/browse/FLINK-7293, the code paths between 
> event- and processing-time handling are very close. This gives an opportunity 
> to unify the 2 paths.
> To do this when operating in processing time, the user will specify an 
> interval (like the watermark interval in event time), during which elements 
> will be buffered, and only when this interval expires, the elements will be 
> emitted. This is the same as the case of event-time, where elements between 
> watermarks are buffered.
> This change will remove the need to register a processing time timer for 
> every millisecond and it will also allow to emit timed-out patterns in 
> processing time without having to wait for the "next" element to arrive.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-7307) Add proper command line parsing tool to ClusterEntrypoint

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7307:
--
Labels: flip-6 pull-request-available  (was: flip-6)

> Add proper command line parsing tool to ClusterEntrypoint
> -
>
> Key: FLINK-7307
> URL: https://issues.apache.org/jira/browse/FLINK-7307
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.4.0
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: flip-6, pull-request-available
>
> We need to add a proper command line parsing tool to the entry point of the 
> {{ClusterEntrypoint#parseArguments}}. At the moment, we are simply using the 
> {{ParameterTool}} as a temporary solution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-6105) Properly handle InterruptedException in HadoopInputFormatBase

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-6105:
--
Labels: pull-request-available  (was: )

> Properly handle InterruptedException in HadoopInputFormatBase
> -
>
> Key: FLINK-6105
> URL: https://issues.apache.org/jira/browse/FLINK-6105
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> When catching InterruptedException, we should throw InterruptedIOException 
> instead of IOException.
> The following example is from HadoopInputFormatBase :
> {code}
> try {
>   splits = this.mapreduceInputFormat.getSplits(jobContext);
> } catch (InterruptedException e) {
>   throw new IOException("Could not get Splits.", e);
> }
> {code}
> There may be other places where IOE is thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-6502) Add support ElasticsearchSink for DataSet

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-6502:
--
Labels: pull-request-available  (was: )

> Add support ElasticsearchSink for DataSet
> -
>
> Key: FLINK-6502
> URL: https://issues.apache.org/jira/browse/FLINK-6502
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataSet
>Affects Versions: 1.2.1
>Reporter: wyp
>Priority: Major
>  Labels: pull-request-available
>
> Currently, Flink only support writing data in DataStream to ElasticSearch 
> through {{ElasticsearchSink}}, We should be able to writing data in DataSet 
> to ElasticSearch too. see 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ElasticsearchSink-on-DataSet-td12980.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14445) Python module build failed when making sdist

2019-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14445:
---
Labels: pull-request-available  (was: )

> Python module build failed when making sdist
> 
>
> Key: FLINK-14445
> URL: https://issues.apache.org/jira/browse/FLINK-14445
> Project: Flink
>  Issue Type: Bug
>Reporter: Yun Tang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> From the description of error-log from building python module in travis, it 
> seems invocation failed for {{sdist-make}} and then the phase of building 
> python module exited.
> The instance log: https://api.travis-ci.com/v3/job/246710918/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9378) Improve TableException message with TypeName usage

2019-10-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-9378:
--
Labels: pull-request-available  (was: )

> Improve TableException message with TypeName usage
> --
>
> Key: FLINK-9378
> URL: https://issues.apache.org/jira/browse/FLINK-9378
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Trivial
>  Labels: pull-request-available
>
> Currently in TableException simple name is in use. It is not clear what is 
> the issue while having error message like {noformat}
> Exception in thread "main" org.apache.flink.table.api.TableException: Result 
> field does not match requested type. Requested: Date; Actual: Date
>   at 
> org.apache.flink.table.api.TableEnvironment$$anonfun$generateRowConverterFunction$1.apply(TableEnvironment.scala:953)
> {noformat}
> or
> {noformat}Caused by: org.apache.flink.table.api.TableException: Type is not 
> supported: Date
>   at 
> org.apache.flink.table.api.TableException$.apply(exceptions.scala:53){noformat}
> also for more detailed have a look at FLINK-9341



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14441) Fix ValueLiteralExpression#getValueAs when valueClass is Period.class

2019-10-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14441:
---
Labels: pull-request-available  (was: )

> Fix ValueLiteralExpression#getValueAs when valueClass is Period.class
> -
>
> Key: FLINK-14441
> URL: https://issues.apache.org/jira/browse/FLINK-14441
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.10.0
>Reporter: hailong wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0
>
>
> In ValueLiteralExpression#getValueAs, when valueClass is Period.class, the 
> expect class is inconsistent with the result class



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5413) Convert TableEnvironmentITCases to unit tests

2019-10-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5413:
--
Labels: pull-request-available  (was: )

> Convert TableEnvironmentITCases to unit tests
> -
>
> Key: FLINK-5413
> URL: https://issues.apache.org/jira/browse/FLINK-5413
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: GaoLun
>Priority: Major
>  Labels: pull-request-available
>
> The following IT cases could be converted into unit tests:
> - {{org.apache.flink.table.api.scala.batch.TableEnvironmentITCase}}
> - {{org.apache.flink.table.api.java.batch.TableEnvironmentITCase}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-12122) Spread out tasks evenly across all available registered TaskManagers

2019-10-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-12122:
---
Labels: pull-request-available  (was: )

> Spread out tasks evenly across all available registered TaskManagers
> 
>
> Key: FLINK-12122
> URL: https://issues.apache.org/jira/browse/FLINK-12122
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.6.4, 1.7.2, 1.8.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2019-05-21-12-28-29-538.png, 
> image-2019-05-21-13-02-50-251.png
>
>
> With Flip-6, we changed the default behaviour how slots are assigned to 
> {{TaskManages}}. Instead of evenly spreading it out over all registered 
> {{TaskManagers}}, we randomly pick slots from {{TaskManagers}} with a 
> tendency to first fill up a TM before using another one. This is a regression 
> wrt the pre Flip-6 code.
> I suggest to change the behaviour so that we try to evenly distribute slots 
> across all available {{TaskManagers}} by considering how many of their slots 
> are already allocated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14397) Failed to run Hive UDTF with array arguments

2019-10-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14397:
---
Labels: pull-request-available  (was: )

> Failed to run Hive UDTF with array arguments
> 
>
> Key: FLINK-14397
> URL: https://issues.apache.org/jira/browse/FLINK-14397
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Reporter: Rui Li
>Priority: Major
>  Labels: pull-request-available
>
> Tried to call 
> {{org.apache.hadoop.hive.contrib.udtf.example.GenericUDTFExplode2}} (in 
> hive-contrib) with query:  "{{select x,y from foo, lateral 
> table(hiveudtf(arr)) as T(x,y)}}". Failed with exception:
> {noformat}
> java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to 
> [Ljava.lang.Integer;
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5476) Fail fast if trying to submit a job to a non-existing Flink cluster

2019-10-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5476:
--
Labels: pull-request-available  (was: )

> Fail fast if trying to submit a job to a non-existing Flink cluster
> ---
>
> Key: FLINK-5476
> URL: https://issues.apache.org/jira/browse/FLINK-5476
> Project: Flink
>  Issue Type: Improvement
>  Components: Command Line Client
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Till Rohrmann
>Assignee: Dmytro Shkvyra
>Priority: Minor
>  Labels: pull-request-available
>
> In case of entering the wrong job manager address when submitting a job via 
> {{flink run}}, the {{JobClientActor}} waits per default {{60 s}} until a 
> {{JobClientActorConnectionException}}, indicating that the {{JobManager}} is 
> no longer reachable, is thrown. In order to fail fast in case of wrong 
> connection information, we could change it such that it uses initially a much 
> lower timeout and only increases the timeout if it had at least once 
> successfully connected to a {{JobManager}} before.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14429) Wrong app final status when running batch job on yarn with non-detached mode

2019-10-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14429:
---
Labels: pull-request-available  (was: )

> Wrong app final status when running batch job on yarn with non-detached mode
> 
>
> Key: FLINK-14429
> URL: https://issues.apache.org/jira/browse/FLINK-14429
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.9.0
>Reporter: liupengcheng
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2019-10-17-16-47-47-038.png
>
>
> Recently, we found that the app final status is not correct when an 
> application failed when running batch job on yarn with non-detached mode,  It 
> reported SUCCEEDED but FAILED is what we expected.
> !image-2019-10-17-16-47-47-038.png!
>  
> But the logs and client reported error and job failed(It's caused by OOM):
> {code:java}
> 2019-10-10 14:36:21,797 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph- Job TeraSort 
> (d82cbfaae905c695597083b1476e51b8) switched from state FAILING to FAILED.
> org.apache.flink.runtime.io.network.partition.consumer.PartitionConnectionException:
>  Connection for partition 
> a254412fc7464cd4e0fe04ab9e3a6309@8d5afff58c86dd7f5bc78946f0101699 not 
> reachable.
>   at 
> org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:168)
>   at 
> org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:237)
>   at 
> org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.setup(SingleInputGate.java:215)
>   at 
> org.apache.flink.runtime.taskmanager.InputGateWithMetrics.setup(InputGateWithMetrics.java:65)
>   at 
> org.apache.flink.runtime.taskmanager.Task.setupPartitionsAndGates(Task.java:866)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:621)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Connecting the channel failed: Connecting to 
> remote task manager + 'zjy-hadoop-prc-st164.bj/10.152.47.8:45704' has failed. 
> This might indicate that the remote task manager has been lost.
>   at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.waitForChannel(PartitionRequestClientFactory.java:197)
>   at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.access$000(PartitionRequestClientFactory.java:134)
>   at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:86)
>   at 
> org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:68)
>   at 
> org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:165)
>   ... 7 more
> Caused by: 
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
> Connecting to remote task manager + 
> 'zjy-hadoop-prc-st164.bj/10.152.47.8:45704' has failed. This might indicate 
> that the remote task manager has been lost.
>   at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.operationComplete(PartitionRequestClientFactory.java:220)
>   at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.operationComplete(PartitionRequestClientFactory.java:134)
>   at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511)
>   at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504)
>   at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483)
>   at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424)
>   at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121)
>   at 
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:327)
>   at 
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:343)
>   at 
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
>   at 
> 

[jira] [Updated] (FLINK-14413) Some quotes from NOTICE files that cause encoding problems

2019-10-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14413:
---
Labels: pull-request-available  (was: )

> Some quotes from NOTICE files that cause encoding problems
> --
>
> Key: FLINK-14413
> URL: https://issues.apache.org/jira/browse/FLINK-14413
> Project: Flink
>  Issue Type: Bug
>  Components: Release System
>Affects Versions: 1.8.2, 1.9.0, 1.10.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.8.3, 1.9.2
>
>
> Some NOTICE files contain quotes that, at least on my system, result in some 
> encoding errors when generating the binary licensing. One example can be 
> found 
> [here|https://github.com/apache/flink/blob/88b48619e2734505a6c2ba0d53168528bc0dc143/flink-filesystems/flink-fs-hadoop-shaded/src/main/resources/META-INF/NOTICE#L1867];
>  the closing quotes would be replaced with a question mark.
> We can either replace these with quotes that don't cause issues, or try to 
> solve the encoding issue _somehow_ (so far I've been unsuccessful).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14389) Restore task state in new DefaultScheduler

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14389:
---
Labels: pull-request-available  (was: )

> Restore task state in new DefaultScheduler
> --
>
> Key: FLINK-14389
> URL: https://issues.apache.org/jira/browse/FLINK-14389
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> The new {{DefaultScheduler}} should restore the state of restarted tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14421) Add 'L' when define a long value

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14421:
---
Labels: pull-request-available  (was: )

> Add 'L' when define a long value
> 
>
> Key: FLINK-14421
> URL: https://issues.apache.org/jira/browse/FLINK-14421
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.9.0
>Reporter: hailong wang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> In SqlDateTimeUtils、DateTimeUtils and ApiExpressionUtils,
> MILLIS_PER_DAY field is defined as follow:
> {code:java}
> public static final long MILLIS_PER_DAY = 8640;
> {code}
> We should change it to 8640L and to be consistent with other fields



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14415) ValueLiteralExpression#equals should take array value into account

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14415:
---
Labels: pull-request-available  (was: )

> ValueLiteralExpression#equals should take array value into account
> --
>
> Key: FLINK-14415
> URL: https://issues.apache.org/jira/browse/FLINK-14415
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Major
>  Labels: pull-request-available
>
> {{ValueLiteralExpression#equals}} uses {{Objects.equals}} to check the 
> equality between two value object. Howeveer, the value object might be array 
> object, using {{Object.equals}} will lead to wrong result. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14414) Support string serializable for SymbolType

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14414:
---
Labels: pull-request-available  (was: )

> Support string serializable for SymbolType
> --
>
> Key: FLINK-14414
> URL: https://issues.apache.org/jira/browse/FLINK-14414
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Major
>  Labels: pull-request-available
>
> Currently, SymbolType has no serializable string representation, because it 
> is an extension type. But we find that we still need to serialize the symbol 
> type, because symbol literals are used in expressions, e.g. {{EXTRACT(DAY 
> FROM DATE '1990-12-01')}}. If we want to serialize this experssion, the DAY 
> symbol and type are also need to be serialized. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14409) MapType doesn't accept any subclass of java.util.Map

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14409:
---
Labels: pull-request-available  (was: )

> MapType doesn't accept any subclass of java.util.Map
> 
>
> Key: FLINK-14409
> URL: https://issues.apache.org/jira/browse/FLINK-14409
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
>  Labels: pull-request-available
>
> Currently the conversion class of MapType is {{java.util.Map}}, but 
> {{java.util.Map}} is an interface not a concrete class. So when verifying an 
> instance of {{HashMap}} for MapType, it fails. 
> For example:
> {code:java}
>   Map map = new HashMap<>();
>   map.put("key1", 1);
>   map.put("key2", 2);
>   map.put("key3", 3);
>   assertEquals(
>   "{key1=1, key2=2, key3=3}",
>   new ValueLiteralExpression(
>   map,
>   DataTypes.MAP(DataTypes.STRING(), 
> DataTypes.INT()))
>   .toString());
> {code}
> throws exception:
> {code}
> org.apache.flink.table.api.ValidationException: Data type 'MAP' 
> does not support a conversion from class 'java.util.HashMap'.
>   at 
> org.apache.flink.table.expressions.ValueLiteralExpression.validateValueDataType(ValueLiteralExpression.java:236)
>   at 
> org.apache.flink.table.expressions.ValueLiteralExpression.(ValueLiteralExpression.java:66)
> {code}
> It's easy to fix this by considering whether it's a subclass of Map. But I'm 
> wondering what the default conversion class should be? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14370) KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink fails on Travis

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14370:
---
Labels: pull-request-available test-stability  (was: test-stability)

> KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink
>  fails on Travis
> ---
>
> Key: FLINK-14370
> URL: https://issues.apache.org/jira/browse/FLINK-14370
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.10.0
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> The 
> {{KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink}}
>  fails on Travis with
> {code}
> Test 
> testOneToOneAtLeastOnceRegularSink(org.apache.flink.streaming.connectors.kafka.KafkaProducerAtLeastOnceITCase)
>  failed with:
> java.lang.AssertionError: Job should fail!
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnce(KafkaProducerTestBase.java:280)
>   at 
> org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink(KafkaProducerTestBase.java:206)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> https://api.travis-ci.com/v3/job/244297223/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14406) Add metric for manage memory

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14406:
---
Labels: pull-request-available  (was: )

> Add metric for manage memory
> 
>
> Key: FLINK-14406
> URL: https://issues.apache.org/jira/browse/FLINK-14406
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Runtime / Task
>Reporter: lining
>Priority: Major
>  Labels: pull-request-available
>
> If user want get memory used in time, as there's no manage memory's metrics, 
> it couldn't get it.
> So could we add metrics for manage memory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14381) Partition field names should be got from CatalogTable instead of source/sink

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14381:
---
Labels: pull-request-available  (was: )

> Partition field names should be got from CatalogTable instead of source/sink
> 
>
> Key: FLINK-14381
> URL: https://issues.apache.org/jira/browse/FLINK-14381
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> Now PartitionableTableSource and PartitionableTableSink have 
> "getPartitionFieldNames" method, this should be removed, and planner rules 
> should get it from CatalogManager.
> The partition field names are the information of Table, source/sink should 
> only be fed with such information but not get them out of it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13062) Set ScheduleMode based on boundedness of streaming Pipeline

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-13062:
---
Labels: pull-request-available  (was: )

> Set ScheduleMode based on boundedness of streaming Pipeline
> ---
>
> Key: FLINK-13062
> URL: https://issues.apache.org/jira/browse/FLINK-13062
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
>  Labels: pull-request-available
>
> The new Blink-based Table Runner needs "streaming pipelines" to be executed 
> with {{ScheduleMode.LAZY_FROM_SOURCES}} if all sources are bounded. The 
> current Blink code base uses a global flag for this and configures the 
> {{StreamGraphGenerator}} accordingly.
> We propose to add an {{isBounded()}} property to {{Transformation}} (formerly 
> known as {{StreamTransformation}}). The property would only be explicitly 
> settable on sources, other transformations inherit the property from their 
> inputs. The {{StreamGraphGenerator}} must use 
> {{ScheduleMode.LAZY_FROM_SOURCES}} if all sources are bounded, otherwise, it 
> should use {{ScheduleMode.EAGER}}, as is the currently existing behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14401) create DefaultCatalogFunctionFactory to instantiate regular java class-based udf

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14401:
---
Labels: pull-request-available  (was: )

> create DefaultCatalogFunctionFactory to instantiate regular java class-based 
> udf
> 
>
> Key: FLINK-14401
> URL: https://issues.apache.org/jira/browse/FLINK-14401
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive, Table SQL / API
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14202) Optimize the execution plan for Python Calc when there is a condition

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14202:
---
Labels: pull-request-available  (was: )

> Optimize the execution plan for Python Calc when there is a condition
> -
>
> Key: FLINK-14202
> URL: https://issues.apache.org/jira/browse/FLINK-14202
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Reporter: Dian Fu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> As discussed in [https://github.com/apache/flink/pull/9748]:
> "For the filter, we calculate these condition UDFs together with other UDFs 
> and do the filter later. I think we can optimize it a bit, i.e., calculate 
> the conditions first and then check whether to call the other UDFs. This can 
> be easily achieved in the SplitRule."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14395) Refactor ES 6 connector to split table-specific code into flink-sql-connector-elasticsearch6

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14395:
---
Labels: pull-request-available  (was: )

> Refactor ES 6 connector to split table-specific code into 
> flink-sql-connector-elasticsearch6
> 
>
> Key: FLINK-14395
> URL: https://issues.apache.org/jira/browse/FLINK-14395
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Reporter: vinoyang
>Priority: Major
>  Labels: pull-request-available
>
> Currently, for ES6 connector and ES6 SQL connector, table-specific code has a 
> problem with improper placement. The relevant code should be moved from es6 
> connector to es6 SQL connector.
> More details and contexts, please see: 
> https://github.com/apache/flink/pull/9720#issuecomment-541996415



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14396) Implement rudimentary non-blocking network output

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14396:
---
Labels: pull-request-available  (was: )

> Implement rudimentary non-blocking network output
> -
>
> Key: FLINK-14396
> URL: https://issues.apache.org/jira/browse/FLINK-14396
> Project: Flink
>  Issue Type: Task
>  Components: Runtime / Network
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> Considering the mailbox model and unaligned checkpoints requirements in 
> future, task network output should be non-blocking. In other words, as long 
> as output is available, it should never block for a subsequent/future single 
> record write.
> In the first version, we only implement the non-blocking output for the most 
> regular case, and do not solve the following cases which still keep the 
> previous behavior.
>  * Big record which might span multiple buffers
>  * Flatmap-like operators which might emit multiple records in every process
>  * Broadcast watermark which might request multiple buffers at a time
> The solution is providing the RecordWriter#isAvailable method and respective 
> LocalBufferPool#isAvailable for judging the output beforehand. As long as 
> there is at-least one available buffer in LocalBufferPool, the RecordWriter 
> is available for network output in most cases.  This doesn’t include runtime 
> handling of this non-blocking and availability behavior in 
> StreamInputProcessor
> Note: It requires the minimum number of buffers in output LocalBufferPool 
> adjusting to (numberOfSubpartitions + 1) and also adjusting the monitor of 
> backpressure future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4714) Set task state to RUNNING after state has been restored

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4714:
--
Labels: pull-request-available  (was: )

> Set task state to RUNNING after state has been restored
> ---
>
> Key: FLINK-4714
> URL: https://issues.apache.org/jira/browse/FLINK-4714
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Affects Versions: 1.2.0
>Reporter: Till Rohrmann
>Assignee: Wei-Che Wei
>Priority: Major
>  Labels: pull-request-available
>
> The task state is set to {{RUNNING}} as soon as the {{Task}} is executed. 
> That, however, happens before the state of the {{StreamTask}} invokable has 
> been restored. As a result, the {{CheckpointCoordinator}} starts to trigger 
> checkpoints even though the {{StreamTask}} is not ready.
> In order to avoid aborting checkpoints and properly start it, we should 
> switch the task state to {{RUNNING}} after the state has been restored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-6096) Refactor the migration of old versioned savepoints

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-6096:
--
Labels: pull-request-available  (was: )

> Refactor the migration of old versioned savepoints
> --
>
> Key: FLINK-6096
> URL: https://issues.apache.org/jira/browse/FLINK-6096
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Xiaogang Shi
>Assignee: Xiaogang Shi
>Priority: Major
>  Labels: pull-request-available
>
> Existing code for the migration of old-versioned savepoints does not allow to 
> correctly deserialize those classes changed in different versions.  I think 
> we should create a migration package for each old-versioned savepoint and put 
> all migrated classes in the savepoint there. A mapping can be deployed to 
> record those migrated classes in the savepoint so that we can correctly 
> deserialize them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-6192) reuse zookeeer client created by CuratorFramework

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-6192:
--
Labels: pull-request-available  (was: )

> reuse zookeeer client created by CuratorFramework
> -
>
> Key: FLINK-6192
> URL: https://issues.apache.org/jira/browse/FLINK-6192
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / YARN, Runtime / Coordination
>Reporter: Tao Wang
>Assignee: Tao Wang
>Priority: Major
>  Labels: pull-request-available
>
> Now in yarn mode, there're three places using zookeeper client(web monitor, 
> jobmanager and resourcemanager) in ApplicationMaster/JobManager, while 
> there're two in TaskManager. They create new one zookeeper client when they 
> need them.
> I believe there're more other places do the same thing, but in one JVM, one 
> CuratorFramework is enough for connections to one zookeeper client, so we 
> need a singleton to reuse them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5362) Implement methods to access BipartiteGraph properties

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5362:
--
Labels: pull-request-available  (was: )

> Implement methods to access BipartiteGraph properties
> -
>
> Key: FLINK-5362
> URL: https://issues.apache.org/jira/browse/FLINK-5362
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Graph Processing (Gelly)
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4647) Implement BipartiteGraph reader

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4647:
--
Labels: pull-request-available  (was: )

> Implement BipartiteGraph reader
> ---
>
> Key: FLINK-4647
> URL: https://issues.apache.org/jira/browse/FLINK-4647
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Graph Processing (Gelly)
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>Priority: Major
>  Labels: pull-request-available
>
> Implement reading bipartite graph from a CSV. Should be similar to how 
> regular graph is read from a file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4648) Implement bipartite graph generators

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4648:
--
Labels: pull-request-available  (was: )

> Implement bipartite graph generators
> 
>
> Key: FLINK-4648
> URL: https://issues.apache.org/jira/browse/FLINK-4648
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Graph Processing (Gelly)
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>Priority: Major
>  Labels: pull-request-available
>
> Implement generators for bipartite graphs.
> Should implement at least:
> * *BipartiteGraphGenerator* (maybe requires a better name) that will generate 
> a bipartite graph where every vertex of one set is connected only to some 
> vertices  from another set
> * *CompleteBipartiteGraphGenerator* that will generate a graph where every 
> vertex of one set is conneted to every vertex of another set



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-2184) Cannot get last element with maxBy/minBy

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-2184:
--
Labels: pull-request-available  (was: )

> Cannot get last element with maxBy/minBy
> 
>
> Key: FLINK-2184
> URL: https://issues.apache.org/jira/browse/FLINK-2184
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, API / Scala
>Reporter: Gábor Hermann
>Priority: Minor
>  Labels: pull-request-available
>
> In the streaming Scala API there is no method
> {{maxBy(int positionToMaxBy, boolean first)}}
> nor
> {{minBy(int positionToMinBy, boolean first)}}
> like in the Java API, where _first_ set to {{true}} indicates that the latest 
> found element will return.
> These methods should be added to the Scala API too, in order to be consistent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5244) Implement methods for BipartiteGraph transformations

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5244:
--
Labels: pull-request-available  (was: )

> Implement methods for BipartiteGraph transformations
> 
>
> Key: FLINK-5244
> URL: https://issues.apache.org/jira/browse/FLINK-5244
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Graph Processing (Gelly)
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>Priority: Major
>  Labels: pull-request-available
>
> BipartiteGraph should implement methods for transforming graph, like map, 
> filter, join, union, difference, etc. similarly to Graph class.
> Depends on: https://issues.apache.org/jira/browse/FLINK-2254



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-2186) Rework CSV import to support very wide files

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-2186:
--
Labels: pull-request-available  (was: )

> Rework CSV import to support very wide files
> 
>
> Key: FLINK-2186
> URL: https://issues.apache.org/jira/browse/FLINK-2186
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Scala, Library / Machine Learning
>Reporter: Theodore Vasiloudis
>Assignee: Anton Solovev
>Priority: Major
>  Labels: pull-request-available
>
> In the current readVcsFile implementation, importing CSV files with many 
> columns can become from cumbersome to impossible.
> For example to import an 11 column file we need to write:
> {code}
> val cancer = env.readCsvFile[(String, String, String, String, String, String, 
> String, String, String, String, 
> String)]("/path/to/breast-cancer-wisconsin.data")
> {code}
> For many use cases in Machine Learning we might have CSV files with thousands 
> or millions of columns that we want to import as vectors.
> In that case using the current readCsvFile method becomes impossible.
> We therefore need to rework the current function, or create a new one that 
> will allow us to import CSV files with an arbitrary number of columns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4649) Implement bipartite graph metrics

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4649:
--
Labels: pull-request-available  (was: )

> Implement bipartite graph metrics
> -
>
> Key: FLINK-4649
> URL: https://issues.apache.org/jira/browse/FLINK-4649
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Graph Processing (Gelly)
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>Priority: Major
>  Labels: pull-request-available
>
> Implement metrics calculation for a bipartite graph. Should be similar to 
> EdgeMetrics and VertexMetrics. 
> Paper that describes bipartite graph metrics: 
> http://jponnela.com/web_documents/twomode.pdf 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-1707) Add an Affinity Propagation Library Method

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-1707:
--
Labels: algorithm pull-request-available requires-design-doc  (was: 
algorithm requires-design-doc)

> Add an Affinity Propagation Library Method
> --
>
> Key: FLINK-1707
> URL: https://issues.apache.org/jira/browse/FLINK-1707
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / Graph Processing (Gelly)
>Reporter: Vasia Kalavri
>Assignee: Josep Rubió
>Priority: Minor
>  Labels: algorithm, pull-request-available, requires-design-doc
> Attachments: Binary_Affinity_Propagation_in_Flink_design_doc.pdf
>
>
> This issue proposes adding the an implementation of the Affinity Propagation 
> algorithm as a Gelly library method and a corresponding example.
> The algorithm is described in paper [1] and a description of a vertex-centric 
> implementation can be found is [2].
> [1]: http://www.psi.toronto.edu/affinitypropagation/FreyDueckScience07.pdf
> [2]: http://event.cwi.nl/grades2014/00-ching-slides.pdf
> Design doc:
> https://docs.google.com/document/d/1QULalzPqMVICi8jRVs3S0n39pell2ZVc7RNemz_SGA4/edit?usp=sharing
> Example spreadsheet:
> https://docs.google.com/spreadsheets/d/1CurZCBP6dPb1IYQQIgUHVjQdyLxK0JDGZwlSXCzBcvA/edit?usp=sharing
> Graph:
> https://docs.google.com/drawings/d/1PC3S-6AEt2Gp_TGrSfiWzkTcL7vXhHSxvM6b9HglmtA/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5104) Implement BipartiteGraph validator

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5104:
--
Labels: pull-request-available  (was: )

> Implement BipartiteGraph validator
> --
>
> Key: FLINK-5104
> URL: https://issues.apache.org/jira/browse/FLINK-5104
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Graph Processing (Gelly)
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>Priority: Major
>  Labels: pull-request-available
>
> BipartiteGraph should have a validator similar to GraphValidator for Graph 
> class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3555) Web interface does not render job information properly

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-3555:
--
Labels: pull-request-available  (was: )

> Web interface does not render job information properly
> --
>
> Key: FLINK-3555
> URL: https://issues.apache.org/jira/browse/FLINK-3555
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
>Affects Versions: 1.0.0
>Reporter: Till Rohrmann
>Assignee: Sergey_Sokur
>Priority: Minor
>  Labels: pull-request-available
> Attachments: Chrome.png, Safari.png
>
>
> In Chrome and Safari, the different tabs of the detailed job view are not 
> properly rendered. The text goes beyond the surrounding box. I would guess 
> that this is some kind of css issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-5031) Consecutive DataStream.split() ignored

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5031:
--
Labels: pull-request-available  (was: )

> Consecutive DataStream.split() ignored
> --
>
> Key: FLINK-5031
> URL: https://issues.apache.org/jira/browse/FLINK-5031
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.1.3, 1.2.0
>Reporter: Fabian Hueske
>Assignee: Renkai Ge
>Priority: Major
>  Labels: pull-request-available
>
> The output of the following program 
> {code}
> static final class ThresholdSelector implements OutputSelector {
>   long threshold;
>   public ThresholdSelector(long threshold) {
>   this.threshold = threshold;
>   }
>   @Override
>   public Iterable select(Long value) {
>   if (value < threshold) {
>   return Collections.singletonList("Less");
>   } else {
>   return Collections.singletonList("GreaterEqual");
>   }
>   }
> }
> public static void main(String[] args) throws Exception {
>   StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
>   env.setParallelism(1);
>   SplitStream split1 = env.generateSequence(1, 11)
>   .split(new ThresholdSelector(6));
>   // stream11 should be [1,2,3,4,5]
>   DataStream stream11 = split1.select("Less");
>   SplitStream split2 = stream11
> //.map(new MapFunction() {
> //@Override
> //public Long map(Long value) throws Exception {
> //return value;
> //}
> //})
>   .split(new ThresholdSelector(3));
>   DataStream stream21 = split2.select("Less");
>   // stream21 should be [1,2]
>   stream21.print();
>   env.execute();
> }
> {code}
> should be {{1, 2}}, however it is {{1, 2, 3, 4, 5}}. It seems that the second 
> {{split}} operation is ignored.
> The program is correctly evaluate if the identity {{MapFunction}} is added to 
> the program.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4272) Create a JobClient for job control and monitoring

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4272:
--
Labels: pull-request-available  (was: )

> Create a JobClient for job control and monitoring 
> --
>
> Key: FLINK-4272
> URL: https://issues.apache.org/jira/browse/FLINK-4272
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
>  Labels: pull-request-available
>
> The aim of this new features is to expose a client to the user which allows 
> to cancel a running job, retrieve accumulators for a running job, or perform 
> other actions in the future. Let's call it {{JobClient}} for now (although 
> this clashes with the existing JobClient class which could be renamed to 
> JobClientActorUtils instead).
> The new client should be returned from the {{ClusterClient}} class upon job 
> submission. The client should also be instantiatable by the users to retrieve 
> the JobClient with a JobID.
> We should expose the new JobClient to the Java and Scala APIs using a new 
> method on the {{ExecutionEnvironment}} / {{StreamExecutionEnvironment}} 
> called {{executeWithControl()}} (perhaps we can find a better name).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14232) Support global failure handling for DefaultScheduler (SchedulerNG)

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14232:
---
Labels: pull-request-available  (was: )

> Support global failure handling for DefaultScheduler (SchedulerNG)
> --
>
> Key: FLINK-14232
> URL: https://issues.apache.org/jira/browse/FLINK-14232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.10.0
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> Global failure handling(full restarts) is widely used in ExecutionGraph 
> components and even other components to recover the job from an inconsistent 
> state. 
> We need to support it for DefaultScheduler to not break the safety net. More 
> details see [here|https://github.com/apache/flink/pull/9663/files#r326892524].
> There can be follow ups of this task to replace usages of full restarts with 
> JVM termination, in cases that are considered as bugs/unexpected to happen.
> Implementation plan:
> 1. Add {{getGlobalFailureHandlingResult(Throwable)}} in 
> {{ExecutionFailureHandler}}
> 2. Add an interface {{handleGlobalFailure(Throwable)}} in {{SchedulerNG}} and 
> implement it in {{DefaultScheduler}}
> 3. Add an interface {{notifyGlobalFailure(Throwable)}} in 
> {{InternalTaskFailuresListener}} and rework the implementations to use 
> {{SchedulerNG#handleGlobalFailure}}
> 4. Rework {{ExecutionGraph#failGlobal}} to invoke 
> {{InternalTaskFailuresListener#notifyGlobalFailure}} for ng scheduler



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14394) Remove unnecessary interface method BufferProvider#requestBufferBlocking

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14394:
---
Labels: pull-request-available  (was: )

> Remove unnecessary interface method BufferProvider#requestBufferBlocking
> 
>
> Key: FLINK-14394
> URL: https://issues.apache.org/jira/browse/FLINK-14394
> Project: Flink
>  Issue Type: Task
>  Components: Runtime / Network
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> Currently BufferProvider#requestBufferBlocking method is only used for unit 
> tests, so we could refactor the related tests to use methods of 
> BufferProvider#requestBufferBuilderBlocking or BufferProvider#requestBuffer 
> instead. Then we could remove this legacy method completely to clean up the 
> interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14363) Prevent vertex from being affected by outdated deployment (SchedulerNG)

2019-10-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14363:
---
Labels: pull-request-available  (was: )

> Prevent vertex from being affected by outdated deployment (SchedulerNG)
> ---
>
> Key: FLINK-14363
> URL: https://issues.apache.org/jira/browse/FLINK-14363
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.10.0
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> DefaultScheduler currently will cancel latest execution of a vertex when the 
> vertex version is outdated. This may lead to undesirable failover of a 
> healthy vertex.
> I'd propose to remove the vertex cancel process in 
> {{DefaultScheduler#stopDeployment}} because the cancellation is not not 
> needed here, since the version of a scheduled vertex is only outdated after 
> the vertex is canceled by others.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14366) Annotate MiniCluster tests in flink-tests with AlsoRunWithSchedulerNG

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14366:
---
Labels: pull-request-available  (was: )

> Annotate MiniCluster tests in flink-tests with AlsoRunWithSchedulerNG
> -
>
> Key: FLINK-14366
> URL: https://issues.apache.org/jira/browse/FLINK-14366
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination, Tests
>Affects Versions: 1.10.0
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> This task is to annotate all MiniCluster tests with AlsoRunWithSchedulerNG in 
> flink-tests, so that we can know breaking changes in time when further 
> improving the new generation scheduler.
> We should also guarantee the annotated tests to pass, either by fixing failed 
> tests, or not annotating a failed test and opening a ticket to track it.
> The tickets for failed tests should be linked in this task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14365) Annotate MiniCluster tests in core modules with AlsoRunWithSchedulerNG

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14365:
---
Labels: pull-request-available  (was: )

> Annotate MiniCluster tests in core modules with AlsoRunWithSchedulerNG
> --
>
> Key: FLINK-14365
> URL: https://issues.apache.org/jira/browse/FLINK-14365
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination, Tests
>Affects Versions: 1.10.0
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> This task is to annotate MiniCluster tests with AlsoRunWithSchedulerNG in 
> flink core modules, so that we can know breaking changes in time when further 
> improving the new generation scheduler.
> Core modules are the basic flink modules as defined in {{MODULES_CORE}} in 
> flink/travis/stage.sh.
> MODULES_CORE="\
> flink-annotations,\
> flink-test-utils-parent/flink-test-utils,\
> flink-state-backends/flink-statebackend-rocksdb,\
> flink-clients,\
> flink-core,\
> flink-java,\
> flink-optimizer,\
> flink-runtime,\
> flink-runtime-web,\
> flink-scala,\
> flink-streaming-java,\
> flink-streaming-scala,\
> flink-metrics,\
> flink-metrics/flink-metrics-core"
> Note that the test bases in flink-test-utils will not be annotated in this 
> task, since it enables MiniCluster tests in flink-tests and other non-core 
> modules.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14359) Create a module called flink-sql-connector-hbase to shade HBase

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14359:
---
Labels: pull-request-available  (was: )

> Create a module called flink-sql-connector-hbase to shade HBase
> ---
>
> Key: FLINK-14359
> URL: https://issues.apache.org/jira/browse/FLINK-14359
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / HBase
>Reporter: Jingsong Lee
>Assignee: Zheng Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> We need do the same thing as kafka and elasticsearch to HBase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14390) Add class for SqlOperators, and add sql operations to AlgoOperator, BatchOperator and StreamOperator.

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14390:
---
Labels: pull-request-available  (was: )

> Add class for SqlOperators, and add sql operations to AlgoOperator, 
> BatchOperator and StreamOperator.
> -
>
> Key: FLINK-14390
> URL: https://issues.apache.org/jira/browse/FLINK-14390
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Machine Learning
>Reporter: Xu Yang
>Priority: Major
>  Labels: pull-request-available
>
> This PR adds sql operators (select, groupby, join, etc.) which apply on 
> AlgoOperators. It is equivalent to applying the sql operators on 
> the output tables of the AlgoOperators.
>  * Add SqlOperators for implementation of the sql operators that apply on 
> AlgoOperators.
>  * Update AlgoOperator by adding some methods of sql operations.
>  * Update BatchOperator by adding some methods of sql operations.
>  * Update StreamOperator by adding some methods of sql operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4883) Prevent UDFs implementations through Scala singleton objects

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4883:
--
Labels: pull-request-available  (was: )

> Prevent UDFs implementations through Scala singleton objects
> 
>
> Key: FLINK-4883
> URL: https://issues.apache.org/jira/browse/FLINK-4883
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Reporter: Stefan Richter
>Assignee: Renkai Ge
>Priority: Major
>  Labels: pull-request-available
>
> Currently, user can create and use UDFs in Scala like this:
> {code}
> object FlatMapper extends RichCoFlatMapFunction[Long, String, (Long, String)] 
> {
> ...
> }
> {code}
> However, this leads to problems as the UDF is now a singleton that Flink 
> could use across several operator instances, which leads to job failures. We 
> should detect and prevent the usage of singleton UDFs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4705) Instrument FixedLengthRecordSorter

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4705:
--
Labels: pull-request-available  (was: )

> Instrument FixedLengthRecordSorter
> --
>
> Key: FLINK-4705
> URL: https://issues.apache.org/jira/browse/FLINK-4705
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Major
>  Labels: pull-request-available
>
> The {{NormalizedKeySorter}} sorts on the concatenation of (potentially 
> partial) keys plus an 8-byte pointer to the record. After sorting each 
> pointer must be dereferenced, which is not cache friendly.
> The {{FixedLengthRecordSorter}} sorts on the concatentation of full keys 
> followed by the remainder of the record. The records can then be deserialized 
> in sequence.
> Instrumenting the {{FixedLengthRecordSorter}} requires implementing the 
> comparator methods {{writereadWithKeyNormalization}} and 
> {{readWithKeyNormalization}}.
> Testing {{JaccardIndex}} on an m4.16xlarge the scale 18 runtime dropped from 
> 71.8 to 68.8 s (4.3% faster) and the scale 20 runtime dropped from 546.1 to 
> 501.8 s (8.8% faster).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4615) Reusing the memory allocated for the drivers and iterators

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4615:
--
Labels: pull-request-available  (was: )

> Reusing the memory allocated for the drivers and iterators
> --
>
> Key: FLINK-4615
> URL: https://issues.apache.org/jira/browse/FLINK-4615
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Major
>  Labels: pull-request-available
>
> Raising as a subtask so that individually can be committed and for better 
> closer reviews.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4205) Implement stratified sampling for DataSet

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4205:
--
Labels: pull-request-available  (was: )

> Implement stratified sampling for DataSet
> -
>
> Key: FLINK-4205
> URL: https://issues.apache.org/jira/browse/FLINK-4205
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataSet
>Reporter: Do Le Quoc
>Priority: Major
>  Labels: pull-request-available
>
> Since a Dataset might consist of data from disparate sources. As such, every 
> data source should be considered fairly to have a representative sample. For 
> this, stratified sampling is needed to ensure that data from every source 
> (stratum) is selected and none of the minorities are excluded. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-2055) Implement Streaming HBaseSink

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-2055:
--
Labels: pull-request-available  (was: )

> Implement Streaming HBaseSink
> -
>
> Key: FLINK-2055
> URL: https://issues.apache.org/jira/browse/FLINK-2055
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Erli Ding
>Priority: Major
>  Labels: pull-request-available
>
> As per : 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Write-Stream-to-HBase-td1300.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4521) Fix "Submit new Job" panel in development mode

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4521:
--
Labels: pull-request-available  (was: )

> Fix "Submit new Job" panel in development mode
> --
>
> Key: FLINK-4521
> URL: https://issues.apache.org/jira/browse/FLINK-4521
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>Priority: Major
>  Labels: pull-request-available
>
> If web frontend is started in the development mode, "Submit new Job" panel is 
> empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3030) Enhance Dashboard to show Execution Attempts

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-3030:
--
Labels: pull-request-available  (was: )

> Enhance Dashboard to show Execution Attempts
> 
>
> Key: FLINK-3030
> URL: https://issues.apache.org/jira/browse/FLINK-3030
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Ivan Mushketyk
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the web dashboard shows only the latest execution attempt. We 
> should make all execution attempts and their accumulators available for 
> inspection.
> The REST monitoring API supports this, so it should be a change only to the 
> frontend part.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3322) MemoryManager creates too much GC pressure with iterative jobs

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-3322:
--
Labels: pull-request-available  (was: )

> MemoryManager creates too much GC pressure with iterative jobs
> --
>
> Key: FLINK-3322
> URL: https://issues.apache.org/jira/browse/FLINK-3322
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.0.0
>Reporter: Gabor Gevay
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
>  Labels: pull-request-available
> Attachments: FLINK-3322.docx, FLINK-3322_reusingmemoryfordrivers.docx
>
>
> When taskmanager.memory.preallocate is false (the default), released memory 
> segments are not added to a pool, but the GC is expected to take care of 
> them. This puts too much pressure on the GC with iterative jobs, where the 
> operators reallocate all memory at every superstep.
> See the following discussion on the mailing list:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Memory-manager-behavior-in-iterative-jobs-tt10066.html
> Reproducing the issue:
> https://github.com/ggevay/flink/tree/MemoryManager-crazy-gc
> The class to start is malom.Solver. If you increase the memory given to the 
> JVM from 1 to 50 GB, performance gradually degrades by more than 10 times. 
> (It will generate some lookuptables to /tmp on first run for a few minutes.) 
> (I think the slowdown might also depend somewhat on 
> taskmanager.memory.fraction, because more unused non-managed memory results 
> in rarer GCs.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3783) Support weighted random sampling with reservoir

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-3783:
--
Labels: pull-request-available  (was: )

> Support weighted random sampling with reservoir
> ---
>
> Key: FLINK-3783
> URL: https://issues.apache.org/jira/browse/FLINK-3783
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataSet
>Reporter: GaoLun
>Assignee: GaoLun
>Priority: Minor
>  Labels: pull-request-available
>
> In default random sampling, all items have the same probability to be 
> selected. But in weighted random sampling, the probability of each item to be 
> selected is determined by its weight with respect to the weights of the other 
> items.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3857) Add reconnect attempt to Elasticsearch host

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-3857:
--
Labels: pull-request-available  (was: )

> Add reconnect attempt to Elasticsearch host
> ---
>
> Key: FLINK-3857
> URL: https://issues.apache.org/jira/browse/FLINK-3857
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Affects Versions: 1.0.2, 1.1.0
>Reporter: Fabian Hueske
>Assignee: Subhobrata Dey
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the connection to the Elasticsearch host is opened in 
> {{ElasticsearchSink.open()}}. In case the connection is lost (maybe due to a 
> changed DNS entry), the sink fails.
> I propose to catch the Exception for lost connections in the {{invoke()}} 
> method and try to re-open the connection for a configurable number of times 
> with a certain delay.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3996) Add addition, subtraction and multiply by scalar to DenseVector.scala and SparseVector.scala

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-3996:
--
Labels: pull-request-available  (was: )

> Add addition, subtraction and multiply by scalar to DenseVector.scala and 
> SparseVector.scala
> 
>
> Key: FLINK-3996
> URL: https://issues.apache.org/jira/browse/FLINK-3996
> Project: Flink
>  Issue Type: Improvement
>  Components: Library / Machine Learning
>Reporter: Daniel Blazevski
>Assignee: Daniel Blazevski
>Priority: Minor
>  Labels: pull-request-available
>
> Small change to add vector operations. With this small change, can now do 
> things like:
> val v1 = DenseVector(0.1, 0.1)
> val v2 = DenseVector(0.2, 0.2)
> val v3 = v1 + v2
> instead of what is now has to be done:
> val v1 = DenseVector(0.1, 0.1)
> val v2 = DenseVector(0.2, 0.2)
> val v3 = (v1.asBreeze + v2.asBreeze).fromBreeze



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4016) FoldApplyWindowFunction is not properly initialized

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-4016:
--
Labels: easyfix pull-request-available  (was: easyfix)

> FoldApplyWindowFunction is not properly initialized
> ---
>
> Key: FLINK-4016
> URL: https://issues.apache.org/jira/browse/FLINK-4016
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.0.3
>Reporter: RWenden
>Priority: Blocker
>  Labels: easyfix, pull-request-available
> Fix For: 1.1.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> FoldApplyWindowFunction's outputtype is not set.
> We're using constructions like (excerpt):
>   .keyBy(0)
>   .countWindow(10, 5)
>   .fold(...)
> Running this stream gives an runtime exception in FoldApplyWindowFunction:
> "No initial value was serialized for the fold window function. Probably the 
> setOutputType method was not called."
> This can be easily fixed in WindowedStream.java by (around line# 449):
> FoldApplyWindowFunction foldApplyWindowFunction = new 
> FoldApplyWindowFunction<>(initialValue, foldFunction, function);
> foldApplyWindowFunction.setOutputType(resultType, 
> input.getExecutionConfig());
> operator = new EvictingWindowOperator<>(windowAssigner,
> 
> windowAssigner.getWindowSerializer(getExecutionEnvironment().getConfig()),
> keySel,
> 
> input.getKeyType().createSerializer(getExecutionEnvironment().getConfig()),
> stateDesc,
> new 
> InternalIterableWindowFunction<>(foldApplyWindowFunction),
> trigger,
> evictor);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-1725) New Partitioner for better load balancing for skewed data

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-1725:
--
Labels: LoadBalancing Partitioner pull-request-available  (was: 
LoadBalancing Partitioner)

> New Partitioner for better load balancing for skewed data
> -
>
> Key: FLINK-1725
> URL: https://issues.apache.org/jira/browse/FLINK-1725
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 0.8.1
>Reporter: Anis Nasir
>Assignee: Anis Nasir
>Priority: Major
>  Labels: LoadBalancing, Partitioner, pull-request-available
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Hi,
> We have recently studied the problem of load balancing in Storm [1].
> In particular, we focused on key distribution of the stream for skewed data.
> We developed a new stream partitioning scheme (which we call Partial Key 
> Grouping). It achieves better load balancing than key grouping while being 
> more scalable than shuffle grouping in terms of memory.
> In the paper we show a number of mining algorithms that are easy to implement 
> with partial key grouping, and whose performance can benefit from it. We 
> think that it might also be useful for a larger class of algorithms.
> Partial key grouping is very easy to implement: it requires just a few lines 
> of code in Java when implemented as a custom grouping in Storm [2].
> For all these reasons, we believe it will be a nice addition to the standard 
> Partitioners available in Flink. If the community thinks it's a good idea, we 
> will be happy to offer support in the porting.
> References:
> [1]. 
> https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf
> [2]. https://github.com/gdfm/partial-key-grouping



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-3109) Join two streams with two different buffer time

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-3109:
--
Labels: easyfix patch pull-request-available  (was: easyfix patch)

> Join two streams with two different buffer time
> ---
>
> Key: FLINK-3109
> URL: https://issues.apache.org/jira/browse/FLINK-3109
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 0.10.1
>Reporter: Wang Yangjun
>Priority: Major
>  Labels: easyfix, patch, pull-request-available
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Current Flink streaming only supports join two streams on the same window. 
> How to solve this problem?
> For example, there are two streams. One is advertisements showed to users. 
> The tuple in which could be described as (id, showed timestamp). The other 
> one is click stream -- (id, clicked timestamp). We want get a joined stream, 
> which includes all the advertisement that is clicked by user in 20 minutes 
> after showed.
> It is possible that after an advertisement is shown, some user click it 
> immediately. It is possible that "click" message arrives server earlier than 
> "show" message because of Internet delay. We assume that the maximum delay is 
> one minute.
> Then the need is that we should alway keep a buffer(20 mins) of "show" stream 
> and another buffer(1 min) of "click" stream.
> It would be grate that there is such an API like.
> showStream.join(clickStream)
> .where(keySelector)
> .buffer(Time.of(20, TimeUnit.MINUTES))
> .equalTo(keySelector)
> .buffer(Time.of(1, TimeUnit.MINUTES))
> .apply(JoinFunction)
> http://stackoverflow.com/questions/33849462/how-to-avoid-repeated-tuples-in-flink-slide-window-join/34024149#34024149



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-2030) Implement discrete and continuous histograms

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-2030:
--
Labels: pull-request-available  (was: )

> Implement discrete and continuous histograms
> 
>
> Key: FLINK-2030
> URL: https://issues.apache.org/jira/browse/FLINK-2030
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Machine Learning
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Minor
>  Labels: pull-request-available
>
> For the implementation of the decision tree in 
> https://issues.apache.org/jira/browse/FLINK-1727, we need to implement an 
> histogram with online updates, merging and equalization features. A reference 
> implementation is provided in [1]
> [1].http://www.jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14045) Rewrite DefaultExecutionSlotAllocator to use SlotProviderStrategy

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14045:
---
Labels: pull-request-available  (was: )

> Rewrite DefaultExecutionSlotAllocator to use SlotProviderStrategy
> -
>
> Key: FLINK-14045
> URL: https://issues.apache.org/jira/browse/FLINK-14045
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> Currently DefaultExecutionSlotAllocator uses the SlotProvider directly. This 
> ticket is about rewriting DefaultExecutionSlotAllocator to use 
> SlotProviderStrategy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14342) Remove method FunctionDefinition#getLanguage

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14342:
---
Labels: pull-request-available  (was: )

> Remove method FunctionDefinition#getLanguage 
> -
>
> Key: FLINK-14342
> URL: https://issues.apache.org/jira/browse/FLINK-14342
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python, Table SQL / API
>Affects Versions: 1.10.0
>Reporter: Dian Fu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> As discussed in 
> [https://github.com/apache/flink/pull/9748#discussion_r329575228], all the 
> Python *ScalarFunction* need to implement the interface *PythonFunction* and 
> we can determine if a ScalarFunction is Python or not via if it's an instance 
> of *PythonFunction*, the method *FunctionDefinition#getLanguage* is not 
> needed any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13999) Correct the documentation of MATCH_RECOGNIZE

2019-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-13999:
---
Labels: pull-request-available  (was: )

> Correct the documentation of MATCH_RECOGNIZE
> 
>
> Key: FLINK-13999
> URL: https://issues.apache.org/jira/browse/FLINK-13999
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Dian Fu
>Priority: Major
>  Labels: pull-request-available
>
> Regarding to the following 
> [example|https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/streaming/match_recognize.html#aggregations]
>  in the doc:
> {code:java}
> SELECT *
> FROM Ticker
> MATCH_RECOGNIZE (
> PARTITION BY symbol
> ORDER BY rowtime
> MEASURES
> FIRST(A.rowtime) AS start_tstamp,
> LAST(A.rowtime) AS end_tstamp,
> AVG(A.price) AS avgPrice
> ONE ROW PER MATCH
> AFTER MATCH SKIP TO FIRST B
> PATTERN (A+ B)
> DEFINE
> A AS AVG(A.price) < 15
> ) MR;
> {code}
> Given the inputs shown in the doc, it should be:
> {code:java}
>  symbol   start_tstamp   end_tstamp  avgPrice
> =  ==  ==  
> ACME   01-APR-11 10:00:00  01-APR-11 10:00:03 14.5{code}
> instead of:
> {code:java}
>  symbol   start_tstamp   end_tstamp  avgPrice
> =  ==  ==  
> ACME   01-APR-11 10:00:00  01-APR-11 10:00:03 14.5
> ACME   01-APR-11 10:00:04  01-APR-11 10:00:09 13.5
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14272) Support Blink planner for Python UDF

2019-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14272:
---
Labels: pull-request-available  (was: )

> Support Blink planner for Python UDF
> 
>
> Key: FLINK-14272
> URL: https://issues.apache.org/jira/browse/FLINK-14272
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Reporter: Dian Fu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> Currently, the Python UDF only works in the legacy planner, we should also 
> support it in the Blink planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14355) Example code in state processor API docs doesn't compile

2019-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14355:
---
Labels: pull-request-available  (was: )

> Example code in state processor API docs doesn't compile
> 
>
> Key: FLINK-14355
> URL: https://issues.apache.org/jira/browse/FLINK-14355
> Project: Flink
>  Issue Type: Bug
>  Components: API / State Processor
>Reporter: Mitch Wasson
>Priority: Major
>  Labels: pull-request-available
>
> The example code in this doc page doesn't compile:
> [https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/state_processor_api.html]
> Here are two instances I found:
>  * Reading State java and scala contain references to undefined 
> {{stateDescriptor}} variable
>  * Reading Keyed State scala has some invalid scala ("{{override def 
> open(Configuration parameters)"}})
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14227) Add Razorpay to Chinese Powered By page

2019-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14227:
---
Labels: pull-request-available  (was: )

> Add Razorpay to Chinese Powered By page
> ---
>
> Key: FLINK-14227
> URL: https://issues.apache.org/jira/browse/FLINK-14227
> Project: Flink
>  Issue Type: Task
>  Components: chinese-translation, Project Website
>Reporter: Fabian Hueske
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Razorpay was added to the English Powered By page with commit: 
> [87a034140e97be42616e1a3dbe58e4f7a014e560|https://github.com/apache/flink-web/commit/87a034140e97be42616e1a3dbe58e4f7a014e560].
> It should be added to the Chinese Powered By (and index.html) page as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14215) Add Docs for TM and JM Environment Variable Setting

2019-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14215:
---
Labels: pull-request-available  (was: )

> Add Docs for TM and JM Environment Variable Setting
> ---
>
> Key: FLINK-14215
> URL: https://issues.apache.org/jira/browse/FLINK-14215
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.9.0
>Reporter: Zhenqiu Huang
>Assignee: Zhenqiu Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.9.2
>
>
> Add description for 
>   /**
>* Prefix for passing custom environment variables to Flink's master 
> process.
>* For example for passing LD_LIBRARY_PATH as an env variable to the 
> AppMaster, set:
>* containerized.master.env.LD_LIBRARY_PATH: "/usr/lib/native"
>* in the flink-conf.yaml.
>*/
>   public static final String CONTAINERIZED_MASTER_ENV_PREFIX = 
> "containerized.master.env.";
>   /**
>* Similar to the {@see CONTAINERIZED_MASTER_ENV_PREFIX}, this 
> configuration prefix allows
>* setting custom environment variables for the workers (TaskManagers).
>*/
>   public static final String CONTAINERIZED_TASK_MANAGER_ENV_PREFIX = 
> "containerized.taskmanager.env.";



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14208) Optimize Python UDFs with parameters of constant values

2019-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14208:
---
Labels: pull-request-available  (was: )

> Optimize Python UDFs with parameters of constant values
> ---
>
> Key: FLINK-14208
> URL: https://issues.apache.org/jira/browse/FLINK-14208
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Reporter: Dian Fu
>Assignee: Huang Xingbo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> We need support Python UDFs with parameters of constant values. It should be 
> noticed that the constant parameters are not needed to be transferred between 
> the Java operator and the Python worker.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14027) Add documentation for Python user-defined functions

2019-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14027:
---
Labels: pull-request-available  (was: )

> Add documentation for Python user-defined functions
> ---
>
> Key: FLINK-14027
> URL: https://issues.apache.org/jira/browse/FLINK-14027
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python, Documentation
>Reporter: Dian Fu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> We should add documentation about how to use Python user-defined functions.
> Python dependencies should be included in the document. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14344) Snapshot master hook state asynchronously

2019-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14344:
---
Labels: pull-request-available  (was: )

> Snapshot master hook state asynchronously
> -
>
> Key: FLINK-14344
> URL: https://issues.apache.org/jira/browse/FLINK-14344
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Biao Liu
>Assignee: Biao Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> Currently we snapshot the master hook state synchronously. As a part of 
> reworking threading model of {{CheckpointCoordinator}}, we have to make this 
> non-blocking to satisfy the requirement of running in main thread.
> The behavior of snapshotting master hook state is similar to task state 
> snapshotting. Master state snapshotting is taken before task state 
> snapshotting. Because in master hook, there might be external system 
> initialization which task state snapshotting might depend on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14266) Introduce RowCsvInputFormat to new CSV module

2019-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14266:
---
Labels: pull-request-available  (was: )

> Introduce RowCsvInputFormat to new CSV module
> -
>
> Key: FLINK-14266
> URL: https://issues.apache.org/jira/browse/FLINK-14266
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> Now, we have an old CSV, but that is not standard CSV support. we should 
> support the RFC-compliant CSV format for table/sql.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13818) Check whether web submission are enabled

2019-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-13818:
---
Labels: pull-request-available  (was: )

> Check whether web submission are enabled
> 
>
> Key: FLINK-13818
> URL: https://issues.apache.org/jira/browse/FLINK-13818
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Web Frontend
>Reporter: Chesnay Schepler
>Assignee: Yadong Xie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> The WebUI should preemptively check whether web-submissions are enabled (via 
> FLINK-13817), and adjust the web-submission page accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-1430) Add test for streaming scala api completeness

2019-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-1430:
--
Labels: pull-request-available  (was: )

> Add test for streaming scala api completeness
> -
>
> Key: FLINK-1430
> URL: https://issues.apache.org/jira/browse/FLINK-1430
> Project: Flink
>  Issue Type: Test
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Assignee: Márton Balassi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> Currently the completeness of the streaming scala api is not tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14377) Translate ProgramOptions relevant for job execution to ConfigOptions.

2019-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14377:
---
Labels: pull-request-available  (was: )

> Translate ProgramOptions relevant for job execution to ConfigOptions.
> -
>
> Key: FLINK-14377
> URL: https://issues.apache.org/jira/browse/FLINK-14377
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client / Job Submission
>Affects Versions: 1.10.0
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>Priority: Major
>  Labels: pull-request-available
>
> A subset of the {{ProgramOptions}} will be needed by the executor for 
> {{JobGraph}} creation and/or {{Executor}} selection.
> These parameters are the:
>  * jars
>  * classpaths
>  * parallelism
>  * savepointSettings
>  * detached (or not)
>  * shutdown_on_attached
> This issue aims at introducing the {{ConfigOption}} counterparts of these 
> parameters and mapping the one to the other.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14347) YARNSessionFIFOITCase.checkForProhibitedLogContents found a log with prohibited string

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14347:
---
Labels: pull-request-available  (was: )

> YARNSessionFIFOITCase.checkForProhibitedLogContents found a log with 
> prohibited string
> --
>
> Key: FLINK-14347
> URL: https://issues.apache.org/jira/browse/FLINK-14347
> Project: Flink
>  Issue Type: Test
>  Components: Deployment / YARN, Tests
>Affects Versions: 1.10.0, 1.9.1, 1.8.3
>Reporter: Caizhi Weng
>Assignee: Zili Chen
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.9.1, 1.8.3
>
>
> YARNSessionFIFOITCase.checkForProhibitedLogContents fails with the following 
> exception:
> {code:java}
> 14:55:27.643 [ERROR]   
> YARNSessionFIFOITCase.checkForProhibitedLogContents:77->YarnTestBase.ensureNoProhibitedStringInLogFiles:461
>  Found a file 
> /home/travis/build/apache/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-1_0/application_1570546069180_0001/container_1570546069180_0001_01_01/jobmanager.log
>  with a prohibited string (one of [Exception, Started 
> SelectChannelConnector@0.0.0.0:8081]). Excerpts:23760[{code}
> Travis log link: [https://travis-ci.org/apache/flink/jobs/595082243]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   8   9   10   >