[jira] [Commented] (FLINK-7798) Add support for windowed joins to Table API

2017-10-10 Thread Xingcan Cui (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199824#comment-16199824
 ] 

Xingcan Cui commented on FLINK-7798:


Hi [~fhueske], as you said, most of the related code has already been there. To 
enable the stream window join in Table API, it seems we just need to get rid of 
the validation snippet in {{operators.scala}}.
{code:java}
if (tableEnv.isInstanceOf[StreamTableEnvironment]
  && !right.isInstanceOf[LogicalTableFunctionCall]) {
  failValidation(s"Join on stream tables is currently not supported.")
}
{code}
In addition, do you think it's necessary to add some (temporary) restrictions 
in the table validation process, so that we can provide more friendly exception 
messages (instead of something like "...TableException: Cannot generate a valid 
execution plan for the given query: ...").

> Add support for windowed joins to Table API
> ---
>
> Key: FLINK-7798
> URL: https://issues.apache.org/jira/browse/FLINK-7798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>Priority: Critical
> Fix For: 1.4.0
>
>
> Currently, windowed joins on streaming tables are only supported through SQL.
> The Table API should support these joins as well. For that, we have to adjust 
> the Table API validation and translate the API into the respective logical 
> plan. Since most of the code should already be there for the batch Table API 
> joins, this should be fairly straightforward.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7802) Bug in PushProjectIntoTableSourceScanRule when empty field collection was pushed into TableSource

2017-10-10 Thread godfrey he (JIRA)
godfrey he created FLINK-7802:
-

 Summary: Bug in PushProjectIntoTableSourceScanRule when empty 
field collection was pushed into TableSource
 Key: FLINK-7802
 URL: https://issues.apache.org/jira/browse/FLINK-7802
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Reporter: godfrey he
Assignee: godfrey he


Currently, if no fields will be used, empty field collection will be pushed 
into TableSource in PushProjectIntoTableSourceScanRule. Some exception will 
occur, e.g. java.lang.IllegalArgumentException: At least one field must be 
specified   at 
org.apache.flink.api.java.io.RowCsvInputFormat.(RowCsvInputFormat.java:50)

Consider such SQL: select count(1) from tbl. 

So if no fields will be used, we should also keep some columns for TableSource 
to read some data.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7661) Add credit field in PartitionRequest message

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199749#comment-16199749
 ] 

ASF GitHub Bot commented on FLINK-7661:
---

Github user zhijiangW commented on the issue:

https://github.com/apache/flink/pull/4698
  
@zentol , I have rebased the latest master codes and solved the conflicts.


> Add credit field in PartitionRequest message
> 
>
> Key: FLINK-7661
> URL: https://issues.apache.org/jira/browse/FLINK-7661
> Project: Flink
>  Issue Type: Sub-task
>  Components: Network
>Affects Versions: 1.4.0
>Reporter: zhijiang
>Assignee: zhijiang
>
> Currently the {{PartitionRequest}} message contains {{ResultPartitionID}} | 
> {{queueIndex}} | {{InputChannelID}} fields.
> We will add a new {{credit}} field indicating the initial credit of 
> {{InputChannel}}, and this info can be got from {{InputChannel}} directly 
> after assigning exclusive buffers to it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4698: [FLINK-7661][network] Add credit field in PartitionReques...

2017-10-10 Thread zhijiangW
Github user zhijiangW commented on the issue:

https://github.com/apache/flink/pull/4698
  
@zentol , I have rebased the latest master codes and solved the conflicts.


---


[GitHub] flink issue #3478: Flink 4816 Executions failed from "DEPLOYING" should reta...

2017-10-10 Thread tony810430
Github user tony810430 commented on the issue:

https://github.com/apache/flink/pull/3478
  
Hi @ramkrish86 
Are you still working on this PR?


---


[jira] [Commented] (FLINK-7423) Always reuse an instance to get elements from the inputFormat

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199716#comment-16199716
 ] 

ASF GitHub Bot commented on FLINK-7423:
---

Github user XuPingyong closed the pull request at:

https://github.com/apache/flink/pull/4525


> Always reuse an instance  to get elements from the inputFormat 
> ---
>
> Key: FLINK-7423
> URL: https://issues.apache.org/jira/browse/FLINK-7423
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>
> In InputFormatSourceFunction.java:
> {code:java}
> OUT nextElement = serializer.createInstance();
>   while (isRunning) {
>   format.open(splitIterator.next());
>   // for each element we also check if cancel
>   // was called by checking the isRunning flag
>   while (isRunning && !format.reachedEnd()) {
>   nextElement = 
> format.nextRecord(nextElement);
>   if (nextElement != null) {
>   ctx.collect(nextElement);
>   } else {
>   break;
>   }
>   }
>   format.close();
>   completedSplitsCounter.inc();
>   if (isRunning) {
>   isRunning = splitIterator.hasNext();
>   }
>   }
> {code}
> the format may return other element or null when nextRecord, that will may 
> cause exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-7798) Add support for windowed joins to Table API

2017-10-10 Thread Xingcan Cui (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-7798:
--

Assignee: Xingcan Cui

> Add support for windowed joins to Table API
> ---
>
> Key: FLINK-7798
> URL: https://issues.apache.org/jira/browse/FLINK-7798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>Priority: Critical
> Fix For: 1.4.0
>
>
> Currently, windowed joins on streaming tables are only supported through SQL.
> The Table API should support these joins as well. For that, we have to adjust 
> the Table API validation and translate the API into the respective logical 
> plan. Since most of the code should already be there for the batch Table API 
> joins, this should be fairly straightforward.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4525: [FLINK-7423] Always reuse an instance to get eleme...

2017-10-10 Thread XuPingyong
Github user XuPingyong closed the pull request at:

https://github.com/apache/flink/pull/4525


---


[jira] [Commented] (FLINK-7394) Manage exclusive buffers in RemoteInputChannel

2017-10-10 Thread zhijiang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199702#comment-16199702
 ] 

zhijiang commented on FLINK-7394:
-

[~Zentol], thank you for merging and let me know the rule.

> Manage exclusive buffers in RemoteInputChannel
> --
>
> Key: FLINK-7394
> URL: https://issues.apache.org/jira/browse/FLINK-7394
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> This is a part of work for credit-based network flow control. 
> The basic works are:
> * Exclusive buffers are assigned to {{RemoteInputChannel}} after created by 
> {{SingleInputGate}}.
> * {{RemoteInputChannel}} implements {{BufferRecycler}} interface to manage 
> the exclusive buffers itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-7762) Make WikipediaEditsSourceTest a proper test

2017-10-10 Thread Hai Zhou UTC+8 (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hai Zhou UTC+8 reassigned FLINK-7762:
-

Assignee: Hai Zhou UTC+8

> Make WikipediaEditsSourceTest a proper test
> ---
>
> Key: FLINK-7762
> URL: https://issues.apache.org/jira/browse/FLINK-7762
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Reporter: Aljoscha Krettek
>Assignee: Hai Zhou UTC+8
>Priority: Minor
>
> {{WikipediaEditsSourceTest}} is currently an ITCase even though it's called 
> test. Making it a test reduces runtime and also makes it more stable because 
> we don't run a whole Flink job.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7669) org.apache.flink.api.common.ExecutionConfig cannot be cast to org.apache.flink.api.common.ExecutionConfig

2017-10-10 Thread Raymond Tay (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199687#comment-16199687
 ] 

Raymond Tay commented on FLINK-7669:


I tried that suggestion but it didn't work and here's what i did
(a) pull from flink's trunk (git log head hash : 
427dfe42e2bea891b40e662bc97cdea57cdae3f5) 
(b) cleaned flink's build dir
(c) re-build flink
  (c.1) re-install flink to local ivy/mvn cache
(d) applied parameters into "build-target/conf/flink-conf.yaml"
(e) re-build fatjar (telling sbt to look for ivy/mvn cache)
(f) start local flink cluster
(g) ran the program via "flink run "


> org.apache.flink.api.common.ExecutionConfig cannot be cast to 
> org.apache.flink.api.common.ExecutionConfig
> -
>
> Key: FLINK-7669
> URL: https://issues.apache.org/jira/browse/FLINK-7669
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.4.0
> Environment: - OS: macOS Sierra 
> - Oracle JDK 1.8
> - Scala 2.11.11
> - sbt 0.13.16
> - Build from trunk code at commit hash 
> {{42cc3a2a9c41dda7cf338db36b45131db9150674}}
> -- started a local flink node 
>Reporter: Raymond Tay
>
> Latest code pulled from trunk threw errors at runtime when i ran a job 
> against it; but when i ran the JAR against the stable version {{1.3.2}} it 
> was OK. Here is the stacktrace. 
> An exception is being thrown :
> {noformat}
> Cluster configuration: Standalone cluster with JobManager at 
> localhost/127.0.0.1:6123
> Using address localhost:6123 to connect to JobManager.
> JobManager web interface address http://localhost:8081
> Starting execution of 
> programhttps://issues.apache.org/jira/issues/?jql=text%20~%20%22org.apache.flink.api.common.ExecutionConfig%20cannot%20be%20cast%20to%22#
> Submitting job with JobID: 05dd8e60c6fda3b96fc22ef6cf389a23. Waiting for job 
> completion.
> Connected to JobManager at 
> Actor[akka.tcp://flink@localhost:6123/user/jobmanager#-234825544] with leader 
> session id ----.
> 
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The program 
> execution failed: Failed to submit job 05dd8e60c6fda3b96fc22ef6cf389a23 
> (Flink Streaming Job)
>   at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:479)
>   at 
> org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:105)
>   at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:443)
>   at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
>   at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1501)
>   at 
> org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:629)
>   at 
> org.example.streams.split.SimpleSplitStreams$.main(04splitstreams.scala:53)
>   at 
> org.example.streams.split.SimpleSplitStreams.main(04splitstreams.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:525)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:417)
>   at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:383)
>   at 
> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:840)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:285)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1088)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1135)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1132)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:44)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1132)
> Caused by: 

[jira] [Commented] (FLINK-5486) Lack of synchronization in BucketingSink#handleRestoredBucketState()

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199652#comment-16199652
 ] 

ASF GitHub Bot commented on FLINK-5486:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4356
  
OKay. Thanks , Ted. Will retest again soon enough./


> Lack of synchronization in BucketingSink#handleRestoredBucketState()
> 
>
> Key: FLINK-5486
> URL: https://issues.apache.org/jira/browse/FLINK-5486
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Ted Yu
>Assignee: mingleizhang
> Fix For: 1.3.3
>
>
> Here is related code:
> {code}
>   
> handlePendingFilesForPreviousCheckpoints(bucketState.pendingFilesPerCheckpoint);
>   synchronized (bucketState.pendingFilesPerCheckpoint) {
> bucketState.pendingFilesPerCheckpoint.clear();
>   }
> {code}
> The handlePendingFilesForPreviousCheckpoints() call should be enclosed inside 
> the synchronization block. Otherwise during the processing of 
> handlePendingFilesForPreviousCheckpoints(), some entries of the map may be 
> cleared.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4356: [FLINK-5486] Fix lacking of synchronization in BucketingS...

2017-10-10 Thread zhangminglei
Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4356
  
OKay. Thanks , Ted. Will retest again soon enough./


---


[jira] [Commented] (FLINK-4808) Allow skipping failed checkpoints

2017-10-10 Thread Jing Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199614#comment-16199614
 ] 

Jing Fan commented on FLINK-4808:
-

[~StephanEwen] Do we have any update on this jira? 

> Allow skipping failed checkpoints
> -
>
> Key: FLINK-4808
> URL: https://issues.apache.org/jira/browse/FLINK-4808
> Project: Flink
>  Issue Type: New Feature
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.2, 1.1.3
>Reporter: Stephan Ewen
>
> Currently, if Flink cannot complete a checkpoint, it results in a failure and 
> recovery.
> To make the impact of less stable storage infrastructure on the performance 
> of Flink less severe, Flink should be able to tolerate a certain number of 
> failed checkpoints and simply keep executing.
> This should be controllable via a parameter, for example:
> {code}
> env.getCheckpointConfig().setAllowedFailedCheckpoints(3);
> {code}
> A value of {{-1}} could indicate an infinite number of checkpoint failures 
> tolerated by Flink.
> The default value should still be {{0}}, to keep compatibility with the 
> existing behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6697) Add batch multi-window support

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199524#comment-16199524
 ] 

ASF GitHub Bot commented on FLINK-6697:
---

GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/4796

[FLINK-6697] [table] Add support for window.rowtime to batch Table API.

## What is the purpose of the change

This PR adds support for the `windowAlias.rowtime` property to the batch 
Table API. The `rowtime` expression is already supported by the streaming Table 
API. 

This PR improves the unification of batch and streaming Table API queries.

## Brief change log

- adjust the validation and return type of the `RowtimeAttribute` expression
- pass rowtime index to DataSet window operators
- adjust `TimeWindowPropertyCollector` to set rowtime attribute in batch 
windows
- extend existing tests to check the new window property

## Verifying this change

- existing tests in `GroupWindowITCase` were extended to check the 
`rowtime` window property.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): **no**
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
  - The serializers: **no**
  - The runtime per-record code paths (performance sensitive): **no**
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: **no**

## Documentation

  - Does this pull request introduce a new feature? **yes**
  - If yes, how is the feature documented? **Documentation is extended**



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink tableBatchRowtime

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4796.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4796


commit 06fddde1ee6ad3aea17b7c11495090964cb0ef1c
Author: Fabian Hueske 
Date:   2017-08-06T21:55:56Z

[FLINK-6697] [table] Add support for window.rowtime to batch Table API.




> Add batch multi-window support
> --
>
> Key: FLINK-6697
> URL: https://issues.apache.org/jira/browse/FLINK-6697
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> Multiple consecutive windows on batch are not tested yet and I think they are 
> also not supported, because the syntax is not defined for batch yet.
> The following should be supported:
> {code}
> val t = table
> .window(Tumble over 2.millis on 'rowtime as 'w)
> .groupBy('w)
> .select('w.rowtime as 'rowtime, 'int.count as 'int)
> .window(Tumble over 4.millis on 'rowtime as 'w2)
> .groupBy('w2)
> .select('w2.rowtime, 'w2.end, 'int.count)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4796: [FLINK-6697] [table] Add support for window.rowtim...

2017-10-10 Thread fhueske
GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/4796

[FLINK-6697] [table] Add support for window.rowtime to batch Table API.

## What is the purpose of the change

This PR adds support for the `windowAlias.rowtime` property to the batch 
Table API. The `rowtime` expression is already supported by the streaming Table 
API. 

This PR improves the unification of batch and streaming Table API queries.

## Brief change log

- adjust the validation and return type of the `RowtimeAttribute` expression
- pass rowtime index to DataSet window operators
- adjust `TimeWindowPropertyCollector` to set rowtime attribute in batch 
windows
- extend existing tests to check the new window property

## Verifying this change

- existing tests in `GroupWindowITCase` were extended to check the 
`rowtime` window property.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): **no**
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
  - The serializers: **no**
  - The runtime per-record code paths (performance sensitive): **no**
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: **no**

## Documentation

  - Does this pull request introduce a new feature? **yes**
  - If yes, how is the feature documented? **Documentation is extended**



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink tableBatchRowtime

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4796.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4796


commit 06fddde1ee6ad3aea17b7c11495090964cb0ef1c
Author: Fabian Hueske 
Date:   2017-08-06T21:55:56Z

[FLINK-6697] [table] Add support for window.rowtime to batch Table API.




---


[jira] [Commented] (FLINK-7657) SQL Timestamps Converted To Wrong Type By Optimizer Causing ClassCastException

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199500#comment-16199500
 ] 

ASF GitHub Bot commented on FLINK-7657:
---

Github user kmurra commented on the issue:

https://github.com/apache/flink/pull/4746
  
The biggest change here is in the test cases -- I generalized the test 
table source to have some basic filtering logic and allow for generic datasets.

I moved the Literal build logic to the 
RexNodeToExpressionConverter.visitLiteral.  I also rewrote several of the 
conversion methods to more closely align with the intended behavior of the code 
- that we're preserving the string values of the various time-related literals 
in the local timezone.  This made a bunch of the epoch-millisecond 
modifications go away.


> SQL Timestamps Converted To Wrong Type By Optimizer Causing ClassCastException
> --
>
> Key: FLINK-7657
> URL: https://issues.apache.org/jira/browse/FLINK-7657
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.1, 1.3.2
>Reporter: Kent Murra
>Assignee: Kent Murra
>Priority: Critical
>
> I have a SQL statement using the Tables API that has a timestamp in it. When 
> the execution environment tries to optimize the SQL, it causes an exception 
> (attached below).  The result is any SQL query with a timestamp, date, or 
> time literal is unexecutable if any table source is marked with 
> FilterableTableSource. 
> {code:none} 
> Exception in thread "main" java.lang.RuntimeException: Error while applying 
> rule PushFilterIntoTableSourceScanRule, args 
> [rel#30:FlinkLogicalCalc.LOGICAL(input=rel#29:Subset#0.LOGICAL,expr#0..1={inputs},expr#2=2017-05-01,expr#3=>($t1,
>  $t2),data=$t0,last_updated=$t1,$condition=$t3), Scan(table:[test_table], 
> fields:(data, last_updated))]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:235)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:650)
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:368)
>   at 
> org.apache.flink.table.api.TableEnvironment.runVolcanoPlanner(TableEnvironment.scala:266)
>   at 
> org.apache.flink.table.api.BatchTableEnvironment.optimize(BatchTableEnvironment.scala:298)
>   at 
> org.apache.flink.table.api.BatchTableEnvironment.translate(BatchTableEnvironment.scala:328)
>   at 
> org.apache.flink.table.api.BatchTableEnvironment.writeToSink(BatchTableEnvironment.scala:135)
>   at org.apache.flink.table.api.Table.writeToSink(table.scala:800)
>   at org.apache.flink.table.api.Table.writeToSink(table.scala:773)
>   at 
> com.remitly.flink.TestReproductionApp$.delayedEndpoint$com$remitly$flink$TestReproductionApp$1(TestReproductionApp.scala:27)
>   at 
> com.remitly.flink.TestReproductionApp$delayedInit$body.apply(TestReproductionApp.scala:22)
>   at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
>   at 
> scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
>   at scala.App$$anonfun$main$1.apply(App.scala:76)
>   at scala.App$$anonfun$main$1.apply(App.scala:76)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at 
> scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
>   at scala.App$class.main(App.scala:76)
>   at 
> com.remitly.flink.TestReproductionApp$.main(TestReproductionApp.scala:22)
>   at com.remitly.flink.TestReproductionApp.main(TestReproductionApp.scala)
> Caused by: java.lang.ClassCastException: java.util.GregorianCalendar cannot 
> be cast to java.util.Date
>   at 
> org.apache.flink.table.expressions.Literal.dateToCalendar(literals.scala:107)
>   at 
> org.apache.flink.table.expressions.Literal.toRexNode(literals.scala:80)
>   at 
> org.apache.flink.table.expressions.BinaryComparison$$anonfun$toRexNode$1.apply(comparison.scala:35)
>   at 
> org.apache.flink.table.expressions.BinaryComparison$$anonfun$toRexNode$1.apply(comparison.scala:35)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.immutable.List.map(List.scala:285)
>   at 
> org.apache.flink.table.expressions.BinaryComparison.toRexNode(comparison.scala:35)
>   at 
> org.apache.flink.table.plan.rules.logical.PushFilterIntoTableSourceScanRule$$anonfun$1.apply(PushFilterIntoTableSourceScanRule.scala:92)
>   at 
> 

[GitHub] flink issue #4746: [FLINK-7657] [Table] Adding logic to convert RexLiteral t...

2017-10-10 Thread kmurra
Github user kmurra commented on the issue:

https://github.com/apache/flink/pull/4746
  
The biggest change here is in the test cases -- I generalized the test 
table source to have some basic filtering logic and allow for generic datasets.

I moved the Literal build logic to the 
RexNodeToExpressionConverter.visitLiteral.  I also rewrote several of the 
conversion methods to more closely align with the intended behavior of the code 
- that we're preserving the string values of the various time-related literals 
in the local timezone.  This made a bunch of the epoch-millisecond 
modifications go away.


---


[jira] [Created] (FLINK-7801) Integrate list command into REST client

2017-10-10 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7801:


 Summary: Integrate list command into REST client
 Key: FLINK-7801
 URL: https://issues.apache.org/jira/browse/FLINK-7801
 Project: Flink
  Issue Type: Sub-task
  Components: Client, REST
Affects Versions: 1.4.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann


The {{RestClusterClient}} should be able to retrieve all currently running jobs 
from the cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-6569) flink-table KafkaJsonTableSource example doesn't work

2017-10-10 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai closed FLINK-6569.
-
   Resolution: Invalid
Fix Version/s: (was: 1.4.0)

> flink-table KafkaJsonTableSource example doesn't work
> -
>
> Key: FLINK-6569
> URL: https://issues.apache.org/jira/browse/FLINK-6569
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation, Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Robert Metzger
>Assignee: Haohui Mai
>
> The code example uses 
> {code}
> TypeInformation typeInfo = Types.ROW(
>   new String[] { "id", "name", "score" },
>   new TypeInformation[] { Types.INT(), Types.STRING(), Types.DOUBLE() }
> );
> {code}
> the correct way of using it is something like
> {code}
> TypeInformation typeInfo = Types.ROW_NAMED(
> new String[] { "id", "zip", "date" },
> Types.LONG, Types.INT, Types.SQL_DATE);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7793) SlotManager releases idle TaskManager in standalone mode

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199448#comment-16199448
 ] 

ASF GitHub Bot commented on FLINK-7793:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/4795

[FLINK-7793] [flip6] Defer slot release to ResourceManager

## What is the purpose of the change

This commit changes the SlotManager behaviour such that upon a TaskManager 
timeout
the ResourceManager is only notified about it without removing the slots. 
The
ResourceManager can then decide whether it stops the TaskManager and 
removes the slots
from the SlotManager or to keep the TaskManager still around.

## Brief change log

- Rename `ResourceManagerActions` into `ResourceActions`
- Remove automatic slot removal from `SlotManager` in case of `TaskManager` 
timeout
- Add `ResourceManager#releaseResource` method which removes the slots 
depending on the `stopWorker` call

## Verifying this change

This change added tests and can be verified as follows:

- `SlotManagerTest#testTaskManagerTimeoutDoesNotRemoveSlots`

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixTaskManagerRelease

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4795.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4795


commit 6b7aec2ce3799a35ed21bd04345abe0ab0b8298e
Author: Till Rohrmann 
Date:   2017-10-10T14:15:53Z

[FLINK-7653] Properly implement Dispatcher#requestClusterOverview

This commit implements the ClusterOverview generation on the Dispatcher. In
order to do this, the Dispatcher requests the ResourceOverview from the
ResourceManager and the job status from all JobMasters. After receiving all
information, it is compiled into the ClusterOverview.

Note: StatusOverview has been renamed to ClusterOverview

commit 62b7ffd1ca924320ddb5a32073358f5b3c53be74
Author: Till Rohrmann 
Date:   2017-10-10T16:39:40Z

[FLINK-7793] [flip6] Defer slot release to ResourceManager

This commit changes the SlotManager behaviour such that upon a TaskManager 
timeout
the ResourceManager is only notified about it without removing the slots. 
The
ResourceManager can then decide whether it stops the TaskManager and 
removes the slots
from the SlotManager or to keep the TaskManager still around.

Add test case




> SlotManager releases idle TaskManager in standalone mode
> 
>
> Key: FLINK-7793
> URL: https://issues.apache.org/jira/browse/FLINK-7793
> Project: Flink
>  Issue Type: Bug
>  Components: ResourceManager
>Affects Versions: 1.4.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: flip-6
>
> The {{SlotManager}} releases idle {{TaskManagers}} and removes all their 
> slots. This also happens in standalone mode where we cannot release task 
> managers. 
> I suggest to let the {{ResourceManager}} decide whether a resource can be 
> released or not. Only in the former case, we will remove the associated slots 
> from the {{SlotManager}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4795: [FLINK-7793] [flip6] Defer slot release to Resourc...

2017-10-10 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/4795

[FLINK-7793] [flip6] Defer slot release to ResourceManager

## What is the purpose of the change

This commit changes the SlotManager behaviour such that upon a TaskManager 
timeout
the ResourceManager is only notified about it without removing the slots. 
The
ResourceManager can then decide whether it stops the TaskManager and 
removes the slots
from the SlotManager or to keep the TaskManager still around.

## Brief change log

- Rename `ResourceManagerActions` into `ResourceActions`
- Remove automatic slot removal from `SlotManager` in case of `TaskManager` 
timeout
- Add `ResourceManager#releaseResource` method which removes the slots 
depending on the `stopWorker` call

## Verifying this change

This change added tests and can be verified as follows:

- `SlotManagerTest#testTaskManagerTimeoutDoesNotRemoveSlots`

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixTaskManagerRelease

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4795.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4795


commit 6b7aec2ce3799a35ed21bd04345abe0ab0b8298e
Author: Till Rohrmann 
Date:   2017-10-10T14:15:53Z

[FLINK-7653] Properly implement Dispatcher#requestClusterOverview

This commit implements the ClusterOverview generation on the Dispatcher. In
order to do this, the Dispatcher requests the ResourceOverview from the
ResourceManager and the job status from all JobMasters. After receiving all
information, it is compiled into the ClusterOverview.

Note: StatusOverview has been renamed to ClusterOverview

commit 62b7ffd1ca924320ddb5a32073358f5b3c53be74
Author: Till Rohrmann 
Date:   2017-10-10T16:39:40Z

[FLINK-7793] [flip6] Defer slot release to ResourceManager

This commit changes the SlotManager behaviour such that upon a TaskManager 
timeout
the ResourceManager is only notified about it without removing the slots. 
The
ResourceManager can then decide whether it stops the TaskManager and 
removes the slots
from the SlotManager or to keep the TaskManager still around.

Add test case




---


[jira] [Created] (FLINK-7800) Enable window joins without equi-join predicates

2017-10-10 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-7800:


 Summary: Enable window joins without equi-join predicates
 Key: FLINK-7800
 URL: https://issues.apache.org/jira/browse/FLINK-7800
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.4.0
Reporter: Fabian Hueske


Currently, windowed joins can only be translated if they have at least on 
equi-join predicate. This limitation exists due to the lack of a good cross 
join strategy for the DataSet API.

Due to the window, windowed joins do not have to be executed as cross joins. 
Hence, the equi-join limitation does not need to be enforces (even though 
non-equi joins are executed with a parallelism of 1 right now).

We can resolve this issue by adding a boolean flag to the 
{{FlinkLogicalJoinConverter}} rule to permit non-equi joins and add such a rule 
to the logical optimization set of the DataStream API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7799) Improve performance of windowed joins

2017-10-10 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-7799:


 Summary: Improve performance of windowed joins
 Key: FLINK-7799
 URL: https://issues.apache.org/jira/browse/FLINK-7799
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.4.0
Reporter: Fabian Hueske
Priority: Critical


The performance of windowed joins can be improved by changing the state access 
patterns.
Right now, rows are inserted into a MapState with their timestamp as key. Since 
we use a time resolution of 1ms, this means that the full key space of the 
state must be iterated and many map entries must be accessed when joining or 
evicting rows. 

A better strategy would be to block the time into larger intervals and register 
the rows in their respective interval. Another benefit would be that we can 
directly access the state entries because we know exactly which timestamps to 
look up. Hence, we can limit the state access to the relevant section during 
joining and state eviction. 

The good size for intervals needs to be identified and might depend on the size 
of the window.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7798) Add support for windowed joins to Table API

2017-10-10 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-7798:


 Summary: Add support for windowed joins to Table API
 Key: FLINK-7798
 URL: https://issues.apache.org/jira/browse/FLINK-7798
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.4.0
Reporter: Fabian Hueske
Priority: Critical
 Fix For: 1.4.0


Currently, windowed joins on streaming tables are only supported through SQL.

The Table API should support these joins as well. For that, we have to adjust 
the Table API validation and translate the API into the respective logical 
plan. Since most of the code should already be there for the batch Table API 
joins, this should be fairly straightforward.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-5725) Support windowed JOIN between two streaming tables

2017-10-10 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-5725:
-
Summary: Support windowed JOIN between two streaming tables  (was: Support 
windowed JOIN between two streams in the SQL API)

> Support windowed JOIN between two streaming tables
> --
>
> Key: FLINK-5725
> URL: https://issues.apache.org/jira/browse/FLINK-5725
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: hongyuhong
>
> As described in the title.
> This jira proposes to support joining two streaming tables in the SQL API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7797) Add support for windowed outer joins for streaming tables

2017-10-10 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-7797:


 Summary: Add support for windowed outer joins for streaming tables
 Key: FLINK-7797
 URL: https://issues.apache.org/jira/browse/FLINK-7797
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.4.0
Reporter: Fabian Hueske


Currently, only windowed inner joins for streaming tables are supported.
This issue is about adding support for windowed LEFT, RIGHT, and FULL OUTER 
joins.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-5725) Support windowed JOIN between two streams in the SQL API

2017-10-10 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-5725:
-
Summary: Support windowed JOIN between two streams in the SQL API  (was: 
Support JOIN between two streams in the SQL API)

> Support windowed JOIN between two streams in the SQL API
> 
>
> Key: FLINK-5725
> URL: https://issues.apache.org/jira/browse/FLINK-5725
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: hongyuhong
>
> As described in the title.
> This jira proposes to support joining two streaming tables in the SQL API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7410) Use toString method to display operator names for UserDefinedFunction

2017-10-10 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-7410.

   Resolution: Implemented
Fix Version/s: 1.4.0

Implemented for 1.4.0 with 427dfe42e2bea891b40e662bc97cdea57cdae3f5

> Use toString method to display operator names for UserDefinedFunction
> -
>
> Key: FLINK-7410
> URL: https://issues.apache.org/jira/browse/FLINK-7410
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
> Fix For: 1.4.0
>
>
> *Motivation*
> Operator names setted in table-api are used by visualization and logging, it 
> is import to make these names simple and readable. Currently, 
> UserDefinedFunction’s name contains class CanonicalName and md5 value making 
> the name too long and unfriendly to users. 
> As shown in the following example, 
> {quote}
> select: (a, b, c, 
> org$apache$flink$table$expressions$utils$RichFunc1$281f7e61ec5d8da894f5783e2e17a4f5(a)
>  AS _c3, 
> org$apache$flink$table$expressions$utils$RichFunc2$fb99077e565685ebc5f48b27edc14d98(c)
>  AS _c4)
> {quote}
> *Changes:*
>   
> Use {{toString}} method to display operator names for UserDefinedFunction. 
> The method will return class name by default. Users can also override the 
> method to return whatever he wants.
> What do you think [~fhueske] ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7491) Support COLLECT Aggregate function in Flink SQL

2017-10-10 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-7491.

   Resolution: Fixed
Fix Version/s: 1.4.0

Implemented for 1.4.0 with dccdba199a8fbb8b5186f0952410c1b1b3dff14f

> Support COLLECT Aggregate function in Flink SQL
> ---
>
> Key: FLINK-7491
> URL: https://issues.apache.org/jira/browse/FLINK-7491
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7776) do not emit duplicated records in group aggregation

2017-10-10 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-7776.

   Resolution: Fixed
Fix Version/s: 1.4.0

Implemented for 1.4.0 with 4047be49e10cacc5e4ce932a0b8433afffa82a58

> do not emit duplicated records in group aggregation
> ---
>
> Key: FLINK-7776
> URL: https://issues.apache.org/jira/browse/FLINK-7776
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Ruidong Li
>Assignee: Ruidong Li
> Fix For: 1.4.0
>
>
> the current group aggregation will compare the last {{Row}} and current 
> {{Row}} when {{generateRetraction}} is {{true}} and {{firstRow}} is {{false}} 
> in {{GroupAggProcessFunction}},
> this logic should be applied to all cases when {{firstRow}} is false, if 
> current {{Row}} is same with previous {{Row}}, we do not emit any records.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-6233) Support rowtime inner equi-join between two streams in the SQL API

2017-10-10 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-6233.

   Resolution: Fixed
Fix Version/s: 1.4.0

Implemented for 1.4.0 with 655d8b16193ac7131fa1f58fb4ba7ff96e439438

> Support rowtime inner equi-join between two streams in the SQL API
> --
>
> Key: FLINK-6233
> URL: https://issues.apache.org/jira/browse/FLINK-6233
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: Xingcan Cui
> Fix For: 1.4.0
>
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.rowtime , o.productId, o.orderId, s.rowtime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL 
> '1' HOUR}} only can use rowtime that is a system attribute, the time 
> condition only support bounded time range like {{o.rowtime BETWEEN s.rowtime 
> - INTERVAL '1' HOUR AND s.rowtime + INTERVAL '1' HOUR}}, not support 
> unbounded like {{o.rowtime  s.rowtime}} ,  and  should include both two 
> stream's rowtime attribute, {{o.rowtime between rowtime () and rowtime () + 
> 1}} should also not be supported.
> An row-time streams join will not be able to handle late data, because this 
> would mean in insert a row into a sorted order shift all other computations. 
> This would be too expensive to maintain. Therefore, we will throw an error if 
> a user tries to use an row-time stream join with late data handling.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode representation (LogicalJoin). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6233) Support rowtime inner equi-join between two streams in the SQL API

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199358#comment-16199358
 ] 

ASF GitHub Bot commented on FLINK-6233:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4625


> Support rowtime inner equi-join between two streams in the SQL API
> --
>
> Key: FLINK-6233
> URL: https://issues.apache.org/jira/browse/FLINK-6233
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: Xingcan Cui
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.rowtime , o.productId, o.orderId, s.rowtime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL 
> '1' HOUR}} only can use rowtime that is a system attribute, the time 
> condition only support bounded time range like {{o.rowtime BETWEEN s.rowtime 
> - INTERVAL '1' HOUR AND s.rowtime + INTERVAL '1' HOUR}}, not support 
> unbounded like {{o.rowtime  s.rowtime}} ,  and  should include both two 
> stream's rowtime attribute, {{o.rowtime between rowtime () and rowtime () + 
> 1}} should also not be supported.
> An row-time streams join will not be able to handle late data, because this 
> would mean in insert a row into a sorted order shift all other computations. 
> This would be too expensive to maintain. Therefore, we will throw an error if 
> a user tries to use an row-time stream join with late data handling.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode representation (LogicalJoin). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7776) do not emit duplicated records in group aggregation

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199359#comment-16199359
 ] 

ASF GitHub Bot commented on FLINK-7776:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4785


> do not emit duplicated records in group aggregation
> ---
>
> Key: FLINK-7776
> URL: https://issues.apache.org/jira/browse/FLINK-7776
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Ruidong Li
>Assignee: Ruidong Li
>
> the current group aggregation will compare the last {{Row}} and current 
> {{Row}} when {{generateRetraction}} is {{true}} and {{firstRow}} is {{false}} 
> in {{GroupAggProcessFunction}},
> this logic should be applied to all cases when {{firstRow}} is false, if 
> current {{Row}} is same with previous {{Row}}, we do not emit any records.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7491) Support COLLECT Aggregate function in Flink SQL

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199360#comment-16199360
 ] 

ASF GitHub Bot commented on FLINK-7491:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4585


> Support COLLECT Aggregate function in Flink SQL
> ---
>
> Key: FLINK-7491
> URL: https://issues.apache.org/jira/browse/FLINK-7491
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7410) Use toString method to display operator names for UserDefinedFunction

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199361#comment-16199361
 ] 

ASF GitHub Bot commented on FLINK-7410:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4624


> Use toString method to display operator names for UserDefinedFunction
> -
>
> Key: FLINK-7410
> URL: https://issues.apache.org/jira/browse/FLINK-7410
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>
> *Motivation*
> Operator names setted in table-api are used by visualization and logging, it 
> is import to make these names simple and readable. Currently, 
> UserDefinedFunction’s name contains class CanonicalName and md5 value making 
> the name too long and unfriendly to users. 
> As shown in the following example, 
> {quote}
> select: (a, b, c, 
> org$apache$flink$table$expressions$utils$RichFunc1$281f7e61ec5d8da894f5783e2e17a4f5(a)
>  AS _c3, 
> org$apache$flink$table$expressions$utils$RichFunc2$fb99077e565685ebc5f48b27edc14d98(c)
>  AS _c4)
> {quote}
> *Changes:*
>   
> Use {{toString}} method to display operator names for UserDefinedFunction. 
> The method will return class name by default. Users can also override the 
> method to return whatever he wants.
> What do you think [~fhueske] ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4585: [FLINK-7491] [Table API & SQL] add MultiSetTypeInf...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4585


---


[GitHub] flink pull request #4624: [FLINK-7410] [table] Use toString method to displa...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4624


---


[GitHub] flink pull request #4785: [FLINK-7776][TableAPI & SQL] do not emit duplicate...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4785


---


[GitHub] flink pull request #4625: [FLINK-6233] [table] Support time-bounded stream i...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4625


---


[jira] [Resolved] (FLINK-7704) Port JobPlanHandler to new REST endpoint

2017-10-10 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann resolved FLINK-7704.
--
Resolution: Done

Done via 9829ca00dff201879724847b498fe0432219cb53

> Port JobPlanHandler to new REST endpoint
> 
>
> Key: FLINK-7704
> URL: https://issues.apache.org/jira/browse/FLINK-7704
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, REST, Webfrontend
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Hai Zhou UTC+8
>  Labels: flip-6
> Fix For: 1.4.0
>
>
> Port existing {{JobPlanHandler}} to new REST endpoint.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7704) Port JobPlanHandler to new REST endpoint

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199343#comment-16199343
 ] 

ASF GitHub Bot commented on FLINK-7704:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4768


> Port JobPlanHandler to new REST endpoint
> 
>
> Key: FLINK-7704
> URL: https://issues.apache.org/jira/browse/FLINK-7704
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, REST, Webfrontend
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Hai Zhou UTC+8
>  Labels: flip-6
> Fix For: 1.4.0
>
>
> Port existing {{JobPlanHandler}} to new REST endpoint.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4768: [FLINK-7704][flip6] Add JobPlanHandler for new Res...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4768


---


[jira] [Created] (FLINK-7796) RocksDBKeyedStateBackend#RocksDBFullSnapshotOperation should close snapshotCloseableRegistry

2017-10-10 Thread Ted Yu (JIRA)
Ted Yu created FLINK-7796:
-

 Summary: RocksDBKeyedStateBackend#RocksDBFullSnapshotOperation 
should close snapshotCloseableRegistry
 Key: FLINK-7796
 URL: https://issues.apache.org/jira/browse/FLINK-7796
 Project: Flink
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor


snapshotCloseableRegistry, being CloseableRegistry, depends on invocation of 
close() method to release certain resource.

It seems close() can be called from releaseSnapshotResources()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7750) Strange behaviour in savepoints

2017-10-10 Thread Razvan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199262#comment-16199262
 ] 

Razvan commented on FLINK-7750:
---

[~aljoscha] Many thanks for clarifying this. Is there a way to update 
documentation to specify the savepoint location has to be accessible from all 
hosts as it's not really clear here 
(https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/savepoints.html)?
 Is there a way I can update the documentation myself?

> Strange behaviour in savepoints
> ---
>
> Key: FLINK-7750
> URL: https://issues.apache.org/jira/browse/FLINK-7750
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.3.2
>Reporter: Razvan
>Priority: Blocker
>
> I recently upgraded from 1.2.0 and savepoint creration behaves strange. 
> Whenever I try to create a savepoint with specified directory Apache Flink 
> creates a folder on the active JobManager (even if I trigger savepoint 
> creation from a different JobManager) which contains only _metadata. And 
> another folder on the TaskManager where the job is running which contains the 
> actual savepoint.
> Obviously if I try to restore it says it can't find the savepoint.
> This worked in 1.2.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-5486) Lack of synchronization in BucketingSink#handleRestoredBucketState()

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199255#comment-16199255
 ] 

ASF GitHub Bot commented on FLINK-5486:
---

Github user tedyu commented on the issue:

https://github.com/apache/flink/pull/4356
  
retest this please


> Lack of synchronization in BucketingSink#handleRestoredBucketState()
> 
>
> Key: FLINK-5486
> URL: https://issues.apache.org/jira/browse/FLINK-5486
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Ted Yu
>Assignee: mingleizhang
> Fix For: 1.3.3
>
>
> Here is related code:
> {code}
>   
> handlePendingFilesForPreviousCheckpoints(bucketState.pendingFilesPerCheckpoint);
>   synchronized (bucketState.pendingFilesPerCheckpoint) {
> bucketState.pendingFilesPerCheckpoint.clear();
>   }
> {code}
> The handlePendingFilesForPreviousCheckpoints() call should be enclosed inside 
> the synchronization block. Otherwise during the processing of 
> handlePendingFilesForPreviousCheckpoints(), some entries of the map may be 
> cleared.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4356: [FLINK-5486] Fix lacking of synchronization in BucketingS...

2017-10-10 Thread tedyu
Github user tedyu commented on the issue:

https://github.com/apache/flink/pull/4356
  
retest this please


---


[jira] [Created] (FLINK-7795) Utilize error-prone to discover common coding mistakes

2017-10-10 Thread Ted Yu (JIRA)
Ted Yu created FLINK-7795:
-

 Summary: Utilize error-prone to discover common coding mistakes
 Key: FLINK-7795
 URL: https://issues.apache.org/jira/browse/FLINK-7795
 Project: Flink
  Issue Type: Improvement
Reporter: Ted Yu


http://errorprone.info/ is a tool which detects common coding mistakes.
We should incorporate into Flink build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4794: [build][minor] Add missing licenses

2017-10-10 Thread yew1eb
GitHub user yew1eb opened a pull request:

https://github.com/apache/flink/pull/4794

[build][minor] Add missing licenses



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yew1eb/flink add_missing_licenses

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4794.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4794






---


[GitHub] flink issue #4790: [FLINK-7764] [kafka] Enable the operator settings for Fli...

2017-10-10 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4790
  
cc @aljoscha and @pnowojski 
I think there is some work for Flink 1.4 to make the Kafka 0.10 sink a 
regular sink function, so that the code that constructs the sink transformation 
is not needed any more.

This change would be relevant to 1.3.3 though.


---


[jira] [Commented] (FLINK-7764) FlinkKafkaProducer010 does not accept name, uid, or parallelism

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199129#comment-16199129
 ] 

ASF GitHub Bot commented on FLINK-7764:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4790
  
cc @aljoscha and @pnowojski 
I think there is some work for Flink 1.4 to make the Kafka 0.10 sink a 
regular sink function, so that the code that constructs the sink transformation 
is not needed any more.

This change would be relevant to 1.3.3 though.


> FlinkKafkaProducer010 does not accept name, uid, or parallelism
> ---
>
> Key: FLINK-7764
> URL: https://issues.apache.org/jira/browse/FLINK-7764
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>
> As [reported on the user 
> list|https://lists.apache.org/thread.html/1e97b79cb5611a942beb609a61b5fba6d62a49c9c86336718a9a0004@%3Cuser.flink.apache.org%3E]:
> When I try to use KafkaProducer with timestamps it fails to set name, uid or 
> parallelism. It uses default values.
> {code}
> FlinkKafkaProducer010.FlinkKafkaProducer010Configuration producer = 
> FlinkKafkaProducer010
> .writeToKafkaWithTimestamps(stream, topicName, schema, props, 
> partitioner);
> producer.setFlushOnCheckpoint(flushOnCheckpoint);
> producer.name("foo")
> .uid("bar")
> .setParallelism(5);
> return producer;
> {code}
> As operator name it shows "FlinKafkaProducer 0.10.x” with the typo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6225) Support Row Stream for CassandraSink

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199094#comment-16199094
 ] 

ASF GitHub Bot commented on FLINK-6225:
---

Github user PangZhi commented on the issue:

https://github.com/apache/flink/pull/3748
  
@zentol Do you mind taking another look.


> Support Row Stream for CassandraSink
> 
>
> Key: FLINK-6225
> URL: https://issues.apache.org/jira/browse/FLINK-6225
> Project: Flink
>  Issue Type: New Feature
>  Components: Cassandra Connector
>Affects Versions: 1.3.0
>Reporter: Jing Fan
>Assignee: Haohui Mai
> Fix For: 1.4.0
>
>
> Currently in CassandraSink, specifying query is not supported for row-stream. 
> The solution should be similar to CassandraTupleSink.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #3748: [FLINK-6225] [Cassandra Connector] add CassandraTableSink

2017-10-10 Thread PangZhi
Github user PangZhi commented on the issue:

https://github.com/apache/flink/pull/3748
  
@zentol Do you mind taking another look.


---


[jira] [Commented] (FLINK-7737) On HCFS systems, FSDataOutputStream does not issue hsync only hflush which leads to data loss

2017-10-10 Thread Ryan Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199070#comment-16199070
 ] 

Ryan Hobbs commented on FLINK-7737:
---

In the case of hflush() it is simply flushing the buffer but there is no 
guarantee that the data is sync'd to disk so in the case of a failure scenario 
we have seen data loss when hflush() is used.  Is it possible for Flink to pass 
in SYNC_BLOCK flag on create(). If set I believe when hflush() is called it 
will perform hsync().  

> On HCFS systems, FSDataOutputStream does not issue hsync only hflush which 
> leads to data loss
> -
>
> Key: FLINK-7737
> URL: https://issues.apache.org/jira/browse/FLINK-7737
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.3.2
> Environment: Dev
>Reporter: Ryan Hobbs
>
> During several tests where we simulated failure conditions, we have observed 
> that on HCFS systems where the data stream is of type FSDataOutputStream, 
> Flink will issue hflush() and not hsync() which results in data loss.
> In the class *StreamWriterBase.java* the code below will execute hsync if the 
> output stream is of type *HdfsDataOutputStream* but not for streams of type 
> *FSDataOutputStream*.  Is this by design?
> {code}
> protected void hflushOrSync(FSDataOutputStream os) throws IOException {
> try {
> // At this point the refHflushOrSync cannot be null,
> // since register method would have thrown if it was.
> this.refHflushOrSync.invoke(os);
> if (os instanceof HdfsDataOutputStream) {
>   ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
>   }
>   } catch (InvocationTargetException e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e.getCause());
> Throwable cause = e.getCause();
> if (cause != null && cause instanceof IOException) {
> throw (IOException) cause;
>   }
> throw new RuntimeException(msg, e);
>   } catch (Exception e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e);
> throw new RuntimeException(msg, e);
>   }
>   }
> {code}
> Could a potential fix me to perform a sync even on streams of type 
> *FSDataOutputStream*?
> {code}
>  if (os instanceof HdfsDataOutputStream) {
> ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
> } else if (os instanceof FSDataOutputStream) {
> os.hsync();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6233) Support rowtime inner equi-join between two streams in the SQL API

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198954#comment-16198954
 ] 

ASF GitHub Bot commented on FLINK-6233:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4625
  
Merging


> Support rowtime inner equi-join between two streams in the SQL API
> --
>
> Key: FLINK-6233
> URL: https://issues.apache.org/jira/browse/FLINK-6233
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: Xingcan Cui
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.rowtime , o.productId, o.orderId, s.rowtime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL 
> '1' HOUR}} only can use rowtime that is a system attribute, the time 
> condition only support bounded time range like {{o.rowtime BETWEEN s.rowtime 
> - INTERVAL '1' HOUR AND s.rowtime + INTERVAL '1' HOUR}}, not support 
> unbounded like {{o.rowtime  s.rowtime}} ,  and  should include both two 
> stream's rowtime attribute, {{o.rowtime between rowtime () and rowtime () + 
> 1}} should also not be supported.
> An row-time streams join will not be able to handle late data, because this 
> would mean in insert a row into a sorted order shift all other computations. 
> This would be too expensive to maintain. Therefore, we will throw an error if 
> a user tries to use an row-time stream join with late data handling.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode representation (LogicalJoin). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7776) do not emit duplicated records in group aggregation

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198952#comment-16198952
 ] 

ASF GitHub Bot commented on FLINK-7776:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4785
  
Merging


> do not emit duplicated records in group aggregation
> ---
>
> Key: FLINK-7776
> URL: https://issues.apache.org/jira/browse/FLINK-7776
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Ruidong Li
>Assignee: Ruidong Li
>
> the current group aggregation will compare the last {{Row}} and current 
> {{Row}} when {{generateRetraction}} is {{true}} and {{firstRow}} is {{false}} 
> in {{GroupAggProcessFunction}},
> this logic should be applied to all cases when {{firstRow}} is false, if 
> current {{Row}} is same with previous {{Row}}, we do not emit any records.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4625: [FLINK-6233] [table] Support time-bounded stream inner jo...

2017-10-10 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4625
  
Merging


---


[jira] [Commented] (FLINK-7491) Support COLLECT Aggregate function in Flink SQL

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198950#comment-16198950
 ] 

ASF GitHub Bot commented on FLINK-7491:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4585
  
Merging


> Support COLLECT Aggregate function in Flink SQL
> ---
>
> Key: FLINK-7491
> URL: https://issues.apache.org/jira/browse/FLINK-7491
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7410) Use toString method to display operator names for UserDefinedFunction

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198948#comment-16198948
 ] 

ASF GitHub Bot commented on FLINK-7410:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4624
  
Merging


> Use toString method to display operator names for UserDefinedFunction
> -
>
> Key: FLINK-7410
> URL: https://issues.apache.org/jira/browse/FLINK-7410
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>
> *Motivation*
> Operator names setted in table-api are used by visualization and logging, it 
> is import to make these names simple and readable. Currently, 
> UserDefinedFunction’s name contains class CanonicalName and md5 value making 
> the name too long and unfriendly to users. 
> As shown in the following example, 
> {quote}
> select: (a, b, c, 
> org$apache$flink$table$expressions$utils$RichFunc1$281f7e61ec5d8da894f5783e2e17a4f5(a)
>  AS _c3, 
> org$apache$flink$table$expressions$utils$RichFunc2$fb99077e565685ebc5f48b27edc14d98(c)
>  AS _c4)
> {quote}
> *Changes:*
>   
> Use {{toString}} method to display operator names for UserDefinedFunction. 
> The method will return class name by default. Users can also override the 
> method to return whatever he wants.
> What do you think [~fhueske] ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4785: [FLINK-7776][TableAPI & SQL] do not emit duplicated recor...

2017-10-10 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4785
  
Merging


---


[GitHub] flink issue #4585: [FLINK-7491] [Table API & SQL] add MultiSetTypeInfo; add ...

2017-10-10 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4585
  
Merging


---


[GitHub] flink issue #4624: [FLINK-7410] [table] Use toString method to display opera...

2017-10-10 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4624
  
Merging


---


[jira] [Created] (FLINK-7794) Link Broken in -- https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connectors/index.html

2017-10-10 Thread Paul Wu (JIRA)
Paul Wu created FLINK-7794:
--

 Summary: Link Broken in -- 
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connectors/index.html
 Key: FLINK-7794
 URL: https://issues.apache.org/jira/browse/FLINK-7794
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.3.0
Reporter: Paul Wu
Priority: Minor



Broken url link  "predefined data sources"  in page 
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connectors/index.html.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7737) On HCFS systems, FSDataOutputStream does not issue hsync only hflush which leads to data loss

2017-10-10 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198898#comment-16198898
 ] 

Vijay Srinivasaraghavan commented on FLINK-7737:


I believe hflush() routes the data to DN but is lost since no sync happens to 
the disk (will let Ryan to confirm). 

I think we cannot generalize hsync() call since the `SyncFlag` is NameNode 
specific - 
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java#L599

> On HCFS systems, FSDataOutputStream does not issue hsync only hflush which 
> leads to data loss
> -
>
> Key: FLINK-7737
> URL: https://issues.apache.org/jira/browse/FLINK-7737
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.3.2
> Environment: Dev
>Reporter: Ryan Hobbs
>
> During several tests where we simulated failure conditions, we have observed 
> that on HCFS systems where the data stream is of type FSDataOutputStream, 
> Flink will issue hflush() and not hsync() which results in data loss.
> In the class *StreamWriterBase.java* the code below will execute hsync if the 
> output stream is of type *HdfsDataOutputStream* but not for streams of type 
> *FSDataOutputStream*.  Is this by design?
> {code}
> protected void hflushOrSync(FSDataOutputStream os) throws IOException {
> try {
> // At this point the refHflushOrSync cannot be null,
> // since register method would have thrown if it was.
> this.refHflushOrSync.invoke(os);
> if (os instanceof HdfsDataOutputStream) {
>   ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
>   }
>   } catch (InvocationTargetException e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e.getCause());
> Throwable cause = e.getCause();
> if (cause != null && cause instanceof IOException) {
> throw (IOException) cause;
>   }
> throw new RuntimeException(msg, e);
>   } catch (Exception e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e);
> throw new RuntimeException(msg, e);
>   }
>   }
> {code}
> Could a potential fix me to perform a sync even on streams of type 
> *FSDataOutputStream*?
> {code}
>  if (os instanceof HdfsDataOutputStream) {
> ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
> } else if (os instanceof FSDataOutputStream) {
> os.hsync();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7793) SlotManager releases idle TaskManager in standalone mode

2017-10-10 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7793:


 Summary: SlotManager releases idle TaskManager in standalone mode
 Key: FLINK-7793
 URL: https://issues.apache.org/jira/browse/FLINK-7793
 Project: Flink
  Issue Type: Bug
  Components: ResourceManager
Affects Versions: 1.4.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann


The {{SlotManager}} releases idle {{TaskManagers}} and removes all their slots. 
This also happens in standalone mode where we cannot release task managers. 

I suggest to let the {{ResourceManager}} decide whether a resource can be 
released or not. Only in the former case, we will remove the associated slots 
from the {{SlotManager}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7709) Port CheckpointStatsDetailsHandler to new REST endpoint

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198841#comment-16198841
 ] 

ASF GitHub Bot commented on FLINK-7709:
---

Github user tillrohrmann closed the pull request at:

https://github.com/apache/flink/pull/4763


> Port CheckpointStatsDetailsHandler to new REST endpoint
> ---
>
> Key: FLINK-7709
> URL: https://issues.apache.org/jira/browse/FLINK-7709
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, REST, Webfrontend
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Till Rohrmann
>  Labels: flip-6
> Fix For: 1.4.0
>
>
> Port existing {{CheckpointStatsDetailsHandler}} to new REST endpoint.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4763: [FLINK-7709] Add CheckpointStatisticDetailsHandler...

2017-10-10 Thread tillrohrmann
Github user tillrohrmann closed the pull request at:

https://github.com/apache/flink/pull/4763


---


[jira] [Resolved] (FLINK-7709) Port CheckpointStatsDetailsHandler to new REST endpoint

2017-10-10 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann resolved FLINK-7709.
--
Resolution: Done

Added via 0a286d0ff98afa68034daff4634f526eaaf97897

> Port CheckpointStatsDetailsHandler to new REST endpoint
> ---
>
> Key: FLINK-7709
> URL: https://issues.apache.org/jira/browse/FLINK-7709
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, REST, Webfrontend
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Till Rohrmann
>  Labels: flip-6
> Fix For: 1.4.0
>
>
> Port existing {{CheckpointStatsDetailsHandler}} to new REST endpoint.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4793: [FLINK-7653] Properly implement Dispatcher#request...

2017-10-10 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/4793

[FLINK-7653] Properly implement Dispatcher#requestClusterOverview

## What is the purpose of the change

This commit implements the ClusterOverview generation on the Dispatcher. In
order to do this, the Dispatcher requests the ResourceOverview from the
ResourceManager and the job status from all JobMasters. After receiving all
information, it is compiled into the ClusterOverview.

Note: StatusOverview has been renamed to ClusterOverview

## Verifying this change

Tested the changes manually.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixRequestStatusOverview

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4793.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4793


commit 6b7aec2ce3799a35ed21bd04345abe0ab0b8298e
Author: Till Rohrmann 
Date:   2017-10-10T14:15:53Z

[FLINK-7653] Properly implement Dispatcher#requestClusterOverview

This commit implements the ClusterOverview generation on the Dispatcher. In
order to do this, the Dispatcher requests the ResourceOverview from the
ResourceManager and the job status from all JobMasters. After receiving all
information, it is compiled into the ClusterOverview.

Note: StatusOverview has been renamed to ClusterOverview




---


[jira] [Commented] (FLINK-7653) Properly implement DispatcherGateway methods on the Dispatcher

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198819#comment-16198819
 ] 

ASF GitHub Bot commented on FLINK-7653:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/4793

[FLINK-7653] Properly implement Dispatcher#requestClusterOverview

## What is the purpose of the change

This commit implements the ClusterOverview generation on the Dispatcher. In
order to do this, the Dispatcher requests the ResourceOverview from the
ResourceManager and the job status from all JobMasters. After receiving all
information, it is compiled into the ClusterOverview.

Note: StatusOverview has been renamed to ClusterOverview

## Verifying this change

Tested the changes manually.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixRequestStatusOverview

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4793.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4793


commit 6b7aec2ce3799a35ed21bd04345abe0ab0b8298e
Author: Till Rohrmann 
Date:   2017-10-10T14:15:53Z

[FLINK-7653] Properly implement Dispatcher#requestClusterOverview

This commit implements the ClusterOverview generation on the Dispatcher. In
order to do this, the Dispatcher requests the ResourceOverview from the
ResourceManager and the job status from all JobMasters. After receiving all
information, it is compiled into the ClusterOverview.

Note: StatusOverview has been renamed to ClusterOverview




> Properly implement DispatcherGateway methods on the Dispatcher
> --
>
> Key: FLINK-7653
> URL: https://issues.apache.org/jira/browse/FLINK-7653
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, REST, Webfrontend
>Affects Versions: 1.4.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Till Rohrmann
>
> Currently, {{DispatcherGateway}} methods such as {{listJobs}}, 
> {{requestStatusOverview}}, and probably other new methods that will be added 
> as we port more existing REST handlers to the new endpoint, have only dummy 
> placeholder implementations in the {{Dispatcher}} marked with TODOs.
> This ticket is to track that they are all eventually properly implemented.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7652) Port CurrentJobIdsHandler to new REST endpoint

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198803#comment-16198803
 ] 

ASF GitHub Bot commented on FLINK-7652:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/4734
  
I agree with @zentol. Moreover, I would like to change 
`MultipleJobsDetails` to not split the job details into running and finished. 
Just a collection of `JobDetails`.


> Port CurrentJobIdsHandler to new REST endpoint
> --
>
> Key: FLINK-7652
> URL: https://issues.apache.org/jira/browse/FLINK-7652
> Project: Flink
>  Issue Type: Sub-task
>  Components: REST, Webfrontend
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>  Labels: flip-6
> Fix For: 1.4.0
>
>
> Port existing {{CurrentJobIdsHandler}} to new REST endpoint



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4734: [FLINK-7652] [flip6] Port CurrentJobIdsHandler to new RES...

2017-10-10 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/4734
  
I agree with @zentol. Moreover, I would like to change 
`MultipleJobsDetails` to not split the job details into running and finished. 
Just a collection of `JobDetails`.


---


[jira] [Commented] (FLINK-7790) Unresolved query parameters are not omitted from request

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198797#comment-16198797
 ] 

ASF GitHub Bot commented on FLINK-7790:
---

Github user zentol closed the pull request at:

https://github.com/apache/flink/pull/4788


> Unresolved query parameters are not omitted from request
> 
>
> Key: FLINK-7790
> URL: https://issues.apache.org/jira/browse/FLINK-7790
> Project: Flink
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4788: [FLINK-7790] [REST] Unresolved query params not ad...

2017-10-10 Thread zentol
Github user zentol closed the pull request at:

https://github.com/apache/flink/pull/4788


---


[jira] [Closed] (FLINK-7790) Unresolved query parameters are not omitted from request

2017-10-10 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-7790.
---
Resolution: Fixed

1.4: 6b3fdc288587fe0982f2ffa2e476e0fd3cd61188

> Unresolved query parameters are not omitted from request
> 
>
> Key: FLINK-7790
> URL: https://issues.apache.org/jira/browse/FLINK-7790
> Project: Flink
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7378) Create a fix size (non rebalancing) buffer pool type for the floating buffers

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198784#comment-16198784
 ] 

ASF GitHub Bot commented on FLINK-7378:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4485


> Create a fix size (non rebalancing) buffer pool type for the floating buffers
> -
>
> Key: FLINK-7378
> URL: https://issues.apache.org/jira/browse/FLINK-7378
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> Currently the number of network buffers in {{LocalBufferPool}} for 
> {{SingleInputGate}} is limited by {{a *  + b}}, where a 
> is the number of exclusive buffers for each channel and b is the number of 
> floating buffers shared by all channels.
> Considering the credit-based flow control feature, we want to create a fix 
> size buffer pool used to manage the floating buffers for {{SingleInputGate}}. 
> And the exclusive buffers are assigned to {{InputChannel}} directly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7394) Manage exclusive buffers in RemoteInputChannel

2017-10-10 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198781#comment-16198781
 ] 

Chesnay Schepler commented on FLINK-7394:
-

1.4: 1f4bdf44b6fa4fd071d9a45aca7dd6b7a4d3090d

> Manage exclusive buffers in RemoteInputChannel
> --
>
> Key: FLINK-7394
> URL: https://issues.apache.org/jira/browse/FLINK-7394
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> This is a part of work for credit-based network flow control. 
> The basic works are:
> * Exclusive buffers are assigned to {{RemoteInputChannel}} after created by 
> {{SingleInputGate}}.
> * {{RemoteInputChannel}} implements {{BufferRecycler}} interface to manage 
> the exclusive buffers itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7699) Define the BufferListener interface to replace EventListener in BufferProvider

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198785#comment-16198785
 ] 

ASF GitHub Bot commented on FLINK-7699:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4735


> Define the BufferListener interface to replace EventListener in BufferProvider
> --
>
> Key: FLINK-7699
> URL: https://issues.apache.org/jira/browse/FLINK-7699
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> Currently the {{EventListener}} is used in {{BufferProvider}} to be notified 
> of buffer available or destroyed pool. 
> To be semantic clearly, we define a new {{BufferListener}} interface which 
> can opt for a one-time only notification or to be notified repeatedly. And we 
> can also notify the pool destroyed via explicitly method 
> {{notifyBufferDestroyed}}.
> The {{RemoteInputChannel}} will implement the {{BufferListener}} interface to 
> wait for floating buffers from {{BufferProvider}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7394) Manage exclusive buffers in RemoteInputChannel

2017-10-10 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198787#comment-16198787
 ] 

Chesnay Schepler commented on FLINK-7394:
-

[~zjwang] Please leave it to the committer to close the JIRA, as we're 
otherwise risking a JIRA from being closed even though no commit has been 
merged.

> Manage exclusive buffers in RemoteInputChannel
> --
>
> Key: FLINK-7394
> URL: https://issues.apache.org/jira/browse/FLINK-7394
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> This is a part of work for credit-based network flow control. 
> The basic works are:
> * Exclusive buffers are assigned to {{RemoteInputChannel}} after created by 
> {{SingleInputGate}}.
> * {{RemoteInputChannel}} implements {{BufferRecycler}} interface to manage 
> the exclusive buffers itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4485: [FLINK-7378][core]Create a fix size (non rebalanci...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4485


---


[jira] [Commented] (FLINK-7378) Create a fix size (non rebalancing) buffer pool type for the floating buffers

2017-10-10 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198780#comment-16198780
 ] 

Chesnay Schepler commented on FLINK-7378:
-

1.4: 450d9df9e96718575ab2979f256f99be4d699636

> Create a fix size (non rebalancing) buffer pool type for the floating buffers
> -
>
> Key: FLINK-7378
> URL: https://issues.apache.org/jira/browse/FLINK-7378
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> Currently the number of network buffers in {{LocalBufferPool}} for 
> {{SingleInputGate}} is limited by {{a *  + b}}, where a 
> is the number of exclusive buffers for each channel and b is the number of 
> floating buffers shared by all channels.
> Considering the credit-based flow control feature, we want to create a fix 
> size buffer pool used to manage the floating buffers for {{SingleInputGate}}. 
> And the exclusive buffers are assigned to {{InputChannel}} directly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7394) Manage exclusive buffers in RemoteInputChannel

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198783#comment-16198783
 ] 

ASF GitHub Bot commented on FLINK-7394:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4499


> Manage exclusive buffers in RemoteInputChannel
> --
>
> Key: FLINK-7394
> URL: https://issues.apache.org/jira/browse/FLINK-7394
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> This is a part of work for credit-based network flow control. 
> The basic works are:
> * Exclusive buffers are assigned to {{RemoteInputChannel}} after created by 
> {{SingleInputGate}}.
> * {{RemoteInputChannel}} implements {{BufferRecycler}} interface to manage 
> the exclusive buffers itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4735: [FLINK-7699][core] Define the BufferListener inter...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4735


---


[GitHub] flink pull request #4499: [FLINK-7394][core] Manage exclusive buffers in Rem...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4499


---


[jira] [Reopened] (FLINK-7699) Define the BufferListener interface to replace EventListener in BufferProvider

2017-10-10 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reopened FLINK-7699:
-

> Define the BufferListener interface to replace EventListener in BufferProvider
> --
>
> Key: FLINK-7699
> URL: https://issues.apache.org/jira/browse/FLINK-7699
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> Currently the {{EventListener}} is used in {{BufferProvider}} to be notified 
> of buffer available or destroyed pool. 
> To be semantic clearly, we define a new {{BufferListener}} interface which 
> can opt for a one-time only notification or to be notified repeatedly. And we 
> can also notify the pool destroyed via explicitly method 
> {{notifyBufferDestroyed}}.
> The {{RemoteInputChannel}} will implement the {{BufferListener}} interface to 
> wait for floating buffers from {{BufferProvider}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7699) Define the BufferListener interface to replace EventListener in BufferProvider

2017-10-10 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-7699.
---
Resolution: Fixed

1.4: 331b7778e4533cd140ffe6e3edaa79994122a592

> Define the BufferListener interface to replace EventListener in BufferProvider
> --
>
> Key: FLINK-7699
> URL: https://issues.apache.org/jira/browse/FLINK-7699
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> Currently the {{EventListener}} is used in {{BufferProvider}} to be notified 
> of buffer available or destroyed pool. 
> To be semantic clearly, we define a new {{BufferListener}} interface which 
> can opt for a one-time only notification or to be notified repeatedly. And we 
> can also notify the pool destroyed via explicitly method 
> {{notifyBufferDestroyed}}.
> The {{RemoteInputChannel}} will implement the {{BufferListener}} interface to 
> wait for floating buffers from {{BufferProvider}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7792) CliFrontend tests suppress logging

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198747#comment-16198747
 ] 

ASF GitHub Bot commented on FLINK-7792:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/4792

[FLINK-7792] [tests][client] Only suppress stdout for CLI tests

## What is the purpose of the change

This PR modifies the CliFrontendTestUtils to only suppress stdout. 
Previously we were also suppressing stderr, which also disabled logging.

## Verifying this change

To verify that this does not cause noise tests, run all tests under 
`org.apache.flink.client` in `flink-clients` which are the sole users of this 
method.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 7792

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4792.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4792


commit 57c8293435de3e2c42fa6e37a95a3c62f69f710c
Author: zentol 
Date:   2017-10-10T14:38:08Z

[FLINK-7792] [tests][client] Only suppress stdout for CLI tests




> CliFrontend tests suppress logging
> --
>
> Key: FLINK-7792
> URL: https://issues.apache.org/jira/browse/FLINK-7792
> Project: Flink
>  Issue Type: Bug
>  Components: Client, Tests
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.4.0
>
>
> The CliFrontendTests in flink-clients call 
> `CliFrontendTestUtils#pipeSsytemOutToNull` to suppress the various print 
> statements.
> This method however also redirects stderr, causing all log output to be 
> suppressed as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4792: [FLINK-7792] [tests][client] Only suppress stdout ...

2017-10-10 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/4792

[FLINK-7792] [tests][client] Only suppress stdout for CLI tests

## What is the purpose of the change

This PR modifies the CliFrontendTestUtils to only suppress stdout. 
Previously we were also suppressing stderr, which also disabled logging.

## Verifying this change

To verify that this does not cause noise tests, run all tests under 
`org.apache.flink.client` in `flink-clients` which are the sole users of this 
method.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 7792

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4792.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4792


commit 57c8293435de3e2c42fa6e37a95a3c62f69f710c
Author: zentol 
Date:   2017-10-10T14:38:08Z

[FLINK-7792] [tests][client] Only suppress stdout for CLI tests




---


[jira] [Closed] (FLINK-7699) Define the BufferListener interface to replace EventListener in BufferProvider

2017-10-10 Thread zhijiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhijiang closed FLINK-7699.
---
Resolution: Fixed

> Define the BufferListener interface to replace EventListener in BufferProvider
> --
>
> Key: FLINK-7699
> URL: https://issues.apache.org/jira/browse/FLINK-7699
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> Currently the {{EventListener}} is used in {{BufferProvider}} to be notified 
> of buffer available or destroyed pool. 
> To be semantic clearly, we define a new {{BufferListener}} interface which 
> can opt for a one-time only notification or to be notified repeatedly. And we 
> can also notify the pool destroyed via explicitly method 
> {{notifyBufferDestroyed}}.
> The {{RemoteInputChannel}} will implement the {{BufferListener}} interface to 
> wait for floating buffers from {{BufferProvider}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7792) CliFrontend tests suppress logging

2017-10-10 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-7792:

Priority: Trivial  (was: Major)

> CliFrontend tests suppress logging
> --
>
> Key: FLINK-7792
> URL: https://issues.apache.org/jira/browse/FLINK-7792
> Project: Flink
>  Issue Type: Bug
>  Components: Client, Tests
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.4.0
>
>
> The CliFrontendTests in flink-clients call 
> `CliFrontendTestUtils#pipeSsytemOutToNull` to suppress the various print 
> statements.
> This method however also redirects stderr, causing all log output to be 
> suppressed as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7792) CliFrontend tests suppress logging

2017-10-10 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-7792:
---

 Summary: CliFrontend tests suppress logging
 Key: FLINK-7792
 URL: https://issues.apache.org/jira/browse/FLINK-7792
 Project: Flink
  Issue Type: Bug
  Components: Client, Tests
Affects Versions: 1.4.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.4.0


The CliFrontendTests in flink-clients call 
`CliFrontendTestUtils#pipeSsytemOutToNull` to suppress the various print 
statements.

This method however also redirects stderr, causing all log output to be 
suppressed as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4791: [hotfix] [Javadoc] Fix typos

2017-10-10 Thread GJL
GitHub user GJL opened a pull request:

https://github.com/apache/flink/pull/4791

[hotfix] [Javadoc] Fix typos

Fix typos in Javadoc for classes:

- `org.apache.flink.api.java.typeutils.InputTypeConfigurable`
- `org.apache.flink.core.fs.FileSystem`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/GJL/flink hotfix-javadoc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4791.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4791


commit e690748cc89f4cc96123edeeec320e268558e7a3
Author: gyao 
Date:   2017-10-10T14:26:52Z

[hotfix] [Javadoc] Fix typo in Javadoc for class InputTypeConfigurable

commit 0217118978d604ec3f52dd980ae2abd224397264
Author: gyao 
Date:   2017-10-10T14:27:05Z

[hotfix] [Javadoc] Fix typo in Javadoc for class FileSystem




---


[jira] [Closed] (FLINK-7394) Manage exclusive buffers in RemoteInputChannel

2017-10-10 Thread zhijiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhijiang closed FLINK-7394.
---
Resolution: Fixed

> Manage exclusive buffers in RemoteInputChannel
> --
>
> Key: FLINK-7394
> URL: https://issues.apache.org/jira/browse/FLINK-7394
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> This is a part of work for credit-based network flow control. 
> The basic works are:
> * Exclusive buffers are assigned to {{RemoteInputChannel}} after created by 
> {{SingleInputGate}}.
> * {{RemoteInputChannel}} implements {{BufferRecycler}} interface to manage 
> the exclusive buffers itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7378) Create a fix size (non rebalancing) buffer pool type for the floating buffers

2017-10-10 Thread zhijiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhijiang closed FLINK-7378.
---
Resolution: Fixed

> Create a fix size (non rebalancing) buffer pool type for the floating buffers
> -
>
> Key: FLINK-7378
> URL: https://issues.apache.org/jira/browse/FLINK-7378
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: zhijiang
>Assignee: zhijiang
> Fix For: 1.4.0
>
>
> Currently the number of network buffers in {{LocalBufferPool}} for 
> {{SingleInputGate}} is limited by {{a *  + b}}, where a 
> is the number of exclusive buffers for each channel and b is the number of 
> floating buffers shared by all channels.
> Considering the credit-based flow control feature, we want to create a fix 
> size buffer pool used to manage the floating buffers for {{SingleInputGate}}. 
> And the exclusive buffers are assigned to {{InputChannel}} directly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7704) Port JobPlanHandler to new REST endpoint

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198732#comment-16198732
 ] 

ASF GitHub Bot commented on FLINK-7704:
---

Github user yew1eb commented on the issue:

https://github.com/apache/flink/pull/4768
  
CC @tillrohrmann 


> Port JobPlanHandler to new REST endpoint
> 
>
> Key: FLINK-7704
> URL: https://issues.apache.org/jira/browse/FLINK-7704
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, REST, Webfrontend
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Hai Zhou UTC+8
>  Labels: flip-6
> Fix For: 1.4.0
>
>
> Port existing {{JobPlanHandler}} to new REST endpoint.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4768: [FLINK-7704][flip6] Add JobPlanHandler for new RestServer...

2017-10-10 Thread yew1eb
Github user yew1eb commented on the issue:

https://github.com/apache/flink/pull/4768
  
CC @tillrohrmann 


---


[jira] [Commented] (FLINK-1268) FileOutputFormat with overwrite does not clear local output directories

2017-10-10 Thread Gabor Gevay (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198727#comment-16198727
 ] 

Gabor Gevay commented on FLINK-1268:


This issue just happened to me. I ran my job locally with parallelism 8, and 
then later with 4, and then I was debugging for an hour to figure out what went 
wrong.

> FileOutputFormat with overwrite does not clear local output directories
> ---
>
> Key: FLINK-1268
> URL: https://issues.apache.org/jira/browse/FLINK-1268
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Reporter: Till Rohrmann
>Priority: Minor
>
> I noticed that the FileOutputFormat does not clear the output directories if 
> it writes to local disk. This has the consequence that previous partitions 
> are still contained in the directory if one decreases the DOP between 
> subsequent runs. If one reads the data from this directory, then more 
> partitions will be read in than were actually written. This can lead to a 
> wrong user code behaviour which is hard to debug. I'm aware that in case of a 
> distributed execution the TaskManagers or the Tasks have to be responsible 
> for the cleanup and if multiple Tasks are running on a TaskManager, then the 
> cleanup has to be coordinated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-7653) Properly implement DispatcherGateway methods on the Dispatcher

2017-10-10 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-7653:


Assignee: Till Rohrmann

> Properly implement DispatcherGateway methods on the Dispatcher
> --
>
> Key: FLINK-7653
> URL: https://issues.apache.org/jira/browse/FLINK-7653
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, REST, Webfrontend
>Affects Versions: 1.4.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Till Rohrmann
>
> Currently, {{DispatcherGateway}} methods such as {{listJobs}}, 
> {{requestStatusOverview}}, and probably other new methods that will be added 
> as we port more existing REST handlers to the new endpoint, have only dummy 
> placeholder implementations in the {{Dispatcher}} marked with TODOs.
> This ticket is to track that they are all eventually properly implemented.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7653) Properly implement DispatcherGateway methods on the Dispatcher

2017-10-10 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-7653:
-
Affects Version/s: 1.4.0

> Properly implement DispatcherGateway methods on the Dispatcher
> --
>
> Key: FLINK-7653
> URL: https://issues.apache.org/jira/browse/FLINK-7653
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, REST, Webfrontend
>Affects Versions: 1.4.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Till Rohrmann
>
> Currently, {{DispatcherGateway}} methods such as {{listJobs}}, 
> {{requestStatusOverview}}, and probably other new methods that will be added 
> as we port more existing REST handlers to the new endpoint, have only dummy 
> placeholder implementations in the {{Dispatcher}} marked with TODOs.
> This ticket is to track that they are all eventually properly implemented.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7737) On HCFS systems, FSDataOutputStream does not issue hsync only hflush which leads to data loss

2017-10-10 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198722#comment-16198722
 ] 

Stephan Ewen commented on FLINK-7737:
-

Or, let me rephrase the question: Is the data replicated to the required number 
of nodes after {{hflush()}}, or only after {{hsync()}}?

If that is the case, then would the best solution be to just always call 
{{hsync()}} on checkpoints, rather than {{hflush()}}?

> On HCFS systems, FSDataOutputStream does not issue hsync only hflush which 
> leads to data loss
> -
>
> Key: FLINK-7737
> URL: https://issues.apache.org/jira/browse/FLINK-7737
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.3.2
> Environment: Dev
>Reporter: Ryan Hobbs
>
> During several tests where we simulated failure conditions, we have observed 
> that on HCFS systems where the data stream is of type FSDataOutputStream, 
> Flink will issue hflush() and not hsync() which results in data loss.
> In the class *StreamWriterBase.java* the code below will execute hsync if the 
> output stream is of type *HdfsDataOutputStream* but not for streams of type 
> *FSDataOutputStream*.  Is this by design?
> {code}
> protected void hflushOrSync(FSDataOutputStream os) throws IOException {
> try {
> // At this point the refHflushOrSync cannot be null,
> // since register method would have thrown if it was.
> this.refHflushOrSync.invoke(os);
> if (os instanceof HdfsDataOutputStream) {
>   ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
>   }
>   } catch (InvocationTargetException e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e.getCause());
> Throwable cause = e.getCause();
> if (cause != null && cause instanceof IOException) {
> throw (IOException) cause;
>   }
> throw new RuntimeException(msg, e);
>   } catch (Exception e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e);
> throw new RuntimeException(msg, e);
>   }
>   }
> {code}
> Could a potential fix me to perform a sync even on streams of type 
> *FSDataOutputStream*?
> {code}
>  if (os instanceof HdfsDataOutputStream) {
> ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
> } else if (os instanceof FSDataOutputStream) {
> os.hsync();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   >