date:20180404

[jira] [Closed] (FLINK-7566) if there's only one checkpointing metadata file in , `flink run -s ` should successfully resume from that metadata file

2018-04-04 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li closed FLINK-7566.
---
Resolution: Won't Fix

> if there's only one checkpointing metadata file in , `flink run -s 
> ` should successfully resume from that metadata file 
> --
>
> Key: FLINK-7566
> URL: https://issues.apache.org/jira/browse/FLINK-7566
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
>
> Currently, if we want to start a Flink job from a checkpointing file, we have 
> to run `flink run -s /checkpoint_metadata-x` by explicitly 
> specifying the checkpoint metadata file name 'checkpoint_metadata-x'. 
> Since metadata file name always changes, it's not easy to programmatically 
> restart a failed Flink job. The error from jobmanager.log looks like:
> {code:java}
> 2017-08-30 07:25:04,907 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph- Job  
> (22defcf962ff2ac2e7fe99354f5ab168) switched from state FAILING to FAILED.
> org.apache.flink.runtime.execution.SuppressRestartsException: Unrecoverable 
> failure. This suppresses job restarts. Please check the stack trace for the 
> root cause.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1396)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.io.IOException: Cannot find meta data file in directory 
> s3:///checkpoints. Please try to load the savepoint directly from the 
> meta data file instead of the directory.
>   at 
> org.apache.flink.runtime.checkpoint.savepoint.SavepointStore.loadSavepointWithHandle(SavepointStore.java:262)
>   at 
> org.apache.flink.runtime.checkpoint.savepoint.SavepointLoader.loadAndValidateSavepoint(SavepointLoader.java:69)
>   at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1140)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
>   ... 10 more
> {code}
> What I want is like this: users should be able to start a Flink job by 
> running `flink run -s ` if there's only one checkpointing metadata file 
> in . If there's none or more than 1 metadata file, the command can fail 
> like it is right now. This way, we can programmatically restart a failed 
> Flink job by hardcoding .
> To achieve that, I think there're two appraches we can do:
> 1) modify {{CheckpointCoordinator.restoreSavepoint}} to check how many 
> metadata files are in 
> 2) add another commandline option like '-sd' / '--savepointdirectory' to 
> explicitly load a dir



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-8837) add @Experimental annotation and properly annotate some classes

2018-04-04 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-8837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-8837:

Affects Version/s: 1.5.0

>  add @Experimental annotation and properly annotate some classes
> 
>
> Key: FLINK-8837
> URL: https://issues.apache.org/jira/browse/FLINK-8837
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.5.0
>Reporter: Stephan Ewen
>Assignee: Bowen Li
>Priority: Blocker
> Fix For: 1.5.0
>
>
> The class {{DataStreamUtils}} came from 'flink-contrib' and now accidentally 
> moved to the fully supported API packages. It should be in package 
> 'experimental' to properly communicate that it is not guaranteed to be API 
> stable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9127) Filesystem State Backend logged incorrectly

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426530#comment-16426530
 ] 

ASF GitHub Bot commented on FLINK-9127:
---

Github user bowenli86 commented on the issue:

https://github.com/apache/flink/pull/5810
  
The original logging is correct - filesystem state backend is actually 
memory state backend + filesystem checkpointing. No need to change the logging. 


> Filesystem State Backend logged incorrectly
> ---
>
> Key: FLINK-9127
> URL: https://issues.apache.org/jira/browse/FLINK-9127
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.3.2, 1.4.2
>Reporter: Scott Kidder
>Priority: Trivial
>
> When using a filesystem backend, the 
> '[StateBackendLoader|https://github.com/apache/flink/blob/1f9c2d9740ffea2b59b8f5f3da287a0dc890ddbf/flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackendLoader.java#L123]'
>  class produces a log message stating: "State backend is set to heap memory". 
> Example:
> {{2018-04-04 00:45:49,591 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask           - State backend 
> is set to heap memory (checkpoints to filesystem 
> "hdfs://hdfs:8020/flink/checkpoints")}}
> It looks like this resulted from some copy-pasta of the previous 
> case-statement that matches on the memory backend. This bug is also present 
> in earlier releases (1.3.2, 1.4.0) of Flink in the 'AbstractStateBackend' 
> class.
> This log statement should be corrected to indicate that a filesystem backend 
> is in use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink issue #5810: [FLINK-9127] [Core] Filesystem State Backend logged incor...

2018-04-04 Thread bowenli86

Github user bowenli86 commented on the issue:

https://github.com/apache/flink/pull/5810
  
The original logging is correct - filesystem state backend is actually 
memory state backend + filesystem checkpointing. No need to change the logging. 


---

[jira] [Commented] (FLINK-9015) Upgrade Calcite dependency to 1.17

2018-04-04 Thread Shuyi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426458#comment-16426458
 ] 

Shuyi Chen commented on FLINK-9015:
---

Duplicate of [FLINK-9134|https://issues.apache.org/jira/browse/FLINK-9134]. 
I'll close this one.

> Upgrade Calcite dependency to 1.17
> --
>
> Key: FLINK-9015
> URL: https://issues.apache.org/jira/browse/FLINK-9015
> Project: Flink
>  Issue Type: Task
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (FLINK-9015) Upgrade Calcite dependency to 1.17

2018-04-04 Thread Shuyi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuyi Chen closed FLINK-9015.
-
Resolution: Duplicate

> Upgrade Calcite dependency to 1.17
> --
>
> Key: FLINK-9015
> URL: https://issues.apache.org/jira/browse/FLINK-9015
> Project: Flink
>  Issue Type: Task
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9134) Update Calcite dependency to 1.17

2018-04-04 Thread Shuyi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426455#comment-16426455
 ] 

Shuyi Chen commented on FLINK-9134:
---

Actually, let's close 
[FLINK-9015|https://issues.apache.org/jira/browse/FLINK-9015].

> Update Calcite dependency to 1.17
> -
>
> Key: FLINK-9134
> URL: https://issues.apache.org/jira/browse/FLINK-9134
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Shuyi Chen
>Priority: Major
>
> This is an umbrella issue for tasks that need to be performed when upgrading 
> to Calcite 1.17 once it is released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Reopened] (FLINK-9134) Update Calcite dependency to 1.17

2018-04-04 Thread Shuyi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuyi Chen reopened FLINK-9134:
---
  Assignee: Shuyi Chen

> Update Calcite dependency to 1.17
> -
>
> Key: FLINK-9134
> URL: https://issues.apache.org/jira/browse/FLINK-9134
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Shuyi Chen
>Priority: Major
>
> This is an umbrella issue for tasks that need to be performed when upgrading 
> to Calcite 1.17 once it is released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (FLINK-9134) Update Calcite dependency to 1.17

2018-04-04 Thread Shuyi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuyi Chen closed FLINK-9134.
-
Resolution: Duplicate

> Update Calcite dependency to 1.17
> -
>
> Key: FLINK-9134
> URL: https://issues.apache.org/jira/browse/FLINK-9134
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Priority: Major
>
> This is an umbrella issue for tasks that need to be performed when upgrading 
> to Calcite 1.17 once it is released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-9089) Upgrade Orc dependency to 1.4.3

2018-04-04 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-9089:
--
Description: 
Currently flink-orc uses Orc 1.4.1 release.


This issue upgrades to Orc 1.4.3

  was:
Currently flink-orc uses Orc 1.4.1 release.

This issue upgrades to Orc 1.4.3


> Upgrade Orc dependency to 1.4.3
> ---
>
> Key: FLINK-9089
> URL: https://issues.apache.org/jira/browse/FLINK-9089
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Minor
>
> Currently flink-orc uses Orc 1.4.1 release.
> This issue upgrades to Orc 1.4.3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5818: sql() is deprecated.

2018-04-04 Thread mayyamus

GitHub user mayyamus opened a pull request:

https://github.com/apache/flink/pull/5818

sql() is deprecated.

## Brief change log

- Replace the deprecatd method(sql()method) with sqlQuery() 

## Verifying this change

This change is a trivial rework / code cleanup without any test coverage.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mayyamus/flink fix_table_docs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5818.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5818


commit a80733769591f4f87fdf38794771604583c4d49b
Author: mayyamus 
Date:   2018-04-05T01:54:11Z

sql() is deprecated.




---

[jira] [Commented] (FLINK-8703) Migrate tests from LocalFlinkMiniCluster to MiniClusterResource

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426402#comment-16426402
 ] 

ASF GitHub Bot commented on FLINK-8703:
---

Github user pnowojski commented on a diff in the pull request:

https://github.com/apache/flink/pull/5669#discussion_r179332025
  
--- Diff: 
flink-connectors/flink-connector-kafka-base/src/test/java/org/apache/flink/streaming/connectors/kafka/KafkaConsumerTestBase.java
 ---
@@ -1063,23 +1078,27 @@ public void runCancelingOnEmptyInputTest() throws 
Exception {
 
final AtomicReference error = new 
AtomicReference<>();
 
+   final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
--- End diff --

As you wish, I can open a follow up since it's a trivial fixup.


> Migrate tests from LocalFlinkMiniCluster to MiniClusterResource
> ---
>
> Key: FLINK-8703
> URL: https://issues.apache.org/jira/browse/FLINK-8703
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Aljoscha Krettek
>Assignee: Chesnay Schepler
>Priority: Blocker
> Fix For: 1.5.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5669: [FLINK-8703][tests] Port KafkaTestBase to MiniClus...

2018-04-04 Thread pnowojski

Github user pnowojski commented on a diff in the pull request:

https://github.com/apache/flink/pull/5669#discussion_r179332025
  
--- Diff: 
flink-connectors/flink-connector-kafka-base/src/test/java/org/apache/flink/streaming/connectors/kafka/KafkaConsumerTestBase.java
 ---
@@ -1063,23 +1078,27 @@ public void runCancelingOnEmptyInputTest() throws 
Exception {
 
final AtomicReference error = new 
AtomicReference<>();
 
+   final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
--- End diff --

As you wish, I can open a follow up since it's a trivial fixup.


---

[jira] [Created] (FLINK-9139) Allow yarn.tags as a job parameter

2018-04-04 Thread Nikhil Simha (JIRA)

Nikhil Simha created FLINK-9139:
---

 Summary: Allow yarn.tags as a job parameter
 Key: FLINK-9139
 URL: https://issues.apache.org/jira/browse/FLINK-9139
 Project: Flink
  Issue Type: Improvement
  Components: Cluster Management, Configuration
Reporter: Nikhil Simha


Currently `yarn.tags` is accepted as a parameter from flink-conf.yml. I want to 
be able to pass  that as a job commandline argument like `yarnslots` 
`yarnqueue`.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-5153) Allow setting custom application tags for Flink on YARN

2018-04-04 Thread Nikhil Simha (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426359#comment-16426359
 ] 

Nikhil Simha commented on FLINK-5153:
-

I want to be able to supply an application tag at job submission time - as a 
parameter like yarn.taskmanagerslots or yarn.name - instead of modifying the 
flink-conf. 

> Allow setting custom application tags for Flink on YARN
> ---
>
> Key: FLINK-5153
> URL: https://issues.apache.org/jira/browse/FLINK-5153
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN
>Reporter: Robert Metzger
>Assignee: Patrick Lucas
>Priority: Major
> Fix For: 1.3.0
>
>
> https://issues.apache.org/jira/browse/YARN-1399 added support in YARN to tag 
> applications.
> We should introduce a configuration variable in Flink allowing users to 
> specify a comma-separated list of tags they want to assign to their Flink on 
> YARN applications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-5153) Allow setting custom application tags for Flink on YARN

2018-04-04 Thread Nikhil Simha (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426359#comment-16426359
 ] 

Nikhil Simha edited comment on FLINK-5153 at 4/5/18 12:17 AM:
--

I want to be able to supply an application tag at job submission time - as a 
parameter like yarn.taskmanagerslots or yarn.name - instead of modifying the 
flink-conf. How do I do that?


was (Author: nikhilsimha):
I want to be able to supply an application tag at job submission time - as a 
parameter like yarn.taskmanagerslots or yarn.name - instead of modifying the 
flink-conf. 

> Allow setting custom application tags for Flink on YARN
> ---
>
> Key: FLINK-5153
> URL: https://issues.apache.org/jira/browse/FLINK-5153
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN
>Reporter: Robert Metzger
>Assignee: Patrick Lucas
>Priority: Major
> Fix For: 1.3.0
>
>
> https://issues.apache.org/jira/browse/YARN-1399 added support in YARN to tag 
> applications.
> We should introduce a configuration variable in Flink allowing users to 
> specify a comma-separated list of tags they want to assign to their Flink on 
> YARN applications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-6105) Properly handle InterruptedException in HadoopInputFormatBase

2018-04-04 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16307281#comment-16307281
 ] 

Ted Yu edited comment on FLINK-6105 at 4/4/18 11:18 PM:


In 
flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/RollingSink.java
 :
{code}
  try {
Thread.sleep(500);
  } catch (InterruptedException e1) {
// ignore it
  }
{code}

Interrupt status should be restored, or throw InterruptedIOException .


was (Author: yuzhih...@gmail.com):
In 
flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/RollingSink.java
 :
{code}
  try {
Thread.sleep(500);
  } catch (InterruptedException e1) {
// ignore it
  }
{code}
Interrupt status should be restored, or throw InterruptedIOException .

> Properly handle InterruptedException in HadoopInputFormatBase
> -
>
> Key: FLINK-6105
> URL: https://issues.apache.org/jira/browse/FLINK-6105
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Major
>
> When catching InterruptedException, we should throw InterruptedIOException 
> instead of IOException.
> The following example is from HadoopInputFormatBase :
> {code}
> try {
>   splits = this.mapreduceInputFormat.getSplits(jobContext);
> } catch (InterruptedException e) {
>   throw new IOException("Could not get Splits.", e);
> }
> {code}
> There may be other places where IOE is thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-7151) FLINK SQL support create temporary function and table

2018-04-04 Thread Shuyi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426316#comment-16426316
 ] 

Shuyi Chen commented on FLINK-7151:
---

I don't have a concrete timeline, but will try to implement the table DDL 
before Flink 1.6 release.

> FLINK SQL support create temporary function and table
> -
>
> Key: FLINK-7151
> URL: https://issues.apache.org/jira/browse/FLINK-7151
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: yuemeng
>Assignee: yuemeng
>Priority: Major
>
> Based on create temporary function and table.we can register a udf,udaf,udtf 
> use sql:
> {code}
> CREATE TEMPORARY function 'TOPK' AS 
> 'com..aggregate.udaf.distinctUdaf.topk.ITopKUDAF';
> INSERT INTO db_sink SELECT id, TOPK(price, 5, 'DESC') FROM kafka_source GROUP 
> BY id;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-8554) Upgrade AWS SDK

2018-04-04 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-8554:
--
Description: 
AWS SDK 1.11.271 fixes a lot of bugs.

One of which would exhibit the following:
{code}
Caused by: java.lang.NullPointerException
at com.amazonaws.metrics.AwsSdkMetrics.getRegion(AwsSdkMetrics.java:729)
at com.amazonaws.metrics.MetricAdmin.getRegion(MetricAdmin.java:67)
at sun.reflect.GeneratedMethodAccessor132.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
{code}

  was:
AWS SDK 1.11.271 fixes a lot of bugs.

One of which would exhibit the following:

{code}
Caused by: java.lang.NullPointerException
at com.amazonaws.metrics.AwsSdkMetrics.getRegion(AwsSdkMetrics.java:729)
at com.amazonaws.metrics.MetricAdmin.getRegion(MetricAdmin.java:67)
at sun.reflect.GeneratedMethodAccessor132.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
{code}


> Upgrade AWS SDK
> ---
>
> Key: FLINK-8554
> URL: https://issues.apache.org/jira/browse/FLINK-8554
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
>
> AWS SDK 1.11.271 fixes a lot of bugs.
> One of which would exhibit the following:
> {code}
> Caused by: java.lang.NullPointerException
>   at com.amazonaws.metrics.AwsSdkMetrics.getRegion(AwsSdkMetrics.java:729)
>   at com.amazonaws.metrics.MetricAdmin.getRegion(MetricAdmin.java:67)
>   at sun.reflect.GeneratedMethodAccessor132.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9061) S3 checkpoint data not partitioned well -- causes errors and poor performance

2018-04-04 Thread Jamie Grier (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426132#comment-16426132
 ] 

Jamie Grier commented on FLINK-9061:


Yup, sounds good to me :)

> S3 checkpoint data not partitioned well -- causes errors and poor performance
> -
>
> Key: FLINK-9061
> URL: https://issues.apache.org/jira/browse/FLINK-9061
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.4.2
>Reporter: Jamie Grier
>Priority: Critical
>
> I think we need to modify the way we write checkpoints to S3 for high-scale 
> jobs (those with many total tasks).  The issue is that we are writing all the 
> checkpoint data under a common key prefix.  This is the worst case scenario 
> for S3 performance since the key is used as a partition key.
>  
> In the worst case checkpoints fail with a 500 status code coming back from S3 
> and an internal error type of TooBusyException.
>  
> One possible solution would be to add a hook in the Flink filesystem code 
> that allows me to "rewrite" paths.  For example say I have the checkpoint 
> directory set to:
>  
> s3://bucket/flink/checkpoints
>  
> I would hook that and rewrite that path to:
>  
> s3://bucket/[HASH]/flink/checkpoints, where HASH is the hash of the original 
> path
>  
> This would distribute the checkpoint write load around the S3 cluster evenly.
>  
> For reference: 
> https://aws.amazon.com/premiumsupport/knowledge-center/s3-bucket-performance-improve/
>  
> Any other people hit this issue?  Any other ideas for solutions?  This is a 
> pretty serious problem for people trying to checkpoint to S3.
>  
> -Jamie
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-9061) S3 checkpoint data not partitioned well -- causes errors and poor performance

2018-04-04 Thread Steven Zhen Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426124#comment-16426124
 ] 

Steven Zhen Wu edited comment on FLINK-9061 at 4/4/18 8:24 PM:
---

[~jgrier] Amazon doesn't want to reveal internal details, hence sometimes are 
pretty vague. My understanding is that prefix (before the random/entropy part) 
has to be fixed. Either way, latest proposal doesn't prevent user from setting 
the first a few chars as random/entropy part. I just think it is important to 
give user the full control on key names.


was (Author: stevenz3wu):
[~jgrier] Amazon doesn't want to reveal internal details, hence sometimes are 
pretty vague. Mu understanding is that prefix (before the random/entropy part) 
has to be fixed. Either way, latest proposal doesn't prevent user from setting 
the first a few chars as random/entropy part. I just think it is important to 
give user the full control on key names.

> S3 checkpoint data not partitioned well -- causes errors and poor performance
> -
>
> Key: FLINK-9061
> URL: https://issues.apache.org/jira/browse/FLINK-9061
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.4.2
>Reporter: Jamie Grier
>Priority: Critical
>
> I think we need to modify the way we write checkpoints to S3 for high-scale 
> jobs (those with many total tasks).  The issue is that we are writing all the 
> checkpoint data under a common key prefix.  This is the worst case scenario 
> for S3 performance since the key is used as a partition key.
>  
> In the worst case checkpoints fail with a 500 status code coming back from S3 
> and an internal error type of TooBusyException.
>  
> One possible solution would be to add a hook in the Flink filesystem code 
> that allows me to "rewrite" paths.  For example say I have the checkpoint 
> directory set to:
>  
> s3://bucket/flink/checkpoints
>  
> I would hook that and rewrite that path to:
>  
> s3://bucket/[HASH]/flink/checkpoints, where HASH is the hash of the original 
> path
>  
> This would distribute the checkpoint write load around the S3 cluster evenly.
>  
> For reference: 
> https://aws.amazon.com/premiumsupport/knowledge-center/s3-bucket-performance-improve/
>  
> Any other people hit this issue?  Any other ideas for solutions?  This is a 
> pretty serious problem for people trying to checkpoint to S3.
>  
> -Jamie
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-9061) S3 checkpoint data not partitioned well -- causes errors and poor performance

2018-04-04 Thread Steven Zhen Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426124#comment-16426124
 ] 

Steven Zhen Wu edited comment on FLINK-9061 at 4/4/18 8:20 PM:
---

[~jgrier] Amazon doesn't want to reveal internal details, hence sometimes are 
pretty vague. Mu understanding is that prefix (before the random/entropy part) 
has to be fixed. Either way, latest proposal doesn't prevent user from setting 
the first a few chars as random/entropy part. I just think it is important to 
give user the full control on key names.


was (Author: stevenz3wu):
[~jgrier] Amazon doesn't want to reveal internal details, hence sometimes are 
pretty vague. Mu understanding is that prefix (before the random/entropy part) 
has to be fixed. Either way, latest proposal doesn't prevent user from setting 
the first a few chars as random/entropy part. 

> S3 checkpoint data not partitioned well -- causes errors and poor performance
> -
>
> Key: FLINK-9061
> URL: https://issues.apache.org/jira/browse/FLINK-9061
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.4.2
>Reporter: Jamie Grier
>Priority: Critical
>
> I think we need to modify the way we write checkpoints to S3 for high-scale 
> jobs (those with many total tasks).  The issue is that we are writing all the 
> checkpoint data under a common key prefix.  This is the worst case scenario 
> for S3 performance since the key is used as a partition key.
>  
> In the worst case checkpoints fail with a 500 status code coming back from S3 
> and an internal error type of TooBusyException.
>  
> One possible solution would be to add a hook in the Flink filesystem code 
> that allows me to "rewrite" paths.  For example say I have the checkpoint 
> directory set to:
>  
> s3://bucket/flink/checkpoints
>  
> I would hook that and rewrite that path to:
>  
> s3://bucket/[HASH]/flink/checkpoints, where HASH is the hash of the original 
> path
>  
> This would distribute the checkpoint write load around the S3 cluster evenly.
>  
> For reference: 
> https://aws.amazon.com/premiumsupport/knowledge-center/s3-bucket-performance-improve/
>  
> Any other people hit this issue?  Any other ideas for solutions?  This is a 
> pretty serious problem for people trying to checkpoint to S3.
>  
> -Jamie
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9061) S3 checkpoint data not partitioned well -- causes errors and poor performance

2018-04-04 Thread Steven Zhen Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426124#comment-16426124
 ] 

Steven Zhen Wu commented on FLINK-9061:
---

[~jgrier] Amazon doesn't want to reveal internal details, hence sometimes are 
pretty vague. Mu understanding is that prefix (before the random/entropy part) 
has to be fixed. Either way, latest proposal doesn't prevent user from setting 
the first a few chars as random/entropy part. 

> S3 checkpoint data not partitioned well -- causes errors and poor performance
> -
>
> Key: FLINK-9061
> URL: https://issues.apache.org/jira/browse/FLINK-9061
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.4.2
>Reporter: Jamie Grier
>Priority: Critical
>
> I think we need to modify the way we write checkpoints to S3 for high-scale 
> jobs (those with many total tasks).  The issue is that we are writing all the 
> checkpoint data under a common key prefix.  This is the worst case scenario 
> for S3 performance since the key is used as a partition key.
>  
> In the worst case checkpoints fail with a 500 status code coming back from S3 
> and an internal error type of TooBusyException.
>  
> One possible solution would be to add a hook in the Flink filesystem code 
> that allows me to "rewrite" paths.  For example say I have the checkpoint 
> directory set to:
>  
> s3://bucket/flink/checkpoints
>  
> I would hook that and rewrite that path to:
>  
> s3://bucket/[HASH]/flink/checkpoints, where HASH is the hash of the original 
> path
>  
> This would distribute the checkpoint write load around the S3 cluster evenly.
>  
> For reference: 
> https://aws.amazon.com/premiumsupport/knowledge-center/s3-bucket-performance-improve/
>  
> Any other people hit this issue?  Any other ideas for solutions?  This is a 
> pretty serious problem for people trying to checkpoint to S3.
>  
> -Jamie
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9061) S3 checkpoint data not partitioned well -- causes errors and poor performance

2018-04-04 Thread Jamie Grier (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426103#comment-16426103
 ] 

Jamie Grier commented on FLINK-9061:


Okay, this is the best documentation I've found on this:  
[https://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html]
 and even it is very vague.

It does appear that it doesn't have to be the very first characters but it 
brings up an interesting question.  What are the exact constraints here?  Which 
part of the key name is and isn't used for partitioning exactly?  I mean 
technically all of our checkpoint objects do in fact have several characters of 
uniqueness since the last part of the full object key name is the GUID.

Anyway, not having full info sucks.

[~stevenz3wu] I think your proposal sounds good.  Thanks for offering to do the 
PR :)  That should work well and logical listing of sub-directories should 
still be possible in this scheme by issuing parallel s3 list requests for each 
possible prefix and merging the results.

Shall we proceed with this approach then?

 

> S3 checkpoint data not partitioned well -- causes errors and poor performance
> -
>
> Key: FLINK-9061
> URL: https://issues.apache.org/jira/browse/FLINK-9061
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.4.2
>Reporter: Jamie Grier
>Priority: Critical
>
> I think we need to modify the way we write checkpoints to S3 for high-scale 
> jobs (those with many total tasks).  The issue is that we are writing all the 
> checkpoint data under a common key prefix.  This is the worst case scenario 
> for S3 performance since the key is used as a partition key.
>  
> In the worst case checkpoints fail with a 500 status code coming back from S3 
> and an internal error type of TooBusyException.
>  
> One possible solution would be to add a hook in the Flink filesystem code 
> that allows me to "rewrite" paths.  For example say I have the checkpoint 
> directory set to:
>  
> s3://bucket/flink/checkpoints
>  
> I would hook that and rewrite that path to:
>  
> s3://bucket/[HASH]/flink/checkpoints, where HASH is the hash of the original 
> path
>  
> This would distribute the checkpoint write load around the S3 cluster evenly.
>  
> For reference: 
> https://aws.amazon.com/premiumsupport/knowledge-center/s3-bucket-performance-improve/
>  
> Any other people hit this issue?  Any other ideas for solutions?  This is a 
> pretty serious problem for people trying to checkpoint to S3.
>  
> -Jamie
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9136) Remove StreamingProgramTestBase

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425945#comment-16425945
 ] 

ASF GitHub Bot commented on FLINK-9136:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/5817

[FLINK-9136][tests] Remove StreamingProgramTestBase

Builds on #5816.

## What is the purpose of the change

This PR removes the `StreamingProgramTestBase` class.

The class discourages reusing cluster resources as every test must be setup 
in a new class. Additionally, it appears to mimic  to jUnits `@Before/@After` 
life-cycle, but actually doesn't as the `postSubmit` method is not called not 
called if `testPogram` fails with an exception. This can lead to resource 
leaks, like for example in the `ContinuousFileProcessingITCase`.

Existing usages were ported to the `AbstractTestBase`.

If a tests was using preSubmit to setup data and postSubmit to verify the 
result, then the methods were merged into a single method.

Otherwise, `preSubmit` implementations were annotated with `@Before`, 
`postSubmit` implementations with `@After`, and `testProgram` implementations 
with `@Test`.

Additionally the visibility of these methods was set to `public`.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 9136

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5817.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5817


commit b3a31933432f4ad4eaeffd28ced16a570910986d
Author: zentol 
Date:   2018-04-04T17:49:35Z

[FLINK-9137][tests] Merge TopSpeedWindowingExampleITCase into 
StreamingExamplesITCase

commit a56c6f969a5fb03260fa0d2705356f0d01badb18
Author: zentol 
Date:   2018-04-04T17:22:53Z

[FLINK-9136][tests] Remove StreamingProgramTestBase




> Remove StreamingProgramTestBase
> ---
>
> Key: FLINK-9136
> URL: https://issues.apache.org/jira/browse/FLINK-9136
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
>
> The {{StreamingProgramTestBase}} should be removed. We can move all existing 
> tests to the {{AbstractTestBase}} with junit annotations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5817: [FLINK-9136][tests] Remove StreamingProgramTestBas...

2018-04-04 Thread zentol

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/5817

[FLINK-9136][tests] Remove StreamingProgramTestBase

Builds on #5816.

## What is the purpose of the change

This PR removes the `StreamingProgramTestBase` class.

The class discourages reusing cluster resources as every test must be setup 
in a new class. Additionally, it appears to mimic  to jUnits `@Before/@After` 
life-cycle, but actually doesn't as the `postSubmit` method is not called not 
called if `testPogram` fails with an exception. This can lead to resource 
leaks, like for example in the `ContinuousFileProcessingITCase`.

Existing usages were ported to the `AbstractTestBase`.

If a tests was using preSubmit to setup data and postSubmit to verify the 
result, then the methods were merged into a single method.

Otherwise, `preSubmit` implementations were annotated with `@Before`, 
`postSubmit` implementations with `@After`, and `testProgram` implementations 
with `@Test`.

Additionally the visibility of these methods was set to `public`.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 9136

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5817.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5817


commit b3a31933432f4ad4eaeffd28ced16a570910986d
Author: zentol 
Date:   2018-04-04T17:49:35Z

[FLINK-9137][tests] Merge TopSpeedWindowingExampleITCase into 
StreamingExamplesITCase

commit a56c6f969a5fb03260fa0d2705356f0d01badb18
Author: zentol 
Date:   2018-04-04T17:22:53Z

[FLINK-9136][tests] Remove StreamingProgramTestBase




---

[jira] [Created] (FLINK-9138) Enhance BucketingSink to also flush data by time interval

2018-04-04 Thread Narayanan Arunachalam (JIRA)

Narayanan Arunachalam created FLINK-9138:


 Summary: Enhance BucketingSink to also flush data by time interval
 Key: FLINK-9138
 URL: https://issues.apache.org/jira/browse/FLINK-9138
 Project: Flink
  Issue Type: Improvement
  Components: filesystem-connector
Affects Versions: 1.4.2
Reporter: Narayanan Arunachalam


BucketingSink now supports flushing data to the file system by size limit and 
by period of inactivity. It will be useful to also flush data by a specified 
time period. This way, the data will be written out when write throughput is 
low but there is no significant time period gaps between the writes. This 
reduces ETA for the data in the file system and should help move the 
checkpoints faster as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9134) Update Calcite dependency to 1.17

2018-04-04 Thread Shuyi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425926#comment-16425926
 ] 

Shuyi Chen commented on FLINK-9134:
---

Hi [~twalthr], this is duplicate of 
[FLINK-9015|https://issues.apache.org/jira/browse/FLINK-9015]. I'll merge and 
close this one.

> Update Calcite dependency to 1.17
> -
>
> Key: FLINK-9134
> URL: https://issues.apache.org/jira/browse/FLINK-9134
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Priority: Major
>
> This is an umbrella issue for tasks that need to be performed when upgrading 
> to Calcite 1.17 once it is released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9137) Merge TopSpeedWindowingExampleITCase into StreamingExamplesITCase

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425924#comment-16425924
 ] 

ASF GitHub Bot commented on FLINK-9137:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/5816

[FLINK-9137][tests] Merge TopSpeedWindowingExampleITCase into 
StreamingExamplesITCase

## What is the purpose of the change

This PR merges the `TopSpeedWindowingExampleITCase` into the 
`StreamingExamplesITCase`. The latter tests all streaming examples, and I don't 
see a reason to not include the `TopSpeedWindowingExample` as well.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 9137

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5816.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5816


commit b3a31933432f4ad4eaeffd28ced16a570910986d
Author: zentol 
Date:   2018-04-04T17:49:35Z

[FLINK-9137][tests] Merge TopSpeedWindowingExampleITCase into 
StreamingExamplesITCase




> Merge TopSpeedWindowingExampleITCase into StreamingExamplesITCase
> -
>
> Key: FLINK-9137
> URL: https://issues.apache.org/jira/browse/FLINK-9137
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5816: [FLINK-9137][tests] Merge TopSpeedWindowingExample...

2018-04-04 Thread zentol

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/5816

[FLINK-9137][tests] Merge TopSpeedWindowingExampleITCase into 
StreamingExamplesITCase

## What is the purpose of the change

This PR merges the `TopSpeedWindowingExampleITCase` into the 
`StreamingExamplesITCase`. The latter tests all streaming examples, and I don't 
see a reason to not include the `TopSpeedWindowingExample` as well.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 9137

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5816.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5816


commit b3a31933432f4ad4eaeffd28ced16a570910986d
Author: zentol 
Date:   2018-04-04T17:49:35Z

[FLINK-9137][tests] Merge TopSpeedWindowingExampleITCase into 
StreamingExamplesITCase




---

[jira] [Created] (FLINK-9137) Merge TopSpeedWindowingExampleITCase into StreamingExamplesITCase

2018-04-04 Thread Chesnay Schepler (JIRA)

Chesnay Schepler created FLINK-9137:
---

 Summary: Merge TopSpeedWindowingExampleITCase into 
StreamingExamplesITCase
 Key: FLINK-9137
 URL: https://issues.apache.org/jira/browse/FLINK-9137
 Project: Flink
  Issue Type: Improvement
  Components: Tests
Affects Versions: 1.5.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-9136) Remove StreamingProgramTestBase

2018-04-04 Thread Chesnay Schepler (JIRA)

Chesnay Schepler created FLINK-9136:
---

 Summary: Remove StreamingProgramTestBase
 Key: FLINK-9136
 URL: https://issues.apache.org/jira/browse/FLINK-9136
 Project: Flink
  Issue Type: Improvement
  Components: Tests
Affects Versions: 1.5.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler


The {{StreamingProgramTestBase}} should be removed. We can move all existing 
tests to the {{AbstractTestBase}} with junit annotations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-9110) Building docs with Ruby 2.5 fails if bundler is not globally installed

2018-04-04 Thread Timo Walther (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425798#comment-16425798
 ] 

Timo Walther edited comment on FLINK-9110 at 4/4/18 5:21 PM:
-

Fixed in 1.6.0: 29fbc95cad2ad05fd08fb82eeac89e0ade011ea6
Fixed in 1.5.0: 4e7b15dad6eb36397f3a07f9f58a1216657493bb
Fixed in 1.4.3: 7610b597b49676299b2dc609d6fd60d4bdccfa2e


was (Author: twalthr):
Fixed in 1.6.0: 29fbc95cad2ad05fd08fb82eeac89e0ade011ea6
Fixed in 1.5.0: 4e7b15dad6eb36397f3a07f9f58a1216657493bb
Fixed in 1.4.0: 7610b597b49676299b2dc609d6fd60d4bdccfa2e

> Building docs with Ruby 2.5 fails if bundler is not globally installed
> --
>
> Key: FLINK-9110
> URL: https://issues.apache.org/jira/browse/FLINK-9110
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Blocker
> Fix For: 1.5.0, 1.4.3
>
>
> If {{bundler}} is not installed, {{build_docs.sh}} attempts to install it 
> locally but updating the {{$PATH}} environment variable is broken at least in 
> my setup with ruby 2.5 because of this command failing:
> {code}
> > ruby -rubygems -e 'puts Gem.user_dir'
> Traceback (most recent call last):
> 1: from 
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': 
> cannot load such file -- ubygems (LoadError)
> > ruby -e 'puts Gem.user_dir'
> /home/nico/.gem/ruby/2.5.0
> {code}
> Additionally, the {{bundle}} binary is not even in that path::
> {code}
> > find ~/.gem/ruby/2.*/bin
> /home/nico/.gem/ruby/2.4.0/bin
> /home/nico/.gem/ruby/2.4.0/bin/bundle.ruby2.4
> /home/nico/.gem/ruby/2.4.0/bin/bundler.ruby2.4
> /home/nico/.gem/ruby/2.5.0/bin
> /home/nico/.gem/ruby/2.5.0/bin/bundle.ruby2.5
> /home/nico/.gem/ruby/2.5.0/bin/bundler.ruby2.5
> {code}
> but indeed here:
> {code}
> > ls ~/.gem/ruby/2.*/gems/bundler-*/exe/bundle
> /home/nico/.gem/ruby/2.4.0/gems/bundler-1.15.3/exe/bundle
> /home/nico/.gem/ruby/2.5.0/gems/bundler-1.16.1/exe/bundle
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-9107) Document timer coalescing for ProcessFunctions

2018-04-04 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-9107.
-
   Resolution: Fixed
Fix Version/s: (was: 1.3.4)

Fixed in 1.6.0: 7b0fc58f75494c9a2c71d551632445ded85c0a45
Fixed in 1.5.0: f083622a200c79395ecf16e2be6f8b540fe85178
Fixed in 1.4.3: ca8f4ca4fdf4fbd98a86b32a2a77dbb00742e164

> Document timer coalescing for ProcessFunctions
> --
>
> Key: FLINK-9107
> URL: https://issues.apache.org/jira/browse/FLINK-9107
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Streaming
>Affects Versions: 1.3.0, 1.4.0, 1.5.0, 1.6.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Major
> Fix For: 1.5.0, 1.4.3
>
>
> In a {{ProcessFunction}}, registering timers for each event via 
> {{ctx.timerService().registerEventTimeTimer()}} using times like 
> {{ctx.timestamp() + timeout}} will get a millisecond accuracy and may thus 
> create one timer per millisecond which may lead to some overhead in the 
> {{TimerService}}.
> This problem can be mitigated by using timer coalescing if the desired 
> accuracy of the timer can be larger than 1ms. A timer firing at full seconds 
> only, for example, can be realised like this:
> {code}
> coalescedTime = ((ctx.timestamp() + timeout) / 1000) * 1000;
> ctx.timerService().registerEventTimeTimer(coalescedTime);
> {code}
> As a result, only a single timer may exist for every second since we do not 
> add timers for timestamps that are already there.
> This should be documented in the {{ProcessFunction}} docs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9107) Document timer coalescing for ProcessFunctions

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425881#comment-16425881
 ] 

ASF GitHub Bot commented on FLINK-9107:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5790


> Document timer coalescing for ProcessFunctions
> --
>
> Key: FLINK-9107
> URL: https://issues.apache.org/jira/browse/FLINK-9107
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Streaming
>Affects Versions: 1.3.0, 1.4.0, 1.5.0, 1.6.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Major
> Fix For: 1.5.0, 1.4.3, 1.3.4
>
>
> In a {{ProcessFunction}}, registering timers for each event via 
> {{ctx.timerService().registerEventTimeTimer()}} using times like 
> {{ctx.timestamp() + timeout}} will get a millisecond accuracy and may thus 
> create one timer per millisecond which may lead to some overhead in the 
> {{TimerService}}.
> This problem can be mitigated by using timer coalescing if the desired 
> accuracy of the timer can be larger than 1ms. A timer firing at full seconds 
> only, for example, can be realised like this:
> {code}
> coalescedTime = ((ctx.timestamp() + timeout) / 1000) * 1000;
> ctx.timerService().registerEventTimeTimer(coalescedTime);
> {code}
> As a result, only a single timer may exist for every second since we do not 
> add timers for timestamps that are already there.
> This should be documented in the {{ProcessFunction}} docs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5790: [FLINK-9107][docs] document timer coalescing for P...

2018-04-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5790


---

[jira] [Commented] (FLINK-9131) Disable spotbugs on travis

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425874#comment-16425874
 ] 

ASF GitHub Bot commented on FLINK-9131:
---

Github user zentol closed the pull request at:

https://github.com/apache/flink/pull/5815


> Disable spotbugs on travis
> --
>
> Key: FLINK-9131
> URL: https://issues.apache.org/jira/browse/FLINK-9131
> Project: Flink
>  Issue Type: Improvement
>  Components: Travis
>Affects Versions: 1.5.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Critical
> Fix For: 1.5.0
>
>
> The misc profile that also runs spotbugs is consistently timing out on travis 
> at the moment.
> The spotbugs plugin is a major contributor to the compilation time, for 
> example it doubles the compile time for flink-runtime.
> I suggest to temporarily disable spotbugs, and re-enable it at a lter point 
> when we figure out the daily cron jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (FLINK-9131) Disable spotbugs on travis

2018-04-04 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-9131.
---
   Resolution: Fixed
Fix Version/s: 1.5.0

master: bd8b47956bde0e7ff7ed5cc6d4bd79a875957835
1.5: 2ae48534ed6df42da9530f95d2b9179a6dbc9015

> Disable spotbugs on travis
> --
>
> Key: FLINK-9131
> URL: https://issues.apache.org/jira/browse/FLINK-9131
> Project: Flink
>  Issue Type: Improvement
>  Components: Travis
>Affects Versions: 1.5.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Critical
> Fix For: 1.5.0
>
>
> The misc profile that also runs spotbugs is consistently timing out on travis 
> at the moment.
> The spotbugs plugin is a major contributor to the compilation time, for 
> example it doubles the compile time for flink-runtime.
> I suggest to temporarily disable spotbugs, and re-enable it at a lter point 
> when we figure out the daily cron jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5815: [FLINK-9131][travis] Disable spotbugs plugin

2018-04-04 Thread zentol

Github user zentol closed the pull request at:

https://github.com/apache/flink/pull/5815


---

[jira] [Commented] (FLINK-9107) Document timer coalescing for ProcessFunctions

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425865#comment-16425865
 ] 

ASF GitHub Bot commented on FLINK-9107:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/5790
  
Thank you @NicoK. I will merge this...


> Document timer coalescing for ProcessFunctions
> --
>
> Key: FLINK-9107
> URL: https://issues.apache.org/jira/browse/FLINK-9107
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Streaming
>Affects Versions: 1.3.0, 1.4.0, 1.5.0, 1.6.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Major
> Fix For: 1.5.0, 1.4.3, 1.3.4
>
>
> In a {{ProcessFunction}}, registering timers for each event via 
> {{ctx.timerService().registerEventTimeTimer()}} using times like 
> {{ctx.timestamp() + timeout}} will get a millisecond accuracy and may thus 
> create one timer per millisecond which may lead to some overhead in the 
> {{TimerService}}.
> This problem can be mitigated by using timer coalescing if the desired 
> accuracy of the timer can be larger than 1ms. A timer firing at full seconds 
> only, for example, can be realised like this:
> {code}
> coalescedTime = ((ctx.timestamp() + timeout) / 1000) * 1000;
> ctx.timerService().registerEventTimeTimer(coalescedTime);
> {code}
> As a result, only a single timer may exist for every second since we do not 
> add timers for timestamps that are already there.
> This should be documented in the {{ProcessFunction}} docs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink issue #5790: [FLINK-9107][docs] document timer coalescing for ProcessF...

2018-04-04 Thread twalthr

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/5790
  
Thank you @NicoK. I will merge this...


---

[jira] [Created] (FLINK-9135) Remove AggregateReduceFunctionsRule once CALCITE-2216 is fixed

2018-04-04 Thread Fabian Hueske (JIRA)

Fabian Hueske created FLINK-9135:


 Summary: Remove AggregateReduceFunctionsRule once CALCITE-2216 is 
fixed
 Key: FLINK-9135
 URL: https://issues.apache.org/jira/browse/FLINK-9135
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.5.0, 1.6.0
Reporter: Fabian Hueske


We had to copy and slightly modify {{AggregateReduceFunctionsRule}} from 
Calcite to fix FLINK-8903.

We proposed the changes to Calcite as CALCITE-2216. Once this issue is fixed 
and we updated to Calcite dependency to a version that includes the fix, we can 
remove our custom rule.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9110) Building docs with Ruby 2.5 fails if bundler is not globally installed

2018-04-04 Thread Timo Walther (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425834#comment-16425834
 ] 

Timo Walther commented on FLINK-9110:
-

Also merged to flink-web: 8b7c2dcb2e46ee1dbc6f4dab49c8b4d89b46027d

> Building docs with Ruby 2.5 fails if bundler is not globally installed
> --
>
> Key: FLINK-9110
> URL: https://issues.apache.org/jira/browse/FLINK-9110
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Blocker
> Fix For: 1.5.0, 1.4.3
>
>
> If {{bundler}} is not installed, {{build_docs.sh}} attempts to install it 
> locally but updating the {{$PATH}} environment variable is broken at least in 
> my setup with ruby 2.5 because of this command failing:
> {code}
> > ruby -rubygems -e 'puts Gem.user_dir'
> Traceback (most recent call last):
> 1: from 
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': 
> cannot load such file -- ubygems (LoadError)
> > ruby -e 'puts Gem.user_dir'
> /home/nico/.gem/ruby/2.5.0
> {code}
> Additionally, the {{bundle}} binary is not even in that path::
> {code}
> > find ~/.gem/ruby/2.*/bin
> /home/nico/.gem/ruby/2.4.0/bin
> /home/nico/.gem/ruby/2.4.0/bin/bundle.ruby2.4
> /home/nico/.gem/ruby/2.4.0/bin/bundler.ruby2.4
> /home/nico/.gem/ruby/2.5.0/bin
> /home/nico/.gem/ruby/2.5.0/bin/bundle.ruby2.5
> /home/nico/.gem/ruby/2.5.0/bin/bundler.ruby2.5
> {code}
> but indeed here:
> {code}
> > ls ~/.gem/ruby/2.*/gems/bundler-*/exe/bundle
> /home/nico/.gem/ruby/2.4.0/gems/bundler-1.15.3/exe/bundle
> /home/nico/.gem/ruby/2.5.0/gems/bundler-1.16.1/exe/bundle
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-9110) Building docs with Ruby 2.5 fails if bundler is not globally installed

2018-04-04 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-9110.
-
   Resolution: Fixed
Fix Version/s: 1.4.3

Fixed in 1.6.0: 29fbc95cad2ad05fd08fb82eeac89e0ade011ea6
Fixed in 1.5.0: 4e7b15dad6eb36397f3a07f9f58a1216657493bb
Fixed in 1.4.0: 7610b597b49676299b2dc609d6fd60d4bdccfa2e

> Building docs with Ruby 2.5 fails if bundler is not globally installed
> --
>
> Key: FLINK-9110
> URL: https://issues.apache.org/jira/browse/FLINK-9110
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Blocker
> Fix For: 1.5.0, 1.4.3
>
>
> If {{bundler}} is not installed, {{build_docs.sh}} attempts to install it 
> locally but updating the {{$PATH}} environment variable is broken at least in 
> my setup with ruby 2.5 because of this command failing:
> {code}
> > ruby -rubygems -e 'puts Gem.user_dir'
> Traceback (most recent call last):
> 1: from 
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': 
> cannot load such file -- ubygems (LoadError)
> > ruby -e 'puts Gem.user_dir'
> /home/nico/.gem/ruby/2.5.0
> {code}
> Additionally, the {{bundle}} binary is not even in that path::
> {code}
> > find ~/.gem/ruby/2.*/bin
> /home/nico/.gem/ruby/2.4.0/bin
> /home/nico/.gem/ruby/2.4.0/bin/bundle.ruby2.4
> /home/nico/.gem/ruby/2.4.0/bin/bundler.ruby2.4
> /home/nico/.gem/ruby/2.5.0/bin
> /home/nico/.gem/ruby/2.5.0/bin/bundle.ruby2.5
> /home/nico/.gem/ruby/2.5.0/bin/bundler.ruby2.5
> {code}
> but indeed here:
> {code}
> > ls ~/.gem/ruby/2.*/gems/bundler-*/exe/bundle
> /home/nico/.gem/ruby/2.4.0/gems/bundler-1.15.3/exe/bundle
> /home/nico/.gem/ruby/2.5.0/gems/bundler-1.16.1/exe/bundle
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9110) Building docs with Ruby 2.5 fails if bundler is not globally installed

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425787#comment-16425787
 ] 

ASF GitHub Bot commented on FLINK-9110:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5788


> Building docs with Ruby 2.5 fails if bundler is not globally installed
> --
>
> Key: FLINK-9110
> URL: https://issues.apache.org/jira/browse/FLINK-9110
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Blocker
> Fix For: 1.5.0
>
>
> If {{bundler}} is not installed, {{build_docs.sh}} attempts to install it 
> locally but updating the {{$PATH}} environment variable is broken at least in 
> my setup with ruby 2.5 because of this command failing:
> {code}
> > ruby -rubygems -e 'puts Gem.user_dir'
> Traceback (most recent call last):
> 1: from 
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': 
> cannot load such file -- ubygems (LoadError)
> > ruby -e 'puts Gem.user_dir'
> /home/nico/.gem/ruby/2.5.0
> {code}
> Additionally, the {{bundle}} binary is not even in that path::
> {code}
> > find ~/.gem/ruby/2.*/bin
> /home/nico/.gem/ruby/2.4.0/bin
> /home/nico/.gem/ruby/2.4.0/bin/bundle.ruby2.4
> /home/nico/.gem/ruby/2.4.0/bin/bundler.ruby2.4
> /home/nico/.gem/ruby/2.5.0/bin
> /home/nico/.gem/ruby/2.5.0/bin/bundle.ruby2.5
> /home/nico/.gem/ruby/2.5.0/bin/bundler.ruby2.5
> {code}
> but indeed here:
> {code}
> > ls ~/.gem/ruby/2.*/gems/bundler-*/exe/bundle
> /home/nico/.gem/ruby/2.4.0/gems/bundler-1.15.3/exe/bundle
> /home/nico/.gem/ruby/2.5.0/gems/bundler-1.16.1/exe/bundle
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5788: [FLINK-9110][docs] fix local bundler installation

2018-04-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5788


---

[jira] [Updated] (FLINK-9031) DataSet Job result changes when adding rebalance after union

2018-04-04 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-9031:
-
Fix Version/s: 1.3.4

> DataSet Job result changes when adding rebalance after union
> 
>
> Key: FLINK-9031
> URL: https://issues.apache.org/jira/browse/FLINK-9031
> Project: Flink
>  Issue Type: Bug
>  Components: DataSet API, Local Runtime, Optimizer
>Affects Versions: 1.3.1
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Critical
> Fix For: 1.5.0, 1.4.3, 1.3.4
>
> Attachments: Person.java, RunAll.java, newplan.txt, oldplan.txt
>
>
> A user [reported this issue on the user mailing 
> list|https://lists.apache.org/thread.html/075f1a487b044079b5d61f199439cb77dd4174bd425bcb3327ed7dfc@%3Cuser.flink.apache.org%3E].
> {quote}I am using Flink 1.3.1 and I have found a strange behavior on running 
> the following logic:
>  # Read data from file and store into DataSet
>  # Split dataset in two, by checking if "field1" of POJOs is empty or not, so 
> that the first dataset contains only elements with non empty "field1", and 
> the second dataset will contain the other elements.
>  # Each dataset is then grouped by, one by "field1" and other by another 
> field, and subsequently reduced.
>  # The 2 datasets are merged together by union.
>  # The final dataset is written as json.
> What I was expected, from output, was to find only one element with a 
> specific value of "field1" because:
>  # Reducing the first dataset grouped by "field1" should generate only one 
> element with a specific value of "field1".
>  # The second dataset should contain only elements with empty "field1".
>  # Making an union of them should not duplicate any record.
> This does not happen. When i read the generated jsons i see some duplicate 
> (non empty) values of "field1".
>  Strangely this does not happen when the union between the two datasets is 
> not computed. In this case the first dataset produces elements only with 
> distinct values of "field1", while second dataset produces only records with 
> empty field "value1".
> {quote}
> The user has not enable object reuse.
> Later he reports that the problem disappears when he injects a rebalance() 
> after a union resolves the problem. I had a look at the execution plans for 
> both cases (attached to this issue) but could not identify a problem.
> Hence I assume, this might be an issue with the runtime code but we need to 
> look deeper into this. The user also provided an example program consisting 
> of two classes which are attached to the issue as well.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Reopened] (FLINK-9031) DataSet Job result changes when adding rebalance after union

2018-04-04 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reopened FLINK-9031:
--

Added fix for Flink 1.3.4

> DataSet Job result changes when adding rebalance after union
> 
>
> Key: FLINK-9031
> URL: https://issues.apache.org/jira/browse/FLINK-9031
> Project: Flink
>  Issue Type: Bug
>  Components: DataSet API, Local Runtime, Optimizer
>Affects Versions: 1.3.1
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Critical
> Fix For: 1.5.0, 1.4.3, 1.3.4
>
> Attachments: Person.java, RunAll.java, newplan.txt, oldplan.txt
>
>
> A user [reported this issue on the user mailing 
> list|https://lists.apache.org/thread.html/075f1a487b044079b5d61f199439cb77dd4174bd425bcb3327ed7dfc@%3Cuser.flink.apache.org%3E].
> {quote}I am using Flink 1.3.1 and I have found a strange behavior on running 
> the following logic:
>  # Read data from file and store into DataSet
>  # Split dataset in two, by checking if "field1" of POJOs is empty or not, so 
> that the first dataset contains only elements with non empty "field1", and 
> the second dataset will contain the other elements.
>  # Each dataset is then grouped by, one by "field1" and other by another 
> field, and subsequently reduced.
>  # The 2 datasets are merged together by union.
>  # The final dataset is written as json.
> What I was expected, from output, was to find only one element with a 
> specific value of "field1" because:
>  # Reducing the first dataset grouped by "field1" should generate only one 
> element with a specific value of "field1".
>  # The second dataset should contain only elements with empty "field1".
>  # Making an union of them should not duplicate any record.
> This does not happen. When i read the generated jsons i see some duplicate 
> (non empty) values of "field1".
>  Strangely this does not happen when the union between the two datasets is 
> not computed. In this case the first dataset produces elements only with 
> distinct values of "field1", while second dataset produces only records with 
> empty field "value1".
> {quote}
> The user has not enable object reuse.
> Later he reports that the problem disappears when he injects a rebalance() 
> after a union resolves the problem. I had a look at the execution plans for 
> both cases (attached to this issue) but could not identify a problem.
> Hence I assume, this might be an issue with the runtime code but we need to 
> look deeper into this. The user also provided an example program consisting 
> of two classes which are attached to the issue as well.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (FLINK-9031) DataSet Job result changes when adding rebalance after union

2018-04-04 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-9031.

Resolution: Fixed

> DataSet Job result changes when adding rebalance after union
> 
>
> Key: FLINK-9031
> URL: https://issues.apache.org/jira/browse/FLINK-9031
> Project: Flink
>  Issue Type: Bug
>  Components: DataSet API, Local Runtime, Optimizer
>Affects Versions: 1.3.1
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Critical
> Fix For: 1.5.0, 1.4.3, 1.3.4
>
> Attachments: Person.java, RunAll.java, newplan.txt, oldplan.txt
>
>
> A user [reported this issue on the user mailing 
> list|https://lists.apache.org/thread.html/075f1a487b044079b5d61f199439cb77dd4174bd425bcb3327ed7dfc@%3Cuser.flink.apache.org%3E].
> {quote}I am using Flink 1.3.1 and I have found a strange behavior on running 
> the following logic:
>  # Read data from file and store into DataSet
>  # Split dataset in two, by checking if "field1" of POJOs is empty or not, so 
> that the first dataset contains only elements with non empty "field1", and 
> the second dataset will contain the other elements.
>  # Each dataset is then grouped by, one by "field1" and other by another 
> field, and subsequently reduced.
>  # The 2 datasets are merged together by union.
>  # The final dataset is written as json.
> What I was expected, from output, was to find only one element with a 
> specific value of "field1" because:
>  # Reducing the first dataset grouped by "field1" should generate only one 
> element with a specific value of "field1".
>  # The second dataset should contain only elements with empty "field1".
>  # Making an union of them should not duplicate any record.
> This does not happen. When i read the generated jsons i see some duplicate 
> (non empty) values of "field1".
>  Strangely this does not happen when the union between the two datasets is 
> not computed. In this case the first dataset produces elements only with 
> distinct values of "field1", while second dataset produces only records with 
> empty field "value1".
> {quote}
> The user has not enable object reuse.
> Later he reports that the problem disappears when he injects a rebalance() 
> after a union resolves the problem. I had a look at the execution plans for 
> both cases (attached to this issue) but could not identify a problem.
> Hence I assume, this might be an issue with the runtime code but we need to 
> look deeper into this. The user also provided an example program consisting 
> of two classes which are attached to the issue as well.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9031) DataSet Job result changes when adding rebalance after union

2018-04-04 Thread Fabian Hueske (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425774#comment-16425774
 ] 

Fabian Hueske commented on FLINK-9031:
--

Fixed for 1.3.4 with cb38b6defbea5f92b6f3a5874acacb56523534f0

> DataSet Job result changes when adding rebalance after union
> 
>
> Key: FLINK-9031
> URL: https://issues.apache.org/jira/browse/FLINK-9031
> Project: Flink
>  Issue Type: Bug
>  Components: DataSet API, Local Runtime, Optimizer
>Affects Versions: 1.3.1
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Critical
> Fix For: 1.5.0, 1.4.3
>
> Attachments: Person.java, RunAll.java, newplan.txt, oldplan.txt
>
>
> A user [reported this issue on the user mailing 
> list|https://lists.apache.org/thread.html/075f1a487b044079b5d61f199439cb77dd4174bd425bcb3327ed7dfc@%3Cuser.flink.apache.org%3E].
> {quote}I am using Flink 1.3.1 and I have found a strange behavior on running 
> the following logic:
>  # Read data from file and store into DataSet
>  # Split dataset in two, by checking if "field1" of POJOs is empty or not, so 
> that the first dataset contains only elements with non empty "field1", and 
> the second dataset will contain the other elements.
>  # Each dataset is then grouped by, one by "field1" and other by another 
> field, and subsequently reduced.
>  # The 2 datasets are merged together by union.
>  # The final dataset is written as json.
> What I was expected, from output, was to find only one element with a 
> specific value of "field1" because:
>  # Reducing the first dataset grouped by "field1" should generate only one 
> element with a specific value of "field1".
>  # The second dataset should contain only elements with empty "field1".
>  # Making an union of them should not duplicate any record.
> This does not happen. When i read the generated jsons i see some duplicate 
> (non empty) values of "field1".
>  Strangely this does not happen when the union between the two datasets is 
> not computed. In this case the first dataset produces elements only with 
> distinct values of "field1", while second dataset produces only records with 
> empty field "value1".
> {quote}
> The user has not enable object reuse.
> Later he reports that the problem disappears when he injects a rebalance() 
> after a union resolves the problem. I had a look at the execution plans for 
> both cases (attached to this issue) but could not identify a problem.
> Hence I assume, this might be an issue with the runtime code but we need to 
> look deeper into this. The user also provided an example program consisting 
> of two classes which are attached to the issue as well.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink issue #5788: [FLINK-9110][docs] fix local bundler installation

2018-04-04 Thread twalthr

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/5788
  
Merging...


---

[jira] [Commented] (FLINK-9110) Building docs with Ruby 2.5 fails if bundler is not globally installed

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425754#comment-16425754
 ] 

ASF GitHub Bot commented on FLINK-9110:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/5788
  
Merging...


> Building docs with Ruby 2.5 fails if bundler is not globally installed
> --
>
> Key: FLINK-9110
> URL: https://issues.apache.org/jira/browse/FLINK-9110
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Blocker
> Fix For: 1.5.0
>
>
> If {{bundler}} is not installed, {{build_docs.sh}} attempts to install it 
> locally but updating the {{$PATH}} environment variable is broken at least in 
> my setup with ruby 2.5 because of this command failing:
> {code}
> > ruby -rubygems -e 'puts Gem.user_dir'
> Traceback (most recent call last):
> 1: from 
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': 
> cannot load such file -- ubygems (LoadError)
> > ruby -e 'puts Gem.user_dir'
> /home/nico/.gem/ruby/2.5.0
> {code}
> Additionally, the {{bundle}} binary is not even in that path::
> {code}
> > find ~/.gem/ruby/2.*/bin
> /home/nico/.gem/ruby/2.4.0/bin
> /home/nico/.gem/ruby/2.4.0/bin/bundle.ruby2.4
> /home/nico/.gem/ruby/2.4.0/bin/bundler.ruby2.4
> /home/nico/.gem/ruby/2.5.0/bin
> /home/nico/.gem/ruby/2.5.0/bin/bundle.ruby2.5
> /home/nico/.gem/ruby/2.5.0/bin/bundler.ruby2.5
> {code}
> but indeed here:
> {code}
> > ls ~/.gem/ruby/2.*/gems/bundler-*/exe/bundle
> /home/nico/.gem/ruby/2.4.0/gems/bundler-1.15.3/exe/bundle
> /home/nico/.gem/ruby/2.5.0/gems/bundler-1.16.1/exe/bundle
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9110) Building docs with Ruby 2.5 fails if bundler is not globally installed

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425753#comment-16425753
 ] 

ASF GitHub Bot commented on FLINK-9110:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/5788
  
Thank you @NicoK. I tested it locally on MacOS. The changes look good.


> Building docs with Ruby 2.5 fails if bundler is not globally installed
> --
>
> Key: FLINK-9110
> URL: https://issues.apache.org/jira/browse/FLINK-9110
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Blocker
> Fix For: 1.5.0
>
>
> If {{bundler}} is not installed, {{build_docs.sh}} attempts to install it 
> locally but updating the {{$PATH}} environment variable is broken at least in 
> my setup with ruby 2.5 because of this command failing:
> {code}
> > ruby -rubygems -e 'puts Gem.user_dir'
> Traceback (most recent call last):
> 1: from 
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'
> /usr/lib64/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': 
> cannot load such file -- ubygems (LoadError)
> > ruby -e 'puts Gem.user_dir'
> /home/nico/.gem/ruby/2.5.0
> {code}
> Additionally, the {{bundle}} binary is not even in that path::
> {code}
> > find ~/.gem/ruby/2.*/bin
> /home/nico/.gem/ruby/2.4.0/bin
> /home/nico/.gem/ruby/2.4.0/bin/bundle.ruby2.4
> /home/nico/.gem/ruby/2.4.0/bin/bundler.ruby2.4
> /home/nico/.gem/ruby/2.5.0/bin
> /home/nico/.gem/ruby/2.5.0/bin/bundle.ruby2.5
> /home/nico/.gem/ruby/2.5.0/bin/bundler.ruby2.5
> {code}
> but indeed here:
> {code}
> > ls ~/.gem/ruby/2.*/gems/bundler-*/exe/bundle
> /home/nico/.gem/ruby/2.4.0/gems/bundler-1.15.3/exe/bundle
> /home/nico/.gem/ruby/2.5.0/gems/bundler-1.16.1/exe/bundle
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink issue #5788: [FLINK-9110][docs] fix local bundler installation

2018-04-04 Thread twalthr

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/5788
  
Thank you @NicoK. I tested it locally on MacOS. The changes look good.


---

[jira] [Updated] (FLINK-7235) Backport CALCITE-1884 to the Flink repository before Calcite 1.14

2018-04-04 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-7235:

Parent Issue: FLINK-9134  (was: FLINK-8507)

> Backport CALCITE-1884 to the Flink repository before Calcite 1.14
> -
>
> Key: FLINK-7235
> URL: https://issues.apache.org/jira/browse/FLINK-7235
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Major
>
> We need to backport CALCITE-1884 in order to unblock upgrading Calcite to 
> 1.13.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-7237) Remove DateTimeUtils from Flink once Calcite is upgraded to 1.14

2018-04-04 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-7237:

Parent Issue: FLINK-9134  (was: FLINK-8507)

> Remove DateTimeUtils from Flink once Calcite is upgraded to 1.14
> 
>
> Key: FLINK-7237
> URL: https://issues.apache.org/jira/browse/FLINK-7237
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9059) Add support for unified table source and sink declaration in environment file

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425717#comment-16425717
 ] 

ASF GitHub Bot commented on FLINK-9059:
---

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/5758#discussion_r179182994
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/descriptors/TableDescriptor.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.descriptors
+
+import org.apache.flink.table.descriptors.DescriptorProperties.toScala
+import 
org.apache.flink.table.descriptors.StatisticsValidator.{STATISTICS_COLUMNS, 
STATISTICS_ROW_COUNT, readColumnStats}
+import org.apache.flink.table.plan.stats.TableStats
+
+import scala.collection.JavaConverters._
+
+/**
+  * Common class for all descriptors describing table sources and sinks.
+  */
+abstract class TableDescriptor extends Descriptor {
+
+  protected var connectorDescriptor: Option[ConnectorDescriptor] = None
+  protected var formatDescriptor: Option[FormatDescriptor] = None
+  protected var schemaDescriptor: Option[Schema] = None
+  protected var statisticsDescriptor: Option[Statistics] = None
--- End diff --

Thanks for pointing this out. You are right. Sinks can have a schema but no 
statistics. I was just wondering if we really need most of the refactorings in 
this PR. We need to rework the `TableSourceDescriptor` class in the near future 
because a Java user can access all `protected` field which is not very nice API 
design. 


> Add support for unified table source and sink declaration in environment file
> -
>
> Key: FLINK-9059
> URL: https://issues.apache.org/jira/browse/FLINK-9059
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
> Fix For: 1.5.0
>
>
> 1) Add a common property called "type" with single value 'source'.
> 2) in yaml file, replace "sources" with "tables".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5758: [FLINK-9059][Table API & SQL] add table type attri...

2018-04-04 Thread twalthr

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/5758#discussion_r179182994
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/descriptors/TableDescriptor.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.descriptors
+
+import org.apache.flink.table.descriptors.DescriptorProperties.toScala
+import 
org.apache.flink.table.descriptors.StatisticsValidator.{STATISTICS_COLUMNS, 
STATISTICS_ROW_COUNT, readColumnStats}
+import org.apache.flink.table.plan.stats.TableStats
+
+import scala.collection.JavaConverters._
+
+/**
+  * Common class for all descriptors describing table sources and sinks.
+  */
+abstract class TableDescriptor extends Descriptor {
+
+  protected var connectorDescriptor: Option[ConnectorDescriptor] = None
+  protected var formatDescriptor: Option[FormatDescriptor] = None
+  protected var schemaDescriptor: Option[Schema] = None
+  protected var statisticsDescriptor: Option[Statistics] = None
--- End diff --

Thanks for pointing this out. You are right. Sinks can have a schema but no 
statistics. I was just wondering if we really need most of the refactorings in 
this PR. We need to rework the `TableSourceDescriptor` class in the near future 
because a Java user can access all `protected` field which is not very nice API 
design. 


---

[jira] [Created] (FLINK-9134) Update Calcite dependency to 1.17

2018-04-04 Thread Timo Walther (JIRA)

Timo Walther created FLINK-9134:
---

 Summary: Update Calcite dependency to 1.17
 Key: FLINK-9134
 URL: https://issues.apache.org/jira/browse/FLINK-9134
 Project: Flink
  Issue Type: Improvement
  Components: Table API & SQL
Reporter: Timo Walther


This is an umbrella issue for tasks that need to be performed when upgrading to 
Calcite 1.17 once it is released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-8508) Remove RexSimplify from Flink repo

2018-04-04 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-8508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-8508.
-
   Resolution: Fixed
Fix Version/s: 1.5.0

Fixed in 1.6.0: 176a893d2084cc48f9b2b7849ada9a1bd56e
Fixed in 1.5.0: 2c626d1404439f8fdc81f64d9db4531e5530771a

> Remove RexSimplify from Flink repo
> --
>
> Key: FLINK-8508
> URL: https://issues.apache.org/jira/browse/FLINK-8508
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
> Fix For: 1.5.0
>
>
> RexSimplify is copied to the Flink repo due to 
> [CALCITE-2110|https://issues.apache.org/jira/browse/CALCITE-2110], we should 
> remove it once flink upgrade Calcite dependency to 1.16.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-8508) Remove RexSimplify from Flink repo

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-8508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425703#comment-16425703
 ] 

ASF GitHub Bot commented on FLINK-8508:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5793


> Remove RexSimplify from Flink repo
> --
>
> Key: FLINK-8508
> URL: https://issues.apache.org/jira/browse/FLINK-8508
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
> Fix For: 1.5.0
>
>
> RexSimplify is copied to the Flink repo due to 
> [CALCITE-2110|https://issues.apache.org/jira/browse/CALCITE-2110], we should 
> remove it once flink upgrade Calcite dependency to 1.16.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5793: [FLINK-8508][Table API & SQL] Remove RexSimplify c...

2018-04-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5793


---

[jira] [Commented] (FLINK-9059) Add support for unified table source and sink declaration in environment file

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425676#comment-16425676
 ] 

ASF GitHub Bot commented on FLINK-9059:
---

Github user walterddr commented on a diff in the pull request:

https://github.com/apache/flink/pull/5758#discussion_r179178156
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/descriptors/TableDescriptor.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.descriptors
+
+import org.apache.flink.table.descriptors.DescriptorProperties.toScala
+import 
org.apache.flink.table.descriptors.StatisticsValidator.{STATISTICS_COLUMNS, 
STATISTICS_ROW_COUNT, readColumnStats}
+import org.apache.flink.table.plan.stats.TableStats
+
+import scala.collection.JavaConverters._
+
+/**
+  * Common class for all descriptors describing table sources and sinks.
+  */
+abstract class TableDescriptor extends Descriptor {
+
+  protected var connectorDescriptor: Option[ConnectorDescriptor] = None
+  protected var formatDescriptor: Option[FormatDescriptor] = None
+  protected var schemaDescriptor: Option[Schema] = None
+  protected var statisticsDescriptor: Option[Statistics] = None
--- End diff --

I was wondering if a sink could have its own "schema configurations" for 
alignment with the output table schema? 
For example a CassandraTableSink / JDBCTableSink would definitely throw 
exceptions when trying to execute an insert with mismatched schemas.


> Add support for unified table source and sink declaration in environment file
> -
>
> Key: FLINK-9059
> URL: https://issues.apache.org/jira/browse/FLINK-9059
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
> Fix For: 1.5.0
>
>
> 1) Add a common property called "type" with single value 'source'.
> 2) in yaml file, replace "sources" with "tables".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5758: [FLINK-9059][Table API & SQL] add table type attri...

2018-04-04 Thread walterddr

Github user walterddr commented on a diff in the pull request:

https://github.com/apache/flink/pull/5758#discussion_r179178156
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/descriptors/TableDescriptor.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.descriptors
+
+import org.apache.flink.table.descriptors.DescriptorProperties.toScala
+import 
org.apache.flink.table.descriptors.StatisticsValidator.{STATISTICS_COLUMNS, 
STATISTICS_ROW_COUNT, readColumnStats}
+import org.apache.flink.table.plan.stats.TableStats
+
+import scala.collection.JavaConverters._
+
+/**
+  * Common class for all descriptors describing table sources and sinks.
+  */
+abstract class TableDescriptor extends Descriptor {
+
+  protected var connectorDescriptor: Option[ConnectorDescriptor] = None
+  protected var formatDescriptor: Option[FormatDescriptor] = None
+  protected var schemaDescriptor: Option[Schema] = None
+  protected var statisticsDescriptor: Option[Statistics] = None
--- End diff --

I was wondering if a sink could have its own "schema configurations" for 
alignment with the output table schema? 
For example a CassandraTableSink / JDBCTableSink would definitely throw 
exceptions when trying to execute an insert with mismatched schemas.


---

[jira] [Resolved] (FLINK-8509) Remove SqlGroupedWindowFunction from Flink repo

2018-04-04 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-8509.
-
   Resolution: Fixed
Fix Version/s: 1.5.0

Fixed in 1.6.0: 98a8b642f24c813a4929dfd780163778dc5bd010
Fixed in 1.5.0: a298e6e4f81ec2cab4c86dd07f441eaebada4915

> Remove SqlGroupedWindowFunction from Flink repo
> ---
>
> Key: FLINK-8509
> URL: https://issues.apache.org/jira/browse/FLINK-8509
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
> Fix For: 1.5.0
>
>
> SqlGroupedWindowFunction is copied to the Flink repo due to 
> [CALCITE-2133|https://issues.apache.org/jira/browse/CALCITE-2133], we should 
> remove it once flink upgrade Calcite dependency to 1.16.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-9133) Improve documentability of REST API

2018-04-04 Thread Chesnay Schepler (JIRA)

Chesnay Schepler created FLINK-9133:
---

 Summary: Improve documentability of REST API
 Key: FLINK-9133
 URL: https://issues.apache.org/jira/browse/FLINK-9133
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, REST
Affects Versions: 1.5.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler


In several places of the REST API we use custom JSON (de)serializers for 
writing data. This is very problematic in regards to the documentation, as 
there is no way to actually generated it when these serializers are used.

I doubt we can fix this issue entirely at the moment, but I already found areas 
that we can improve.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5794: [Flink-8509][Table API & SQL] Remove SqlGroupedWin...

2018-04-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5794


---

[jira] [Created] (FLINK-9132) Cluster runs out of task slots when a job falls into restart loop

2018-04-04 Thread Alex Smirnov (JIRA)

Alex Smirnov created FLINK-9132:
---

 Summary: Cluster runs out of task slots when a job falls into 
restart loop
 Key: FLINK-9132
 URL: https://issues.apache.org/jira/browse/FLINK-9132
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.4.2
 Environment: env.java.opts in flink-conf.yaml file:

 

env.java.opts: -Xloggc:/home/user/flink/log/flinkServer-gc.log  -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+UseG1GC 
-XX:MaxGCPauseMillis=150 -XX:InitiatingHeapOccupancyPercent=55 
-XX:+ParallelRefProcEnabled -XX:ParallelGCThreads=2 -XX:-ResizePLAB 
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=100M
Reporter: Alex Smirnov
 Attachments: FailedJob.java, jconsole-classes.png

If there's a job which is restarting in a loop, then Task Manager hosting it 
goes down after some time. Job manager automatically assigns the job to another 
Task Manager and the new Task Manager goes down as well. After some time, all 
Task Managers are gone. Cluster becomes paralyzed.

I've attached to TaskManager's java process using jconsole and noticed that 
number of loaded classes increases dramatically if a job is in restarting loop 
and restores from checkpoint.

See attachment for the graph with G1GC enabled for the node. Standard GC 
performs even worse - task manager shuts down within 20 minutes since the 
restart loop start.

I've also attached minimal program to reproduce the problem

 

please let me know if additional information is required from me.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-9008) End-to-end test: Quickstarts

2018-04-04 Thread mingleizhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425648#comment-16425648
 ] 

mingleizhang edited comment on FLINK-9008 at 4/4/18 2:54 PM:
-

Hi, [~till.rohrmann] I would like to confirm one stuff with you here for the 
example of a flink application program that will package into a jar file. I 
know there is already an example of {{PopularPlacesToES}}, should I package 
this into a jar or instead I can write a simple job which  support only one 
operator like {{filter}} into that jar. Any suggestions ? Thank you very much.


was (Author: mingleizhang):
Hi, [~till.rohrmann] I would like to confirm one stuff with you here for the 
example of a flink application program that will package into a jar file. I 
know there is already an example of {{PopularPlacesToES}}, should I package 
this into a jar or instead I can write a simple job which  support only one 
operator like {{filter}}. Any suggestions ? Thank you very much.

> End-to-end test: Quickstarts
> 
>
> Key: FLINK-9008
> URL: https://issues.apache.org/jira/browse/FLINK-9008
> Project: Flink
>  Issue Type: Sub-task
>  Components: Quickstarts, Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Assignee: mingleizhang
>Priority: Critical
> Fix For: 1.5.0
>
>
> We could add an end-to-end test which verifies Flink's quickstarts. It should 
> do the following:
> # create a new Flink project using the quickstarts archetype 
> # add a new Flink dependency to the {{pom.xml}} (e.g. Flink connector or 
> library) 
> # run {{mvn clean package -Pbuild-jar}}
> # verify that no core dependencies are contained in the jar file
> # Run the program



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9008) End-to-end test: Quickstarts

2018-04-04 Thread mingleizhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425648#comment-16425648
 ] 

mingleizhang commented on FLINK-9008:
-

Hi, [~till.rohrmann] I would like to confirm one stuff with you here for the 
example of a flink application program that will package into a jar file. I 
know there is already an example of {{PopularPlacesToES}}, should I package 
this into a jar or instead I can write a simple job which  support only one 
operator like {{filter}}. Any suggestions ? Thank you very much.

> End-to-end test: Quickstarts
> 
>
> Key: FLINK-9008
> URL: https://issues.apache.org/jira/browse/FLINK-9008
> Project: Flink
>  Issue Type: Sub-task
>  Components: Quickstarts, Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Assignee: mingleizhang
>Priority: Critical
> Fix For: 1.5.0
>
>
> We could add an end-to-end test which verifies Flink's quickstarts. It should 
> do the following:
> # create a new Flink project using the quickstarts archetype 
> # add a new Flink dependency to the {{pom.xml}} (e.g. Flink connector or 
> library) 
> # run {{mvn clean package -Pbuild-jar}}
> # verify that no core dependencies are contained in the jar file
> # Run the program



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-8563) Support consecutive DOT operators

2018-04-04 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-8563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-8563.
-
   Resolution: Fixed
Fix Version/s: 1.5.0

Fixed in 1.6.0: 284995172bad3cbc5844d8198654e1de7f513591
Fixed in 1.5.0: 4b10dd684b2ee8a1b74f1297b79fe0852f0172e5

> Support consecutive DOT operators 
> --
>
> Key: FLINK-8563
> URL: https://issues.apache.org/jira/browse/FLINK-8563
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Shuyi Chen
>Priority: Major
> Fix For: 1.5.0
>
>
> We added support for accessing fields of arrays of composite types in 
> FLINK-7923. However, accessing another nested subfield is not supported by 
> Calcite. See CALCITE-2162. We should fix this once we upgrade to Calcite 1.16.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9131) Disable spotbugs on travis

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425641#comment-16425641
 ] 

ASF GitHub Bot commented on FLINK-9131:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/5815

[FLINK-9131][travis] Disable spotbugs plugin

## What is the purpose of the change

This PR disables spotbugs on travis. This will buy us 7-8 minutes in the 
misc profile that currently times out,

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 9131

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5815.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5815


commit 521462ad59dcfea7f12d2d3827846028154c8a80
Author: zentol 
Date:   2018-04-04T14:43:14Z

[FLINK-9131][travis] Disable spotbugs plugin




> Disable spotbugs on travis
> --
>
> Key: FLINK-9131
> URL: https://issues.apache.org/jira/browse/FLINK-9131
> Project: Flink
>  Issue Type: Improvement
>  Components: Travis
>Affects Versions: 1.5.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Critical
>
> The misc profile that also runs spotbugs is consistently timing out on travis 
> at the moment.
> The spotbugs plugin is a major contributor to the compilation time, for 
> example it doubles the compile time for flink-runtime.
> I suggest to temporarily disable spotbugs, and re-enable it at a lter point 
> when we figure out the daily cron jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (FLINK-9108) invalid ProcessWindowFunction link in Document

2018-04-04 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-9108.
---
   Resolution: Fixed
Fix Version/s: 1.5.0

master: a71b9031821b80b74df855df4a565467bb32550a
1.5: 198446962c67a297cf98d45b3efa94ee56d1dd7a

> invalid ProcessWindowFunction link in Document 
> ---
>
> Key: FLINK-9108
> URL: https://issues.apache.org/jira/browse/FLINK-9108
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Matrix42
>Assignee: Matrix42
>Priority: Trivial
> Fix For: 1.5.0
>
> Attachments: QQ截图20180329184203.png
>
>
> !QQ截图20180329184203.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5815: [FLINK-9131][travis] Disable spotbugs plugin

2018-04-04 Thread zentol

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/5815

[FLINK-9131][travis] Disable spotbugs plugin

## What is the purpose of the change

This PR disables spotbugs on travis. This will buy us 7-8 minutes in the 
misc profile that currently times out,

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 9131

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5815.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5815


commit 521462ad59dcfea7f12d2d3827846028154c8a80
Author: zentol 
Date:   2018-04-04T14:43:14Z

[FLINK-9131][travis] Disable spotbugs plugin




---

[GitHub] flink pull request #5792: [Flink-8563][Table API & SQL] add unittest for con...

2018-04-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5792


---

[jira] [Commented] (FLINK-9108) invalid ProcessWindowFunction link in Document

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425636#comment-16425636
 ] 

ASF GitHub Bot commented on FLINK-9108:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5785
  
ehh...whoops. Yes i closed it by accident, thanks for catching it. I'll 
merge the commit in a second...


> invalid ProcessWindowFunction link in Document 
> ---
>
> Key: FLINK-9108
> URL: https://issues.apache.org/jira/browse/FLINK-9108
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Matrix42
>Assignee: Matrix42
>Priority: Trivial
> Attachments: QQ截图20180329184203.png
>
>
> !QQ截图20180329184203.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink issue #5785: [FLINK-9108][docs] Fix invalid link

2018-04-04 Thread zentol

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5785
  
ehh...whoops. Yes i closed it by accident, thanks for catching it. I'll 
merge the commit in a second...


---

[jira] [Created] (FLINK-9131) Disable spotbugs on travis

2018-04-04 Thread Chesnay Schepler (JIRA)

Chesnay Schepler created FLINK-9131:
---

 Summary: Disable spotbugs on travis
 Key: FLINK-9131
 URL: https://issues.apache.org/jira/browse/FLINK-9131
 Project: Flink
  Issue Type: Improvement
  Components: Travis
Affects Versions: 1.5.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler


The misc profile that also runs spotbugs is consistently timing out on travis 
at the moment.

The spotbugs plugin is a major contributor to the compilation time, for example 
it doubles the compile time for flink-runtime.

I suggest to temporarily disable spotbugs, and re-enable it at a lter point 
when we figure out the daily cron jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink issue #5792: [Flink-8563][Table API & SQL] add unittest for consecutiv...

2018-04-04 Thread twalthr

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/5792
  
Thank you @suez1224. Merging...


---

[jira] [Commented] (FLINK-9130) Add cancel-job option to SavepointHandlers JavaDoc

2018-04-04 Thread Gary Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425626#comment-16425626
 ] 

Gary Yao commented on FLINK-9130:
-

[~fhueske], after FLINK-9104 the FLIP-6 REST API docs should include the 
option. 

> Add cancel-job option to SavepointHandlers JavaDoc
> --
>
> Key: FLINK-9130
> URL: https://issues.apache.org/jira/browse/FLINK-9130
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Fabian Hueske
>Priority: Minor
>
> The Savepoint JavaDocs are missing the {{cancel-job}} option.
> See discussion on ML here: 
> [https://lists.apache.org/thread.html/dc05751fa6507388dcefc0c845facef6b36b086e256b52c3e37d71dc@%3Cuser.flink.apache.org%3E]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-8507) Upgrade Calcite dependency to 1.16

2018-04-04 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-8507.
-
   Resolution: Fixed
Fix Version/s: 1.5.0

Fixed in 1.6.0: 3e21f0f8ea1d1e5d6c4a34fda1f8fa821ffd6a40
Fixed in 1.5.0: 7e240edca02d03648ccec471f141bca70b8c79db

> Upgrade Calcite dependency to 1.16
> --
>
> Key: FLINK-8507
> URL: https://issues.apache.org/jira/browse/FLINK-8507
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
> Fix For: 1.5.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-8507) Upgrade Calcite dependency to 1.16

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425623#comment-16425623
 ] 

ASF GitHub Bot commented on FLINK-8507:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5791


> Upgrade Calcite dependency to 1.16
> --
>
> Key: FLINK-8507
> URL: https://issues.apache.org/jira/browse/FLINK-8507
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5791: [FLINK-8507][Table API & SQL] upgrade calcite depe...

2018-04-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5791


---

[jira] [Commented] (FLINK-9108) invalid ProcessWindowFunction link in Document

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425620#comment-16425620
 ] 

ASF GitHub Bot commented on FLINK-9108:
---

Github user Matrix42 commented on the issue:

https://github.com/apache/flink/pull/5785
  
@zentol Is this closed by accident?


> invalid ProcessWindowFunction link in Document 
> ---
>
> Key: FLINK-9108
> URL: https://issues.apache.org/jira/browse/FLINK-9108
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Matrix42
>Assignee: Matrix42
>Priority: Trivial
> Attachments: QQ截图20180329184203.png
>
>
> !QQ截图20180329184203.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink issue #5785: [FLINK-9108][docs] Fix invalid link

2018-04-04 Thread Matrix42

Github user Matrix42 commented on the issue:

https://github.com/apache/flink/pull/5785
  
@zentol Is this closed by accident?


---

[jira] [Updated] (FLINK-9130) Add cancel-job option to SavepointHandlers JavaDoc

2018-04-04 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-9130:
-
Priority: Minor  (was: Critical)

> Add cancel-job option to SavepointHandlers JavaDoc
> --
>
> Key: FLINK-9130
> URL: https://issues.apache.org/jira/browse/FLINK-9130
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Fabian Hueske
>Priority: Minor
>
> The Savepoint JavaDocs are missing the {{cancel-job}} option.
> See discussion on ML here: 
> [https://lists.apache.org/thread.html/dc05751fa6507388dcefc0c845facef6b36b086e256b52c3e37d71dc@%3Cuser.flink.apache.org%3E]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9130) Add cancel-job option to SavepointHandlers JavaDoc

2018-04-04 Thread Fabian Hueske (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425612#comment-16425612
 ] 

Fabian Hueske commented on FLINK-9130:
--

Thanks for the correction [~gjy]. I've changed the Jira title and description.

Does the REST API documentation include the {{cancel-job}} option?

> Add cancel-job option to SavepointHandlers JavaDoc
> --
>
> Key: FLINK-9130
> URL: https://issues.apache.org/jira/browse/FLINK-9130
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Fabian Hueske
>Priority: Critical
>
> The Savepoint JavaDocs are missing the {{cancel-job}} option.
> See discussion on ML here: 
> [https://lists.apache.org/thread.html/dc05751fa6507388dcefc0c845facef6b36b086e256b52c3e37d71dc@%3Cuser.flink.apache.org%3E]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-9130) Add cancel-job option to SavepointHandlers JavaDoc

2018-04-04 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-9130:
-
Description: 
The Savepoint JavaDocs are missing the {{cancel-job}} option.

See discussion on ML here: 
[https://lists.apache.org/thread.html/dc05751fa6507388dcefc0c845facef6b36b086e256b52c3e37d71dc@%3Cuser.flink.apache.org%3E]

  was:
The Savepoint REST documentation is missing the {{cancel-job}} option.

See discussion on ML here: 
https://lists.apache.org/thread.html/dc05751fa6507388dcefc0c845facef6b36b086e256b52c3e37d71dc@%3Cuser.flink.apache.org%3E


> Add cancel-job option to SavepointHandlers JavaDoc
> --
>
> Key: FLINK-9130
> URL: https://issues.apache.org/jira/browse/FLINK-9130
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Fabian Hueske
>Priority: Critical
>
> The Savepoint JavaDocs are missing the {{cancel-job}} option.
> See discussion on ML here: 
> [https://lists.apache.org/thread.html/dc05751fa6507388dcefc0c845facef6b36b086e256b52c3e37d71dc@%3Cuser.flink.apache.org%3E]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-9130) Add cancel-job option to SavepointHandlers JavaDoc

2018-04-04 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-9130:
-
Summary: Add cancel-job option to SavepointHandlers JavaDoc  (was: Add 
cancel-job option to SavepointHandlers JavaDoc and regenerate REST docs)

> Add cancel-job option to SavepointHandlers JavaDoc
> --
>
> Key: FLINK-9130
> URL: https://issues.apache.org/jira/browse/FLINK-9130
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Fabian Hueske
>Priority: Critical
>
> The Savepoint REST documentation is missing the {{cancel-job}} option.
> See discussion on ML here: 
> https://lists.apache.org/thread.html/dc05751fa6507388dcefc0c845facef6b36b086e256b52c3e37d71dc@%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9130) Add cancel-job option to SavepointHandlers JavaDoc and regenerate REST docs

2018-04-04 Thread Gary Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425603#comment-16425603
 ] 

Gary Yao commented on FLINK-9130:
-

The Javadocs for 
{{org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers}} is 
missing the {{cancel-job}} option. The REST documentation is not generated from 
Javadocs.

> Add cancel-job option to SavepointHandlers JavaDoc and regenerate REST docs
> ---
>
> Key: FLINK-9130
> URL: https://issues.apache.org/jira/browse/FLINK-9130
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Fabian Hueske
>Priority: Critical
>
> The Savepoint REST documentation is missing the {{cancel-job}} option.
> See discussion on ML here: 
> https://lists.apache.org/thread.html/dc05751fa6507388dcefc0c845facef6b36b086e256b52c3e37d71dc@%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (FLINK-7726) Move marshalling testbases out of legacy namespace

2018-04-04 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-7726.
---
   Resolution: Fixed
Fix Version/s: (was: 1.5.0)
   1.4.0

master: bc4638a3c96049de3ef615159cf83bbd88019575

> Move marshalling testbases out of legacy namespace
> --
>
> Key: FLINK-7726
> URL: https://issues.apache.org/jira/browse/FLINK-7726
> Project: Flink
>  Issue Type: Improvement
>  Components: REST, Tests
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
> Fix For: 1.4.0
>
>
> The marshalling test bases currently reside under 
> {{org.apache.flink.runtime.rest.handler.legacy.messages}} which doesn't make 
> sense as this isn't legacy code.
> We should do this once the port of all handlers has been finished to avoid 
> merge conflicts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (FLINK-8254) REST API documentation wonky due to shading

2018-04-04 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-8254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-8254.
---
Resolution: Fixed

master: 37fe082bd594c5c092362077e0329f138c84e544
1.5: 69b9515b47b11be70dfb4d6176883c91086609b3

> REST API documentation wonky due to shading
> ---
>
> Key: FLINK-8254
> URL: https://issues.apache.org/jira/browse/FLINK-8254
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
> Fix For: 1.5.0
>
>
> The REST API documentation isn't quite correct as all jackson annotations are 
> being ignored. Our annotations come from flink-shaded-jackson, but the tool 
> we use (jackson-module-jsonSchema) checks against vanilla jackson.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-8507) Upgrade Calcite dependency to 1.16

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425569#comment-16425569
 ] 

ASF GitHub Bot commented on FLINK-8507:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/5791
  
LGTM, will merge in next batch...


> Upgrade Calcite dependency to 1.16
> --
>
> Key: FLINK-8507
> URL: https://issues.apache.org/jira/browse/FLINK-8507
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink issue #5791: [FLINK-8507][Table API & SQL] upgrade calcite dependency ...

2018-04-04 Thread twalthr

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/5791
  
LGTM, will merge in next batch...


---

[jira] [Created] (FLINK-9130) Add cancel-job option to SavepointHandlers JavaDoc and regenerate REST docs

2018-04-04 Thread Fabian Hueske (JIRA)

Fabian Hueske created FLINK-9130:


 Summary: Add cancel-job option to SavepointHandlers JavaDoc and 
regenerate REST docs
 Key: FLINK-9130
 URL: https://issues.apache.org/jira/browse/FLINK-9130
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.5.0, 1.6.0
Reporter: Fabian Hueske


The Savepoint REST documentation is missing the {{cancel-job}} option.

See discussion on ML here: 
https://lists.apache.org/thread.html/dc05751fa6507388dcefc0c845facef6b36b086e256b52c3e37d71dc@%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9059) Add support for unified table source and sink declaration in environment file

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425558#comment-16425558
 ] 

ASF GitHub Bot commented on FLINK-9059:
---

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/5758#discussion_r179149794
  
--- Diff: 
flink-libraries/flink-sql-client/src/main/java/org/apache/flink/table/client/config/Environment.java
 ---
@@ -29,38 +30,47 @@
 
 /**
  * Environment configuration that represents the content of an environment 
file. Environment files
- * define sources, execution, and deployment behavior. An environment 
might be defined by default or
+ * define tables, execution, and deployment behavior. An environment might 
be defined by default or
  * as part of a session. Environments can be merged or enriched with 
properties (e.g. from CLI command).
  *
  * In future versions, we might restrict the merging or enrichment of 
deployment properties to not
  * allow overwriting of a deployment by a session.
  */
 public class Environment {
 
-   private Map sources;
+   private Map tables;
--- End diff --

Why not maintaining two separate maps for sources and sinks? Then we don't 
need instance of checks. If a table is both we can simply add it to both maps.


> Add support for unified table source and sink declaration in environment file
> -
>
> Key: FLINK-9059
> URL: https://issues.apache.org/jira/browse/FLINK-9059
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
> Fix For: 1.5.0
>
>
> 1) Add a common property called "type" with single value 'source'.
> 2) in yaml file, replace "sources" with "tables".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5758: [FLINK-9059][Table API & SQL] add table type attri...

2018-04-04 Thread twalthr

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/5758#discussion_r179149794
  
--- Diff: 
flink-libraries/flink-sql-client/src/main/java/org/apache/flink/table/client/config/Environment.java
 ---
@@ -29,38 +30,47 @@
 
 /**
  * Environment configuration that represents the content of an environment 
file. Environment files
- * define sources, execution, and deployment behavior. An environment 
might be defined by default or
+ * define tables, execution, and deployment behavior. An environment might 
be defined by default or
  * as part of a session. Environments can be merged or enriched with 
properties (e.g. from CLI command).
  *
  * In future versions, we might restrict the merging or enrichment of 
deployment properties to not
  * allow overwriting of a deployment by a session.
  */
 public class Environment {
 
-   private Map sources;
+   private Map tables;
--- End diff --

Why not maintaining two separate maps for sources and sinks? Then we don't 
need instance of checks. If a table is both we can simply add it to both maps.


---

[jira] [Commented] (FLINK-9059) Add support for unified table source and sink declaration in environment file

2018-04-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425559#comment-16425559
 ] 

ASF GitHub Bot commented on FLINK-9059:
---

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/5758#discussion_r179150731
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/descriptors/TableDescriptor.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.descriptors
+
+import org.apache.flink.table.descriptors.DescriptorProperties.toScala
+import 
org.apache.flink.table.descriptors.StatisticsValidator.{STATISTICS_COLUMNS, 
STATISTICS_ROW_COUNT, readColumnStats}
+import org.apache.flink.table.plan.stats.TableStats
+
+import scala.collection.JavaConverters._
+
+/**
+  * Common class for all descriptors describing table sources and sinks.
+  */
+abstract class TableDescriptor extends Descriptor {
+
+  protected var connectorDescriptor: Option[ConnectorDescriptor] = None
+  protected var formatDescriptor: Option[FormatDescriptor] = None
+  protected var schemaDescriptor: Option[Schema] = None
+  protected var statisticsDescriptor: Option[Statistics] = None
--- End diff --

I'm wondering if we really need these changes. A sink will never have a 
schema or statistics.


> Add support for unified table source and sink declaration in environment file
> -
>
> Key: FLINK-9059
> URL: https://issues.apache.org/jira/browse/FLINK-9059
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
> Fix For: 1.5.0
>
>
> 1) Add a common property called "type" with single value 'source'.
> 2) in yaml file, replace "sources" with "tables".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] flink pull request #5758: [FLINK-9059][Table API & SQL] add table type attri...

2018-04-04 Thread twalthr

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/5758#discussion_r179150731
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/descriptors/TableDescriptor.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.descriptors
+
+import org.apache.flink.table.descriptors.DescriptorProperties.toScala
+import 
org.apache.flink.table.descriptors.StatisticsValidator.{STATISTICS_COLUMNS, 
STATISTICS_ROW_COUNT, readColumnStats}
+import org.apache.flink.table.plan.stats.TableStats
+
+import scala.collection.JavaConverters._
+
+/**
+  * Common class for all descriptors describing table sources and sinks.
+  */
+abstract class TableDescriptor extends Descriptor {
+
+  protected var connectorDescriptor: Option[ConnectorDescriptor] = None
+  protected var formatDescriptor: Option[FormatDescriptor] = None
+  protected var schemaDescriptor: Option[Schema] = None
+  protected var statisticsDescriptor: Option[Statistics] = None
--- End diff --

I'm wondering if we really need these changes. A sink will never have a 
schema or statistics.


---

[GitHub] flink pull request #5812: [FLINK-9128] [flip6] Add support for scheduleRunAs...

2018-04-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5812


---

1 2 3 >

1 - 100 of 278 matches

Mail list logo