date:20190819

[jira] [Commented] (FLINK-13769) BatchFineGrainedRecoveryITCase.testProgram failed on Travis

2019-08-19 Thread Till Rohrmann (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911033#comment-16911033
 ] 

Till Rohrmann commented on FLINK-13769:
---

Another instance: https://api.travis-ci.org/v3/job/573915348/log.txt

> BatchFineGrainedRecoveryITCase.testProgram failed on Travis
> ---
>
> Key: FLINK-13769
> URL: https://issues.apache.org/jira/browse/FLINK-13769
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.9.0
>Reporter: Andrey Zagrebin
>Assignee: Andrey Zagrebin
>Priority: Critical
>  Labels: test-stability
>
> {{BatchFineGrainedRecoveryITCase.testProgram}} failed on Travis.
> {code}
> 23:14:26.860 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time 
> elapsed: 50.007 s <<< FAILURE! - in 
> org.apache.flink.test.recovery.BatchFineGrainedRecoveryITCase
> 23:14:26.868 [ERROR] 
> testProgram(org.apache.flink.test.recovery.BatchFineGrainedRecoveryITCase)  
> Time elapsed: 49.469 s  <<< ERROR!
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.test.recovery.BatchFineGrainedRecoveryITCase.testProgram(BatchFineGrainedRecoveryITCase.java:225)
> Caused by: java.util.concurrent.CompletionException: 
> akka.pattern.AskTimeoutException: Ask timed out on 
> [Actor[akka.tcp://flink@localhost:39333/user/taskmanager_3#-344551647]] after 
> [1 ms]. Message of type 
> [org.apache.flink.runtime.rpc.messages.RemoteRpcInvocation]. A typical reason 
> for `AskTimeoutException` is that the recipient actor didn't send a reply.
> Caused by: akka.pattern.AskTimeoutException: Ask timed out on 
> [Actor[akka.tcp://flink@localhost:39333/user/taskmanager_3#-344551647]] after 
> [1 ms]. Message of type 
> [org.apache.flink.runtime.rpc.messages.RemoteRpcInvocation]. A typical reason 
> for `AskTimeoutException` is that the recipient actor didn't send a reply.
> {code}
> [https://travis-ci.org/apache/flink/jobs/573523669]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-7860) Support YARN proxy user in Flink (impersonation)

2019-08-19 Thread Shengnan YU (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-7860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911025#comment-16911025
 ] 

Shengnan YU commented on FLINK-7860:


Is there any plan for this improvement?

> Support YARN proxy user in Flink (impersonation)
> 
>
> Key: FLINK-7860
> URL: https://issues.apache.org/jira/browse/FLINK-7860
> Project: Flink
>  Issue Type: New Feature
>  Components: Deployment / YARN
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-13790) Support -e option with a sql script file as input

2019-08-19 Thread Zhenghua Gao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911020#comment-16911020
 ] 

Zhenghua Gao commented on FLINK-13790:
--

OK， I will create a PR for it soon.

> Support -e option with a sql script file as input
> -
>
> Key: FLINK-13790
> URL: https://issues.apache.org/jira/browse/FLINK-13790
> Project: Flink
>  Issue Type: Sub-task
>  Components: Command Line Client
>Reporter: Bowen Li
>Assignee: Zhenghua Gao
>Priority: Major
> Fix For: 1.10.0
>
>
> We expect user to run SQL directly on the command line. Something like: 
> sql-client embedded -f "query in string", which will execute the given file 
> without entering interactive mode
> This is related to FLINK-12828.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Comment Edited] (FLINK-13750) Separate HA services between client-/ and server-side

2019-08-19 Thread TisonKun (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910848#comment-16910848
 ] 

TisonKun edited comment on FLINK-13750 at 8/20/19 5:03 AM:
---

Hi [~Zentol] & [~till.rohrmann].

After an investigation I notice that {{ClusterClient}} need not to hold a field 
is or like {{highAvailabilityServices}}. Towards the target {{ClusterClient}} 
is an interface, i.e., is not an abstract class, we can shift down the 
initialize logic into {{RestClusterClient}} and {{MiniClusterClient}}.

Here are two possible direction we do the separation and I post here for advice.

1. introduce utility functions in {{HighAvailabilityServicesUtils}} to return a 
limited set of high-availability service regarded as client-side services, 
without introduce any new class or interface.(a prototype can be found at 
https://github.com/TisonKun/flink/commit/1ea7c4ed6c7c2ce2a82da48bcacfd20e2bc0fdfd)

pros:

- easy to implement
- in custom HA scenario, customer doesn't need to modify their code instead of 
their implementation has similar issue with FLINK-13500.

cons:

- there is no explicit client-side service concept.
- {{HighAvailabilityServicesUtils}} knows details of Standalone and ZooKeeper 
implementation.

nit:
for the prototype, we might separate {{getDispatcherLeaderRetrievalService}} 
and {{getWebMonitorLeaderRetrievalService}} while the downside is we might 
initialize {{CurationFramework}} and custom HA service more than once.

2. introduce an interface {{RetrieverOnlyHighAvailabilityService}} which looks 
like


{code:java}
interface RetrieverOnlyHighAvailabilityService {
  LeaderRetrievalService getDispatcherLeaderRetrievalService();
  LeaderRetrievalService getWebMonitorLeaderRetrievalService();
}
{code}

and implement it for different high-availability backends.

pros:

- a clear concept of separation between high-availability services.
- HighAvailabilityServicesUtils only pass configuration to generate 
RetrieverOnlyHighAvailabilityService and only 
RetrieverOnlyHighAvailabilityService knows the detail.

cons:

- we need to implement RetrieverOnlyHighAvailabilityService for every 
high-availability services. RetrieverOnlyHighAvailabilityService has methods 
like HighAvailabilityService but both inheritance and composition seems 
improper here.
- in {{MiniClusterClient}} scenario, we actually used the service passed from 
MiniCluster. either we should treat it as a special case or change totally the 
logic {{MiniClusterClient}} initialization.
- in custom HA scenario, user has to implement a new interface.

nit:

it is not the truth for current codebase that every ClusterClient share the 
same retrieval requirements. only RestClusterClient need to 
getWebMonitorLeaderRetrievalService. or in a more conceptual layer client 
should only communicate with WebMonitor and request to Dispatcher is routed by 
WebMonitor.


was (Author: tison):
Hi [~Zentol] & [~till.rohrmann].

After an investigation I notice that {{ClusterClient}} need not to hold a field 
is or like {{highAvailabilityServices}}. Towards the target {{ClusterClient}} 
is an interface, i.e., is not an abstract class, we can shift down the 
initialize logic into {{RestClusterClient}} and {{MiniClusterClient}}.

Here are two possible direction we do the separation and I post here for advice.

1. introduce utility functions in {{HighAvailabilityServicesUtils}} to return a 
limited set of high-availability service regarded as client-side services, 
without introduce any new class or interface.(a prototype can be found at 
https://github.com/TisonKun/flink/commit/1ea7c4ed6c7c2ce2a82da48bcacfd20e2bc0fdfd)

pros:

- easy to implement
- in custom HA scenario, customer doesn't need to modify their code instead of 
their implementation has similar issue with FLINK-13500.

cons:

- there is no explicit client-side service concept.
- {{HighAvailabilityServicesUtils}} knows details of Standalone and ZooKeeper 
implementation.

nit:
for the prototype, we might separate {{getDispatcherLeaderRetrievalService}} 
and {{getWebMonitorLeaderRetrievalService}} while the downside is we might 
initialize {{CurationFramework}} and custom HA service more than once.

2. introduce an interface {{RetrieverOnlyHighAvailabilityService}} which looks 
like


{code:java}
interface RetrieverOnlyHighAvailabilityService {
  LeaderRetrievalService getDispatcherLeaderRetrievalService();
  LeaderRetrievalService getWebMonitorLeaderRetrievalService();
}
{code}

and implement it for different high-availability backends.

pros:

- a clear concept of separation between high-availability services.
- HighAvailabilityServicesUtils only pass configuration to generate 
RetrieverOnlyHighAvailabilityService and only 
RetrieverOnlyHighAvailabilityService knows the detail.

cons:

- we need to implement RetrieverOnlyHighAvailabilityService for every 
high-availability

[GitHub] [flink] flinkbot edited a comment on issue #9482: [FLINK-13758] [flink-clients] failed to submit JobGraph when registered hdfs file in DistributedCache.

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #9482: [FLINK-13758] [flink-clients] failed 
to submit JobGraph when registered hdfs file in DistributedCache.
URL: https://github.com/apache/flink/pull/9482#issuecomment-522564432
 
 
   ## CI report:
   
   * 78d5077eca1d8a05cbea426d24ac8b476c29828e : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123718680)
   * 95441b71309a8f0562ec31cb06bdb01e61142ce8 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/123819442)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-13197) support querying Hive's view in Flink

2019-08-19 Thread Bowen Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910975#comment-16910975
 ] 

Bowen Li commented on FLINK-13197:
--

[~lirui] I think so.

> support querying Hive's view in Flink
> -
>
> Key: FLINK-13197
> URL: https://issues.apache.org/jira/browse/FLINK-13197
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Bowen Li
>Assignee: Rui Li
>Priority: Major
> Fix For: 1.10.0
>
>
> One goal of HiveCatalog and hive integration is to enable Flink-Hive 
> interoperability, that is Flink should understand existing Hive meta-objects, 
> and Hive meta-objects created thru Flink should be understood by Hive.
> Taking an example of a Hive view v1 in HiveCatalog and database hc.db. Unlike 
> an equivalent Flink view whose full path in expanded query should be 
> hc.db.v1, the Hive view's full path in the expanded query should be db.v1 
> such that Hive can understand it, no matter it's created by Hive or Flink.
> [~lirui] can you help to ensure that Flink can also query Hive's view in both 
> Flink planner and Blink planner?
> cc [~xuefuz]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] 1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] Translate "State Backends" page into Chinese

2019-08-19 Thread GitBox

1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] 
Translate "State Backends" page into Chinese
URL: https://github.com/apache/flink/pull/9454#discussion_r315497072
 
 

 ##
 File path: docs/ops/state/state_backends.zh.md
 ##
 @@ -22,114 +22,111 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Programs written in the [Data Stream API]({{ site.baseurl 
}}/dev/datastream_api.html) often hold state in various forms:
+用 [Data Stream API]({{ site.baseurl }}/zh/dev/datastream_api.html) 
编写的程序通常以各种形式保存状态：
 
-- Windows gather elements or aggregates until they are triggered
-- Transformation functions may use the key/value state interface to store 
values
-- Transformation functions may implement the `CheckpointedFunction` interface 
to make their local variables fault tolerant
+- 在 Window 触发之前要么收集元素、要么聚合
+- 转换函数可以使用 key/value 格式的状态接口来存储状态
+- 转换函数可以实现 `CheckpointedFunction` 接口，使其本地变量具有容错能力
 
-See also [state section]({{ site.baseurl }}/dev/stream/state/index.html) in 
the streaming API guide.
+另请参阅 Streaming API 指南中的 [状态部分]({{ site.baseurl 
}}/zh/dev/stream/state/index.html) 。
 
-When checkpointing is activated, such state is persisted upon checkpoints to 
guard against data loss and recover consistently.
-How the state is represented internally, and how and where it is persisted 
upon checkpoints depends on the
-chosen **State Backend**.
+CheckPoint 开启时，CheckPoint 持续地进行来防止数据丢失，并且能够完整地恢复。
+在 Flink 内部以及 CheckPoint 上，状态如何存储以及存储在哪里取决于选择的 **State Backend**。
 
 * ToC
 {:toc}
 
-## Available State Backends
+## 可用的 State Backends
 
-Out of the box, Flink bundles these state backends:
+Flink 内置了以下这些开箱即用的 state backends ：
 
  - *MemoryStateBackend*
  - *FsStateBackend*
  - *RocksDBStateBackend*
 
-If nothing else is configured, the system will use the MemoryStateBackend.
+如果不设置，默认使用 MemoryStateBackend。
 
 
-### The MemoryStateBackend
+### MemoryStateBackend
 
-The *MemoryStateBackend* holds data internally as objects on the Java heap. 
Key/value state and window operators hold hash tables
-that store the values, triggers, etc.
+*MemoryStateBackend* 将数据存储在 Java 堆中。 Key/value 的状态值和窗口算子的触发器都是通过 hash table 
来存储。
 
-Upon checkpoints, this state backend will snapshot the state and send it as 
part of the checkpoint acknowledgement messages to the
-JobManager (master), which stores it on its heap as well.
+在 CheckPoint 时，State Backend 对状态进行快照，并将快照信息作为 CheckPoint 应答消息的一部分发送给 
JobManager(master)，同时 JobManager 也将快照信息存储在堆内存中。
 
-The MemoryStateBackend can be configured to use asynchronous snapshots. While 
we strongly encourage the use of asynchronous snapshots to avoid blocking 
pipelines, please note that this is currently enabled 
-by default. To disable this feature, users can instantiate a 
`MemoryStateBackend` with the corresponding boolean flag in the constructor set 
to `false`(this should only used for debug), e.g.:
+MemoryStateBackend 能配置异步快照。强烈建议使用异步快照来防止数据流阻塞，注意，异步快照默认是开启的。
+用户可以在实例化 `MemoryStateBackend` 的时候，将相应布尔类型的构造参数设置为 `false` 来关闭异步快照（仅在 debug 
的时候使用），例如：
 
 {% highlight java %}
 new MemoryStateBackend(MAX_MEM_STATE_SIZE, false);
 {% endhighlight %}
 
-Limitations of the MemoryStateBackend:
+MemoryStateBackend 的限制：
 
-  - The size of each individual state is by default limited to 5 MB. This 
value can be increased in the constructor of the MemoryStateBackend.
-  - Irrespective of the configured maximal state size, the state cannot be 
larger than the akka frame size (see [Configuration]({{ site.baseurl 
}}/ops/config.html)).
-  - The aggregate state must fit into the JobManager memory.
+  - 默认情况下，每个独立的状态大小限制是 5 MB。在 MemoryStateBackend 的构造器中可以增加其大小。
+  - 无论配置的 `MAX_MEM_STATE_SIZE` 有多大，都不能大于 akka frame 大小（看[配置参数]({{ site.baseurl 
}}/zh/ops/config.html)）。
+  - 聚合后的状态大小必须小于 JobManager 的内存。
 
-The MemoryStateBackend is encouraged for:
+MemoryStateBackend 适用场景：
 
-  - Local development and debugging
-  - Jobs that do hold little state, such as jobs that consist only of 
record-at-a-time functions (Map, FlatMap, Filter, ...). The Kafka Consumer 
requires very little state.
+  - 本地开发和调试。
+  - 状态很小的 Job，例如：由每次只处理一条记录的函数（Map、FlatMap、Filter 等）构成的 Job。Kafka Consumer 
仅仅需要非常小的状态。
 
 
-### The FsStateBackend
+### FsStateBackend
 
-The *FsStateBackend* is configured with a file system URL (type, address, 
path), such as "hdfs://namenode:40010/flink/checkpoints" or 
"file:///data/flink/checkpoints".
+*FsStateBackend* 需要配置一个文件系统的 
URL（类型、地址、路径），例如："hdfs://namenode:40010/flink/checkpoints" 或 
"file:///data/flink/checkpoints"。
 
-The FsStateBackend holds in-flight data in the TaskManager's memory. Upon 
checkpointing, it writes state snapshots into files in the configured file 
system and directory. Minimal metadata is stored in the JobManager's memory 
(or, in high-availability mode, in the metadata checkpoint).
+FsStateBackend 将正在运行中的状态数据保存在 TaskManager 的内存中。CheckPoint 时，将状态快照写入到配置的文件系统目录中。
+少量的元数据信息存储到 JobManager 的内存中（高可用模式下，将其写入到 CheckPoint 的元数据文件

[GitHub] [flink] 1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] Translate "State Backends" page into Chinese

2019-08-19 Thread GitBox

1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] 
Translate "State Backends" page into Chinese
URL: https://github.com/apache/flink/pull/9454#discussion_r315496929
 
 

 ##
 File path: docs/ops/state/state_backends.zh.md
 ##
 @@ -22,114 +22,111 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Programs written in the [Data Stream API]({{ site.baseurl 
}}/dev/datastream_api.html) often hold state in various forms:
+用 [Data Stream API]({{ site.baseurl }}/zh/dev/datastream_api.html) 
编写的程序通常以各种形式保存状态：
 
-- Windows gather elements or aggregates until they are triggered
-- Transformation functions may use the key/value state interface to store 
values
-- Transformation functions may implement the `CheckpointedFunction` interface 
to make their local variables fault tolerant
+- 在 Window 触发之前要么收集元素、要么聚合
+- 转换函数可以使用 key/value 格式的状态接口来存储状态
+- 转换函数可以实现 `CheckpointedFunction` 接口，使其本地变量具有容错能力
 
-See also [state section]({{ site.baseurl }}/dev/stream/state/index.html) in 
the streaming API guide.
+另请参阅 Streaming API 指南中的 [状态部分]({{ site.baseurl 
}}/zh/dev/stream/state/index.html) 。
 
-When checkpointing is activated, such state is persisted upon checkpoints to 
guard against data loss and recover consistently.
-How the state is represented internally, and how and where it is persisted 
upon checkpoints depends on the
-chosen **State Backend**.
+CheckPoint 开启时，CheckPoint 持续地进行来防止数据丢失，并且能够完整地恢复。
+在 Flink 内部以及 CheckPoint 上，状态如何存储以及存储在哪里取决于选择的 **State Backend**。
 
 * ToC
 {:toc}
 
-## Available State Backends
+## 可用的 State Backends
 
-Out of the box, Flink bundles these state backends:
+Flink 内置了以下这些开箱即用的 state backends ：
 
  - *MemoryStateBackend*
  - *FsStateBackend*
  - *RocksDBStateBackend*
 
-If nothing else is configured, the system will use the MemoryStateBackend.
+如果不设置，默认使用 MemoryStateBackend。
 
 
-### The MemoryStateBackend
+### MemoryStateBackend
 
-The *MemoryStateBackend* holds data internally as objects on the Java heap. 
Key/value state and window operators hold hash tables
-that store the values, triggers, etc.
+*MemoryStateBackend* 将数据存储在 Java 堆中。 Key/value 的状态值和窗口算子的触发器都是通过 hash table 
来存储。
 
-Upon checkpoints, this state backend will snapshot the state and send it as 
part of the checkpoint acknowledgement messages to the
-JobManager (master), which stores it on its heap as well.
+在 CheckPoint 时，State Backend 对状态进行快照，并将快照信息作为 CheckPoint 应答消息的一部分发送给 
JobManager(master)，同时 JobManager 也将快照信息存储在堆内存中。
 
-The MemoryStateBackend can be configured to use asynchronous snapshots. While 
we strongly encourage the use of asynchronous snapshots to avoid blocking 
pipelines, please note that this is currently enabled 
-by default. To disable this feature, users can instantiate a 
`MemoryStateBackend` with the corresponding boolean flag in the constructor set 
to `false`(this should only used for debug), e.g.:
+MemoryStateBackend 能配置异步快照。强烈建议使用异步快照来防止数据流阻塞，注意，异步快照默认是开启的。
+用户可以在实例化 `MemoryStateBackend` 的时候，将相应布尔类型的构造参数设置为 `false` 来关闭异步快照（仅在 debug 
的时候使用），例如：
 
 {% highlight java %}
 new MemoryStateBackend(MAX_MEM_STATE_SIZE, false);
 {% endhighlight %}
 
-Limitations of the MemoryStateBackend:
+MemoryStateBackend 的限制：
 
-  - The size of each individual state is by default limited to 5 MB. This 
value can be increased in the constructor of the MemoryStateBackend.
-  - Irrespective of the configured maximal state size, the state cannot be 
larger than the akka frame size (see [Configuration]({{ site.baseurl 
}}/ops/config.html)).
-  - The aggregate state must fit into the JobManager memory.
+  - 默认情况下，每个独立的状态大小限制是 5 MB。在 MemoryStateBackend 的构造器中可以增加其大小。
+  - 无论配置的 `MAX_MEM_STATE_SIZE` 有多大，都不能大于 akka frame 大小（看[配置参数]({{ site.baseurl 
}}/zh/ops/config.html)）。
+  - 聚合后的状态大小必须小于 JobManager 的内存。
 
 Review comment:
   聚合后的状态必须能够放进 JobManager 的内存中。
   
   去掉了老师回复中的 “大小” 二字


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] 1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] Translate "State Backends" page into Chinese

2019-08-19 Thread GitBox

1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] 
Translate "State Backends" page into Chinese
URL: https://github.com/apache/flink/pull/9454#discussion_r315496505
 
 

 ##
 File path: docs/ops/state/state_backends.zh.md
 ##
 @@ -22,114 +22,111 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Programs written in the [Data Stream API]({{ site.baseurl 
}}/dev/datastream_api.html) often hold state in various forms:
+用 [Data Stream API]({{ site.baseurl }}/zh/dev/datastream_api.html) 
编写的程序通常以各种形式保存状态：
 
-- Windows gather elements or aggregates until they are triggered
-- Transformation functions may use the key/value state interface to store 
values
-- Transformation functions may implement the `CheckpointedFunction` interface 
to make their local variables fault tolerant
+- 在 Window 触发之前要么收集元素、要么聚合
+- 转换函数可以使用 key/value 格式的状态接口来存储状态
+- 转换函数可以实现 `CheckpointedFunction` 接口，使其本地变量具有容错能力
 
-See also [state section]({{ site.baseurl }}/dev/stream/state/index.html) in 
the streaming API guide.
+另请参阅 Streaming API 指南中的 [状态部分]({{ site.baseurl 
}}/zh/dev/stream/state/index.html) 。
 
-When checkpointing is activated, such state is persisted upon checkpoints to 
guard against data loss and recover consistently.
-How the state is represented internally, and how and where it is persisted 
upon checkpoints depends on the
-chosen **State Backend**.
+CheckPoint 开启时，CheckPoint 持续地进行来防止数据丢失，并且能够完整地恢复。
+在 Flink 内部以及 CheckPoint 上，状态如何存储以及存储在哪里取决于选择的 **State Backend**。
 
 * ToC
 {:toc}
 
-## Available State Backends
+## 可用的 State Backends
 
-Out of the box, Flink bundles these state backends:
+Flink 内置了以下这些开箱即用的 state backends ：
 
  - *MemoryStateBackend*
  - *FsStateBackend*
  - *RocksDBStateBackend*
 
-If nothing else is configured, the system will use the MemoryStateBackend.
+如果不设置，默认使用 MemoryStateBackend。
 
 
-### The MemoryStateBackend
+### MemoryStateBackend
 
-The *MemoryStateBackend* holds data internally as objects on the Java heap. 
Key/value state and window operators hold hash tables
-that store the values, triggers, etc.
+*MemoryStateBackend* 将数据存储在 Java 堆中。 Key/value 的状态值和窗口算子的触发器都是通过 hash table 
来存储。
 
 Review comment:
   在 *MemoryStateBackend* 内部，数据以 Java 对象的形式存储在堆中。 Key/value 
形式的状态和窗口算子持有存储着状态值、触发器的 hash table。


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] 1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] Translate "State Backends" page into Chinese

2019-08-19 Thread GitBox

1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] 
Translate "State Backends" page into Chinese
URL: https://github.com/apache/flink/pull/9454#discussion_r315496559
 
 

 ##
 File path: docs/ops/state/state_backends.zh.md
 ##
 @@ -22,114 +22,111 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Programs written in the [Data Stream API]({{ site.baseurl 
}}/dev/datastream_api.html) often hold state in various forms:
+用 [Data Stream API]({{ site.baseurl }}/zh/dev/datastream_api.html) 
编写的程序通常以各种形式保存状态：
 
-- Windows gather elements or aggregates until they are triggered
-- Transformation functions may use the key/value state interface to store 
values
-- Transformation functions may implement the `CheckpointedFunction` interface 
to make their local variables fault tolerant
+- 在 Window 触发之前要么收集元素、要么聚合
+- 转换函数可以使用 key/value 格式的状态接口来存储状态
+- 转换函数可以实现 `CheckpointedFunction` 接口，使其本地变量具有容错能力
 
-See also [state section]({{ site.baseurl }}/dev/stream/state/index.html) in 
the streaming API guide.
+另请参阅 Streaming API 指南中的 [状态部分]({{ site.baseurl 
}}/zh/dev/stream/state/index.html) 。
 
-When checkpointing is activated, such state is persisted upon checkpoints to 
guard against data loss and recover consistently.
-How the state is represented internally, and how and where it is persisted 
upon checkpoints depends on the
-chosen **State Backend**.
+CheckPoint 开启时，CheckPoint 持续地进行来防止数据丢失，并且能够完整地恢复。
+在 Flink 内部以及 CheckPoint 上，状态如何存储以及存储在哪里取决于选择的 **State Backend**。
 
 * ToC
 {:toc}
 
-## Available State Backends
+## 可用的 State Backends
 
-Out of the box, Flink bundles these state backends:
+Flink 内置了以下这些开箱即用的 state backends ：
 
  - *MemoryStateBackend*
  - *FsStateBackend*
  - *RocksDBStateBackend*
 
-If nothing else is configured, the system will use the MemoryStateBackend.
+如果不设置，默认使用 MemoryStateBackend。
 
 
-### The MemoryStateBackend
+### MemoryStateBackend
 
-The *MemoryStateBackend* holds data internally as objects on the Java heap. 
Key/value state and window operators hold hash tables
-that store the values, triggers, etc.
+*MemoryStateBackend* 将数据存储在 Java 堆中。 Key/value 的状态值和窗口算子的触发器都是通过 hash table 
来存储。
 
 Review comment:
   个人感觉 Key/Value 不翻译会好一些吧


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] 1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] Translate "State Backends" page into Chinese

2019-08-19 Thread GitBox

1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] 
Translate "State Backends" page into Chinese
URL: https://github.com/apache/flink/pull/9454#discussion_r315496390
 
 

 ##
 File path: docs/ops/state/state_backends.zh.md
 ##
 @@ -22,114 +22,111 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Programs written in the [Data Stream API]({{ site.baseurl 
}}/dev/datastream_api.html) often hold state in various forms:
+用 [Data Stream API]({{ site.baseurl }}/zh/dev/datastream_api.html) 
编写的程序通常以各种形式保存状态：
 
-- Windows gather elements or aggregates until they are triggered
-- Transformation functions may use the key/value state interface to store 
values
-- Transformation functions may implement the `CheckpointedFunction` interface 
to make their local variables fault tolerant
+- 在 Window 触发之前要么收集元素、要么聚合
+- 转换函数可以使用 key/value 格式的状态接口来存储状态
+- 转换函数可以实现 `CheckpointedFunction` 接口，使其本地变量具有容错能力
 
-See also [state section]({{ site.baseurl }}/dev/stream/state/index.html) in 
the streaming API guide.
+另请参阅 Streaming API 指南中的 [状态部分]({{ site.baseurl 
}}/zh/dev/stream/state/index.html) 。
 
-When checkpointing is activated, such state is persisted upon checkpoints to 
guard against data loss and recover consistently.
-How the state is represented internally, and how and where it is persisted 
upon checkpoints depends on the
-chosen **State Backend**.
+CheckPoint 开启时，CheckPoint 持续地进行来防止数据丢失，并且能够完整地恢复。
 
 Review comment:
   同意


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] 1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] Translate "State Backends" page into Chinese

2019-08-19 Thread GitBox

1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] 
Translate "State Backends" page into Chinese
URL: https://github.com/apache/flink/pull/9454#discussion_r315496439
 
 

 ##
 File path: docs/ops/state/state_backends.zh.md
 ##
 @@ -22,114 +22,111 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Programs written in the [Data Stream API]({{ site.baseurl 
}}/dev/datastream_api.html) often hold state in various forms:
+用 [Data Stream API]({{ site.baseurl }}/zh/dev/datastream_api.html) 
编写的程序通常以各种形式保存状态：
 
-- Windows gather elements or aggregates until they are triggered
-- Transformation functions may use the key/value state interface to store 
values
-- Transformation functions may implement the `CheckpointedFunction` interface 
to make their local variables fault tolerant
+- 在 Window 触发之前要么收集元素、要么聚合
+- 转换函数可以使用 key/value 格式的状态接口来存储状态
+- 转换函数可以实现 `CheckpointedFunction` 接口，使其本地变量具有容错能力
 
-See also [state section]({{ site.baseurl }}/dev/stream/state/index.html) in 
the streaming API guide.
+另请参阅 Streaming API 指南中的 [状态部分]({{ site.baseurl 
}}/zh/dev/stream/state/index.html) 。
 
-When checkpointing is activated, such state is persisted upon checkpoints to 
guard against data loss and recover consistently.
-How the state is represented internally, and how and where it is persisted 
upon checkpoints depends on the
-chosen **State Backend**.
+CheckPoint 开启时，CheckPoint 持续地进行来防止数据丢失，并且能够完整地恢复。
+在 Flink 内部以及 CheckPoint 上，状态如何存储以及存储在哪里取决于选择的 **State Backend**。
 
 Review comment:
   状态内部的存储格式、状态在 CheckPoint 时如何持久化以及持久化在哪里均取决于选择的 **State Backend**。


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] 1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] Translate "State Backends" page into Chinese

2019-08-19 Thread GitBox

1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] 
Translate "State Backends" page into Chinese
URL: https://github.com/apache/flink/pull/9454#discussion_r315496276
 
 

 ##
 File path: docs/dev/stream/state/state_backends.zh.md
 ##
 @@ -22,13 +22,15 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Flink provides different state backends that specify how and where state is 
stored.
+Flink 提供了多种 state backends，它用于指定状态的存储方式和位置。
 
-State can be located on Java’s heap or off-heap. Depending on your state 
backend, Flink can also manage the state for the application, meaning Flink 
deals with the memory management (possibly spilling to disk if necessary) to 
allow applications to hold very large state. By default, the configuration file 
*flink-conf.yaml* determines the state backend for all Flink jobs.
+状态可以位于 Java 的堆或堆外内存。 Flink 可以依赖 state backend 管理应用程序的状态。
+为了让应用程序可以维护非常大的状态，Flink 自己管理内存（如果有必要可以溢写到磁盘）。
+默认情况下，所有 Flink Job 的 state backend 将使用配置文件 *flink-conf.yaml* 中的配置。
 
-However, the default state backend can be overridden on a per-job basis, as 
shown below.
+但是，每一个 Job 的 state backend 配置会覆盖默认的 state backend 配置，如下所示。
 
 Review comment:
   默认情况下，所有 Flink Job 会使用配置文件 *flink-conf.yaml* 中指定的 state backend。


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] 1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] Translate "State Backends" page into Chinese

2019-08-19 Thread GitBox

1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] 
Translate "State Backends" page into Chinese
URL: https://github.com/apache/flink/pull/9454#discussion_r315496196
 
 

 ##
 File path: docs/dev/stream/state/state_backends.zh.md
 ##
 @@ -22,13 +22,15 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Flink provides different state backends that specify how and where state is 
stored.
+Flink 提供了多种 state backends，它用于指定状态的存储方式和位置。
 
-State can be located on Java’s heap or off-heap. Depending on your state 
backend, Flink can also manage the state for the application, meaning Flink 
deals with the memory management (possibly spilling to disk if necessary) to 
allow applications to hold very large state. By default, the configuration file 
*flink-conf.yaml* determines the state backend for all Flink jobs.
+状态可以位于 Java 的堆或堆外内存。 Flink 可以依赖 state backend 管理应用程序的状态。
+为了让应用程序可以维护非常大的状态，Flink 自己管理内存（如果有必要可以溢写到磁盘）。
 
 Review comment:
   取决于你的 state backend，Flink 也可以自己管理应用程序的状态。
   为了让应用程序可以维护非常大的状态，Flink 可以自己管理内存（如果有必要可以溢写到磁盘）。


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] 1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] Translate "State Backends" page into Chinese

2019-08-19 Thread GitBox

1996fanrui commented on a change in pull request #9454: [FLINK-13644][docs-zh] 
Translate "State Backends" page into Chinese
URL: https://github.com/apache/flink/pull/9454#discussion_r315495987
 
 

 ##
 File path: docs/ops/state/state_backends.zh.md
 ##
 @@ -22,114 +22,111 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Programs written in the [Data Stream API]({{ site.baseurl 
}}/dev/datastream_api.html) often hold state in various forms:
+用 [Data Stream API]({{ site.baseurl }}/zh/dev/datastream_api.html) 
编写的程序通常以各种形式保存状态：
 
-- Windows gather elements or aggregates until they are triggered
-- Transformation functions may use the key/value state interface to store 
values
-- Transformation functions may implement the `CheckpointedFunction` interface 
to make their local variables fault tolerant
+- 在 Window 触发之前要么收集元素、要么聚合
+- 转换函数可以使用 key/value 格式的状态接口来存储状态
+- 转换函数可以实现 `CheckpointedFunction` 接口，使其本地变量具有容错能力
 
-See also [state section]({{ site.baseurl }}/dev/stream/state/index.html) in 
the streaming API guide.
+另请参阅 Streaming API 指南中的 [状态部分]({{ site.baseurl 
}}/zh/dev/stream/state/index.html) 。
 
-When checkpointing is activated, such state is persisted upon checkpoints to 
guard against data loss and recover consistently.
-How the state is represented internally, and how and where it is persisted 
upon checkpoints depends on the
-chosen **State Backend**.
+CheckPoint 开启时，CheckPoint 持续地进行来防止数据丢失，并且能够完整地恢复。
+在 Flink 内部以及 CheckPoint 上，状态如何存储以及存储在哪里取决于选择的 **State Backend**。
 
 * ToC
 {:toc}
 
-## Available State Backends
+## 可用的 State Backends
 
-Out of the box, Flink bundles these state backends:
+Flink 内置了以下这些开箱即用的 state backends ：
 
  - *MemoryStateBackend*
  - *FsStateBackend*
  - *RocksDBStateBackend*
 
-If nothing else is configured, the system will use the MemoryStateBackend.
+如果不设置，默认使用 MemoryStateBackend。
 
 
-### The MemoryStateBackend
+### MemoryStateBackend
 
-The *MemoryStateBackend* holds data internally as objects on the Java heap. 
Key/value state and window operators hold hash tables
-that store the values, triggers, etc.
+*MemoryStateBackend* 将数据存储在 Java 堆中。 Key/value 的状态值和窗口算子的触发器都是通过 hash table 
来存储。
 
-Upon checkpoints, this state backend will snapshot the state and send it as 
part of the checkpoint acknowledgement messages to the
-JobManager (master), which stores it on its heap as well.
+在 CheckPoint 时，State Backend 对状态进行快照，并将快照信息作为 CheckPoint 应答消息的一部分发送给 
JobManager(master)，同时 JobManager 也将快照信息存储在堆内存中。
 
-The MemoryStateBackend can be configured to use asynchronous snapshots. While 
we strongly encourage the use of asynchronous snapshots to avoid blocking 
pipelines, please note that this is currently enabled 
-by default. To disable this feature, users can instantiate a 
`MemoryStateBackend` with the corresponding boolean flag in the constructor set 
to `false`(this should only used for debug), e.g.:
+MemoryStateBackend 能配置异步快照。强烈建议使用异步快照来防止数据流阻塞，注意，异步快照默认是开启的。
+用户可以在实例化 `MemoryStateBackend` 的时候，将相应布尔类型的构造参数设置为 `false` 来关闭异步快照（仅在 debug 
的时候使用），例如：
 
 {% highlight java %}
 new MemoryStateBackend(MAX_MEM_STATE_SIZE, false);
 {% endhighlight %}
 
-Limitations of the MemoryStateBackend:
+MemoryStateBackend 的限制：
 
-  - The size of each individual state is by default limited to 5 MB. This 
value can be increased in the constructor of the MemoryStateBackend.
-  - Irrespective of the configured maximal state size, the state cannot be 
larger than the akka frame size (see [Configuration]({{ site.baseurl 
}}/ops/config.html)).
-  - The aggregate state must fit into the JobManager memory.
+  - 默认情况下，每个独立的状态大小限制是 5 MB。在 MemoryStateBackend 的构造器中可以增加其大小。
+  - 无论配置的 `MAX_MEM_STATE_SIZE` 有多大，都不能大于 akka frame 大小（看[配置参数]({{ site.baseurl 
}}/zh/ops/config.html)）。
 
 Review comment:
   无论配置的最大状态内存大小（MAX_MEM_STATE_SIZE）有多大，都不能大于 akka frame 大小


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #9488: [hotfix][docs] Update local setup tutorials to fit new log messages

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #9488: [hotfix][docs] Update local setup 
tutorials to fit new log messages
URL: https://github.com/apache/flink/pull/9488#issuecomment-522823318
 
 
   ## CI report:
   
   * 720808139b09bcd08509d7dbd583c35e03cf2be2 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/123818243)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zjuwangg commented on a change in pull request #9447: [FLINK-13643][docs]Document the workaround for users with a different minor Hive version

2019-08-19 Thread GitBox

zjuwangg commented on a change in pull request #9447: 
[FLINK-13643][docs]Document the workaround for users with a different minor 
Hive version
URL: https://github.com/apache/flink/pull/9447#discussion_r315486159
 
 

 ##
 File path: docs/dev/table/hive/index.zh.md
 ##
 @@ -40,7 +40,16 @@ You do not need to modify your existing Hive Metastore or 
change the data placem
 
 ## Supported Hive Version's
 
-Flink supports Hive `2.3.4` and `1.2.1` and relies on Hive's compatibility 
guarantee's for other versions.
+Flink supports Hive `2.3.4` and `1.2.1` and relies on Hive's compatibility 
guarantee's for other minor versions.
+
+If you use a different minor Hive version such as `1.2.2` or `2.3.1`, it 
should also be ok to 
+chose the closest version `1.2.1` (for `1.2.2`) or `2.3.4` (for `2.3.1`) to 
workaround. For 
 
 Review comment:
   thx~


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-13516) YARNSessionFIFOSecuredITCase fails on Java 11

2019-08-19 Thread Haibo Sun (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910939#comment-16910939
 ] 

Haibo Sun commented on FLINK-13516:
---

Hi, [~Zentol]. I can help with this task. Can you assign it to me?

> YARNSessionFIFOSecuredITCase fails on Java 11
> -
>
> Key: FLINK-13516
> URL: https://issues.apache.org/jira/browse/FLINK-13516
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / YARN, Tests
>Reporter: Chesnay Schepler
>Priority: Major
> Fix For: 1.10.0
>
>
> {{YARNSessionFIFOSecuredITCase#testDetachedMode}} times out when run on Java 
> 11. This may be related to security changes in Java 11.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-13515) ClassLoaderITCase fails on Java 11

2019-08-19 Thread Haibo Sun (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910938#comment-16910938
 ] 

Haibo Sun commented on FLINK-13515:
---

Hi, [~Zentol]. I can help with this task. Can you assign it to me?

> ClassLoaderITCase fails on Java 11
> --
>
> Key: FLINK-13515
> URL: https://issues.apache.org/jira/browse/FLINK-13515
> Project: Flink
>  Issue Type: Sub-task
>  Components: Command Line Client, Tests
>Reporter: Chesnay Schepler
>Priority: Major
> Fix For: 1.10.0
>
>
> {{ClassLoaderITCas#testCheckpointedStreamingClassloaderJobWithCustomClassLoader}}
>  fails on Java 11 because the usercode exception can be serialized in the 
> client. This shouldn't be possible since the user-jar isn't on the classpath 
> of the client.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] flinkbot edited a comment on issue #9482: [FLINK-13758] [flink-clients] failed to submit JobGraph when registered hdfs file in DistributedCache.

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #9482: [FLINK-13758] [flink-clients] failed 
to submit JobGraph when registered hdfs file in DistributedCache.
URL: https://github.com/apache/flink/pull/9482#issuecomment-522564432
 
 
   ## CI report:
   
   * 78d5077eca1d8a05cbea426d24ac8b476c29828e : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123718680)
   * 95441b71309a8f0562ec31cb06bdb01e61142ce8 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123819442)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-13197) support querying Hive's view in Flink

2019-08-19 Thread Rui Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910930#comment-16910930
 ] 

Rui Li commented on FLINK-13197:


[~phoenixjiangnan] I suppose this has to wait for FLINK-12905 to get in first, 
is that correct?

> support querying Hive's view in Flink
> -
>
> Key: FLINK-13197
> URL: https://issues.apache.org/jira/browse/FLINK-13197
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Bowen Li
>Assignee: Rui Li
>Priority: Major
> Fix For: 1.10.0
>
>
> One goal of HiveCatalog and hive integration is to enable Flink-Hive 
> interoperability, that is Flink should understand existing Hive meta-objects, 
> and Hive meta-objects created thru Flink should be understood by Hive.
> Taking an example of a Hive view v1 in HiveCatalog and database hc.db. Unlike 
> an equivalent Flink view whose full path in expanded query should be 
> hc.db.v1, the Hive view's full path in the expanded query should be db.v1 
> such that Hive can understand it, no matter it's created by Hive or Flink.
> [~lirui] can you help to ensure that Flink can also query Hive's view in both 
> Flink planner and Blink planner?
> cc [~xuefuz]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Comment Edited] (FLINK-13418) Avoid InfluxdbReporter to report unnecessary tags

2019-08-19 Thread ouyangwulin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910926#comment-16910926
 ] 

ouyangwulin edited comment on FLINK-13418 at 8/20/19 2:20 AM:
--

We use influxdbReporter in product env in NTES .When the flink task growing， 
the tags are increasing fast, and disk io of the influxdb meshine are not 
enough for the increasing index. As the temp method , we change to ssd. So, we 
think this Jira is a good idea.


was (Author: ouyangwuli):
We use influxdbReporter in product env in NTES .When the flink task growing， 
the tags are increasing fast, and disk io of the influxdb meshine are not 
enough for the increasing index. As the temp method , we change to ssd. So, we 
think the Jira is a good idea.

> Avoid InfluxdbReporter to report unnecessary tags
> -
>
> Key: FLINK-13418
> URL: https://issues.apache.org/jira/browse/FLINK-13418
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Yun Tang
>Priority: Major
> Fix For: 1.10.0
>
>
> Currently, when building measurement info within {{InfluxdbReporter}}, it 
> would involve all variables as tags (please see code 
> [here|https://github.com/apache/flink/blob/d57741cef9d4773cc487418baa961254d0d47524/flink-metrics/flink-metrics-influxdb/src/main/java/org/apache/flink/metrics/influxdb/MeasurementInfoProvider.java#L54]).
>  However, user could adjust their own scope format to abort unnecessary 
> scope, while {{InfluxdbReporter}} could report all the scopes as tags to 
> InfluxDB.
> This is due to current {{MetricGroup}} lacks of any method to get necessary 
> scopes but only {{#getScopeComponents()}} or {{#getAllVariables()}}. In other 
> words, InfluxDB need tag-key and tag-value to compose as its tags while we 
> could only get all variables (without any filter acording to scope format) or 
> only scopeComponents (could be treated as tag-value). I think that's why 
> previous implementation have to report all tags.
> From our experience on InfluxDB, as the size of tags contribute to the 
> overall series in InfluxDB, it would never be a good idea to contain too many 
> tags, not to mention the [default value of series per 
> database|https://docs.influxdata.com/influxdb/v1.7/troubleshooting/errors/#error-max-series-per-database-exceeded]
>  is only one million.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] xuchao0903 commented on issue #9352: FLINK-13317 Merge NetUtils and ClientUtils

2019-08-19 Thread GitBox

xuchao0903 commented on issue #9352: FLINK-13317 Merge NetUtils and ClientUtils
URL: https://github.com/apache/flink/pull/9352#issuecomment-522824555
 
 
   @tillrohrmann 
   Thank you for helping me to resolve the checkstyle violation.  
   I'm not familiar with Git.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Comment Edited] (FLINK-13418) Avoid InfluxdbReporter to report unnecessary tags

2019-08-19 Thread ouyangwulin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910926#comment-16910926
 ] 

ouyangwulin edited comment on FLINK-13418 at 8/20/19 2:17 AM:
--

We use influxdbReporter in product env in NTES .When the flink task growing， 
the tags are increasing fast, and disk io of the influxdb meshine are not 
enough for the increasing index. As the temp method , we change to ssd. So, we 
think the Jira is a good idea.


was (Author: ouyangwuli):
I use influxdbReporter in product env in NTES .When the flink task growing， the 
tags are increasing fast, and disk io of the influxdb meshine are not enough 
for the increasing index. As the temp method , I change to ssd. So, I think the 
Jira is a good idea.

> Avoid InfluxdbReporter to report unnecessary tags
> -
>
> Key: FLINK-13418
> URL: https://issues.apache.org/jira/browse/FLINK-13418
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Yun Tang
>Priority: Major
> Fix For: 1.10.0
>
>
> Currently, when building measurement info within {{InfluxdbReporter}}, it 
> would involve all variables as tags (please see code 
> [here|https://github.com/apache/flink/blob/d57741cef9d4773cc487418baa961254d0d47524/flink-metrics/flink-metrics-influxdb/src/main/java/org/apache/flink/metrics/influxdb/MeasurementInfoProvider.java#L54]).
>  However, user could adjust their own scope format to abort unnecessary 
> scope, while {{InfluxdbReporter}} could report all the scopes as tags to 
> InfluxDB.
> This is due to current {{MetricGroup}} lacks of any method to get necessary 
> scopes but only {{#getScopeComponents()}} or {{#getAllVariables()}}. In other 
> words, InfluxDB need tag-key and tag-value to compose as its tags while we 
> could only get all variables (without any filter acording to scope format) or 
> only scopeComponents (could be treated as tag-value). I think that's why 
> previous implementation have to report all tags.
> From our experience on InfluxDB, as the size of tags contribute to the 
> overall series in InfluxDB, it would never be a good idea to contain too many 
> tags, not to mention the [default value of series per 
> database|https://docs.influxdata.com/influxdb/v1.7/troubleshooting/errors/#error-max-series-per-database-exceeded]
>  is only one million.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-13418) Avoid InfluxdbReporter to report unnecessary tags

2019-08-19 Thread ouyangwulin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910926#comment-16910926
 ] 

ouyangwulin commented on FLINK-13418:
-

I use influxdbReporter in product env in NTES .When the flink task growing， the 
tags are increasing fast, and disk io of the influxdb meshine are not enough 
for the increasing index. As the temp method , I change to ssd. So, I think the 
Jira is a good idea.

> Avoid InfluxdbReporter to report unnecessary tags
> -
>
> Key: FLINK-13418
> URL: https://issues.apache.org/jira/browse/FLINK-13418
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Yun Tang
>Priority: Major
> Fix For: 1.10.0
>
>
> Currently, when building measurement info within {{InfluxdbReporter}}, it 
> would involve all variables as tags (please see code 
> [here|https://github.com/apache/flink/blob/d57741cef9d4773cc487418baa961254d0d47524/flink-metrics/flink-metrics-influxdb/src/main/java/org/apache/flink/metrics/influxdb/MeasurementInfoProvider.java#L54]).
>  However, user could adjust their own scope format to abort unnecessary 
> scope, while {{InfluxdbReporter}} could report all the scopes as tags to 
> InfluxDB.
> This is due to current {{MetricGroup}} lacks of any method to get necessary 
> scopes but only {{#getScopeComponents()}} or {{#getAllVariables()}}. In other 
> words, InfluxDB need tag-key and tag-value to compose as its tags while we 
> could only get all variables (without any filter acording to scope format) or 
> only scopeComponents (could be treated as tag-value). I think that's why 
> previous implementation have to report all tags.
> From our experience on InfluxDB, as the size of tags contribute to the 
> overall series in InfluxDB, it would never be a good idea to contain too many 
> tags, not to mention the [default value of series per 
> database|https://docs.influxdata.com/influxdb/v1.7/troubleshooting/errors/#error-max-series-per-database-exceeded]
>  is only one million.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] flinkbot commented on issue #9488: [hotfix][docs] Update local setup tutorials to fit new log messages

2019-08-19 Thread GitBox

flinkbot commented on issue #9488: [hotfix][docs] Update local setup tutorials 
to fit new log messages
URL: https://github.com/apache/flink/pull/9488#issuecomment-522823318
 
 
   ## CI report:
   
   * 720808139b09bcd08509d7dbd583c35e03cf2be2 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123818243)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #9488: [hotfix][docs] Update local setup tutorials to fit new log messages

2019-08-19 Thread GitBox

flinkbot commented on issue #9488: [hotfix][docs] Update local setup tutorials 
to fit new log messages
URL: https://github.com/apache/flink/pull/9488#issuecomment-522821968
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 720808139b09bcd08509d7dbd583c35e03cf2be2 (Tue Aug 20 
02:04:56 UTC 2019)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] TisonKun opened a new pull request #9488: [hotfix][docs] Update local setup tutorials to fit new log messages

2019-08-19 Thread GitBox

TisonKun opened a new pull request #9488: [hotfix][docs] Update local setup 
tutorials to fit new log messages
URL: https://github.com/apache/flink/pull/9488
 
 
   ## What is the purpose of the change
   
   With scheduler/slot refactoring we change log messages and the changes 
should be reflected to document.
   
   ## Brief change log
   
   Update local setup tutorials to fit new log messages
   
   ## Verifying this change
   
   It is about documentation.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   
   cc @rmetzger also a nit to backport to 1.9 docs.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] wuchong closed pull request #9423: [FLINK-13699][table-api] Fix TableFactory doesn't work with DDL when containing TIMESTAMP/DATE/TIME types

2019-08-19 Thread GitBox

wuchong closed pull request #9423: [FLINK-13699][table-api] Fix TableFactory 
doesn't work with DDL when containing TIMESTAMP/DATE/TIME types
URL: https://github.com/apache/flink/pull/9423
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] wuchong commented on issue #9423: [FLINK-13699][table-api] Fix TableFactory doesn't work with DDL when containing TIMESTAMP/DATE/TIME types

2019-08-19 Thread GitBox

wuchong commented on issue #9423: [FLINK-13699][table-api] Fix TableFactory 
doesn't work with DDL when containing TIMESTAMP/DATE/TIME types
URL: https://github.com/apache/flink/pull/9423#issuecomment-522816376
 
 
   Travis passed in my own repo: 
https://travis-ci.org/wuchong/flink/builds/573858501
   
   Merged with commits:
   
   b837a589f1bda5d8352e9760af39937f9194c670 
d20175ee62cd9b3ce8912745240b57c88c5af51c


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-13683) Translate "Code Style - Component Guide" page into Chinese

2019-08-19 Thread ChaojianZhang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910891#comment-16910891
 ] 

ChaojianZhang commented on FLINK-13683:
---

Hi [~jark], I've finished this work, the pull request is 
[https://github.com/apache/flink-web/pull/247] , please help me to review it, 
thanks.

> Translate "Code Style - Component Guide" page into Chinese
> --
>
> Key: FLINK-13683
> URL: https://issues.apache.org/jira/browse/FLINK-13683
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Project Website
>Reporter: Jark Wu
>Assignee: ChaojianZhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Translate page 
> https://flink.apache.org/zh/contributing/code-style-and-quality-components.html
>  into Chinese. The page is located in 
> https://github.com/apache/flink-web/blob/asf-site/contributing/code-style-and-quality-components.zh.md.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (FLINK-13683) Translate "Code Style - Component Guide" page into Chinese

2019-08-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-13683:
---
Labels: pull-request-available  (was: )

> Translate "Code Style - Component Guide" page into Chinese
> --
>
> Key: FLINK-13683
> URL: https://issues.apache.org/jira/browse/FLINK-13683
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Project Website
>Reporter: Jark Wu
>Assignee: ChaojianZhang
>Priority: Major
>  Labels: pull-request-available
>
> Translate page 
> https://flink.apache.org/zh/contributing/code-style-and-quality-components.html
>  into Chinese. The page is located in 
> https://github.com/apache/flink-web/blob/asf-site/contributing/code-style-and-quality-components.zh.md.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Closed] (FLINK-13332) PackagedProgram#isUsingInteractiveMode implemented error prone

2019-08-19 Thread TisonKun (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TisonKun closed FLINK-13332.

Resolution: Not A Problem

The same as FLINK-1.

> PackagedProgram#isUsingInteractiveMode implemented error prone
> --
>
> Key: FLINK-13332
> URL: https://issues.apache.org/jira/browse/FLINK-13332
> Project: Flink
>  Issue Type: Bug
>  Components: Command Line Client
>Affects Versions: 1.10.0
>Reporter: TisonKun
>Priority: Minor
>
> Currently, {{PackagedProgram#isUsingInteractiveMode}} implemented as
> {code:java}
> public boolean isUsingInteractiveMode() {
> return this.program == null;
> }
> {code}
> is hacky and a better implementation would be
> {code:java}
> public boolean isUsingInteractiveMode() {
> return hasMainMethod(mainClass);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] TisonKun commented on issue #9452: [FLINK-13714] Remove Program-related code.

2019-08-19 Thread GitBox

TisonKun commented on issue #9452: [FLINK-13714] Remove Program-related code.
URL: https://github.com/apache/flink/pull/9452#issuecomment-522811842
 
 
   Hi @kl0u with this pr merged we can fix FLINK-13332 and FLINK-1 as a 
side-effect. Also I attached a patch on 
[FLINK-13716](https://issues.apache.org/jira/browse/FLINK-13716) that fix 
Chinese document side.
   
   Could you see whether include that patch in this merging thread or I open an 
extra pr and let's finish FLIP-52?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Closed] (FLINK-13333) Potentially NPE of preview plan functionality

2019-08-19 Thread TisonKun (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TisonKun closed FLINK-1.

Resolution: Not A Problem

[~xintongsong] this issue would be addressed once FLINK-13714 merged since we 
don't have a {{program}} field ever.

> Potentially NPE of preview plan functionality
> -
>
> Key: FLINK-1
> URL: https://issues.apache.org/jira/browse/FLINK-1
> Project: Flink
>  Issue Type: Bug
>  Components: Command Line Client
>Affects Versions: 1.10.0
>Reporter: TisonKun
>Priority: Major
>
> {{PackagedProgram#getPreviewPlan}} contains code as below
> {code:java}
> if (isUsingProgramEntryPoint()) {
> previewPlan = Optimizer.createPreOptimizedPlan(getPlan());
> } else if (isUsingInteractiveMode()) {
> // ...
> getPlan().getJobId();
> // 
> }
> {code}
>  
> when the latter {{#getPlan}} executed, it will finally execute 
> {{program.getPlan(options)}} where {{program}} equals null.
> To solve this problem, we can replace {{getPlan}} with {{env.getPlan}}. Where 
> {{env}} is an instance of {{PreviewPlanEnvironment}}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] flinkbot edited a comment on issue #9487: [FLINK-13791][docs] Speed up sidenav by using group_by

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #9487: [FLINK-13791][docs] Speed up sidenav 
by using group_by
URL: https://github.com/apache/flink/pull/9487#issuecomment-522782605
 
 
   ## CI report:
   
   * be58c370c8714b0ef9b9b8963479225d38627b7f : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/123804733)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Comment Edited] (FLINK-13750) Separate HA services between client-/ and server-side

2019-08-19 Thread TisonKun (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910848#comment-16910848
 ] 

TisonKun edited comment on FLINK-13750 at 8/19/19 11:47 PM:


Hi [~Zentol] & [~till.rohrmann].

After an investigation I notice that {{ClusterClient}} need not to hold a field 
is or like {{highAvailabilityServices}}. Towards the target {{ClusterClient}} 
is an interface, i.e., is not an abstract class, we can shift down the 
initialize logic into {{RestClusterClient}} and {{MiniClusterClient}}.

Here are two possible direction we do the separation and I post here for advice.

1. introduce utility functions in {{HighAvailabilityServicesUtils}} to return a 
limited set of high-availability service regarded as client-side services, 
without introduce any new class or interface.(a prototype can be found at 
https://github.com/TisonKun/flink/commit/1ea7c4ed6c7c2ce2a82da48bcacfd20e2bc0fdfd)

pros:

- easy to implement
- in custom HA scenario, customer doesn't need to modify their code instead of 
their implementation has similar issue with FLINK-13500.

cons:

- there is no explicit client-side service concept.
- {{HighAvailabilityServicesUtils}} knows details of Standalone and ZooKeeper 
implementation.

nit:
for the prototype, we might separate {{getDispatcherLeaderRetrievalService}} 
and {{getWebMonitorLeaderRetrievalService}} while the downside is we might 
initialize {{CurationFramework}} and custom HA service more than once.

2. introduce an interface {{RetrieverOnlyHighAvailabilityService}} which looks 
like


{code:java}
interface RetrieverOnlyHighAvailabilityService {
  LeaderRetrievalService getDispatcherLeaderRetrievalService();
  LeaderRetrievalService getWebMonitorLeaderRetrievalService();
}
{code}

and implement it for different high-availability backends.

pros:

- a clear concept of separation between high-availability services.
- HighAvailabilityServicesUtils only pass configuration to generate 
RetrieverOnlyHighAvailabilityService and only 
RetrieverOnlyHighAvailabilityService knows the detail.

cons:

- we need to implement RetrieverOnlyHighAvailabilityService for every 
high-availability services.
- in {{MiniClusterClient}} scenario, we actually used the service passed from 
MiniCluster. either we should treat it as a special case or change totally the 
logic {{MiniClusterClient}} initialization.
- in custom HA scenario, user has to implement a new interface.

nit:

it is not the truth for current codebase that every ClusterClient share the 
same retrieval requirements. only RestClusterClient need to 
getWebMonitorLeaderRetrievalService. or in a more conceptual layer client 
should only communicate with WebMonitor and request to Dispatcher is routed by 
WebMonitor.


was (Author: tison):
Hi [~Zentol] & [~till.rohrmann].

After an investigation I notice that {{ClusterClient}} need not to hold a field 
is or like {{highAvailabilityServices}}. Towards the target {{ClusterClient}} 
is an interface, i.e., is not an abstract class, we can shift down the 
initialize logic into {{RestClusterClient}} and {{MiniClusterClient}}.

Here are two possible direction we do the separation and I post here for advice.

1. introduce utility functions in {{HighAvailabilityServicesUtils}} to return a 
limited set of high-availability service regarded as client-side services, 
without introduce any new class or interface.(a prototype can be found at 
https://github.com/TisonKun/flink/commit/1ea7c4ed6c7c2ce2a82da48bcacfd20e2bc0fdfd)

pros:

- easy to implement
- in custom HA scenario, customer doesn't need to modify their code instead of 
their implementation has similar issue with FLINK-13500.

cons:

- there is no explicit client-side service concept.
- {{HighAvailabilityServicesUtils}} knows details of Standalone and ZooKeeper 
implementation.

nit: for the prototype, we might separate 
{{getDispatcherLeaderRetrievalService}} and 
{{getWebMonitorLeaderRetrievalService}} while the downside is we would 
initialize {{CurationFramework}} and custom HA service twice or more.

2. introduce an interface {{RetrieverOnlyHighAvailabilityService}} which looks 
like


{code:java}
interface RetrieverOnlyHighAvailabilityService {
  LeaderRetrievalService getDispatcherLeaderRetrievalService();
  LeaderRetrievalService getWebMonitorLeaderRetrievalService();
}
{code}

and implement it for different high-availability backends.

pros:

- a clear concept of separation between high-availability services.
- HighAvailabilityServicesUtils only pass configuration to generate 
RetrieverOnlyHighAvailabilityService and only 
RetrieverOnlyHighAvailabilityService knows the detail.

cons:

- we need to implement RetrieverOnlyHighAvailabilityService for every 
high-availability services.
- in {{MiniClusterClient}} scenario, we actually used the service passed from 
MiniCluster. either we should treat it as a spe

[jira] [Commented] (FLINK-13750) Separate HA services between client-/ and server-side

2019-08-19 Thread TisonKun (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910848#comment-16910848
 ] 

TisonKun commented on FLINK-13750:
--

Hi [~Zentol] & [~till.rohrmann].

After an investigation I notice that {{ClusterClient}} need not to hold a field 
is or like {{highAvailabilityServices}}. Towards the target {{ClusterClient}} 
is an interface, i.e., is not an abstract class, we can shift down the 
initialize logic into {{RestClusterClient}} and {{MiniClusterClient}}.

Here are two possible direction we do the separation and I post here for advice.

1. introduce utility functions in {{HighAvailabilityServicesUtils}} to return a 
limited set of high-availability service regarded as client-side services, 
without introduce any new class or interface.(a prototype can be found at 
https://github.com/TisonKun/flink/commit/1ea7c4ed6c7c2ce2a82da48bcacfd20e2bc0fdfd)

pros:

- easy to implement
- in custom HA scenario, customer doesn't need to modify their code instead of 
their implementation has similar issue with FLINK-13500.

cons:

- there is no explicit client-side service concept.
- {{HighAvailabilityServicesUtils}} knows details of Standalone and ZooKeeper 
implementation.

nit: for the prototype, we might separate 
{{getDispatcherLeaderRetrievalService}} and 
{{getWebMonitorLeaderRetrievalService}} while the downside is we would 
initialize {{CurationFramework}} and custom HA service twice or more.

2. introduce an interface {{RetrieverOnlyHighAvailabilityService}} which looks 
like


{code:java}
interface RetrieverOnlyHighAvailabilityService {
  LeaderRetrievalService getDispatcherLeaderRetrievalService();
  LeaderRetrievalService getWebMonitorLeaderRetrievalService();
}
{code}

and implement it for different high-availability backends.

pros:

- a clear concept of separation between high-availability services.
- HighAvailabilityServicesUtils only pass configuration to generate 
RetrieverOnlyHighAvailabilityService and only 
RetrieverOnlyHighAvailabilityService knows the detail.

cons:

- we need to implement RetrieverOnlyHighAvailabilityService for every 
high-availability services.
- in {{MiniClusterClient}} scenario, we actually used the service passed from 
MiniCluster. either we should treat it as a special case or change totally the 
logic {{MiniClusterClient}} initialization.
- in custom HA scenario, user has to implement a new interface.

nit:

it is not the truth for current codebase that every ClusterClient share the 
same retrieval requirements. only RestClusterClient need to 
getWebMonitorLeaderRetrievalService. or in a more conceptual layer client 
should only communicate with WebMonitor and request to Dispatcher is routed by 
WebMonitor.

> Separate HA services between client-/ and server-side
> -
>
> Key: FLINK-13750
> URL: https://issues.apache.org/jira/browse/FLINK-13750
> Project: Flink
>  Issue Type: Improvement
>  Components: Command Line Client, Runtime / Coordination
>Reporter: Chesnay Schepler
>Assignee: TisonKun
>Priority: Major
>
> Currently, we use the same {{HighAvailabilityServices}} on the client and 
> server. However, the client does not need several of the features that the 
> services currently provide (access to the blobstore or checkpoint metadata).
> Additionally, due to how these services are setup they also require the 
> client to have access to the blob storage, despite it never actually being 
> used, which can cause issues, like FLINK-13500.
> [~Tison] Would be be interested in this issue?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] zhijiangW commented on a change in pull request #9471: [FLINK-13754][task] Decouple OperatorChain from StreamStatusMaintainer

2019-08-19 Thread GitBox

zhijiangW commented on a change in pull request #9471: [FLINK-13754][task] 
Decouple OperatorChain from StreamStatusMaintainer
URL: https://github.com/apache/flink/pull/9471#discussion_r315451534
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java
 ##
 @@ -605,6 +610,50 @@ boolean isSerializingTimestamps() {
return tc == TimeCharacteristic.EventTime | tc == 
TimeCharacteristic.IngestionTime;
}
 
+   private void broadcastCheckpointBarrier(
 
 Review comment:
   I already updated the codes for addressing this issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #9471: [FLINK-13754][task] Decouple OperatorChain from StreamStatusMaintainer

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #9471: [FLINK-13754][task] Decouple 
OperatorChain from StreamStatusMaintainer
URL: https://github.com/apache/flink/pull/9471#issuecomment-522269622
 
 
   ## CI report:
   
   * 46356e9f2ac97632021b3450f2585ea8b6120175 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123609454)
   * 330c8be5df79465a8804b7059c104984c6ac43ad : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123644155)
   * df762cf9c977de44fda9ccd5f805e7784c7dff6f : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123807523)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zhijiangW commented on a change in pull request #9471: [FLINK-13754][task] Decouple OperatorChain from StreamStatusMaintainer

2019-08-19 Thread GitBox

zhijiangW commented on a change in pull request #9471: [FLINK-13754][task] 
Decouple OperatorChain from StreamStatusMaintainer
URL: https://github.com/apache/flink/pull/9471#discussion_r315286950
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/io/RecordWriterOutput.java
 ##
 @@ -49,18 +48,17 @@
 
private SerializationDelegate serializationDelegate;
 
-   private final StreamStatusProvider streamStatusProvider;
-
private final OutputTag outputTag;
 
private final WatermarkGauge watermarkGauge = new WatermarkGauge();
 
+   private StreamStatus currentStatus;
+
@SuppressWarnings("unchecked")
public RecordWriterOutput(
RecordWriter>> 
recordWriter,
TypeSerializer outSerializer,
-   OutputTag outputTag,
-   StreamStatusProvider streamStatusProvider) {
 
 Review comment:
   Thanks for this investigation with @tzulitai .
   
   1. After further reviewing the whole related process, the current idle 
`StreamStatus` only blocks the emitting of watermark, not works for 
`StreamRecord/LatencyMarker`.
   
   2. The mechanism is different for source and non-source tasks to control 
emitting watermark based on stream status:
   
   - For source task, the source context references the 
`StreamStatusMaintainer` to mark idle status if needed. Then the 
`RecordWriterOutput` also references this `StreamStatusMaintainer` to judge the 
status before really emitting watermark.
   
   - For non-source task, we have a high-level `StatusWatermarkValve` component 
to handle watermark with `StreamStatus` together. In this case if the 
maintained status in some channel is idle, it would not really call the 
`Operator#processWatermark` while handling the watermark. So the existing judge 
on `RecordWriterOutput` side seems redundant for non-source case.
   
   Overall the current judge of status logic is still necessary for the case of 
source. I agree that it seems more reasonable to make this judge logic in upper 
layer like the current way for non-source task. If to do so, we need to define 
a similar `StatusWatermarkValve` component to be referenced by source context, 
and the watermark emitting must firstly go through `StatusWatermarkValve` to 
judge whether the status is idle. It would involve in more refactoring work on 
legacy source stack.
   
   So the only feasible way currently might be done in PR. It seems redundant 
for non-source task, but we do not change the behavior and the redundant issue 
still exists before this refactoring.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-13752) TaskDeploymentDescriptor cannot be recycled by GC due to referenced by an anonymous function

2019-08-19 Thread Tzu-Li (Gordon) Tai (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910819#comment-16910819
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-13752:
-

Cherry-picked for 1.9.0: 04e95278777611519f5d14813dec4cbc533e2934

> TaskDeploymentDescriptor cannot be recycled by GC due to referenced by an 
> anonymous function
> 
>
> Key: FLINK-13752
> URL: https://issues.apache.org/jira/browse/FLINK-13752
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.9.0
>Reporter: Yun Gao
>Assignee: Yun Gao
>Priority: Critical
> Fix For: 1.9.0
>
>
> When comparing the 1.8 and 1.9.0-rc2 on a test streaming job, we found that 
> the performance on 1.9.0-rc2 is much lower than that of 1.8. By comparing the 
> two versions, we found that the count of Full GC of TaskExecutor process on 
> 1.9.0-rc2 is much more than that on 1.8.
> A further analysis found that the difference is due to in 
> _TaskExecutor#setupResultPartitionBookkeeping_, the anonymous function in 
> _taskTermimationWithResourceCleanFuture_ has referenced the 
> _TaskDeploymentDescriptor_, since this function will be kept till the task is 
> terminated,  _TaskDeploymentDescriptor_ will also be kept referenced in the 
> closure and cannot be recycled by GC. In this job, _TaskDeploymentDescriptor_ 
> of some tasks are as large as 10M, and the total heap is about 113M, thus the 
> kept _TaskDeploymentDescriptors_ will cause relatively large impact on GC and 
> performance.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-13231) Add a ratelimiter to pubsub source

2019-08-19 Thread Tzu-Li (Gordon) Tai (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910818#comment-16910818
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-13231:
-

Cherry-picked for 1.9.0: c2d9aeacede65724912ed9a2ef87a23181869aa8

> Add a ratelimiter to pubsub source
> --
>
> Key: FLINK-13231
> URL: https://issues.apache.org/jira/browse/FLINK-13231
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Google Cloud PubSub
>Reporter: Richard Deurwaarder
>Assignee: Richard Deurwaarder
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.9.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Replace MaxMessagesToAcknowledge limit by introducing a rate limiter. See: 
> [https://github.com/apache/flink/pull/6594#discussion_r300215868]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (FLINK-13699) Fix TableFactory doesn't work with DDL when containing TIMESTAMP/DATE/TIME types

2019-08-19 Thread Tzu-Li (Gordon) Tai (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-13699:

Fix Version/s: (was: 1.9.1)
   1.9.0

> Fix TableFactory doesn't work with DDL when containing TIMESTAMP/DATE/TIME 
> types
> 
>
> Key: FLINK-13699
> URL: https://issues.apache.org/jira/browse/FLINK-13699
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API, Table SQL / Planner
>Affects Versions: 1.9.0
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.9.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, in blink planner, we will convert DDL to {{TableSchema}} with new 
> type system, i.e. DataTypes.TIMESTAMP()/DATE()/TIME() whose underlying 
> TypeInformation are  Types.LOCAL_DATETIME/LOCAL_DATE/LOCAL_TIME. 
> However, this makes the existing connector implementations (Kafka, ES, CSV, 
> etc..) don't work because they only accept the old TypeInformations 
> (Types.SQL_TIMESTAMP/SQL_DATE/SQL_TIME).
> A simple solution is encode DataTypes.TIMESTAMP() as "TIMESTAMP" when 
> translating to properties. And will be converted back to the old 
> TypeInformation: Types.SQL_TIMESTAMP. This would fix all factories at once.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-13699) Fix TableFactory doesn't work with DDL when containing TIMESTAMP/DATE/TIME types

2019-08-19 Thread Tzu-Li (Gordon) Tai (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910817#comment-16910817
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-13699:
-

Cherry picked for 1.9.0: d8941711e51f3315f543399a1030dbcf2fb07434

> Fix TableFactory doesn't work with DDL when containing TIMESTAMP/DATE/TIME 
> types
> 
>
> Key: FLINK-13699
> URL: https://issues.apache.org/jira/browse/FLINK-13699
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API, Table SQL / Planner
>Affects Versions: 1.9.0
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.9.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, in blink planner, we will convert DDL to {{TableSchema}} with new 
> type system, i.e. DataTypes.TIMESTAMP()/DATE()/TIME() whose underlying 
> TypeInformation are  Types.LOCAL_DATETIME/LOCAL_DATE/LOCAL_TIME. 
> However, this makes the existing connector implementations (Kafka, ES, CSV, 
> etc..) don't work because they only accept the old TypeInformations 
> (Types.SQL_TIMESTAMP/SQL_DATE/SQL_TIME).
> A simple solution is encode DataTypes.TIMESTAMP() as "TIMESTAMP" when 
> translating to properties. And will be converted back to the old 
> TypeInformation: Types.SQL_TIMESTAMP. This would fix all factories at once.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-13512) Kinesis connector missing jaxb-api dependency

2019-08-19 Thread Yu Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910810#comment-16910810
 ] 

Yu Li commented on FLINK-13512:
---

Checked the license of 
[jaxb-api|https://mvnrepository.com/artifact/javax.xml.bind/jaxb-api/2.3.0] and 
it’s CDDL 1.1. Checked [apache legal|https://apache.org/legal/resolved.html] 
and confirmed for software under CDDL 1.1 license we could include binaries 
(and binary-only, no source code), so it’s ok to include it into our project.

> Kinesis connector missing jaxb-api dependency
> -
>
> Key: FLINK-13512
> URL: https://issues.apache.org/jira/browse/FLINK-13512
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Kinesis
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> KPL makes use of {{javax.xml.bind.DatatypeConverter}} but does not declare a 
> dependency on {{jaxb-api}} and relies on the JDK containing this class. This 
> is no longer the case on Java 11; we have to add it as a dependency and 
> bundle it in the jar.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] flinkbot commented on issue #9487: [FLINK-13791][docs] Speed up sidenav by using group_by

2019-08-19 Thread GitBox

flinkbot commented on issue #9487: [FLINK-13791][docs] Speed up sidenav by 
using group_by
URL: https://github.com/apache/flink/pull/9487#issuecomment-522782605
 
 
   ## CI report:
   
   * be58c370c8714b0ef9b9b8963479225d38627b7f : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123804733)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #9487: [FLINK-13791][docs] Speed up sidenav by using group_by

2019-08-19 Thread GitBox

flinkbot commented on issue #9487: [FLINK-13791][docs] Speed up sidenav by 
using group_by
URL: https://github.com/apache/flink/pull/9487#issuecomment-522781093
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit be58c370c8714b0ef9b9b8963479225d38627b7f (Mon Aug 19 
22:39:55 UTC 2019)
   
   **Warnings:**
* Documentation files were touched, but no `.zh.md` files: Update Chinese 
documentation or file Jira ticket.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-13791) Speed up sidenav by using group_by

2019-08-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-13791:
---
Labels: pull-request-available  (was: )

> Speed up sidenav by using group_by
> --
>
> Key: FLINK-13791
> URL: https://issues.apache.org/jira/browse/FLINK-13791
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Major
>  Labels: pull-request-available
>
> {{_includes/sidenav.html}} parses through {{pages_by_language}} over and over 
> again trying to find children when building the (recursive) side navigation. 
> We could do this once with a {{group_by}} instead.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] flinkbot edited a comment on issue #9210: [FLINK-12746][docs] Getting Started - DataStream Example Walkthrough

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #9210: [FLINK-12746][docs] Getting Started - 
DataStream Example Walkthrough
URL: https://github.com/apache/flink/pull/9210#issuecomment-514437706
 
 
   ## CI report:
   
   * 5eb979da047c442c0205464c92b5bd9ee3a740dc : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120299964)
   * d7bf53a30514664925357bd5817305a02553d0a3 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120506936)
   * 02cca7fb6283b84a20ee019159ccb023ccffbd82 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120769129)
   * 5009b10d38eef92f25bfe4ff4608f2dd121ea9c6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120915709)
   * e3b272586d8f41d3800e86134730c4dc427952a6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120916220)
   * f1aee543a7aef88e3cf052f4d686ab0a8e5938e5 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120996260)
   * c66060dba290844085f90f554d447c6d7033779d : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121131224)
   * 700e5c19a3d49197ef2b18a646f0b6e1bf783ba8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121174288)
   * 6f3fccea82189ef95d46f12212f6f7386fc11668 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123540519)
   * 829c9c0505b6f08bb68e20a34e0613d83ae21758 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123545553)
   * 6f4f9ad2b9840347bda3474fe18f4b6b0b870c01 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123789816)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] NicoK opened a new pull request #9487: [FLINK-13791][docs] Speed up sidenav by using group_by

2019-08-19 Thread GitBox

NicoK opened a new pull request #9487: [FLINK-13791][docs] Speed up sidenav by 
using group_by
URL: https://github.com/apache/flink/pull/9487
 
 
   ## What is the purpose of the change
   
   `_includes/sidenav.html` parses through `pages_by_language` over and over 
again trying to find children when building the (recursive) side navigation. By 
doing this once with a `group_by`, we can gain considerable savings in building 
the docs via `./build_docs.sh` without any change to the generated HTML pages:
   
   before: ~54s
   after: ~37s
   
   ## Brief change log
   
   Building on top of #9444, this PR adds:
   - use `group_by` to create an easier-iterable array for determining page 
children
   
   ## Verifying this change
   
   I verified the changes in the generated HTML pages (nothing changed).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #8738: [FLINK-12845][sql-client] Execute multiple statements in command line…

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #8738: [FLINK-12845][sql-client] Execute 
multiple statements in command line…
URL: https://github.com/apache/flink/pull/8738#issuecomment-522743542
 
 
   ## CI report:
   
   * ca6e0d310325ecda459b5c29144161b35ee73278 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123788585)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-13791) Speed up sidenav by using group_by

2019-08-19 Thread Nico Kruber (Jira)

Nico Kruber created FLINK-13791:
---

 Summary: Speed up sidenav by using group_by
 Key: FLINK-13791
 URL: https://issues.apache.org/jira/browse/FLINK-13791
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Nico Kruber
Assignee: Nico Kruber


{{_includes/sidenav.html}} parses through {{pages_by_language}} over and over 
again trying to find children when building the (recursive) side navigation. We 
could do this once with a {{group_by}} instead.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] flinkbot edited a comment on issue #8706: [FLINK-12814][sql-client] Support a traditional and scrolling view of…

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #8706: [FLINK-12814][sql-client] Support a 
traditional and scrolling view of…
URL: https://github.com/apache/flink/pull/8706#issuecomment-522740816
 
 
   ## CI report:
   
   * f4a31e789e78ba3d5ab18ce50c9e8e697d3141d1 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123787386)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #9486: [FLINK-13789] move prefix out of the format string

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #9486: [FLINK-13789] move prefix out of the 
format string
URL: https://github.com/apache/flink/pull/9486#issuecomment-522731408
 
 
   ## CI report:
   
   * 6b1de55f21733476acacf0de3df5c859e6d49a48 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123783508)
   * 2549ae243b1b16f248b780a21c1ccd9ca6c065c8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123786288)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] carp84 commented on a change in pull request #9451: [FLINK-13718][hbase] Disable tests on Java 11

2019-08-19 Thread GitBox

carp84 commented on a change in pull request #9451: [FLINK-13718][hbase] 
Disable tests on Java 11
URL: https://github.com/apache/flink/pull/9451#discussion_r315416345
 
 

 ##
 File path: flink-connectors/flink-hbase/pom.xml
 ##
 @@ -340,6 +340,25 @@ under the License.



+   
+   java11
+   
+   11
+   
+
+   
+   
+   
+   
org.apache.maven.plugins
+   
maven-surefire-plugin
+   
+   
 
 Review comment:
   Minor: hive->hbase, and suggest to mark as "currently doesn't support Java 
11" due to HBASE-21110 is ongoing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #9210: [FLINK-12746][docs] Getting Started - DataStream Example Walkthrough

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #9210: [FLINK-12746][docs] Getting Started - 
DataStream Example Walkthrough
URL: https://github.com/apache/flink/pull/9210#issuecomment-514437706
 
 
   ## CI report:
   
   * 5eb979da047c442c0205464c92b5bd9ee3a740dc : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120299964)
   * d7bf53a30514664925357bd5817305a02553d0a3 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120506936)
   * 02cca7fb6283b84a20ee019159ccb023ccffbd82 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120769129)
   * 5009b10d38eef92f25bfe4ff4608f2dd121ea9c6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120915709)
   * e3b272586d8f41d3800e86134730c4dc427952a6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120916220)
   * f1aee543a7aef88e3cf052f4d686ab0a8e5938e5 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120996260)
   * c66060dba290844085f90f554d447c6d7033779d : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121131224)
   * 700e5c19a3d49197ef2b18a646f0b6e1bf783ba8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121174288)
   * 6f3fccea82189ef95d46f12212f6f7386fc11668 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123540519)
   * 829c9c0505b6f08bb68e20a34e0613d83ae21758 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123545553)
   * 6f4f9ad2b9840347bda3474fe18f4b6b0b870c01 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123789816)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #8727: [FLINK-12828][sql-client] Support -f option with a sql script file as…

2019-08-19 Thread GitBox

flinkbot commented on issue #8727: [FLINK-12828][sql-client] Support -f option 
with a sql script file as…
URL: https://github.com/apache/flink/pull/8727#issuecomment-522743669
 
 
   ## CI report:
   
   * a653c7903cb78b98aa6ddaca6001a587aa8fb7ce : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123788620)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #8738: [FLINK-12845][sql-client] Execute multiple statements in command line…

2019-08-19 Thread GitBox

flinkbot commented on issue #8738: [FLINK-12845][sql-client] Execute multiple 
statements in command line…
URL: https://github.com/apache/flink/pull/8738#issuecomment-522743542
 
 
   ## CI report:
   
   * ca6e0d310325ecda459b5c29144161b35ee73278 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123788585)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-13755) support Hive built-in functions in Flink

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13755:
-
Description: 
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime.
 # We need to define the resolution order
 # Alternatively, we can try to match Hive built-in functions in Flink and 
eliminate the need of supporting them. The gap needs to be discovered first.

 

cc [~xuefuz] [~lirui] [~Terry1897]

  was:
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime.

cc [~xuefuz] [~lirui] [~Terry1897]


> support Hive built-in functions in Flink
> 
>
> Key: FLINK-13755
> URL: https://issues.apache.org/jira/browse/FLINK-13755
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: common builtin functions is flink and hive.txt, hive 
> builtin functions that are missing in flink.txt
>
>
> Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
> registered into in-memory function catalog at runtime.
>  # We need to define the resolution order
>  # Alternatively, we can try to match Hive built-in functions in Flink and 
> eliminate the need of supporting them. The gap needs to be discovered first.
>  
> cc [~xuefuz] [~lirui] [~Terry1897]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (FLINK-13755) support Hive built-in functions in Flink

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13755:
-
Description: 
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime.
 # We need to define the resolution order
 # Alternatively, we can try to match Hive built-in functions in Flink and 
eliminate the need of supporting them. I did a simple comparison. With Flink 
1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0. Please 
see attached files. According my sampling in the 195 functions, some are 
straight-forward to rewrite, some don't seem to be frequently used.

 

cc [~xuefuz] [~lirui] [~Terry1897]

  was:
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime.
 # We need to define the resolution order
 # Alternatively, we can try to match Hive built-in functions in Flink and 
eliminate the need of supporting them. The gap needs to be discovered first.

 

cc [~xuefuz] [~lirui] [~Terry1897]


> support Hive built-in functions in Flink
> 
>
> Key: FLINK-13755
> URL: https://issues.apache.org/jira/browse/FLINK-13755
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: common builtin functions is flink and hive.txt, hive 
> builtin functions that are missing in flink.txt
>
>
> Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
> registered into in-memory function catalog at runtime.
>  # We need to define the resolution order
>  # Alternatively, we can try to match Hive built-in functions in Flink and 
> eliminate the need of supporting them. I did a simple comparison. With Flink 
> 1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
> there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0. 
> Please see attached files. According my sampling in the 195 functions, some 
> are straight-forward to rewrite, some don't seem to be frequently used.
>  
> cc [~xuefuz] [~lirui] [~Terry1897]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-13790) Support -e option with a sql script file as input

2019-08-19 Thread Bowen Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910734#comment-16910734
 ] 

Bowen Li commented on FLINK-13790:
--

Hi [~docete] , I assigned this Jira to you since it's very similar to 
FLINK-12828. Please let me know if you can take this. Thanks!

> Support -e option with a sql script file as input
> -
>
> Key: FLINK-13790
> URL: https://issues.apache.org/jira/browse/FLINK-13790
> Project: Flink
>  Issue Type: Sub-task
>  Components: Command Line Client
>Reporter: Bowen Li
>Assignee: Zhenghua Gao
>Priority: Major
> Fix For: 1.10.0
>
>
> We expect user to run SQL directly on the command line. Something like: 
> sql-client embedded -f "query in string", which will execute the given file 
> without entering interactive mode
> This is related to FLINK-12828.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Created] (FLINK-13790) Support -e option with a sql script file as input

2019-08-19 Thread Bowen Li (Jira)

Bowen Li created FLINK-13790:


 Summary: Support -e option with a sql script file as input
 Key: FLINK-13790
 URL: https://issues.apache.org/jira/browse/FLINK-13790
 Project: Flink
  Issue Type: Sub-task
  Components: Command Line Client
Reporter: Bowen Li
Assignee: Zhenghua Gao
 Fix For: 1.10.0


We expect user to run SQL directly on the command line. Something like: 
sql-client embedded -f "query in string", which will execute the given file 
without entering interactive mode

This is related to FLINK-12828.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] sjwiesman commented on issue #9210: [FLINK-12746][docs] Getting Started - DataStream Example Walkthrough

2019-08-19 Thread GitBox

sjwiesman commented on issue #9210: [FLINK-12746][docs] Getting Started - 
DataStream Example Walkthrough
URL: https://github.com/apache/flink/pull/9210#issuecomment-522741985
 
 
   @NicoK Thank you for taking a look. I believe I have addressed all your 
comments, please take another look when you have a chance. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #8706: [FLINK-12814][sql-client] Support a traditional and scrolling view of…

2019-08-19 Thread GitBox

flinkbot commented on issue #8706: [FLINK-12814][sql-client] Support a 
traditional and scrolling view of…
URL: https://github.com/apache/flink/pull/8706#issuecomment-522740816
 
 
   ## CI report:
   
   * f4a31e789e78ba3d5ab18ce50c9e8e697d3141d1 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123787386)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] bowenli86 edited a comment on issue #8738: [FLINK-12845][sql-client] Execute multiple statements in command line…

2019-08-19 Thread GitBox

bowenli86 edited a comment on issue #8738: [FLINK-12845][sql-client] Execute 
multiple statements in command line…
URL: https://github.com/apache/flink/pull/8738#issuecomment-522740294
 
 
   @docete can you rebase this PR? This should be aimed for 1.10


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] bowenli86 commented on issue #8727: [FLINK-12828][sql-client] Support -f option with a sql script file as…

2019-08-19 Thread GitBox

bowenli86 commented on issue #8727: [FLINK-12828][sql-client] Support -f option 
with a sql script file as…
URL: https://github.com/apache/flink/pull/8727#issuecomment-522740406
 
 
   @docete can you rebase this PR? This should be aimed for 1.10


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] bowenli86 commented on issue #8738: [FLINK-12845][sql-client] Execute multiple statements in command line…

2019-08-19 Thread GitBox

bowenli86 commented on issue #8738: [FLINK-12845][sql-client] Execute multiple 
statements in command line…
URL: https://github.com/apache/flink/pull/8738#issuecomment-522740294
 
 
   @docete can you rebase this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-12845) Execute multiple statements in command line or sql script file

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-12845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-12845:
-
Fix Version/s: 1.10.0

> Execute multiple statements in command line or sql script file
> --
>
> Key: FLINK-12845
> URL: https://issues.apache.org/jira/browse/FLINK-12845
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Reporter: Zhenghua Gao
>Assignee: Zhenghua Gao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> User may copy multiple statements and paste them on command line GUI of SQL 
> Client, or User may pass a script file(using SOURCE command or -f option), we 
> should parse and execute them one by one(like other sql cli applications)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (FLINK-12828) Support -f option with a sql script file as input

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-12828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-12828:
-
Fix Version/s: 1.10.0

> Support -f option with a sql script file as input
> -
>
> Key: FLINK-12828
> URL: https://issues.apache.org/jira/browse/FLINK-12828
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Reporter: Zhenghua Gao
>Assignee: Zhenghua Gao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We expect user to run a script file directly on the command line. Something 
> like: sql-client embedded -f myscript.sql, which will execute the given file 
> without entering interactive mode



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Assigned] (FLINK-12845) Execute multiple statements in command line or sql script file

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-12845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li reassigned FLINK-12845:


Assignee: Zhenghua Gao

> Execute multiple statements in command line or sql script file
> --
>
> Key: FLINK-12845
> URL: https://issues.apache.org/jira/browse/FLINK-12845
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Reporter: Zhenghua Gao
>Assignee: Zhenghua Gao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> User may copy multiple statements and paste them on command line GUI of SQL 
> Client, or User may pass a script file(using SOURCE command or -f option), we 
> should parse and execute them one by one(like other sql cli applications)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] flinkbot edited a comment on issue #9486: [FLINK-13789] move prefix out of the format string

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #9486: [FLINK-13789] move prefix out of the 
format string
URL: https://github.com/apache/flink/pull/9486#issuecomment-522731408
 
 
   ## CI report:
   
   * 6b1de55f21733476acacf0de3df5c859e6d49a48 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123783508)
   * 2549ae243b1b16f248b780a21c1ccd9ca6c065c8 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123786288)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-13197) support querying Hive's view in Flink

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13197:
-
Fix Version/s: 1.10.0

> support querying Hive's view in Flink
> -
>
> Key: FLINK-13197
> URL: https://issues.apache.org/jira/browse/FLINK-13197
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Bowen Li
>Assignee: Rui Li
>Priority: Major
> Fix For: 1.10.0
>
>
> One goal of HiveCatalog and hive integration is to enable Flink-Hive 
> interoperability, that is Flink should understand existing Hive meta-objects, 
> and Hive meta-objects created thru Flink should be understood by Hive.
> Taking an example of a Hive view v1 in HiveCatalog and database hc.db. Unlike 
> an equivalent Flink view whose full path in expanded query should be 
> hc.db.v1, the Hive view's full path in the expanded query should be db.v1 
> such that Hive can understand it, no matter it's created by Hive or Flink.
> [~lirui] can you help to ensure that Flink can also query Hive's view in both 
> Flink planner and Blink planner?
> cc [~xuefuz]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (FLINK-13197) support querying Hive's view in Flink

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13197:
-
Description: 
One goal of HiveCatalog and hive integration is to enable Flink-Hive 
interoperability, that is Flink should understand existing Hive meta-objects, 
and Hive meta-objects created thru Flink should be understood by Hive.

Taking an example of a Hive view v1 in HiveCatalog and database hc.db. Unlike 
an equivalent Flink view whose full path in expanded query should be hc.db.v1, 
the Hive view's full path in the expanded query should be db.v1 such that Hive 
can understand it, no matter it's created by Hive or Flink.

[~lirui] can you help to ensure that Flink can also query Hive's view in both 
Flink planner and Blink planner?

cc [~xuefuz]

  was:
One goal of HiveCatalog and hive integration is to enable Flink-Hive 
interoperability, that is Flink should understand existing Hive meta-objects, 
and Hive meta-objects created thru Flink should be understood by Hive.

Taking an example of a Hive view v1 in HiveCatalog and database hc.db. Unlike 
an equivalent Flink view whose full path in expanded query should be hc.db.v1, 
the Hive view's full path in the expanded query should be db.v1 such that Hive 
can understand it, no matter it's created by Hive or Flink.

[~dawidwys] is working on FLINK-12905 to enable Flink to query CatalogView in 
legacy Flink planner.

[~lirui] can you help to ensure that Flink can also query Hive's view in both 
Flink planner and Blink planner?

cc [~xuefuz]


> support querying Hive's view in Flink
> -
>
> Key: FLINK-13197
> URL: https://issues.apache.org/jira/browse/FLINK-13197
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Bowen Li
>Assignee: Rui Li
>Priority: Major
>
> One goal of HiveCatalog and hive integration is to enable Flink-Hive 
> interoperability, that is Flink should understand existing Hive meta-objects, 
> and Hive meta-objects created thru Flink should be understood by Hive.
> Taking an example of a Hive view v1 in HiveCatalog and database hc.db. Unlike 
> an equivalent Flink view whose full path in expanded query should be 
> hc.db.v1, the Hive view's full path in the expanded query should be db.v1 
> such that Hive can understand it, no matter it's created by Hive or Flink.
> [~lirui] can you help to ensure that Flink can also query Hive's view in both 
> Flink planner and Blink planner?
> cc [~xuefuz]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-12814) Support a traditional and scrolling view of result (non-interactive)

2019-08-19 Thread Bowen Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-12814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910723#comment-16910723
 ] 

Bowen Li commented on FLINK-12814:
--

+1. We'd better aim this for release 1.10.

> Support a traditional and scrolling view of result (non-interactive)
> 
>
> Key: FLINK-12814
> URL: https://issues.apache.org/jira/browse/FLINK-12814
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Affects Versions: 1.8.0
>Reporter: Zhenghua Gao
>Assignee: Zhenghua Gao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In table mode, we want to introduce a non-interactive view (so-called 
> FinalizedResult), which submit SQL statements(DQLs) in attach mode with a 
> user defined timeout, fetch results until the job finished/failed/timeout or 
> interrupted by user(Ctrl+C), and output them in a non-interactive way (the 
> behavior in change-log mode is under discussion)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] bowenli86 commented on issue #8706: [FLINK-12814][sql-client] Support a traditional and scrolling view of…

2019-08-19 Thread GitBox

bowenli86 commented on issue #8706: [FLINK-12814][sql-client] Support a 
traditional and scrolling view of…
URL: https://github.com/apache/flink/pull/8706#issuecomment-522736547
 
 
   @docete can you please rebase this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] sjwiesman removed a comment on issue #9444: [FLINK-13726][docs] build docs with jekyll 4.0.0.pre.beta1

2019-08-19 Thread GitBox

sjwiesman removed a comment on issue #9444: [FLINK-13726][docs] build docs with 
jekyll 4.0.0.pre.beta1
URL: https://github.com/apache/flink/pull/9444#issuecomment-522735908
 
 
   Please add the new jekyll cache (`docs/.jekyll-cache/`) to flinks 
`.gitignore`.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-12814) Support a traditional and scrolling view of result (non-interactive)

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-12814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-12814:
-
Fix Version/s: 1.10.0

> Support a traditional and scrolling view of result (non-interactive)
> 
>
> Key: FLINK-12814
> URL: https://issues.apache.org/jira/browse/FLINK-12814
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Affects Versions: 1.8.0
>Reporter: Zhenghua Gao
>Assignee: Zhenghua Gao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In table mode, we want to introduce a non-interactive view (so-called 
> FinalizedResult), which submit SQL statements(DQLs) in attach mode with a 
> user defined timeout, fetch results until the job finished/failed/timeout or 
> interrupted by user(Ctrl+C), and output them in a non-interactive way (the 
> behavior in change-log mode is under discussion)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] sjwiesman edited a comment on issue #9444: [FLINK-13726][docs] build docs with jekyll 4.0.0.pre.beta1

2019-08-19 Thread GitBox

sjwiesman edited a comment on issue #9444: [FLINK-13726][docs] build docs with 
jekyll 4.0.0.pre.beta1
URL: https://github.com/apache/flink/pull/9444#issuecomment-522735908
 
 
   Please add the new jekyll cache (`docs/.jekyll-cache/`) to flinks 
`.gitignore`.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] sjwiesman commented on issue #9444: [FLINK-13726][docs] build docs with jekyll 4.0.0.pre.beta1

2019-08-19 Thread GitBox

sjwiesman commented on issue #9444: [FLINK-13726][docs] build docs with jekyll 
4.0.0.pre.beta1
URL: https://github.com/apache/flink/pull/9444#issuecomment-522735908
 
 
   Please add `docs/.jekyll-cache/*` to flinks `.gitignore`.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-13755) support Hive built-in functions in Flink

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13755:
-
Description: 
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime.

cc [~xuefuz] [~lirui] [~Terry1897]

  was:
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime, which makes it hard for 
Flink to integrate with architecturely.

First and basic option is to do it the hard way by integrating hive's function 
registry, which architecturely can be hard.

Second option to support rich Hive built-in functions is to develop builtin 
functions in Flink with the same logic. I did a simple comparison. With Flink 
1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0. Please 
see attached files. According my sampling in the 195 functions, some are 
straight-forward to rewrite, some don't seem to be frequently used.

Besides rewriting all of them, another option for users is to manually register 
those builtin functions in Hive metastore, so Flink can load them thru 
HiveCatalog at runtime.

Lastly, we can load and hold hive builtin functions in an in-memory map of 
HiveCatalog as if it's from hive function registry

 

cc [~xuefuz] [~lirui] [~Terry1897]


> support Hive built-in functions in Flink
> 
>
> Key: FLINK-13755
> URL: https://issues.apache.org/jira/browse/FLINK-13755
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: common builtin functions is flink and hive.txt, hive 
> builtin functions that are missing in flink.txt
>
>
> Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
> registered into in-memory function catalog at runtime.
> cc [~xuefuz] [~lirui] [~Terry1897]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] flinkbot commented on issue #9486: [FLINK-13789] move prefix out of the format string

2019-08-19 Thread GitBox

flinkbot commented on issue #9486: [FLINK-13789] move prefix out of the format 
string
URL: https://github.com/apache/flink/pull/9486#issuecomment-522731408
 
 
   ## CI report:
   
   * 6b1de55f21733476acacf0de3df5c859e6d49a48 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123783508)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #9486: [FLINK-13789] move prefix out of the format string

2019-08-19 Thread GitBox

flinkbot commented on issue #9486: [FLINK-13789] move prefix out of the format 
string
URL: https://github.com/apache/flink/pull/9486#issuecomment-522729564
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 6b1de55f21733476acacf0de3df5c859e6d49a48 (Mon Aug 19 
19:55:57 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-13789).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-13789) Transactional Id Generation fails due to user code impacting formatting string

2019-08-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-13789:
---
Labels: pull-request-available  (was: )

> Transactional Id Generation fails due to user code impacting formatting string
> --
>
> Key: FLINK-13789
> URL: https://issues.apache.org/jira/browse/FLINK-13789
> Project: Flink
>  Issue Type: Bug
>Reporter: Hao Dang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.2, 1.8.0, 1.8.1
>
>
> In [TransactionalIdsGenerator.java|#L94]], prefix contains taskName of the 
> particular task which could ultimately contain user code.  In some cases, 
> user code contains conversion specifiers like %, the string formatting could 
> fail.
> For example, in Flink SQL code, user could have a LIKE statement with a % 
> wildcard, the % wildcard will end up in the prefix and get mistreated during 
> formatting, causing task to fail.
> Think we should move prefix out of the string formatting.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] haodang opened a new pull request #9486: [FLINK-13789] move prefix out of the format string

2019-08-19 Thread GitBox

haodang opened a new pull request #9486: [FLINK-13789] move prefix out of the 
format string
URL: https://github.com/apache/flink/pull/9486
 
 
   
   
   ## What is the purpose of the change
   
   Fixing https://issues.apache.org/jira/browse/FLINK-13789 by moving `prefix` 
out of the format string.
   
   ## Brief change log
   
   - moving `prefix` out of the format string
   
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-13789) Transactional Id Generation fails due to user code impacting formatting string

2019-08-19 Thread Hao Dang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Dang updated FLINK-13789:
-
Fix Version/s: 1.7.2

> Transactional Id Generation fails due to user code impacting formatting string
> --
>
> Key: FLINK-13789
> URL: https://issues.apache.org/jira/browse/FLINK-13789
> Project: Flink
>  Issue Type: Bug
>Reporter: Hao Dang
>Priority: Major
> Fix For: 1.7.2, 1.8.0, 1.8.1
>
>
> In [TransactionalIdsGenerator.java|#L94]], prefix contains taskName of the 
> particular task which could ultimately contain user code.  In some cases, 
> user code contains conversion specifiers like %, the string formatting could 
> fail.
> For example, in Flink SQL code, user could have a LIKE statement with a % 
> wildcard, the % wildcard will end up in the prefix and get mistreated during 
> formatting, causing task to fail.
> Think we should move prefix out of the string formatting.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (FLINK-13789) Transactional Id Generation fails due to user code impacting formatting string

2019-08-19 Thread Hao Dang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Dang updated FLINK-13789:
-
Description: 
In [TransactionalIdsGenerator.java|#L94]], prefix contains taskName of the 
particular task which could ultimately contain user code.  In some cases, user 
code contains conversion specifiers like %, the string formatting could fail.

For example, in Flink SQL code, user could have a LIKE statement with a % 
wildcard, the % wildcard will end up in the prefix and get mistreated during 
formatting, causing task to fail.

Think we should move prefix out of the string formatting.

  was:
In 
[TransactionalIdsGenerator.java|[https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/TransactionalIdsGenerator.java#L94]],
 prefix contains taskName of the particular task which could ultimately contain 
user code.  In some cases, user code contains conversion specifiers like %, the 
string formatting could fail.

For example, in Flink SQL code, user could have a LIKE statement with a % 
wildcard, the % wildcard will end up in the prefix and get mistreated during 
formatting, causing task to fail.


> Transactional Id Generation fails due to user code impacting formatting string
> --
>
> Key: FLINK-13789
> URL: https://issues.apache.org/jira/browse/FLINK-13789
> Project: Flink
>  Issue Type: Bug
>Reporter: Hao Dang
>Priority: Major
>
> In [TransactionalIdsGenerator.java|#L94]], prefix contains taskName of the 
> particular task which could ultimately contain user code.  In some cases, 
> user code contains conversion specifiers like %, the string formatting could 
> fail.
> For example, in Flink SQL code, user could have a LIKE statement with a % 
> wildcard, the % wildcard will end up in the prefix and get mistreated during 
> formatting, causing task to fail.
> Think we should move prefix out of the string formatting.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (FLINK-13789) Transactional Id Generation fails due to user code impacting formatting string

2019-08-19 Thread Hao Dang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Dang updated FLINK-13789:
-
Fix Version/s: 1.8.0
   1.8.1

> Transactional Id Generation fails due to user code impacting formatting string
> --
>
> Key: FLINK-13789
> URL: https://issues.apache.org/jira/browse/FLINK-13789
> Project: Flink
>  Issue Type: Bug
>Reporter: Hao Dang
>Priority: Major
> Fix For: 1.8.0, 1.8.1
>
>
> In [TransactionalIdsGenerator.java|#L94]], prefix contains taskName of the 
> particular task which could ultimately contain user code.  In some cases, 
> user code contains conversion specifiers like %, the string formatting could 
> fail.
> For example, in Flink SQL code, user could have a LIKE statement with a % 
> wildcard, the % wildcard will end up in the prefix and get mistreated during 
> formatting, causing task to fail.
> Think we should move prefix out of the string formatting.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Created] (FLINK-13789) Transactional Id Generation fails due to user code impacting formatting string

2019-08-19 Thread Hao Dang (Jira)

Hao Dang created FLINK-13789:


 Summary: Transactional Id Generation fails due to user code 
impacting formatting string
 Key: FLINK-13789
 URL: https://issues.apache.org/jira/browse/FLINK-13789
 Project: Flink
  Issue Type: Bug
Reporter: Hao Dang


In 
[TransactionalIdsGenerator.java|[https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/TransactionalIdsGenerator.java#L94]],
 prefix contains taskName of the particular task which could ultimately contain 
user code.  In some cases, user code contains conversion specifiers like %, the 
string formatting could fail.

For example, in Flink SQL code, user could have a LIKE statement with a % 
wildcard, the % wildcard will end up in the prefix and get mistreated during 
formatting, causing task to fail.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] NicoK commented on issue #8887: [FLINK-12982][metrics] improve DescriptiveStatisticsHistogramStatistics performance

2019-08-19 Thread GitBox

NicoK commented on issue #8887: [FLINK-12982][metrics] improve 
DescriptiveStatisticsHistogramStatistics performance
URL: https://github.com/apache/flink/pull/8887#issuecomment-522722305
 
 
   FYI: I updated the code according to your comments and my findings


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] NicoK commented on a change in pull request #8888: [FLINK-12983][metrics] replace descriptive histogram's storage back-end

2019-08-19 Thread GitBox

NicoK commented on a change in pull request #: [FLINK-12983][metrics] 
replace descriptive histogram's storage back-end
URL: https://github.com/apache/flink/pull/#discussion_r315374624
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/metrics/DescriptiveStatisticsHistogram.java
 ##
 @@ -27,27 +27,63 @@
  */
 public class DescriptiveStatisticsHistogram implements 
org.apache.flink.metrics.Histogram {
 
-   private final DescriptiveStatistics descriptiveStatistics;
-
-   private long elementsSeen = 0L;
+   private final CircularDoubleArray descriptiveStatistics;
 
public DescriptiveStatisticsHistogram(int windowSize) {
-   this.descriptiveStatistics = new 
DescriptiveStatistics(windowSize);
+   this.descriptiveStatistics = new 
CircularDoubleArray(windowSize);
}
 
@Override
public void update(long value) {
-   elementsSeen += 1L;
this.descriptiveStatistics.addValue(value);
}
 
@Override
public long getCount() {
-   return this.elementsSeen;
+   return this.descriptiveStatistics.getElementsSeen();
}
 
@Override
public HistogramStatistics getStatistics() {
return new 
DescriptiveStatisticsHistogramStatistics(this.descriptiveStatistics);
}
+
+   /**
+* Fixed-size array that wraps around at the end and has a dynamic 
start position.
+*/
+   static class CircularDoubleArray {
+   private final double[] backingArray;
+   private int nextPos = 0;
+   private boolean fullSize = false;
+   private long elementsSeen = 0;
+
+   CircularDoubleArray(int windowSize) {
+   this.backingArray = new double[windowSize];
+   }
+
+   synchronized void addValue(double value) {
+   backingArray[nextPos] = value;
+   ++elementsSeen;
+   ++nextPos;
+   if (nextPos == backingArray.length) {
+   nextPos = 0;
+   fullSize = true;
+   }
+   }
+
+   synchronized double[] toUnsortedArray() {
+   final int size = getSize();
+   double[] result = new double[size];
+   System.arraycopy(backingArray, 0, result, 0, 
result.length);
 
 Review comment:
   `CircularDoubleArray` is a package-private class - I'm wondering who/which 
component would require a sorted array then. This circular array has special 
APIs (making clear that we return an unsorted array) and is only used by 
`org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogramStatistics.CommonMetricsSnapshot`
 which does not require sorting. Whether sorting is required is basically 
defined via 
`org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogramStatistics.CommonMetricsSnapshot`.
   
   Out of curiosity, I added this method and used it instead:
   ```
public synchronized double[] toArray() {
if (fullSize) {
final double[] result = new 
double[backingArray.length];
final int firstLength = result.length - nextPos;
System.arraycopy(backingArray, nextPos, result, 
0, firstLength);
System.arraycopy(backingArray, 0, result, 
firstLength, nextPos);
return result;
} else {
final double[] result = new double[nextPos];
System.arraycopy(backingArray, 0, result, 0, 
nextPos);
return result;
}
}
   ```
   
   First quick results are as follows and show the benefit for not working with 
a sorted array if we don't need it (and are requesting the histogram a lot):
   ```
   Flink 1.9
   Benchmark Mode  Cnt  Score Error 
  Units
   HistogramBenchmarks.descriptiveHistogram thrpt   30 89.224 ±   1.974 
 ops/ms
   HistogramBenchmarks.descriptiveHistogramAdd  thrpt   30  56034.903 ± 939.263 
 ops/ms
   
   Flink 1.10 + FLINK-12983
   Benchmark Mode  Cnt   Score  
Error   Units
   HistogramBenchmarks.descriptiveHistogram thrpt   30 280.240 ±
3.747  ops/ms
   HistogramBenchmarks.descriptiveHistogramAdd  thrpt   30  207176.894 ± 
2417.831  ops/ms
   
   Flink 1.10 + FLINK-12983 + sortedArray
   Benchmark Mode  Cnt   Score  
Error   Units
   HistogramBenchmarks.descriptiveHistogram thrpt   30 239.000 ±
1.506  ops/ms
   HistogramBenchmarks.descriptiveHistogramAdd  thrpt   30  21035

[GitHub] [flink] NicoK commented on a change in pull request #8888: [FLINK-12983][metrics] replace descriptive histogram's storage back-end

2019-08-19 Thread GitBox

NicoK commented on a change in pull request #: [FLINK-12983][metrics] 
replace descriptive histogram's storage back-end
URL: https://github.com/apache/flink/pull/#discussion_r315374624
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/metrics/DescriptiveStatisticsHistogram.java
 ##
 @@ -27,27 +27,63 @@
  */
 public class DescriptiveStatisticsHistogram implements 
org.apache.flink.metrics.Histogram {
 
-   private final DescriptiveStatistics descriptiveStatistics;
-
-   private long elementsSeen = 0L;
+   private final CircularDoubleArray descriptiveStatistics;
 
public DescriptiveStatisticsHistogram(int windowSize) {
-   this.descriptiveStatistics = new 
DescriptiveStatistics(windowSize);
+   this.descriptiveStatistics = new 
CircularDoubleArray(windowSize);
}
 
@Override
public void update(long value) {
-   elementsSeen += 1L;
this.descriptiveStatistics.addValue(value);
}
 
@Override
public long getCount() {
-   return this.elementsSeen;
+   return this.descriptiveStatistics.getElementsSeen();
}
 
@Override
public HistogramStatistics getStatistics() {
return new 
DescriptiveStatisticsHistogramStatistics(this.descriptiveStatistics);
}
+
+   /**
+* Fixed-size array that wraps around at the end and has a dynamic 
start position.
+*/
+   static class CircularDoubleArray {
+   private final double[] backingArray;
+   private int nextPos = 0;
+   private boolean fullSize = false;
+   private long elementsSeen = 0;
+
+   CircularDoubleArray(int windowSize) {
+   this.backingArray = new double[windowSize];
+   }
+
+   synchronized void addValue(double value) {
+   backingArray[nextPos] = value;
+   ++elementsSeen;
+   ++nextPos;
+   if (nextPos == backingArray.length) {
+   nextPos = 0;
+   fullSize = true;
+   }
+   }
+
+   synchronized double[] toUnsortedArray() {
+   final int size = getSize();
+   double[] result = new double[size];
+   System.arraycopy(backingArray, 0, result, 0, 
result.length);
 
 Review comment:
   `CircularDoubleArray` is a package-private class - I'm wondering who/which 
component would require a sorted array then. This circular array has special 
APIs (making clear that we return an unsorted array) and is only used by 
`org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogramStatistics.CommonMetricsSnapshot`
 which does not require sorting. Whether sorting is required is basically 
defined via 
`org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogramStatistics.CommonMetricsSnapshot`.
   
   Out of curiosity, I added this method and used it instead:
   ```
   public synchronized double[] toArray() {
if (fullSize) {
final double[] result = new double[backingArray.length];
final int firstLength = result.length - nextPos;
System.arraycopy(backingArray, nextPos, result, 0, firstLength);
System.arraycopy(backingArray, 0, result, firstLength, nextPos);
return result;
} else {
final double[] result = new double[nextPos];
System.arraycopy(backingArray, 0, result, 0, nextPos);
return result;
}
   }
   ```
   
   First quick results are as follows and show the benefit for not working with 
a sorted array if we don't need it (and are requesting the histogram a lot):
   ```
   Flink 1.9
   Benchmark Mode  Cnt  Score Error 
  Units
   HistogramBenchmarks.descriptiveHistogram thrpt   30 89.224 ±   1.974 
 ops/ms
   HistogramBenchmarks.descriptiveHistogramAdd  thrpt   30  56034.903 ± 939.263 
 ops/ms
   
   Flink 1.10 + FLINK-12983
   Benchmark Mode  Cnt   Score  
Error   Units
   HistogramBenchmarks.descriptiveHistogram thrpt   30 280.240 ±
3.747  ops/ms
   HistogramBenchmarks.descriptiveHistogramAdd  thrpt   30  207176.894 ± 
2417.831  ops/ms
   
   Flink 1.10 + FLINK-12983 + sortedArray
   Benchmark Mode  Cnt   Score  
Error   Units
   HistogramBenchmarks.descriptiveHistogram thrpt   30 239.000 ±
1.506  ops/ms
   HistogramBenchmarks.descriptiveHistogramAdd  thrpt   30  210352.180 ± 
1135.394  ops/ms
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHu

[GitHub] [flink] flinkbot edited a comment on issue #9457: [FLINK-13741][table] "SHOW FUNCTIONS" should include Flink built-in functions' names

2019-08-19 Thread GitBox

flinkbot edited a comment on issue #9457: [FLINK-13741][table] "SHOW FUNCTIONS" 
should include Flink built-in functions' names
URL: https://github.com/apache/flink/pull/9457#issuecomment-521829752
 
 
   ## CI report:
   
   * 55c0e5843e029f022ff59fe14a9e6c1d2c5ac69e : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123443311)
   * 006236fff94d0204223a2c3b89f621da3248f6a4 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123444248)
   * 726259a0a1bfb2061f77a82c586d9b3a4c70abb6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123572731)
   * 34cf90b067d6de0c0ffd72ac380fe66d07725a88 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123766959)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10231) Add a view SQL DDL

2019-08-19 Thread Bowen Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-10231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910677#comment-16910677
 ] 

Bowen Li commented on FLINK-10231:
--

Hi [~winipanda] , are you still interested in moving this forward? I think it's 
time to start some discussions and action now as most pre-requisites are 
mature/maturing.

> Add a view SQL DDL
> --
>
> Key: FLINK-10231
> URL: https://issues.apache.org/jira/browse/FLINK-10231
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: TANG Wen-hui
>Priority: Major
> Fix For: 1.10.0
>
>
> FLINK-10163 added initial view support for the SQL Client. However, for 
> supporting the [full definition of 
> views|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/AlterView]
>  (with schema, comments, etc.) we need to support native support for views in 
> the Table API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (FLINK-10231) Add a view SQL DDL

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-10231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-10231:
-
Fix Version/s: 1.10.0

> Add a view SQL DDL
> --
>
> Key: FLINK-10231
> URL: https://issues.apache.org/jira/browse/FLINK-10231
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: TANG Wen-hui
>Priority: Major
> Fix For: 1.10.0
>
>
> FLINK-10163 added initial view support for the SQL Client. However, for 
> supporting the [full definition of 
> views|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/AlterView]
>  (with schema, comments, etc.) we need to support native support for views in 
> the Table API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-7151) Add a function SQL DDL

2019-08-19 Thread Bowen Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-7151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910674#comment-16910674
 ] 

Bowen Li commented on FLINK-7151:
-

Hi [~suez1224] , are you still interested in moving this forward?

BTW, shall we differentiate DDL for function and DDL for temp function? I think 
they probably should be two JIRAs

> Add a function SQL DDL
> --
>
> Key: FLINK-7151
> URL: https://issues.apache.org/jira/browse/FLINK-7151
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: yuemeng
>Assignee: Shuyi Chen
>Priority: Major
>
> Based on create temporary function and table.we can register a udf,udaf,udtf 
> use sql:
> {code}
> CREATE TEMPORARY function 'TOPK' AS 
> 'com..aggregate.udaf.distinctUdaf.topk.ITopKUDAF';
> INSERT INTO db_sink SELECT id, TOPK(price, 5, 'DESC') FROM kafka_source GROUP 
> BY id;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Closed] (FLINK-13643) Document the workaround for users with a different minor Hive version

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li closed FLINK-13643.

Resolution: Fixed

merged in master: eb0436b4971a0a939395f570b4abbaa9ee508903  1.9.1: 
598a71dfc5a0379e7f650589acce6aad108dddf2

> Document the workaround for users with a different minor Hive version
> -
>
> Key: FLINK-13643
> URL: https://issues.apache.org/jira/browse/FLINK-13643
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Affects Versions: 1.9.0
>Reporter: Xuefu Zhang
>Assignee: Terry Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.9.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We officially support two Hive versions. However, we can tell user how to 
> work around the limitation if their Hive version is only minorly differently.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (FLINK-13643) Document the workaround for users with a different minor Hive version

2019-08-19 Thread Bowen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13643:
-
Fix Version/s: (was: 1.9.0)
   1.9.1
   1.10.0

> Document the workaround for users with a different minor Hive version
> -
>
> Key: FLINK-13643
> URL: https://issues.apache.org/jira/browse/FLINK-13643
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Affects Versions: 1.9.0
>Reporter: Xuefu Zhang
>Assignee: Terry Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.9.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We officially support two Hive versions. However, we can tell user how to 
> work around the limitation if their Hive version is only minorly differently.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[GitHub] [flink] asfgit closed pull request #9447: [FLINK-13643][docs]Document the workaround for users with a different minor Hive version

2019-08-19 Thread GitBox

asfgit closed pull request #9447: [FLINK-13643][docs]Document the workaround 
for users with a different minor Hive version
URL: https://github.com/apache/flink/pull/9447
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] bowenli86 commented on a change in pull request #9447: [FLINK-13643][docs]Document the workaround for users with a different minor Hive version

2019-08-19 Thread GitBox

bowenli86 commented on a change in pull request #9447: 
[FLINK-13643][docs]Document the workaround for users with a different minor 
Hive version
URL: https://github.com/apache/flink/pull/9447#discussion_r315367004
 
 

 ##
 File path: docs/dev/table/hive/index.zh.md
 ##
 @@ -40,7 +40,16 @@ You do not need to modify your existing Hive Metastore or 
change the data placem
 
 ## Supported Hive Version's
 
-Flink supports Hive `2.3.4` and `1.2.1` and relies on Hive's compatibility 
guarantee's for other versions.
+Flink supports Hive `2.3.4` and `1.2.1` and relies on Hive's compatibility 
guarantee's for other minor versions.
+
+If you use a different minor Hive version such as `1.2.2` or `2.3.1`, it 
should also be ok to 
+chose the closest version `1.2.1` (for `1.2.2`) or `2.3.4` (for `2.3.1`) to 
workaround. For 
 
 Review comment:
   ```suggestion
   choose the closest version `1.2.1` (for `1.2.2`) or `2.3.4` (for `2.3.1`) to 
workaround. For 
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

1 2 3 4 >

1 - 100 of 322 matches

Mail list logo