date:20191124

[jira] [Assigned] (FLINK-14590) Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py"

2019-11-24 Thread Hequn Cheng (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hequn Cheng reassigned FLINK-14590:
---

Assignee: Hequn Cheng

> Unify the working directory of Java process and Python process when 
> submitting python jobs via "flink run -py"
> --
>
> Key: FLINK-14590
> URL: https://issues.apache.org/jira/browse/FLINK-14590
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Reporter: Wei Zhong
>Assignee: Hequn Cheng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Assume we enter this flink directory with following structure:
> {code:java}
> flink/
>   bin/
>   flink
>   pyflink-shell.sh
>   python-gateway-server.sh
>   ...
>   bad_case/
>word_count.py
>data.txt
>   lib/...
>   opt/...{code}
>  And the word_count.py has such a piece of code:
> {code:java}
> t_config = TableConfig()
> env = StreamExecutionEnvironment.get_execution_environment()
> t_env = StreamTableEnvironment.create(env, t_config)
> env._j_stream_execution_environment.registerCachedFile("data", 
> "bad_case/data.txt")
> with open("bad_case/data.txt", "r") as f:
> content = f.read()
> elements = [(word, 1) for word in content.split(" ")]
> t_env.from_elements(elements, ["word", "count"]){code}
> Then we enter the "flink" directory and run:
> {code:java}
> bin/flink run -py bad_case/word_count.py
> {code}
> The program will fail at the line of "with open("bad_case/data.txt", "r") as 
> f:".
> It is because the working directory of Java process is current directory but 
> the working directory of Python process is a temporary directory.
> So there is no problem when relative path is used in the api call to java 
> process. But if relative path is used in other place such as native file 
> access, it will fail, because the working directory of python process has 
> been change to a temporary directory that is not known to users.
> I think it will cause some confusion for users, especially after we support 
> dependency management. It will be great if we unify the working directory of 
> Java process and Python process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-14590) Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py"

2019-11-24 Thread Hequn Cheng (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hequn Cheng reassigned FLINK-14590:
---

Assignee: Wei Zhong  (was: Hequn Cheng)

> Unify the working directory of Java process and Python process when 
> submitting python jobs via "flink run -py"
> --
>
> Key: FLINK-14590
> URL: https://issues.apache.org/jira/browse/FLINK-14590
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Reporter: Wei Zhong
>Assignee: Wei Zhong
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Assume we enter this flink directory with following structure:
> {code:java}
> flink/
>   bin/
>   flink
>   pyflink-shell.sh
>   python-gateway-server.sh
>   ...
>   bad_case/
>word_count.py
>data.txt
>   lib/...
>   opt/...{code}
>  And the word_count.py has such a piece of code:
> {code:java}
> t_config = TableConfig()
> env = StreamExecutionEnvironment.get_execution_environment()
> t_env = StreamTableEnvironment.create(env, t_config)
> env._j_stream_execution_environment.registerCachedFile("data", 
> "bad_case/data.txt")
> with open("bad_case/data.txt", "r") as f:
> content = f.read()
> elements = [(word, 1) for word in content.split(" ")]
> t_env.from_elements(elements, ["word", "count"]){code}
> Then we enter the "flink" directory and run:
> {code:java}
> bin/flink run -py bad_case/word_count.py
> {code}
> The program will fail at the line of "with open("bad_case/data.txt", "r") as 
> f:".
> It is because the working directory of Java process is current directory but 
> the working directory of Python process is a temporary directory.
> So there is no problem when relative path is used in the api call to java 
> process. But if relative path is used in other place such as native file 
> access, it will fail, because the working directory of python process has 
> been change to a temporary directory that is not known to users.
> I think it will cause some confusion for users, especially after we support 
> dependency management. It will be great if we unify the working directory of 
> Java process and Python process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10306: [FLINK-13943][table-api] Provide 
utility method to convert Flink table to Java List
URL: https://github.com/apache/flink/pull/10306#issuecomment-558027199
 
 
   
   ## CI report:
   
   * 0b265d192e2a6024e5817317be0317136208ccaf : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/138004313)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10305: [FLINK-14892][docs] Add 
documentation for checkpoint directory layout
URL: https://github.com/apache/flink/pull/10305#issuecomment-558011526
 
 
   
   ## CI report:
   
   * 160416cea63f157c0264c24812e2b2eef90e8c1d : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137989900)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-13590) flink-on-yarn sometimes could create many little files that are xxx-taskmanager-conf.yaml

2019-11-24 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981350#comment-16981350
 ] 

Yang Wang commented on FLINK-13590:
---

This Jira has been fixed via 
[FLINK-13184|https://issues.apache.org/jira/browse/FLINK-13184]. Since 
\{uuid}-taskmanager-conf.yaml on hdfs will not be created. We will use dynamic 
properties instead.

 

It has also been picked to release-1.8 and release-1.9. [~shu_wen...@qq.com] 
could we close this JIRA?

> flink-on-yarn sometimes could create many little files that are 
> xxx-taskmanager-conf.yaml
> -
>
> Key: FLINK-13590
> URL: https://issues.apache.org/jira/browse/FLINK-13590
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: shuwenjun
>Priority: Major
> Attachments: taskmanager-conf-yaml.png
>
>
> Both of 1.7.2 and 1.8.0 are used, but they could create many little files.
>  These files are the configuration file of  taskmanager and when the flink 
> session try to apply a new container, one of the files will be created. And I 
> don't know why sometimes the flink session apply container again and again? 
> Or when one container has lost, it could delete its taskmanager-conf.yaml 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package

2019-11-24 Thread GitBox

hequn8128 commented on a change in pull request #10103: 
[FLINK-14506][python][build] Improve the release script for Python API release 
package
URL: https://github.com/apache/flink/pull/10103#discussion_r350021027
 
 

 ##
 File path: flink-python/dev/lint-python.sh
 ##
 @@ -626,6 +723,12 @@ while getopts "hfi:e:l" arg; do
 ;;
 esac
 done
+# decides whether to skip check stage
 
 Review comment:
   Add a blank line before this line?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package

2019-11-24 Thread GitBox

hequn8128 commented on a change in pull request #10103: 
[FLINK-14506][python][build] Improve the release script for Python API release 
package
URL: https://github.com/apache/flink/pull/10103#discussion_r350021854
 
 

 ##
 File path: flink-python/dev/lint-python.sh
 ##
 @@ -303,7 +352,7 @@ function install_environment() {
 # step-3 install python environment whcih includes
 # 3.5 3.6 3.7
 print_function "STEP" "installing python environment..."
 
 Review comment:
   Move this print into the `if`? because the installation may not be called if 
it returns false.
   Same for other commands.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package

2019-11-24 Thread GitBox

hequn8128 commented on a change in pull request #10103: 
[FLINK-14506][python][build] Improve the release script for Python API release 
package
URL: https://github.com/apache/flink/pull/10103#discussion_r350022355
 
 

 ##
 File path: flink-python/dev/lint-python.sh
 ##
 @@ -589,16 +659,40 @@ get_all_supported_checks
 EXCLUDE_CHECKS=""
 
 INCLUDE_CHECKS=""
+
+SUPPORTED_INSTALLATION_COMPONENTS=()
+# search all supported install functions and put them into 
SUPPORTED_INSTALLATION_COMPONENTS array
+get_all_supported_install_components
+
+INSTALLATION_COMPONENTS=()
 
 Review comment:
   Add a blank line after this line.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package

2019-11-24 Thread GitBox

hequn8128 commented on a change in pull request #10103: 
[FLINK-14506][python][build] Improve the release script for Python API release 
package
URL: https://github.com/apache/flink/pull/10103#discussion_r350024300
 
 

 ##
 File path: flink-python/dev/lint-python.sh
 ##
 @@ -105,6 +123,29 @@ function check_valid_stage() {
 return 1
 }
 
+function parse_component_args() {
+local REAL_COMPONENTS=()
+for component in ${INSTALLATION_COMPONENTS[@]}; do
+# because all other components depends on conda, the install of conda 
is
+# required component.
+if [[ "$component" == "basic" ]] || [[ "$component" == "miniconda" ]]; 
then
+continue
+fi
+if [[ "$component" == "all" ]]; then
+component="environment"
+fi
+if [[ `contains_element "${SUPPORTED_INSTALLATION_COMPONENTS[*]}" 
"${component}"` = true ]]; then
+REAL_COMPONENTS+=(${component})
+else
+echo "unknown install component ${component}"
 
 Review comment:
   Also print the components that are supported?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-13758) failed to submit JobGraph when registered hdfs file in DistributedCache

2019-11-24 Thread Arvid Heise (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981342#comment-16981342
 ] 

Arvid Heise commented on FLINK-13758:
-

If FLINK-14908 is indeed a duplicate, then this is also affecting Flink 1.9 and 
most likely 1.10 .

> failed to submit JobGraph when registered hdfs file in DistributedCache 
> 
>
> Key: FLINK-13758
> URL: https://issues.apache.org/jira/browse/FLINK-13758
> Project: Flink
>  Issue Type: Bug
>  Components: Command Line Client
>Affects Versions: 1.6.3, 1.6.4, 1.7.2, 1.8.0, 1.8.1
>Reporter: luoguohao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> when using HDFS files for DistributedCache, it would failed to submit 
> jobGraph, we can see exceptions stack traces in log file after a while, but 
> if DistributedCache file is a local file, every thing goes fine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #10126: [FLINK-14590][python] Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py"

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10126: [FLINK-14590][python] Unify the 
working directory of Java process and Python process when submitting python 
jobs via "flink run -py"
URL: https://github.com/apache/flink/pull/10126#issuecomment-551413525
 
 
   
   ## CI report:
   
   * ea5abdfce3ab26a0a196fd7d73a78de109d71bc3 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/135594994)
   * 25c2e648e4c72c110ec8519392d085bb88ccd023 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/138001762)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10304: [FLINK-14838] Cleanup the 
description about container number config option in Scala and python shell doc
URL: https://github.com/apache/flink/pull/10304#issuecomment-558004149
 
 
   
   ## CI report:
   
   * afdf3960fae62476f377aff7c27fac4f4779283a : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137987427)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List

2019-11-24 Thread GitBox

flinkbot commented on issue #10306: [FLINK-13943][table-api] Provide utility 
method to convert Flink table to Java List
URL: https://github.com/apache/flink/pull/10306#issuecomment-558027199
 
 
   
   ## CI report:
   
   * 0b265d192e2a6024e5817317be0317136208ccaf : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10268: [Flink-14599][table-planner-blink] 
Support precision of TimestampType in blink planner
URL: https://github.com/apache/flink/pull/10268#issuecomment-555964030
 
 
   
   ## CI report:
   
   * 72414aad5f654b834205103f23fbbfc3d5466748 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137370738)
   * 30d167184e68f6a44fc4f5d58228577c916d63d2 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137496415)
   * f41e0aaf005a489fc1df5c85511ff632ed9402a7 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137522333)
   * 29cc047cd4d65ecd9d47606843ee4893f765e8bc : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137706167)
   * 9e6bb01366f7aaa7aacb8f74104213dc9d97ff25 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137729165)
   * b0271e6b12a9e951074adbb50d7a7110736d61bd : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137737105)
   * 724494d11181f6185df8de83e825e5b1d636a415 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137741018)
   * f1c9a89535ada7e5d3b5a155fe34cac5ce1dc928 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137902784)
   * b7b0adea530f8a8273085990cc583128c2e90fc8 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/138001736)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-14936) Introduce MemoryManager#computeMemorySize to calculate managed memory of an operator from a fraction

2019-11-24 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-14936:

Description: 
A MemoryManager#computeMemorySize(double fraction) is needed to calculate 
managed memory from a fraction.
It can be helpful for operators to get the memory size it can reserve and for 
further #reserveMemory. (Similar to #computeNumberOfPages).

Here are two cases that may need this method in near future:
1. [Python operator memory 
management|https://lists.apache.org/thread.html/dd4dedeb9354c2ee559cd2f15629c719853915b5efb31a0eafee9361@%3Cdev.flink.apache.org%3E]
2. [Statebackend memory 
management|https://issues.apache.org/jira/browse/FLINK-14883]

  was:
A MemoryManager#computeMemorySize(double fraction) is needed to calculate 
managed memory from a fraction.
It can be helpful for operators to get the memory size it can reserve and for 
further #reserveMemory. (Similar to #computeNumberOfPages).

In the near future, there are two cases that may need this method:
1. [Python operator memory 
management|https://lists.apache.org/thread.html/dd4dedeb9354c2ee559cd2f15629c719853915b5efb31a0eafee9361@%3Cdev.flink.apache.org%3E]
2. [Statebackend memory 
management|https://issues.apache.org/jira/browse/FLINK-14883]


> Introduce MemoryManager#computeMemorySize to calculate managed memory of an 
> operator from a fraction
> 
>
> Key: FLINK-14936
> URL: https://issues.apache.org/jira/browse/FLINK-14936
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.10.0
>Reporter: Zhu Zhu
>Priority: Major
>
> A MemoryManager#computeMemorySize(double fraction) is needed to calculate 
> managed memory from a fraction.
> It can be helpful for operators to get the memory size it can reserve and for 
> further #reserveMemory. (Similar to #computeNumberOfPages).
> Here are two cases that may need this method in near future:
> 1. [Python operator memory 
> management|https://lists.apache.org/thread.html/dd4dedeb9354c2ee559cd2f15629c719853915b5efb31a0eafee9361@%3Cdev.flink.apache.org%3E]
> 2. [Statebackend memory 
> management|https://issues.apache.org/jira/browse/FLINK-14883]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-14936) Introduce MemoryManager#computeMemorySize to calculate managed memory of an operator from a fraction

2019-11-24 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-14936:

Description: 
A MemoryManager#computeMemorySize(double fraction) is needed to calculate 
managed memory from a fraction.
It can be helpful for operators to get the memory size it can reserve and for 
further #reserveMemory. (Similar to #computeNumberOfPages).

In the near future, there are two cases that may need this method:
1. [Python operator memory 
management|https://lists.apache.org/thread.html/dd4dedeb9354c2ee559cd2f15629c719853915b5efb31a0eafee9361@%3Cdev.flink.apache.org%3E]
2. [Statebackend memory 
management|https://issues.apache.org/jira/browse/FLINK-14883]

  was:


I'd propose to MemoryManager#computeMemorySize(double fraction) which 
calculates managed memory from a fraction.
It can be helpful for operators to get the memory size it can reserve and for 
further #reserveMemory. (Similar to #computeNumberOfPages).


> Introduce MemoryManager#computeMemorySize to calculate managed memory of an 
> operator from a fraction
> 
>
> Key: FLINK-14936
> URL: https://issues.apache.org/jira/browse/FLINK-14936
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.10.0
>Reporter: Zhu Zhu
>Priority: Major
>
> A MemoryManager#computeMemorySize(double fraction) is needed to calculate 
> managed memory from a fraction.
> It can be helpful for operators to get the memory size it can reserve and for 
> further #reserveMemory. (Similar to #computeNumberOfPages).
> In the near future, there are two cases that may need this method:
> 1. [Python operator memory 
> management|https://lists.apache.org/thread.html/dd4dedeb9354c2ee559cd2f15629c719853915b5efb31a0eafee9361@%3Cdev.flink.apache.org%3E]
> 2. [Statebackend memory 
> management|https://issues.apache.org/jira/browse/FLINK-14883]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] zjffdu commented on a change in pull request #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List

2019-11-24 Thread GitBox

zjffdu commented on a change in pull request #10306: [FLINK-13943][table-api] 
Provide utility method to convert Flink table to Java List
URL: https://github.com/apache/flink/pull/10306#discussion_r350016903
 
 

 ##
 File path: 
flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/table/utils/TableResultUtils.java
 ##
 @@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.utils;
+
+import org.apache.flink.annotation.Experimental;
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.JobExecutionResult;
+import org.apache.flink.api.common.accumulators.Accumulator;
+import org.apache.flink.api.common.accumulators.SerializedListAccumulator;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.Utils;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.DataStreamSink;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.TableEnvironment;
+import org.apache.flink.table.api.TableSchema;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.table.sinks.StreamTableSink;
+import org.apache.flink.table.sinks.TableSink;
+import org.apache.flink.table.types.DataType;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.AbstractID;
+
+import java.util.List;
+
+/**
+ * A collection of utilities for fetching table results.
+ *
+ * NOTE: Methods in this utility class are experimental and can only be 
used for demonstration or testing
+ * small table results. Please DO NOT use them in production or on large 
tables.
+ */
+@Experimental
+public class TableResultUtils {
+
+   /**
+* Convert Flink table to Java list.
+*
+* @param table Flink table to convert
+* @return  Converted Java list
+*/
+   public static List tableResultToList(Table table) {
+   final TableEnvironment tEnv = ((TableImpl) 
table).getTableEnvironment();
+
+   final String id = new AbstractID().toString();
+   final TypeSerializer serializer = 
table.getSchema().toRowType().createSerializer(new ExecutionConfig());
+   final Utils.CollectHelper outputFormat = new 
Utils.CollectHelper<>(id, serializer);
+   final TableResultSink sink = new TableResultSink(table, 
outputFormat);
+
+   tEnv.registerTableSink("tableResultSink", sink);
 
 Review comment:
   Hard code table sink name ? I am afraid it will cause error when this method 
is called the second time. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List

2019-11-24 Thread GitBox

flinkbot commented on issue #10306: [FLINK-13943][table-api] Provide utility 
method to convert Flink table to Java List
URL: https://github.com/apache/flink/pull/10306#issuecomment-558020653
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 0b265d192e2a6024e5817317be0317136208ccaf (Mon Nov 25 
07:05:49 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-13943) Provide api to convert flink table to java List (e.g. Table#collect)

2019-11-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-13943:
---
Labels: pull-request-available  (was: )

> Provide api to convert flink table to java List (e.g. Table#collect)
> 
>
> Key: FLINK-13943
> URL: https://issues.apache.org/jira/browse/FLINK-13943
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Jeff Zhang
>Assignee: Caizhi Weng
>Priority: Major
>  Labels: pull-request-available
>
> It would be nice to convert flink table to java List so that I can do other 
> data manipulation in client side after execution flink job. For flink 
> planner, I can convert flink table to DataSet and use DataSet#collect, but 
> for blink planner, there's no such api.
> EDIT from FLINK-14807:
> Currently, it is very unconvinient for user to fetch data of flink job unless 
> specify sink expclitly and then fetch data from this sink via its api (e.g. 
> write to hdfs sink, then read data from hdfs). However, most of time user 
> just want to get the data and do whatever processing he want. So it is very 
> necessary for flink to provide api Table#collect for this purpose. 
> Other apis such as Table#head, Table#print is also helpful.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] TsReaper opened a new pull request #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List

2019-11-24 Thread GitBox

TsReaper opened a new pull request #10306: [FLINK-13943][table-api] Provide 
utility method to convert Flink table to Java List
URL: https://github.com/apache/flink/pull/10306
 
 
   ## What is the purpose of the change
   
   Some users would like to convert Flink table to Java List for demo or 
testing purpose. This PR introduces a utility method 
`TableResultUtils#tableResultToList` to convert Flink tables to lists.
   
   As the new utility method is experimental, we currently are not going to 
introduce a new `Table#collect` API. If users find this utility method 
satisfying to use, we will then propose a FLIP for `Table#collect`.
   
   ## Brief change log
   
- Introduce `TableResultUtils#tableResultToList`
   
   ## Verifying this change
   
   This change added tests and can be verified as follows: run the newly added 
`TableResultUtilsITCase`.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10126: [FLINK-14590][python] Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py"

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10126: [FLINK-14590][python] Unify the 
working directory of Java process and Python process when submitting python 
jobs via "flink run -py"
URL: https://github.com/apache/flink/pull/10126#issuecomment-551413525
 
 
   
   ## CI report:
   
   * ea5abdfce3ab26a0a196fd7d73a78de109d71bc3 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/135594994)
   * 25c2e648e4c72c110ec8519392d085bb88ccd023 : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-14936) Introduce MemoryManager#computeMemorySize to calculate managed memory of an operator from a fraction

2019-11-24 Thread Zhu Zhu (Jira)

Zhu Zhu created FLINK-14936:
---

 Summary: Introduce MemoryManager#computeMemorySize to calculate 
managed memory of an operator from a fraction
 Key: FLINK-14936
 URL: https://issues.apache.org/jira/browse/FLINK-14936
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Zhu Zhu




I'd propose to MemoryManager#computeMemorySize(double fraction) which 
calculates managed memory from a fraction.
It can be helpful for operators to get the memory size it can reserve and for 
further #reserveMemory. (Similar to #computeNumberOfPages).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10305: [FLINK-14892][docs] Add 
documentation for checkpoint directory layout
URL: https://github.com/apache/flink/pull/10305#issuecomment-558011526
 
 
   
   ## CI report:
   
   * 160416cea63f157c0264c24812e2b2eef90e8c1d : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/137989900)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add 
use/create/drop/alter database operation and support it in flink/blink planner
URL: https://github.com/apache/flink/pull/10296#issuecomment-557762350
 
 
   
   ## CI report:
   
   * 7c5e46d1f794cb01be5088f5f5d29aec5850bf7f : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137872169)
   * 9d98382a4a56dfb2df7cc38cac34db0b68a27b93 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137873009)
   * 6942e402272d95e745e26cb6409dbdb2e0a4496e : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137985445)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10268: [Flink-14599][table-planner-blink] 
Support precision of TimestampType in blink planner
URL: https://github.com/apache/flink/pull/10268#issuecomment-555964030
 
 
   
   ## CI report:
   
   * 72414aad5f654b834205103f23fbbfc3d5466748 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137370738)
   * 30d167184e68f6a44fc4f5d58228577c916d63d2 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137496415)
   * f41e0aaf005a489fc1df5c85511ff632ed9402a7 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137522333)
   * 29cc047cd4d65ecd9d47606843ee4893f765e8bc : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137706167)
   * 9e6bb01366f7aaa7aacb8f74104213dc9d97ff25 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137729165)
   * b0271e6b12a9e951074adbb50d7a7110736d61bd : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137737105)
   * 724494d11181f6185df8de83e825e5b1d636a415 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137741018)
   * f1c9a89535ada7e5d3b5a155fe34cac5ce1dc928 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137902784)
   * b7b0adea530f8a8273085990cc583128c2e90fc8 : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] wangyang0918 commented on issue #10193: [FLINK-13938][yarn] Use pre-uploaded flink binary to accelerate flink submission

2019-11-24 Thread GitBox

wangyang0918 commented on issue #10193: [FLINK-13938][yarn] Use pre-uploaded 
flink binary to accelerate flink submission
URL: https://github.com/apache/flink/pull/10193#issuecomment-558013369
 
 
   @walterddr Your assumption 1 is right. For assumption 2, the pre-uploaded 
flink binary is located on a generic location of any kind of DFS system(usually 
HDFS), not Yarn distributed cache.
   
   We register the pre-uploaded flink jars as Yarn public cache, so that it 
could be shared by different applications. And they will not be downloaded 
every time.
   
   For #10187, @TisonKun will create a separate JIRA. It is focused on support 
shipping directory located on HDFS.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-14930) OSS Filesystem Uses Wrong Shading Prefix

2019-11-24 Thread Konstantin Knauf (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981322#comment-16981322
 ] 

Konstantin Knauf commented on FLINK-14930:
--

[~chesnay] Could you assign me to this?

> OSS Filesystem Uses Wrong Shading Prefix
> 
>
> Key: FLINK-14930
> URL: https://issues.apache.org/jira/browse/FLINK-14930
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems
>Affects Versions: 1.9.1
>Reporter: Konstantin Knauf
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The relevant classes (CredentialsProviders) are relocated to 
> {{org.apache.flink.fs.osshadoop.shaded.}} not 
> {{org.apache.flink.fs.shaded.hadoop3.}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-14552) Enable partition statistics in blink planner

2019-11-24 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-14552:
---

Assignee: Jingsong Lee

> Enable partition statistics in blink planner
> 
>
> Key: FLINK-14552
> URL: https://issues.apache.org/jira/browse/FLINK-14552
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>
> We need update statistics after partition pruning in 
> PushPartitionIntoTableSourceScanRule.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-14543) Support partition for temporary table

2019-11-24 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-14543:
---

Assignee: Jingsong Lee

> Support partition for temporary table
> -
>
> Key: FLINK-14543
> URL: https://issues.apache.org/jira/browse/FLINK-14543
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout

2019-11-24 Thread GitBox

flinkbot commented on issue #10305: [FLINK-14892][docs] Add documentation for 
checkpoint directory layout
URL: https://github.com/apache/flink/pull/10305#issuecomment-558011526
 
 
   
   ## CI report:
   
   * 160416cea63f157c0264c24812e2b2eef90e8c1d : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10304: [FLINK-14838] Cleanup the 
description about container number config option in Scala and python shell doc
URL: https://github.com/apache/flink/pull/10304#issuecomment-558004149
 
 
   
   ## CI report:
   
   * afdf3960fae62476f377aff7c27fac4f4779283a : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/137987427)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] wuchong commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner

2019-11-24 Thread GitBox

wuchong commented on issue #10268: [Flink-14599][table-planner-blink] Support 
precision of TimestampType in blink planner
URL: https://github.com/apache/flink/pull/10268#issuecomment-558010632
 
 
   I agree with @JingsongLi . We can add some tests for high precision 
timestamp as group key, as distinct key, as time attribute. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner

2019-11-24 Thread GitBox

JingsongLi commented on issue #10268: [Flink-14599][table-planner-blink] 
Support precision of TimestampType in blink planner
URL: https://github.com/apache/flink/pull/10268#issuecomment-558009679
 
 
   Can you add more tests for timestamp type with precision that bigger than 3?
   Like in batch mode, shuffle with timestamp key, join with timestamp key, 
aggregate with timestamp key, aggregate function to timestamp field. I am not 
100% sure that work in these scenarios.
   @wuchong What do you think about streaming mode?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #10268: 
[Flink-14599][table-planner-blink] Support precision of TimestampType in blink 
planner
URL: https://github.com/apache/flink/pull/10268#discussion_r350004855
 
 

 ##
 File path: 
flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/codegen/CodeGenUtils.scala
 ##
 @@ -590,8 +597,7 @@ object CodeGenUtils {
 case DOUBLE => s"$arrayTerm.setNullDouble($index)"
 case TIME_WITHOUT_TIME_ZONE => s"$arrayTerm.setNullInt($index)"
 case DATE => s"$arrayTerm.setNullInt($index)"
-case TIMESTAMP_WITHOUT_TIME_ZONE |
- TIMESTAMP_WITH_LOCAL_TIME_ZONE => s"$arrayTerm.setNullLong($index)"
+case TIMESTAMP_WITH_LOCAL_TIME_ZONE => s"$arrayTerm.setNullLong($index)"
 
 Review comment:
   Yes, it is what I mean.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] docete commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner

2019-11-24 Thread GitBox

docete commented on issue #10268: [Flink-14599][table-planner-blink] Support 
precision of TimestampType in blink planner
URL: https://github.com/apache/flink/pull/10268#issuecomment-558008007
 
 
   > Looks like your base is broken, rebase to master to let the tests pass?
   
   OK. Would rebase to master soon.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] KurtYoung commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner

2019-11-24 Thread GitBox

KurtYoung commented on issue #10268: [Flink-14599][table-planner-blink] Support 
precision of TimestampType in blink planner
URL: https://github.com/apache/flink/pull/10268#issuecomment-558006414
 
 
   Looks like your base is broken, rebase to master to let the tests pass?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] klion26 commented on a change in pull request #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout

2019-11-24 Thread GitBox

klion26 commented on a change in pull request #10305: [FLINK-14892][docs] Add 
documentation for checkpoint directory layout
URL: https://github.com/apache/flink/pull/10305#discussion_r350003208
 
 

 ##
 File path: docs/ops/state/checkpoints.md
 ##
 @@ -63,6 +63,24 @@ files. The meta data file and data files are stored in the 
directory that is
 configured via `state.checkpoints.dir` in the configuration files, 
 and also can be specified for per job in the code.
 
+The current checkpoint directory layout which introduced by FLINK-8531 is as 
follows:
+
+{% highlight yaml %}
+/user-defined-checkpoint-dir
+|
++ --shared/
++ --taskowned/
++ --chk-1/
++ --chk-2/
++ --chk-3/
+...
+{% endhighlight %}
+
+The **SHARED** directory is for state that is possibly part of multiple 
checkpoints, **TASKOWNED** is for state that must never by dropped by the 
JobManager, and **EXCLUSIVE** is for state that belongs to one checkpoint only. 
 
 Review comment:
   The description order is according to the order of the above checkpoint 
directory layout.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout

2019-11-24 Thread GitBox

flinkbot commented on issue #10305: [FLINK-14892][docs] Add documentation for 
checkpoint directory layout
URL: https://github.com/apache/flink/pull/10305#issuecomment-558005628
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 160416cea63f157c0264c24812e2b2eef90e8c1d (Mon Nov 25 
06:08:20 UTC 2019)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] Support all TPC-DS queries

2019-11-24 Thread GitBox

KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] 
Support all TPC-DS queries
URL: https://github.com/apache/flink/pull/10239#discussion_r349973287
 
 

 ##
 File path: 
flink-end-to-end-tests/flink-tpcds-test/src/main/java/org/apache/flink/table/tpcds/utils/AnswerFormatter.java
 ##
 @@ -0,0 +1,187 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.tpcds.utils;
+
+import org.apache.flink.api.java.utils.ParameterTool;
+
+import java.io.BufferedReader;
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileReader;
+import java.io.FileWriter;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.stream.Collectors;
+
+/**
+ * Answer set format tool class. convert delimiter from spaces or tabs to 
bar('|') in TPC-DS answer set.
+ * before convert, need to format TPC-DS result as following:
+ * 1. split answer set which has multi query results to multi answer set, 
includes query14, 23, 24, 39.
+ * 2. replace tabs by spaces in answer set by vim.
+ * (1) cd answer_set directory
+ * (2) vim 1.ans with command model,
+ * :set ts=8
+ * :set noexpandtab
+ * :%retab!
+ * :args ./*.ans
+ * :argdo %retab! |update
+ * (3) save and quit vim.
+ */
+public class AnswerFormatter {
+
+   private static final int SPACE_BETWEEN_COL = 1;
+   private static final String RESULT_HEAD_STRING_BAR = "|";
+   private static final String RESULT_HEAD_STRING_DASH = "--";
+   private static final String RESULT_HEAD_STRING_SPACE = " ";
+   private static final String COL_DELIMITER = "|";
+   private static final String ANSWER_FILE_SUFFIX = ".ans";
+   private static final String REGEX_SPLIT_BAR = "\\|";
+   private static final String FILE_SEPARATOR = "/";
+
+   /**
+* 1.flink keeps NULLS_FIRST in ASC order, keeps NULLS_LAST in DESC 
order,
+* choose corresponding answer set file here.
+* 2.for query 8、14a、18、70、77, decimal precision of answer set is too 
low
+* and unreasonable, compare result with result from SQL server, they 
can
+* strictly match.
+*/
+   private static final List ORIGIN_ANSWER_FILE = Arrays.asList(
+   "1", "2", "3", "4", "5_NULLS_FIRST", "6_NULLS_FIRST", "7", 
"8_SQL_SERVER", "9", "10",
+   "11", "12", "13", "14a_SQL_SERVER", "14b_NULLS_FIRST", 
"15_NULLS_FIRST", "16",
+   "17", "18_SQL_SERVER", "19", "20_NULLS_FIRST", 
"21_NULLS_FIRST", "22_NULLS_FIRST",
+   "23a_NULLS_FIRST", "23b_NULLS_FIRST", "24a", "24b", "25", "26", 
"27_NULLS_FIRST",
+   "28", "29", "30", "31", "32", "33", "34_NULLS_FIRST", 
"35_NULLS_FIRST", "36_NULLS_FIRST",
+   "37", "38", "39a", "39b", "40", "41", "42", "43", "44", "45", 
"46_NULLS_FIRST",
+   "47", "48", "49", "50", "51", "52", "53", "54", "55", 
"56_NULLS_FIRST", "57",
+   "58", "59", "60", "61", "62_NULLS_FIRST", "63", "64", 
"65_NULLS_FIRST", "66_NULLS_FIRST",
+   "67_NULLS_FIRST", "68_NULLS_FIRST", "69", "70_SQL_SERVER", 
"71_NULLS_LAST", "72_NULLS_FIRST", "73",
+   "74", "75", "76_NULLS_FIRST", "77_SQL_SERVER", "78", 
"79_NULLS_FIRST", "80_NULLS_FIRST",
+   "81", "82", "83", "84", "85", "86_NULLS_FIRST", "87", "88", 
"89", "90",
+   "91", "92", "93_NULLS_FIRST", "94", "95", "96", "97", 
"98_NULLS_FIRST", "99_NULLS_FIRST"
+   );
+
+   public static void main(String[] args) throws Exception {
+   ParameterTool params = ParameterTool.fromArgs(args);
+   String originDir = params.getRequired("originDir");
+   String destDir = params.getRequired("destDir");
+   for (int i = 0; i < ORIGIN_ANSWER_FILE.size(); i++) {
+   String file = ORIGIN_ANSWER_FILE.get(i);
+   String originFileName = file + ANSWER_FILE_SUFFIX;
+   String destFileName = file.split("_")[0] + 
ANSWER_FILE_SUFFIX;
+   File originFIle = new File(originDir + FILE_SEPARATOR + 
originFileName);
 
 Review comment:
   originFIle -> originFile

[GitHub] [flink] KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] Support all TPC-DS queries

2019-11-24 Thread GitBox

KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] 
Support all TPC-DS queries
URL: https://github.com/apache/flink/pull/10239#discussion_r349973626
 
 

 ##
 File path: 
flink-end-to-end-tests/flink-tpcds-test/src/main/java/org/apache/flink/table/tpcds/utils/AnswerFormatter.java
 ##
 @@ -0,0 +1,187 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.tpcds.utils;
+
+import org.apache.flink.api.java.utils.ParameterTool;
+
+import java.io.BufferedReader;
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileReader;
+import java.io.FileWriter;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.stream.Collectors;
+
+/**
+ * Answer set format tool class. convert delimiter from spaces or tabs to 
bar('|') in TPC-DS answer set.
+ * before convert, need to format TPC-DS result as following:
+ * 1. split answer set which has multi query results to multi answer set, 
includes query14, 23, 24, 39.
+ * 2. replace tabs by spaces in answer set by vim.
+ * (1) cd answer_set directory
+ * (2) vim 1.ans with command model,
+ * :set ts=8
+ * :set noexpandtab
+ * :%retab!
+ * :args ./*.ans
+ * :argdo %retab! |update
+ * (3) save and quit vim.
+ */
+public class AnswerFormatter {
+
+   private static final int SPACE_BETWEEN_COL = 1;
+   private static final String RESULT_HEAD_STRING_BAR = "|";
+   private static final String RESULT_HEAD_STRING_DASH = "--";
+   private static final String RESULT_HEAD_STRING_SPACE = " ";
+   private static final String COL_DELIMITER = "|";
+   private static final String ANSWER_FILE_SUFFIX = ".ans";
+   private static final String REGEX_SPLIT_BAR = "\\|";
+   private static final String FILE_SEPARATOR = "/";
+
+   /**
+* 1.flink keeps NULLS_FIRST in ASC order, keeps NULLS_LAST in DESC 
order,
+* choose corresponding answer set file here.
+* 2.for query 8、14a、18、70、77, decimal precision of answer set is too 
low
+* and unreasonable, compare result with result from SQL server, they 
can
+* strictly match.
+*/
+   private static final List ORIGIN_ANSWER_FILE = Arrays.asList(
+   "1", "2", "3", "4", "5_NULLS_FIRST", "6_NULLS_FIRST", "7", 
"8_SQL_SERVER", "9", "10",
+   "11", "12", "13", "14a_SQL_SERVER", "14b_NULLS_FIRST", 
"15_NULLS_FIRST", "16",
+   "17", "18_SQL_SERVER", "19", "20_NULLS_FIRST", 
"21_NULLS_FIRST", "22_NULLS_FIRST",
+   "23a_NULLS_FIRST", "23b_NULLS_FIRST", "24a", "24b", "25", "26", 
"27_NULLS_FIRST",
+   "28", "29", "30", "31", "32", "33", "34_NULLS_FIRST", 
"35_NULLS_FIRST", "36_NULLS_FIRST",
+   "37", "38", "39a", "39b", "40", "41", "42", "43", "44", "45", 
"46_NULLS_FIRST",
+   "47", "48", "49", "50", "51", "52", "53", "54", "55", 
"56_NULLS_FIRST", "57",
+   "58", "59", "60", "61", "62_NULLS_FIRST", "63", "64", 
"65_NULLS_FIRST", "66_NULLS_FIRST",
+   "67_NULLS_FIRST", "68_NULLS_FIRST", "69", "70_SQL_SERVER", 
"71_NULLS_LAST", "72_NULLS_FIRST", "73",
+   "74", "75", "76_NULLS_FIRST", "77_SQL_SERVER", "78", 
"79_NULLS_FIRST", "80_NULLS_FIRST",
+   "81", "82", "83", "84", "85", "86_NULLS_FIRST", "87", "88", 
"89", "90",
+   "91", "92", "93_NULLS_FIRST", "94", "95", "96", "97", 
"98_NULLS_FIRST", "99_NULLS_FIRST"
+   );
+
+   public static void main(String[] args) throws Exception {
+   ParameterTool params = ParameterTool.fromArgs(args);
+   String originDir = params.getRequired("originDir");
+   String destDir = params.getRequired("destDir");
+   for (int i = 0; i < ORIGIN_ANSWER_FILE.size(); i++) {
+   String file = ORIGIN_ANSWER_FILE.get(i);
+   String originFileName = file + ANSWER_FILE_SUFFIX;
+   String destFileName = file.split("_")[0] + 
ANSWER_FILE_SUFFIX;
+   File originFIle = new File(originDir + FILE_SEPARATOR + 
originFileName);
+   File destFile = new File(destDir + FILE_SEPARATOR +

[GitHub] [flink] KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] Support all TPC-DS queries

2019-11-24 Thread GitBox

KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] 
Support all TPC-DS queries
URL: https://github.com/apache/flink/pull/10239#discussion_r34496
 
 

 ##
 File path: 
flink-end-to-end-tests/flink-tpcds-test/src/main/java/org/apache/flink/table/tpcds/utils/AnswerFormatter.java
 ##
 @@ -0,0 +1,187 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.tpcds.utils;
+
+import org.apache.flink.api.java.utils.ParameterTool;
+
+import java.io.BufferedReader;
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileReader;
+import java.io.FileWriter;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.stream.Collectors;
+
+/**
+ * Answer set format tool class. convert delimiter from spaces or tabs to 
bar('|') in TPC-DS answer set.
+ * before convert, need to format TPC-DS result as following:
+ * 1. split answer set which has multi query results to multi answer set, 
includes query14, 23, 24, 39.
+ * 2. replace tabs by spaces in answer set by vim.
+ * (1) cd answer_set directory
+ * (2) vim 1.ans with command model,
+ * :set ts=8
+ * :set noexpandtab
+ * :%retab!
+ * :args ./*.ans
+ * :argdo %retab! |update
+ * (3) save and quit vim.
+ */
+public class AnswerFormatter {
+
+   private static final int SPACE_BETWEEN_COL = 1;
+   private static final String RESULT_HEAD_STRING_BAR = "|";
+   private static final String RESULT_HEAD_STRING_DASH = "--";
+   private static final String RESULT_HEAD_STRING_SPACE = " ";
+   private static final String COL_DELIMITER = "|";
+   private static final String ANSWER_FILE_SUFFIX = ".ans";
+   private static final String REGEX_SPLIT_BAR = "\\|";
+   private static final String FILE_SEPARATOR = "/";
+
+   /**
+* 1.flink keeps NULLS_FIRST in ASC order, keeps NULLS_LAST in DESC 
order,
+* choose corresponding answer set file here.
+* 2.for query 8、14a、18、70、77, decimal precision of answer set is too 
low
+* and unreasonable, compare result with result from SQL server, they 
can
+* strictly match.
+*/
+   private static final List ORIGIN_ANSWER_FILE = Arrays.asList(
+   "1", "2", "3", "4", "5_NULLS_FIRST", "6_NULLS_FIRST", "7", 
"8_SQL_SERVER", "9", "10",
+   "11", "12", "13", "14a_SQL_SERVER", "14b_NULLS_FIRST", 
"15_NULLS_FIRST", "16",
+   "17", "18_SQL_SERVER", "19", "20_NULLS_FIRST", 
"21_NULLS_FIRST", "22_NULLS_FIRST",
+   "23a_NULLS_FIRST", "23b_NULLS_FIRST", "24a", "24b", "25", "26", 
"27_NULLS_FIRST",
+   "28", "29", "30", "31", "32", "33", "34_NULLS_FIRST", 
"35_NULLS_FIRST", "36_NULLS_FIRST",
+   "37", "38", "39a", "39b", "40", "41", "42", "43", "44", "45", 
"46_NULLS_FIRST",
+   "47", "48", "49", "50", "51", "52", "53", "54", "55", 
"56_NULLS_FIRST", "57",
+   "58", "59", "60", "61", "62_NULLS_FIRST", "63", "64", 
"65_NULLS_FIRST", "66_NULLS_FIRST",
+   "67_NULLS_FIRST", "68_NULLS_FIRST", "69", "70_SQL_SERVER", 
"71_NULLS_LAST", "72_NULLS_FIRST", "73",
+   "74", "75", "76_NULLS_FIRST", "77_SQL_SERVER", "78", 
"79_NULLS_FIRST", "80_NULLS_FIRST",
+   "81", "82", "83", "84", "85", "86_NULLS_FIRST", "87", "88", 
"89", "90",
+   "91", "92", "93_NULLS_FIRST", "94", "95", "96", "97", 
"98_NULLS_FIRST", "99_NULLS_FIRST"
+   );
+
+   public static void main(String[] args) throws Exception {
+   ParameterTool params = ParameterTool.fromArgs(args);
+   String originDir = params.getRequired("originDir");
+   String destDir = params.getRequired("destDir");
+   for (int i = 0; i < ORIGIN_ANSWER_FILE.size(); i++) {
+   String file = ORIGIN_ANSWER_FILE.get(i);
+   String originFileName = file + ANSWER_FILE_SUFFIX;
+   String destFileName = file.split("_")[0] + 
ANSWER_FILE_SUFFIX;
+   File originFIle = new File(originDir + FILE_SEPARATOR + 
originFileName);
+   File destFile = new File(destDir + FILE_SEPARATOR +

[GitHub] [flink] klion26 commented on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout

2019-11-24 Thread GitBox

klion26 commented on issue #10305: [FLINK-14892][docs] Add documentation for 
checkpoint directory layout
URL: https://github.com/apache/flink/pull/10305#issuecomment-558005061
 
 
   cc @pnowojski 
   cc @wuchong  for the translated Chinese version


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce 
FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#issuecomment-539910284
 
 
   
   ## CI report:
   
   * e66114b1aa73a82b4c6bcf5c3a16baedb598f8c3 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131098600)
   * b7887760a3c3d28ca88eb31800ebd61084a520fc : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131249622)
   * 19ff8f1384bf24b469fa6cac0566d603a332b31d : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/133329923)
   * 98c83564a56ddcc30095e83458384a106170443c : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/133770151)
   * 501f1c3266693d0cbb1b8b1c0195f354756b7526 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134385291)
   * ae434f5b0a094cbe4dda971dd0d69d7264b50155 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134525931)
   * 2ab2896c8b1a1334dbb097d4a0c89cffdbdbbca0 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134804805)
   * ebcdd089caed98b9fef94445781c06c6689fbe60 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134806371)
   * a818c7d8c5c08f1498533a85165f93abda769f50 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134807944)
   * 9084a7edd3b9e0553c396154ef96dca15b580ffd : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/134814001)
   * 74a10525ee8fde96179cba59d26a8c2bf9698160 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134816617)
   * 22f56474fd27ec1cbe79f962607bc5bddfb6e067 : UNKNOWN
   * 5b35025a2848de85628ac2071f4fc64c02b557a1 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/13482)
   * 1a42be32e9aadf2328d7e4c2ee7487621e835658 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134835537)
   * 15d9f43362e702dd1b857b2e91d5c8b9812fe160 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134839586)
   * 300540862c55ca11b467191389463ebf025aebc4 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134999837)
   * bc77f17dab4013a5f2ba82d9d9e73b4b9e1be8ac : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135224808)
   * 0ea64d42dca7fb6f981ee73ebdde11ff93d1e6bd : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137983497)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-14892) Add documentation for checkpoint directory layout

2019-11-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14892:
---
Labels: pull-request-available  (was: )

> Add documentation for checkpoint directory layout
> -
>
> Key: FLINK-14892
> URL: https://issues.apache.org/jira/browse/FLINK-14892
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / Checkpointing
>Reporter: Congxian Qiu(klion26)
>Assignee: Congxian Qiu(klion26)
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> In FLINK-8531, we change the checkpoint directory layout to
> {code:java}
> /user-defined-checkpoint-dir
> |
> + --shared/
> + --taskowned/
> + --chk-1/
> + --chk-2/
> + --chk-3/
> ...
> {code}
> But the directory layout did not describe in the doc currently, and I found 
> some users confused about this, such as[1][2], so I propose to add a 
> description for the checkpoint directory layout in the documentation, maybe 
> in the page {{checkpoints#DirectoryStructure}}[3]
>  [1] 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-checkpointing-behavior-td30749.html#a30751]
>  [2] 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Apache-Flink-Operator-name-and-uuid-best-practices-td31031.html]
>  [3] 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/state/checkpoints.html#directory-structure]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] wangyang0918 commented on a change in pull request #9957: [FLINK-10932] Initialize flink-kubernetes module

2019-11-24 Thread GitBox

wangyang0918 commented on a change in pull request #9957: [FLINK-10932] 
Initialize flink-kubernetes module
URL: https://github.com/apache/flink/pull/9957#discussion_r350002500
 
 

 ##
 File path: flink-kubernetes/pom.xml
 ##
 @@ -0,0 +1,208 @@
+
+
+
+http://maven.apache.org/POM/4.0.0;
+xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+   4.0.0
+   
+   org.apache.flink
+   flink-parent
+   1.10-SNAPSHOT
+   ..
+   
+
+   flink-kubernetes_${scala.binary.version}
+   flink-kubernetes
+   jar
+
+   
+   4.5.2
+   
+
+   
+
+   
+
+   
+   org.apache.flink
+   
flink-clients_${scala.binary.version}
+   ${project.version}
+   provided
+   
+
+   
+   org.apache.flink
+   
flink-runtime_${scala.binary.version}
+   ${project.version}
+   provided
+   
+
+   
+   io.fabric8
+   kubernetes-client
+   ${kubernetes.client.version}
+   
+   
+   
+   com.squareup.okhttp3
+   okhttp
+   
+   
+   
com.fasterxml.jackson.core
+   jackson-core
+   
+   
+   
com.fasterxml.jackson.core
+   
jackson-databind
+   
 
 Review comment:
   There's multiple dependencies of `jackson-core` and `jackson-databind` for 
`kubernetes-client`. So we just exclude them and use a fixed version. This is 
to avoid dependency convergence check failure. For `jackson-dataformat-yaml`, 
it is only one dependency and it will also be shaded. I think we do not need to 
exclude it and add a direct dependency.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] klion26 opened a new pull request #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout

2019-11-24 Thread GitBox

klion26 opened a new pull request #10305: [FLINK-14892][docs] Add documentation 
for checkpoint directory layout
URL: https://github.com/apache/flink/pull/10305
 
 
   ## What is the purpose of the change
   
   In FLINK-8531, we change the checkpoint directory layout to
   ```
   /user-defined-checkpoint-dir
   |
   + --shared/
   + --taskowned/
   + --chk-1/
   + --chk-2/
   + --chk-3/
   ...
   ```
   But the directory layout did not describe in the doc currently, and some 
users confused about this in the mail-list, this pr wants to add a description 
for the checkpoint directory layout in the documentation.
   
   
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: ( no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc

2019-11-24 Thread GitBox

flinkbot commented on issue #10304: [FLINK-14838] Cleanup the description about 
container number config option in Scala and python shell doc
URL: https://github.com/apache/flink/pull/10304#issuecomment-558004149
 
 
   
   ## CI report:
   
   * afdf3960fae62476f377aff7c27fac4f4779283a : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add 
use/create/drop/alter database operation and support it in flink/blink planner
URL: https://github.com/apache/flink/pull/10296#issuecomment-557762350
 
 
   
   ## CI report:
   
   * 7c5e46d1f794cb01be5088f5f5d29aec5850bf7f : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137872169)
   * 9d98382a4a56dfb2df7cc38cac34db0b68a27b93 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137873009)
   * 6942e402272d95e745e26cb6409dbdb2e0a4496e : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/137985445)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc

2019-11-24 Thread GitBox

flinkbot commented on issue #10304: [FLINK-14838] Cleanup the description about 
container number config option in Scala and python shell doc
URL: https://github.com/apache/flink/pull/10304#issuecomment-558002278
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit afdf3960fae62476f377aff7c27fac4f4779283a (Mon Nov 25 
05:54:50 UTC 2019)
   
   **Warnings:**
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-14838).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-14838) Cleanup the description about container number config option in Scala and python shell doc

2019-11-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14838:
---
Labels: pull-request-available  (was: )

> Cleanup the description about container number config option in Scala and 
> python shell doc
> --
>
> Key: FLINK-14838
> URL: https://issues.apache.org/jira/browse/FLINK-14838
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: vinoyang
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the config option {{-n}} for Flink on Yarn has not been supported 
> since Flink 1.8+. FLINK-12362 did the cleanup job about this config option. 
> However, the scala shell and python doc still contains some description about 
> {{-n}} which may make users confused. This issue used to track the cleanup 
> work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] yanghua opened a new pull request #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc

2019-11-24 Thread GitBox

yanghua opened a new pull request #10304: [FLINK-14838] Cleanup the description 
about container number config option in Scala and python shell doc
URL: https://github.com/apache/flink/pull/10304
 
 
   
   
   ## What is the purpose of the change
   
   *This pull request cleanups the description about container number config 
option in Scala and python shell doc*
   
   ## Brief change log
   
 - *Cleanup the description about container number config option in Scala 
and python shell doc*
   
   
   ## Verifying this change
   
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ **not documented**)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce 
FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#issuecomment-539910284
 
 
   
   ## CI report:
   
   * e66114b1aa73a82b4c6bcf5c3a16baedb598f8c3 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131098600)
   * b7887760a3c3d28ca88eb31800ebd61084a520fc : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131249622)
   * 19ff8f1384bf24b469fa6cac0566d603a332b31d : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/133329923)
   * 98c83564a56ddcc30095e83458384a106170443c : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/133770151)
   * 501f1c3266693d0cbb1b8b1c0195f354756b7526 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134385291)
   * ae434f5b0a094cbe4dda971dd0d69d7264b50155 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134525931)
   * 2ab2896c8b1a1334dbb097d4a0c89cffdbdbbca0 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134804805)
   * ebcdd089caed98b9fef94445781c06c6689fbe60 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134806371)
   * a818c7d8c5c08f1498533a85165f93abda769f50 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134807944)
   * 9084a7edd3b9e0553c396154ef96dca15b580ffd : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/134814001)
   * 74a10525ee8fde96179cba59d26a8c2bf9698160 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134816617)
   * 22f56474fd27ec1cbe79f962607bc5bddfb6e067 : UNKNOWN
   * 5b35025a2848de85628ac2071f4fc64c02b557a1 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/13482)
   * 1a42be32e9aadf2328d7e4c2ee7487621e835658 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134835537)
   * 15d9f43362e702dd1b857b2e91d5c8b9812fe160 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134839586)
   * 300540862c55ca11b467191389463ebf025aebc4 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134999837)
   * bc77f17dab4013a5f2ba82d9d9e73b4b9e1be8ac : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135224808)
   * 0ea64d42dca7fb6f981ee73ebdde11ff93d1e6bd : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/137983497)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add 
use/create/drop/alter database operation and support it in flink/blink planner
URL: https://github.com/apache/flink/pull/10296#issuecomment-557762350
 
 
   
   ## CI report:
   
   * 7c5e46d1f794cb01be5088f5f5d29aec5850bf7f : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137872169)
   * 9d98382a4a56dfb2df7cc38cac34db0b68a27b93 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137873009)
   * 6942e402272d95e745e26cb6409dbdb2e0a4496e : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce orc ColumnarRow reader for hive connector

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce 
orc ColumnarRow reader for hive connector
URL: https://github.com/apache/flink/pull/10022#issuecomment-547278354
 
 
   
   ## CI report:
   
   * 73e9fbb7c4edb8628b93e9a5b17271e57a8b8f14 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/133947525)
   * b86c5db1ab9a081f0fec52f15fdad3b4f4d61939 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134797047)
   * 77aa3cb348f97c8e2999bafcee9e7169b96e983e : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135586640)
   * 3e65b8dfcdbcb91902d7c9b122b3bdb36333b227 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135597364)
   * 48e164d35f68ea4d11318c1481586ea2beefe386 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135625521)
   * c8bf33b2b0bc62f779e6ea68e4dc4c3d18432f87 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135891066)
   * 619180562ffd3549bc1e9a2fd33f87f42d61a772 : UNKNOWN
   * 111a1761ee7b728ba5bd7fa606879ca018eb2e15 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135898529)
   * 3c7a40e08a555d26b27dc2738e19ccb600446cb9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135922821)
   * 4c43bc9d58bbfabcf6661c1a93e5315a6cee1269 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/136073480)
   * 82352ba9729d9c9f15872964753fc4d55459efe1 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136076198)
   * 1e35c8e1496d1aa26bc0da8edfe913c8b7c4cfc3 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/136253883)
   * 5c0d680d44622e5a371fb501316b4e7b6c875440 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/136255756)
   * 2907034544549016ac5afce4b136972264adb152 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136522618)
   * 79e9f7f73eed2478e72575d40df7e7f1cabdf1ad : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/136634189)
   * c600ec7fed19f38701933fab6fb2822193ebbec1 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/136636034)
   * 4dcb68db6b75621f17d2bebc69b75ddaa2fb2017 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137762868)
   * 31a92829a40ecdffe94f9b4e537327c7e9bef0b3 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137979329)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] 
add local aggregate to solve data skew for ROLLUP/CUBE case
URL: https://github.com/apache/flink/pull/10271#issuecomment-556037401
 
 
   
   ## CI report:
   
   * dfd530a162b1912b00f5740ca07f828b44dcd4bd : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137401080)
   * 366928af4c6384dfeebf26ac04dac6a15b725ba4 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137498499)
   * 63595b41fa8062be41de37d6abe7cb752c9de280 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137530355)
   * 488c22fd0bb6ab21f262594d5f1b7897273b3f28 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137762854)
   * e9c30c1d7e3097774ef019d228069e34ac407852 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137979314)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce 
FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#issuecomment-539910284
 
 
   
   ## CI report:
   
   * e66114b1aa73a82b4c6bcf5c3a16baedb598f8c3 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131098600)
   * b7887760a3c3d28ca88eb31800ebd61084a520fc : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131249622)
   * 19ff8f1384bf24b469fa6cac0566d603a332b31d : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/133329923)
   * 98c83564a56ddcc30095e83458384a106170443c : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/133770151)
   * 501f1c3266693d0cbb1b8b1c0195f354756b7526 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134385291)
   * ae434f5b0a094cbe4dda971dd0d69d7264b50155 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134525931)
   * 2ab2896c8b1a1334dbb097d4a0c89cffdbdbbca0 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134804805)
   * ebcdd089caed98b9fef94445781c06c6689fbe60 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134806371)
   * a818c7d8c5c08f1498533a85165f93abda769f50 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134807944)
   * 9084a7edd3b9e0553c396154ef96dca15b580ffd : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/134814001)
   * 74a10525ee8fde96179cba59d26a8c2bf9698160 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134816617)
   * 22f56474fd27ec1cbe79f962607bc5bddfb6e067 : UNKNOWN
   * 5b35025a2848de85628ac2071f4fc64c02b557a1 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/13482)
   * 1a42be32e9aadf2328d7e4c2ee7487621e835658 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/134835537)
   * 15d9f43362e702dd1b857b2e91d5c8b9812fe160 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134839586)
   * 300540862c55ca11b467191389463ebf025aebc4 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134999837)
   * bc77f17dab4013a5f2ba82d9d9e73b4b9e1be8ac : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135224808)
   * 0ea64d42dca7fb6f981ee73ebdde11ff93d1e6bd : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner

2019-11-24 Thread GitBox

zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add 
use/create/drop/alter database operation and support it in flink/blink planner
URL: https://github.com/apache/flink/pull/10296#discussion_r349991863
 
 

 ##
 File path: 
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java
 ##
 @@ -469,18 +475,57 @@ public void sqlUpdate(String stmt) {
createTableOperation.getCatalogTable(),
createTableOperation.getTableIdentifier(),
createTableOperation.isIgnoreIfExists());
+   } else if (operation instanceof CreateDatabaseOperation) {
+   CreateDatabaseOperation createDatabaseOperation = 
(CreateDatabaseOperation) operation;
+   
catalogManager.createDatabase(createDatabaseOperation.getCatalogName(),
+   
createDatabaseOperation.getDatabaseName(),
+   
createDatabaseOperation.getCatalogDatabase(),
+   
createDatabaseOperation.isIgnoreIfExists(),
+   
false);
} else if (operation instanceof DropTableOperation) {
DropTableOperation dropTableOperation = 
(DropTableOperation) operation;
catalogManager.dropTable(
dropTableOperation.getTableIdentifier(),
dropTableOperation.isIfExists());
-   } else if (operation instanceof UseCatalogOperation) {
-   UseCatalogOperation useCatalogOperation = 
(UseCatalogOperation) operation;
-   
catalogManager.setCurrentCatalog(useCatalogOperation.getCatalogName());
+   } else if (operation instanceof DropDatabaseOperation) {
+   DropDatabaseOperation dropDatabaseOperation = 
(DropDatabaseOperation) operation;
+   catalogManager.dropDatabase(
+   dropDatabaseOperation.getCatalogName(),
+   dropDatabaseOperation.getDatabaseName(),
+   dropDatabaseOperation.isIfExists(),
+   dropDatabaseOperation.isRestrict(),
+   false);
+   } else if (operation instanceof AlterDatabaseOperation) {
+   AlterDatabaseOperation alterDatabaseOperation = 
(AlterDatabaseOperation) operation;
+   catalogManager.alterDatabase(
+   alterDatabaseOperation.getCatalogName(),
+   
alterDatabaseOperation.getDatabaseName(),
+   
alterDatabaseOperation.getCatalogDatabase(),
+   false);
+   } else if (operation instanceof UseOperation) {
+   applyUseOperation((UseOperation) operation);
} else {
throw new TableException(
"Unsupported SQL query! sqlUpdate() only 
accepts a single SQL statements of " +
-   "type INSERT, CREATE TABLE, DROP TABLE, 
USE CATALOG");
+   "type INSERT, CREATE TABLE, DROP TABLE, 
USE CATALOG, USE [catalog.]database, " +
+   "CREATE DATABASE, DROP DATABASE, ALTER 
DATABASE");
 
 Review comment:
   Yes, but now we can't infer the type of unknown. We can do a refactor later.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner

2019-11-24 Thread GitBox

zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add 
use/create/drop/alter database operation and support it in flink/blink planner
URL: https://github.com/apache/flink/pull/10296#discussion_r349991483
 
 

 ##
 File path: 
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/catalog/CatalogManager.java
 ##
 @@ -455,6 +457,94 @@ public void createTable(CatalogBaseTable table, 
ObjectIdentifier objectIdentifie
"CreateTable");
}
 
+   /**
+* Creates a database in a given fully qualified path.
+* @param catalogName
+* @param databaseName
+* @param database
+* @param ignoreIfExists If false exception will be thrown if a 
database exists in the given path.
+*/
+   public void createDatabase(String catalogName,
+   String databaseName,
+   CatalogDatabase database,
+   boolean ignoreIfExists,
+   boolean ignoreNoCatalog) {
 
 Review comment:
   It provides the ability to ignore the exception if there is no corresponding 
catalog like the other catalog method behavior. I think we can keep this and so 
that it's more convenient to use later.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce parallelism inference for HiveSource

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce 
parallelism inference for HiveSource
URL: https://github.com/apache/flink/pull/10210#issuecomment-554238019
 
 
   
   ## CI report:
   
   * 642c74d85bdc002a43307814307cc5fb4cba923c : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136653666)
   * 28557e594494e4df238542a33fad3da60c4cdb23 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136663675)
   * 600dd204a87bfb69656b20247759efe1095428d1 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136957895)
   * 6ac5c0321d415ba19b2f9512faae1ed729dcc85c : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137706192)
   * aaba84a4188eab863f1504c8bec045a51aa4ff73 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137975262)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on issue #9864: [FLINK-14254][table] Introduce 
FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#issuecomment-557985620
 
 
   @KurtYoung @lirui-apache Hope you take a look again.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10300: [FLINK-14926] [state backends] Make sure no resource leak of RocksObject

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10300: [FLINK-14926] [state backends] Make 
sure no resource leak of RocksObject
URL: https://github.com/apache/flink/pull/10300#issuecomment-557891612
 
 
   
   ## CI report:
   
   * dae41b86e4710d0c83d3f764e6196513db26a2c8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137940914)
   * 95bb102bbc2cb241ab66efbf0870f2a9d9e515d8 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137975257)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner

2019-11-24 Thread GitBox

zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add 
use/create/drop/alter database operation and support it in flink/blink planner
URL: https://github.com/apache/flink/pull/10296#discussion_r349987842
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/client/HiveMetastoreClientWrapper.java
 ##
 @@ -149,6 +149,11 @@ public void dropDatabase(String name, boolean deleteData, 
boolean ignoreIfNotExi
client.dropDatabase(name, deleteData, ignoreIfNotExists);
}
 
+   public void dropDatabase(String name, boolean deleteData, boolean 
ignoreIfNotExists, boolean cascade)
+   throws NoSuchObjectException, 
InvalidOperationException, MetaException, TException {
+   client.dropDatabase(name, deleteData, ignoreIfNotExists, 
cascade);
 
 Review comment:
   Yes，I check it hive version's range in [1.0.0, 3.1.2], all of them support 
this method.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce orc ColumnarRow reader for hive connector

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce 
orc ColumnarRow reader for hive connector
URL: https://github.com/apache/flink/pull/10022#issuecomment-547278354
 
 
   
   ## CI report:
   
   * 73e9fbb7c4edb8628b93e9a5b17271e57a8b8f14 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/133947525)
   * b86c5db1ab9a081f0fec52f15fdad3b4f4d61939 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134797047)
   * 77aa3cb348f97c8e2999bafcee9e7169b96e983e : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135586640)
   * 3e65b8dfcdbcb91902d7c9b122b3bdb36333b227 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135597364)
   * 48e164d35f68ea4d11318c1481586ea2beefe386 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135625521)
   * c8bf33b2b0bc62f779e6ea68e4dc4c3d18432f87 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135891066)
   * 619180562ffd3549bc1e9a2fd33f87f42d61a772 : UNKNOWN
   * 111a1761ee7b728ba5bd7fa606879ca018eb2e15 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135898529)
   * 3c7a40e08a555d26b27dc2738e19ccb600446cb9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135922821)
   * 4c43bc9d58bbfabcf6661c1a93e5315a6cee1269 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/136073480)
   * 82352ba9729d9c9f15872964753fc4d55459efe1 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136076198)
   * 1e35c8e1496d1aa26bc0da8edfe913c8b7c4cfc3 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/136253883)
   * 5c0d680d44622e5a371fb501316b4e7b6c875440 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/136255756)
   * 2907034544549016ac5afce4b136972264adb152 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136522618)
   * 79e9f7f73eed2478e72575d40df7e7f1cabdf1ad : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/136634189)
   * c600ec7fed19f38701933fab6fb2822193ebbec1 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/136636034)
   * 4dcb68db6b75621f17d2bebc69b75ddaa2fb2017 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137762868)
   * 31a92829a40ecdffe94f9b4e537327c7e9bef0b3 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/137979329)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] 
add local aggregate to solve data skew for ROLLUP/CUBE case
URL: https://github.com/apache/flink/pull/10271#issuecomment-556037401
 
 
   
   ## CI report:
   
   * dfd530a162b1912b00f5740ca07f828b44dcd4bd : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137401080)
   * 366928af4c6384dfeebf26ac04dac6a15b725ba4 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137498499)
   * 63595b41fa8062be41de37d6abe7cb752c9de280 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137530355)
   * 488c22fd0bb6ab21f262594d5f1b7897273b3f28 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137762854)
   * e9c30c1d7e3097774ef019d228069e34ac407852 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/137979314)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349986777
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSink.java
 ##
 @@ -178,27 +171,6 @@ public void setStaticPartition(Map 
partitionSpec) {
}
}
 
-   private void validatePartitionSpec() {
 
 Review comment:
   Planner already verified it. It should be verify by framework.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349986703
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSink.java
 ##
 @@ -86,46 +83,36 @@ public HiveTableSink(JobConf jobConf, ObjectPath 
tablePath, CatalogTable table)
 
@Override
public OutputFormat getOutputFormat() {
-   List partitionColumns = getPartitionFieldNames();
-   boolean isPartitioned = partitionColumns != null && 
!partitionColumns.isEmpty();
-   boolean isDynamicPartition = isPartitioned && 
partitionColumns.size() > staticPartitionSpec.size();
+   String[] partitionColumns = 
getPartitionFieldNames().toArray(new String[0]);
String dbName = tablePath.getDatabaseName();
String tableName = tablePath.getObjectName();
try (HiveMetastoreClientWrapper client = 
HiveMetastoreClientFactory.create(new HiveConf(jobConf, HiveConf.class), 
hiveVersion)) {
Table table = client.getTable(dbName, tableName);
StorageDescriptor sd = table.getSd();
-   // here we use the sdLocation to store the output path 
of the job, which is always a staging dir
-   String sdLocation = sd.getLocation();
-   HiveTablePartition hiveTablePartition;
-   if (isPartitioned) {
-   validatePartitionSpec();
-   if (isDynamicPartition) {
-   List path = new ArrayList<>(2);
-   path.add(sd.getLocation());
-   if (!staticPartitionSpec.isEmpty()) {
-   
path.add(Warehouse.makePartName(staticPartitionSpec, false));
-   }
-   sdLocation = 
String.join(Path.SEPARATOR, path);
-   } else {
-   List partitions = 
client.listPartitions(dbName, tableName,
-   new 
ArrayList<>(staticPartitionSpec.values()), (short) 1);
-   sdLocation = !partitions.isEmpty() ? 
partitions.get(0).getSd().getLocation() :
-   sd.getLocation() + 
Path.SEPARATOR + Warehouse.makePartName(staticPartitionSpec, true);
-   }
-
-   sd.setLocation(toStagingDir(sdLocation, 
jobConf));
-   hiveTablePartition = new HiveTablePartition(sd, 
new LinkedHashMap<>(staticPartitionSpec));
-   } else {
-   sd.setLocation(toStagingDir(sdLocation, 
jobConf));
-   hiveTablePartition = new HiveTablePartition(sd, 
null);
-   }
-   return new HiveTableOutputFormat(
-   jobConf,
-   tablePath,
-   catalogTable,
-   hiveTablePartition,
-   HiveReflectionUtils.getTableMetadata(hiveShim, 
table),
-   overwrite);
+
+   FileSystemOutputFormat.Builder builder = new 
FileSystemOutputFormat.Builder<>();
+   builder.setColumnNames(tableSchema.getFieldNames());
+   
builder.setDefaultPartName(jobConf.get(HiveConf.ConfVars.DEFAULTPARTITIONNAME.varname,
+   
HiveConf.ConfVars.DEFAULTPARTITIONNAME.defaultStrVal));
+   builder.setDynamicGrouped(dynamicGrouping);
+   builder.setPartitionColumns(partitionColumns);
+   builder.setFileSystemFactory(new 
HiveFileSystemFactory(jobConf));
+   builder.setFormatFactory(new HiveOutputFormatFactory(
+   jobConf,
+   sd.getOutputFormat(),
+   sd.getSerdeInfo(),
+   tableSchema,
+   partitionColumns,
+   
HiveReflectionUtils.getTableMetadata(hiveShim, table),
+   hiveVersion));
+   builder.setMetaStoreFactory(
+   new HiveMetaStoreFactory(jobConf, 
hiveVersion, dbName, tableName));
+   builder.setOverwrite(overwrite);
+   builder.setStaticPartitions(staticPartitionSpec);
+   builder.setTmpPath(new

[GitHub] [flink] godfreyhe commented on a change in pull request #10174: [FLINK-14625][table-planner-blink] Add a rule to eliminate cross join as much as possible without statistics

2019-11-24 Thread GitBox

godfreyhe commented on a change in pull request #10174: 
[FLINK-14625][table-planner-blink] Add a rule to eliminate cross join as much 
as possible without statistics
URL: https://github.com/apache/flink/pull/10174#discussion_r349981321
 
 

 ##
 File path: 
flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/optimize/program/FlinkBatchProgram.scala
 ##
 @@ -139,24 +140,24 @@ object FlinkBatchProgram {
 // join reorder
 val deprecatedJoinReorderEnabled =
   
config.getBoolean(OptimizerConfigOptions.TABLE_OPTIMIZER_JOIN_REORDER_ENABLED)
-val joinReorderMode = JoinReorderMode.withName(
-  
config.getString(OptimizerConfigOptions.TABLE_OPTIMIZER_JOIN_REORDER_MODE))
+val joinReorderMode = JoinReorderStrategy.valueOf(
 
 Review comment:
   rename to `joinReorderStrategy` ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349985201
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemCommitter.java
 ##
 @@ -0,0 +1,161 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.filesystem;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+
+import static 
org.apache.flink.table.filesystem.FileSystemUtils.searchPartSpecAndPaths;
+import static 
org.apache.flink.table.filesystem.TempFileManager.deleteCheckpoint;
+import static 
org.apache.flink.table.filesystem.TempFileManager.headCheckpoints;
+import static 
org.apache.flink.table.filesystem.TempFileManager.listTaskTemporaryPaths;
+
+/**
+ * File system file committer implementation. It move all files to output path 
from temporary path.
+ *
+ * In a checkpoint:
+ *  1.Every task will invoke {@link #createFileManagerAndCleanDir} to 
initialization, it returns
+ *  a path generator to generate path for task writing. And clean the 
temporary path of task.
+ *  2.After writing done for this checkpoint, need invoke {@link 
#commitUpToCheckpoint(long)},
+ *  will move the temporary files to real output path.
+ *
+ * Batch is a special case of Streaming, which has only one checkpoint.
+ *
+ * Data consistency:
+ * 1.For task failure: will launch a new task and invoke {@link 
#createFileManagerAndCleanDir},
+ *   this will clean previous temporary files (This simple design can make it 
easy to delete the
+ *   invalid temporary directory of the task, but it also causes that our 
directory does not
+ *   support the same task to start multiple backups to run).
+ * 2.For job master commit failure when overwrite: this may result in 
unfinished intermediate
+ *   results, but if we try to run job again, the final result must be correct 
(because the
+ *   intermediate result will be overwritten).
+ * 3.For job master commit failure when append: This can lead to inconsistent 
data. But,
+ *   considering that the commit action is a single point of execution, and 
only moves files and
+ *   updates metadata, it will be faster, so the probability of inconsistency 
is relatively small.
+ *
+ * See:
+ * {@link TempFileManager}.
+ * {@link FileSystemLoader}.
+ */
+@Internal
+public class FileSystemCommitter implements Serializable {
+
+   private static final long serialVersionUID = 1L;
+
+   private final FileSystemFactory factory;
+   private final MetaStoreFactory metaStoreFactory;
+   private final boolean overwrite;
+   private final Path tmpPath;
+   private final LinkedHashMap staticPartitions;
+   private final int partitionColumnSize;
+
+   public FileSystemCommitter(
+   FileSystemFactory factory,
+   MetaStoreFactory metaStoreFactory,
+   boolean overwrite,
+   Path tmpPath,
+   LinkedHashMap staticPartitions,
+   int partitionColumnSize) {
+   this.factory = factory;
+   this.metaStoreFactory = metaStoreFactory;
+   this.overwrite = overwrite;
+   this.tmpPath = tmpPath;
+   this.staticPartitions = staticPartitions;
+   this.partitionColumnSize = partitionColumnSize;
+   }
+
+   /**
+* For committing job's output after successful batch job completion or 
one checkpoint finish
+* for streaming job. Should move all files to final output paths.
+*
+* NOTE: According to checkpoint notify mechanism of Flink, 
checkpoint may fail and be
+* abandoned, so this method should commit all checkpoint ids that less 
than current
+* checkpoint id (Includes failure checkpoints).
+*/
+   public

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349985134
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveMetaStoreFactory.java
 ##
 @@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connectors.hive;
+
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.table.catalog.hive.client.HiveMetastoreClientFactory;
+import org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper;
+import org.apache.flink.table.catalog.hive.util.HiveTableUtil;
+import org.apache.flink.table.filesystem.MetaStoreFactory;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.NoSuchObjectException;
+import org.apache.hadoop.hive.metastore.api.Partition;
+import org.apache.hadoop.hive.metastore.api.StorageDescriptor;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.thrift.TException;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.Optional;
+
+/**
+ * Hive {@link MetaStoreFactory}, use {@link HiveMetastoreClientWrapper} to 
communicate with
+ * hive meta store.
+ */
+public class HiveMetaStoreFactory implements MetaStoreFactory {
+
+   private static final long serialVersionUID = 1L;
+
+   private final JobConfWrapper conf;
+   private final String hiveVersion;
+   private final String database;
+   private final String tableName;
+
+   HiveMetaStoreFactory(
+   JobConf conf,
+   String hiveVersion,
+   String database,
+   String tableName) {
+   this.conf = new JobConfWrapper(conf);
+   this.hiveVersion = hiveVersion;
+   this.database = database;
+   this.tableName = tableName;
+   }
+
+   @Override
+   public HiveTableMetaStore createTableMetaStore() throws Exception {
+   return new HiveTableMetaStore();
+   }
+
+   private class HiveTableMetaStore implements TableMetaStore {
+
+   private HiveMetastoreClientWrapper client;
+   private StorageDescriptor sd;
+
+   private HiveTableMetaStore() throws TException {
+   client = HiveMetastoreClientFactory.create(
+   new HiveConf(conf.conf(), 
HiveConf.class), hiveVersion);
+   sd = client.getTable(database, tableName).getSd();
+   }
+
+   @Override
+   public Path getLocationPath() {
+   return new Path(sd.getLocation());
+   }
+
+   @Override
+   public Optional getPartition(
+   LinkedHashMap partSpec) throws 
Exception {
+   try {
+   return Optional.of(new Path(client.getPartition(
+   database,
+   tableName,
+   new 
ArrayList<>(partSpec.values()))
+   .getSd().getLocation()));
+   } catch (NoSuchObjectException ignore) {
+   return Optional.empty();
+   }
+   }
+
+   @Override
+   public void createPartition(LinkedHashMap 
partSpec, Path path) throws Exception {
+   StorageDescriptor newSd = new StorageDescriptor(sd);
+   newSd.setLocation(path.toString());
+   Partition partition = 
HiveTableUtil.createHivePartition(database, tableName,
+   new ArrayList<>(partSpec.values()), 
newSd, new HashMap<>());
+   partition.setValues(new ArrayList<>(partSpec.values()));
+   client.add_partition(partition);
 
 Review comment:
   These two interface are

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349984971
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveMetaStoreFactory.java
 ##
 @@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connectors.hive;
+
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.table.catalog.hive.client.HiveMetastoreClientFactory;
+import org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper;
+import org.apache.flink.table.catalog.hive.util.HiveTableUtil;
+import org.apache.flink.table.filesystem.MetaStoreFactory;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.NoSuchObjectException;
+import org.apache.hadoop.hive.metastore.api.Partition;
+import org.apache.hadoop.hive.metastore.api.StorageDescriptor;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.thrift.TException;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.Optional;
+
+/**
+ * Hive {@link MetaStoreFactory}, use {@link HiveMetastoreClientWrapper} to 
communicate with
+ * hive meta store.
+ */
+public class HiveMetaStoreFactory implements MetaStoreFactory {
+
+   private static final long serialVersionUID = 1L;
+
+   private final JobConfWrapper conf;
+   private final String hiveVersion;
+   private final String database;
+   private final String tableName;
+
+   HiveMetaStoreFactory(
+   JobConf conf,
+   String hiveVersion,
+   String database,
+   String tableName) {
+   this.conf = new JobConfWrapper(conf);
+   this.hiveVersion = hiveVersion;
+   this.database = database;
+   this.tableName = tableName;
+   }
+
+   @Override
+   public HiveTableMetaStore createTableMetaStore() throws Exception {
+   return new HiveTableMetaStore();
+   }
+
+   private class HiveTableMetaStore implements TableMetaStore {
+
+   private HiveMetastoreClientWrapper client;
+   private StorageDescriptor sd;
+
+   private HiveTableMetaStore() throws TException {
+   client = HiveMetastoreClientFactory.create(
+   new HiveConf(conf.conf(), 
HiveConf.class), hiveVersion);
+   sd = client.getTable(database, tableName).getSd();
+   }
+
+   @Override
+   public Path getLocationPath() {
+   return new Path(sd.getLocation());
+   }
+
+   @Override
+   public Optional getPartition(
+   LinkedHashMap partSpec) throws 
Exception {
+   try {
+   return Optional.of(new Path(client.getPartition(
 
 Review comment:
   Will invoke `PartitionLoader.loadNonPartition`, never reach here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349984854
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/MetaStoreFactory.java
 ##
 @@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.filesystem;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.table.catalog.Catalog;
+
+import java.io.Closeable;
+import java.io.Serializable;
+import java.util.LinkedHashMap;
+import java.util.Optional;
+
+/**
+ * Meta store factory to create {@link TableMetaStore}. Meta store may need 
contains connection
+ * to remote, so we should not create too frequently.
+ */
+@Internal
+public interface MetaStoreFactory extends Serializable {
+
+   /**
+* Create a {@link TableMetaStore}.
+*/
+   TableMetaStore createTableMetaStore() throws Exception;
 
 Review comment:
   `TableMetaStoreFactory` already specify DB name and table name, we don't 
want to let invoker to get DB name and table name every time and every where, 
this is meaningless, and where factory exists is just for a single table.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-13995) Fix shading of the licence information of netty

2019-11-24 Thread Hequn Cheng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981282#comment-16981282
 ] 

Hequn Cheng commented on FLINK-13995:
-

[~chesnay] Hi, as legal problems are always important, so I think this is also 
a blocker for 1.8.3? 

> Fix shading of the licence information of netty
> ---
>
> Key: FLINK-13995
> URL: https://issues.apache.org/jira/browse/FLINK-13995
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: 1.8.0
>Reporter: Arvid Heise
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.8.3, 1.9.2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The license filter isn't actually filtering anything. It should be 
> META-INF/license/**.
> The first filter seems to be outdated btw.
> Multiple modules affected.
> {code:xml}
> 
>   io.netty:netty
>   
>   META-INF/maven/io.netty/**
>   
>   META-INF/license
>   
>   META-INF/NOTICE.txt
>   
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349983145
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemFactory.java
 ##
 @@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.filesystem;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.FileSystem;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.net.URI;
+
+/**
+ * A factory to create file systems.
+ */
+@Internal
+public interface FileSystemFactory extends Serializable {
 
 Review comment:
   I don't like to introduce `getScheme` in `FileSystemFactory`. And it is not 
serializable.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] 
add local aggregate to solve data skew for ROLLUP/CUBE case
URL: https://github.com/apache/flink/pull/10271#issuecomment-556037401
 
 
   
   ## CI report:
   
   * dfd530a162b1912b00f5740ca07f828b44dcd4bd : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137401080)
   * 366928af4c6384dfeebf26ac04dac6a15b725ba4 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137498499)
   * 63595b41fa8062be41de37d6abe7cb752c9de280 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137530355)
   * 488c22fd0bb6ab21f262594d5f1b7897273b3f28 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137762854)
   * e9c30c1d7e3097774ef019d228069e34ac407852 : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce orc ColumnarRow reader for hive connector

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce 
orc ColumnarRow reader for hive connector
URL: https://github.com/apache/flink/pull/10022#issuecomment-547278354
 
 
   
   ## CI report:
   
   * 73e9fbb7c4edb8628b93e9a5b17271e57a8b8f14 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/133947525)
   * b86c5db1ab9a081f0fec52f15fdad3b4f4d61939 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/134797047)
   * 77aa3cb348f97c8e2999bafcee9e7169b96e983e : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135586640)
   * 3e65b8dfcdbcb91902d7c9b122b3bdb36333b227 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135597364)
   * 48e164d35f68ea4d11318c1481586ea2beefe386 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135625521)
   * c8bf33b2b0bc62f779e6ea68e4dc4c3d18432f87 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135891066)
   * 619180562ffd3549bc1e9a2fd33f87f42d61a772 : UNKNOWN
   * 111a1761ee7b728ba5bd7fa606879ca018eb2e15 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135898529)
   * 3c7a40e08a555d26b27dc2738e19ccb600446cb9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/135922821)
   * 4c43bc9d58bbfabcf6661c1a93e5315a6cee1269 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/136073480)
   * 82352ba9729d9c9f15872964753fc4d55459efe1 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136076198)
   * 1e35c8e1496d1aa26bc0da8edfe913c8b7c4cfc3 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/136253883)
   * 5c0d680d44622e5a371fb501316b4e7b6c875440 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/136255756)
   * 2907034544549016ac5afce4b136972264adb152 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136522618)
   * 79e9f7f73eed2478e72575d40df7e7f1cabdf1ad : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/136634189)
   * c600ec7fed19f38701933fab6fb2822193ebbec1 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/136636034)
   * 4dcb68db6b75621f17d2bebc69b75ddaa2fb2017 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137762868)
   * 31a92829a40ecdffe94f9b4e537327c7e9bef0b3 : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-14933) translate "Data Streaming Fault Tolerance" page in to Chinese

2019-11-24 Thread Zhangcheng Hu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981280#comment-16981280
 ] 

Zhangcheng Hu commented on FLINK-14933:
---

ok，thx

> translate "Data Streaming Fault Tolerance" page in to Chinese
> -
>
> Key: FLINK-14933
> URL: https://issues.apache.org/jira/browse/FLINK-14933
> Project: Flink
>  Issue Type: Task
>  Components: chinese-translation
>Reporter: Zhangcheng Hu
>Assignee: Zhangcheng Hu
>Priority: Minor
>
> The page url is 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.9/internals/stream_checkpointing.html]
> The markdown file is located in 
> flink/docs/internals/stream_checkpointing.zh.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349983145
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemFactory.java
 ##
 @@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.filesystem;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.FileSystem;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.net.URI;
+
+/**
+ * A factory to create file systems.
+ */
+@Internal
+public interface FileSystemFactory extends Serializable {
 
 Review comment:
   I don't like to introduce `getScheme` in `FileSystemFactory`. But I am OK to 
use it too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349982946
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemUtils.java
 ##
 @@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.filesystem;
+
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.table.api.TableException;
+
+import java.util.ArrayList;
+import java.util.BitSet;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+/**
+ * Utils for file system.
+ */
+public class FileSystemUtils {
+
+   private static final Pattern PARTITION_NAME_PATTERN = 
Pattern.compile("([^/]+)=([^/]+)");
+
+   private static final BitSet CHAR_TO_ESCAPE = new BitSet(128);
+   static {
+   for (char c = 0; c < ' '; c++) {
+   CHAR_TO_ESCAPE.set(c);
+   }
+
+   /*
+* ASCII 01-1F are HTTP control characters that need to be 
escaped.
+* \u000A and \u000D are \n and \r, respectively.
+*/
+   char[] clist = new char[] {'\u0001', '\u0002', '\u0003', 
'\u0004',
+   '\u0005', '\u0006', '\u0007', '\u0008', 
'\u0009', '\n', '\u000B',
+   '\u000C', '\r', '\u000E', '\u000F', '\u0010', 
'\u0011', '\u0012',
+   '\u0013', '\u0014', '\u0015', '\u0016', 
'\u0017', '\u0018', '\u0019',
+   '\u001A', '\u001B', '\u001C', '\u001D', 
'\u001E', '\u001F',
+   '"', '#', '%', '\'', '*', '/', ':', '=', '?', 
'\\', '\u007F', '{',
+   '[', ']', '^'};
+
+   for (char c : clist) {
+   CHAR_TO_ESCAPE.set(c);
+   }
+   }
+
+   private static boolean needsEscaping(char c) {
+   return c < CHAR_TO_ESCAPE.size() && CHAR_TO_ESCAPE.get(c);
+   }
+
+   /**
+* Make partition path from partition spec.
+*
+* @param partitionSpec The partition spec.
+* @return An escaped, valid partition name.
+*/
+   public static String generatePartName(LinkedHashMap 
partitionSpec) {
+   StringBuilder suffixBuf = new StringBuilder();
+   int i = 0;
+   for (Map.Entry e : partitionSpec.entrySet()) {
+   if (i > 0) {
+   suffixBuf.append(Path.SEPARATOR);
+   }
+   suffixBuf.append(escapePathName(e.getKey()));
+   suffixBuf.append('=');
+   suffixBuf.append(escapePathName(e.getValue()));
+   i++;
+   }
+   suffixBuf.append(Path.SEPARATOR);
+   return suffixBuf.toString();
+   }
+
+   /**
+* Escapes a path name.
+* @param path The path to escape.
+* @return An escaped path name.
+*/
+   private static String escapePathName(String path) {
+
+   // __DEFAULT_NULL__ is the system default value for null and 
empty string.
 
 Review comment:
   Wrong comment, I will remove it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349982669
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemCommitter.java
 ##
 @@ -0,0 +1,161 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.filesystem;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+
+import static 
org.apache.flink.table.filesystem.FileSystemUtils.searchPartSpecAndPaths;
+import static 
org.apache.flink.table.filesystem.TempFileManager.deleteCheckpoint;
+import static 
org.apache.flink.table.filesystem.TempFileManager.headCheckpoints;
+import static 
org.apache.flink.table.filesystem.TempFileManager.listTaskTemporaryPaths;
+
+/**
+ * File system file committer implementation. It move all files to output path 
from temporary path.
+ *
+ * In a checkpoint:
+ *  1.Every task will invoke {@link #createFileManagerAndCleanDir} to 
initialization, it returns
+ *  a path generator to generate path for task writing. And clean the 
temporary path of task.
+ *  2.After writing done for this checkpoint, need invoke {@link 
#commitUpToCheckpoint(long)},
+ *  will move the temporary files to real output path.
+ *
+ * Batch is a special case of Streaming, which has only one checkpoint.
+ *
+ * Data consistency:
+ * 1.For task failure: will launch a new task and invoke {@link 
#createFileManagerAndCleanDir},
+ *   this will clean previous temporary files (This simple design can make it 
easy to delete the
+ *   invalid temporary directory of the task, but it also causes that our 
directory does not
+ *   support the same task to start multiple backups to run).
+ * 2.For job master commit failure when overwrite: this may result in 
unfinished intermediate
+ *   results, but if we try to run job again, the final result must be correct 
(because the
+ *   intermediate result will be overwritten).
+ * 3.For job master commit failure when append: This can lead to inconsistent 
data. But,
+ *   considering that the commit action is a single point of execution, and 
only moves files and
+ *   updates metadata, it will be faster, so the probability of inconsistency 
is relatively small.
+ *
+ * See:
+ * {@link TempFileManager}.
+ * {@link FileSystemLoader}.
+ */
+@Internal
+public class FileSystemCommitter implements Serializable {
+
+   private static final long serialVersionUID = 1L;
+
+   private final FileSystemFactory factory;
+   private final MetaStoreFactory metaStoreFactory;
+   private final boolean overwrite;
+   private final Path tmpPath;
+   private final LinkedHashMap staticPartitions;
+   private final int partitionColumnSize;
+
+   public FileSystemCommitter(
+   FileSystemFactory factory,
+   MetaStoreFactory metaStoreFactory,
+   boolean overwrite,
+   Path tmpPath,
+   LinkedHashMap staticPartitions,
+   int partitionColumnSize) {
+   this.factory = factory;
+   this.metaStoreFactory = metaStoreFactory;
+   this.overwrite = overwrite;
+   this.tmpPath = tmpPath;
+   this.staticPartitions = staticPartitions;
+   this.partitionColumnSize = partitionColumnSize;
+   }
+
+   /**
+* For committing job's output after successful batch job completion or 
one checkpoint finish
+* for streaming job. Should move all files to final output paths.
+*
+* NOTE: According to checkpoint notify mechanism of Flink, 
checkpoint may fail and be
+* abandoned, so this method should commit all checkpoint ids that less 
than current
+* checkpoint id (Includes failure checkpoints).
+*/
+   public

[GitHub] [flink] flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce parallelism inference for HiveSource

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce 
parallelism inference for HiveSource
URL: https://github.com/apache/flink/pull/10210#issuecomment-554238019
 
 
   
   ## CI report:
   
   * 642c74d85bdc002a43307814307cc5fb4cba923c : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136653666)
   * 28557e594494e4df238542a33fad3da60c4cdb23 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136663675)
   * 600dd204a87bfb69656b20247759efe1095428d1 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136957895)
   * 6ac5c0321d415ba19b2f9512faae1ed729dcc85c : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137706192)
   * aaba84a4188eab863f1504c8bec045a51aa4ff73 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/137975262)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-14933) translate "Data Streaming Fault Tolerance" page in to Chinese

2019-11-24 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981278#comment-16981278
 ] 

Jark Wu commented on FLINK-14933:
-

Hi [~Zhangcheng], I assigned this issue to you. Feel free to open pull request. 

> translate "Data Streaming Fault Tolerance" page in to Chinese
> -
>
> Key: FLINK-14933
> URL: https://issues.apache.org/jira/browse/FLINK-14933
> Project: Flink
>  Issue Type: Task
>  Components: chinese-translation
>Reporter: Zhangcheng Hu
>Assignee: Zhangcheng Hu
>Priority: Minor
>
> The page url is 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.9/internals/stream_checkpointing.html]
> The markdown file is located in 
> flink/docs/internals/stream_checkpointing.zh.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-13943) Provide api to convert flink table to java List (e.g. Table#collect)

2019-11-24 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981276#comment-16981276
 ] 

Jark Wu commented on FLINK-13943:
-

Just want to confirm that we don't want to support \{{collect()}} for streaming 
jobs in this version? 

> Provide api to convert flink table to java List (e.g. Table#collect)
> 
>
> Key: FLINK-13943
> URL: https://issues.apache.org/jira/browse/FLINK-13943
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Jeff Zhang
>Assignee: Caizhi Weng
>Priority: Major
>
> It would be nice to convert flink table to java List so that I can do other 
> data manipulation in client side after execution flink job. For flink 
> planner, I can convert flink table to DataSet and use DataSet#collect, but 
> for blink planner, there's no such api.
> EDIT from FLINK-14807:
> Currently, it is very unconvinient for user to fetch data of flink job unless 
> specify sink expclitly and then fetch data from this sink via its api (e.g. 
> write to hdfs sink, then read data from hdfs). However, most of time user 
> just want to get the data and do whatever processing he want. So it is very 
> necessary for flink to provide api Table#collect for this purpose. 
> Other apis such as Table#head, Table#print is also helpful.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349980532
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/TempFileManager.java
 ##
 @@ -0,0 +1,148 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.filesystem;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.FileStatus;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.flink.util.Preconditions.checkArgument;
+
+/**
+ * Manage temporary files for writing files. Use special rules to organize 
directories
+ * for temporary files.
+ *
+ * Temporary file directory contains the following directory parts:
+ *  1.temporary base path directory.
+ *  2.checkpoint id directory.
+ *  3.task id directory.
+ *  4.directories to specify partitioning.
+ *  5.data files.
+ *  eg: /tmp/cp-1/task-0/p0=1/p1=2/fileName.
+ */
+@Internal
+public class TempFileManager {
+
+   private static final String CHECKPOINT_DIR_PREFIX = "cp-";
+   private static final String TASK_DIR_PREFIX = "task-";
+
+   private final int taskNumber;
+   private final long checkpointId;
+   private final Path taskTmpDir;
+
+   private transient int nameCounter = 0;
+
+   public TempFileManager(Path temporaryPath, int taskNumber, long 
checkpointId) {
+   checkArgument(checkpointId != -1, "checkpoint id start with 
0.");
+   this.taskNumber = taskNumber;
+   this.checkpointId = checkpointId;
+   this.taskTmpDir = new Path(
+   new Path(temporaryPath, 
checkpointName(checkpointId)),
+   TASK_DIR_PREFIX + taskNumber);
+   }
+
+   public Path getTaskTemporaryPath() {
 
 Review comment:
   This is just for committer to clean task temporary dir, but I think we can 
move this `clean` to `FileManager`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Assigned] (FLINK-14933) translate "Data Streaming Fault Tolerance" page in to Chinese

2019-11-24 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-14933:
---

Assignee: Zhangcheng Hu

> translate "Data Streaming Fault Tolerance" page in to Chinese
> -
>
> Key: FLINK-14933
> URL: https://issues.apache.org/jira/browse/FLINK-14933
> Project: Flink
>  Issue Type: Task
>  Components: chinese-translation
>Reporter: Zhangcheng Hu
>Assignee: Zhangcheng Hu
>Priority: Minor
>
> The page url is 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.9/internals/stream_checkpointing.html]
> The markdown file is located in 
> flink/docs/internals/stream_checkpointing.zh.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349980532
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/TempFileManager.java
 ##
 @@ -0,0 +1,148 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.filesystem;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.FileStatus;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.flink.util.Preconditions.checkArgument;
+
+/**
+ * Manage temporary files for writing files. Use special rules to organize 
directories
+ * for temporary files.
+ *
+ * Temporary file directory contains the following directory parts:
+ *  1.temporary base path directory.
+ *  2.checkpoint id directory.
+ *  3.task id directory.
+ *  4.directories to specify partitioning.
+ *  5.data files.
+ *  eg: /tmp/cp-1/task-0/p0=1/p1=2/fileName.
+ */
+@Internal
+public class TempFileManager {
+
+   private static final String CHECKPOINT_DIR_PREFIX = "cp-";
+   private static final String TASK_DIR_PREFIX = "task-";
+
+   private final int taskNumber;
+   private final long checkpointId;
+   private final Path taskTmpDir;
+
+   private transient int nameCounter = 0;
+
+   public TempFileManager(Path temporaryPath, int taskNumber, long 
checkpointId) {
+   checkArgument(checkpointId != -1, "checkpoint id start with 
0.");
+   this.taskNumber = taskNumber;
+   this.checkpointId = checkpointId;
+   this.taskTmpDir = new Path(
+   new Path(temporaryPath, 
checkpointName(checkpointId)),
+   TASK_DIR_PREFIX + taskNumber);
+   }
+
+   public Path getTaskTemporaryPath() {
 
 Review comment:
   This is just for committer to clean task temporary dir, but I think we 
should let it not public.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10300: [FLINK-14926] [state backends] Make sure no resource leak of RocksObject

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10300: [FLINK-14926] [state backends] Make 
sure no resource leak of RocksObject
URL: https://github.com/apache/flink/pull/10300#issuecomment-557891612
 
 
   
   ## CI report:
   
   * dae41b86e4710d0c83d3f764e6196513db26a2c8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137940914)
   * 95bb102bbc2cb241ab66efbf0870f2a9d9e515d8 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/137975257)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10291: [FLINK-14899] [table-planner-blink] Fix unexpected plan when PROCTIME() is defined in query

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10291: [FLINK-14899] [table-planner-blink] 
Fix unexpected plan when PROCTIME() is defined in query
URL: https://github.com/apache/flink/pull/10291#issuecomment-557488395
 
 
   
   ## CI report:
   
   * c3082d201afec00fff9e25d46059b3789c9d28d5 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/137741065)
   * 76a6169e77815b25b5fb265a706ab222d2395d87 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137744705)
   * 365b5e6cc7b3953877838d371d971d8a3cbb3966 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/137973852)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] godfreyhe commented on a change in pull request #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case

2019-11-24 Thread GitBox

godfreyhe commented on a change in pull request #10271: [FLINK-14874] 
[table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE 
case
URL: https://github.com/apache/flink/pull/10271#discussion_r349980293
 
 

 ##
 File path: 
flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/rules/physical/batch/EnforceLocalAggRuleBase.scala
 ##
 @@ -0,0 +1,197 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.rules.physical.batch
+
+import org.apache.flink.table.api.TableException
+import org.apache.flink.table.planner.calcite.FlinkTypeFactory
+import org.apache.flink.table.planner.plan.`trait`.FlinkRelDistribution
+import org.apache.flink.table.planner.plan.nodes.FlinkConventions
+import 
org.apache.flink.table.planner.plan.nodes.physical.batch.{BatchExecExchange, 
BatchExecExpand, BatchExecGroupAggregateBase, BatchExecHashAggregate, 
BatchExecLocalHashAggregate, BatchExecLocalSortAggregate, 
BatchExecSortAggregate}
+import org.apache.flink.table.planner.plan.utils.{AggregateUtil, 
FlinkRelOptUtil}
+import 
org.apache.flink.table.runtime.types.LogicalTypeDataTypeConverter.fromDataTypeToLogicalType
+
+import org.apache.calcite.plan.{RelOptRule, RelOptRuleCall, RelOptRuleOperand}
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rex.RexUtil
+import org.apache.calcite.tools.RelBuilder
+import org.apache.calcite.util.Util
+
+import scala.collection.JavaConversions._
+
+/**
+  * Planner rule that writes one phase aggregate to two phase aggregate,
+  * when the following conditions are met:
+  * 1. there is no local aggregate,
+  * 2. the aggregate has non-empty grouping and two phase aggregate strategy 
is enabled,
+  * 3. the input is [[BatchExecExpand]] and there is at least one expand row
+  * which the columns for grouping are all constant.
+  */
+abstract class EnforceLocalAggRuleBase(
+operand: RelOptRuleOperand,
+description: String)
+  extends RelOptRule(operand, description)
+  with BatchExecAggRuleBase {
+
+  protected def getBatchExecExpand(call: RelOptRuleCall): BatchExecExpand
+
+  override def matches(call: RelOptRuleCall): Boolean = {
+val agg: BatchExecGroupAggregateBase = call.rel(0)
+val expand = getBatchExecExpand(call)
+
+val tableConfig = FlinkRelOptUtil.getTableConfigFromContext(agg)
+val aggFunctions = agg.getAggCallToAggFunction.map(_._2).toArray
+val enableTwoPhaseAgg = isTwoPhaseAggWorkable(aggFunctions, tableConfig)
+
+val grouping = agg.getGrouping
+// if all group columns in a expand row are constant, this row will be 
shuffled to
+// a single node. (shuffle keys are grouping)
+// add local aggregate to greatly reduce the output data
+val hasConstantRow = expand.projects.exists {
+  project =>
+val groupingColumns = grouping.map(i => project.get(i))
+groupingColumns.forall(RexUtil.isConstant)
+}
+
+grouping.nonEmpty && enableTwoPhaseAgg && hasConstantRow
+  }
+
+  protected def createLocalAgg(
 
 Review comment:
   actually, the logic of creating `localAgg` can be partially reused, and the 
logic of creating `globalAgg` can hardly be reused (calling the constructor 
directly)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349979952
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemLoader.java
 ##
 @@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.filesystem;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.FileStatus;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.table.filesystem.MetaStoreFactory.TableMetaStore;
+import org.apache.flink.util.Preconditions;
+
+import java.io.Closeable;
+import java.io.IOException;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Optional;
+
+import static 
org.apache.flink.table.filesystem.FileSystemUtils.generatePartName;
+import static 
org.apache.flink.table.filesystem.FileSystemUtils.listStatusWithoutHidden;
+
+/**
+ * Loader to temporary files to final output path and meta store. According to 
overwrite,
+ * the loader will delete the previous data.
+ *
+ * This provide two interface to load:
+ * 1.{@link #loadPartition}: load temporary partitioned files, if it is new 
partition,
+ * will create partition to meta store.
+ * 2.{@link #loadNonPartition}: just rename all files to final output path.
+ */
+@Internal
+public class FileSystemLoader implements Closeable {
 
 Review comment:
   But I am OK to `PartitionLoader`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-14574) flink-s3-fs-hadoop doesn't work with plugins mechanism

2019-11-24 Thread clay (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

clay updated FLINK-14574:
-
Description: 
As reported by a user via [mailing 
list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/No-FileSystem-for-scheme-quot-file-quot-for-S3A-in-and-state-processor-api-in-1-9-td30704.html]:
{noformat}
We've added flink-s3-fs-hadoop library to plugins folder and trying to
bootstrap state to S3 using S3A protocol. The following exception happens
(unless hadoop library is put to lib folder instead of plugins). Looks like
S3A filesystem is trying to use "local" filesystem for temporary files and
fails:

java.lang.Exception: Could not write timer service of MapPartition
(d2976134f80849779b7a94b7e6218476) (4/4) to checkpoint state stream.
at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:466)
at
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:89)
at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:399)
at
org.apache.flink.state.api.output.SnapshotUtils.snapshot(SnapshotUtils.java:59)
at
org.apache.flink.state.api.output.operators.KeyedStateBootstrapOperator.endInput(KeyedStateBootstrapOperator.java:84)
at
org.apache.flink.state.api.output.BoundedStreamTask.performDefaultAction(BoundedStreamTask.java:85)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:298)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:403)
at
org.apache.flink.state.api.output.BoundedOneInputStreamTaskRunner.mapPartition(BoundedOneInputStreamTaskRunner.java:76)
at
org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:504)
at 
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Could not open output stream for state
backend
at
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:367)
at
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.flush(FsCheckpointStreamFactory.java:234)
at
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.write(FsCheckpointStreamFactory.java:209)
at
org.apache.flink.runtime.state.NonClosingCheckpointOutputStream.write(NonClosingCheckpointOutputStream.java:61)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at java.io.DataOutputStream.writeUTF(DataOutputStream.java:401)
at java.io.DataOutputStream.writeUTF(DataOutputStream.java:323)
at
org.apache.flink.util.LinkedOptionalMapSerializer.lambda$writeOptionalMap$0(LinkedOptionalMapSerializer.java:58)
at
org.apache.flink.util.LinkedOptionalMap.forEach(LinkedOptionalMap.java:163)
at
org.apache.flink.util.LinkedOptionalMapSerializer.writeOptionalMap(LinkedOptionalMapSerializer.java:57)
at
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializerSnapshotData.writeKryoRegistrations(KryoSerializerSnapshotData.java:141)
at
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializerSnapshotData.writeSnapshotData(KryoSerializerSnapshotData.java:128)
at
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializerSnapshot.writeSnapshot(KryoSerializerSnapshot.java:72)
at
org.apache.flink.api.common.typeutils.TypeSerializerSnapshot.writeVersionedSnapshot(TypeSerializerSnapshot.java:153)
at
org.apache.flink.streaming.api.operators.InternalTimersSnapshotReaderWriters$InternalTimersSnapshotWriterV2.writeKeyAndNamespaceSerializers(InternalTimersSnapshotReaderWriters.java:199)
at
org.apache.flink.streaming.api.operators.InternalTimersSnapshotReaderWriters$AbstractInternalTimersSnapshotWriter.writeTimersSnapshot(InternalTimersSnapshotReaderWriters.java:117)
at
org.apache.flink.streaming.api.operators.InternalTimerServiceSerializationProxy.write(InternalTimerServiceSerializationProxy.java:101)
at
org.apache.flink.streaming.api.operators.InternalTimeServiceManager.snapshotStateForKeyGroup(InternalTimeServiceManager.java:139)
at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:462)
... 14 common frames omitted
Caused by:

[jira] [Resolved] (FLINK-14595) Move flink-orc to flink-formats from flink-connectors

2019-11-24 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu resolved FLINK-14595.
-
Resolution: Fixed

1.10.0: 320240e2c412c15c7fa91649a0faa78018e74d86

> Move flink-orc to flink-formats from flink-connectors
> -
>
> Key: FLINK-14595
> URL: https://issues.apache.org/jira/browse/FLINK-14595
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ORC
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We already have the parent model of formats. we have put other 
> formats(flink-avro, flink-json, flink-parquet, flink-json, flink-csv, 
> flink-sequence-file) to flink-formats. flink-orc is a format too. So we can 
> move it to flink-formats.
>  
> In theory, there should be no compatibility problem, only the parent model 
> needs to be changed, and no other changes are needed. 
>  
> Discuss thread: 
> [http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Move-flink-orc-to-flink-formats-from-flink-connectors-td34438.html]
> Vote thread: 
> [http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Move-flink-orc-to-flink-formats-from-flink-connectors-td34496.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] sunhaibotb commented on a change in pull request #10151: [FLINK-14231] Handle the processing-time timers before closing operator to properly support endInput

2019-11-24 Thread GitBox

sunhaibotb commented on a change in pull request #10151: [FLINK-14231] Handle 
the processing-time timers before closing operator to properly support endInput
URL: https://github.com/apache/flink/pull/10151#discussion_r349979144
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java
 ##
 @@ -205,6 +206,8 @@
 
private Long syncSavepointId = null;
 
+   private final Map, ProcessingTimeServiceImpl> 
processingTimeServices;
 
 Review comment:
   > map value: It's better to specify an interface, not a concrete 
implementation to depend on it less.
   
   The methods `quiesce` and `getTimersDoneFutureAfterQuiescing` need to be 
added in the processing time service, but the `ProcessingTimeService` interafce 
is user-oriented, so we don't want to change it due to the runtime needs. On 
the other hand, `StreamTask` understands `ProcessingTimeServiceImpl` because it 
uses `ProcessingTimeServiceImpl`  to build instances in 
`#getProcessingTimeService`. So it should be possible to use the concrete 
implementation here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch

2019-11-24 Thread GitBox

JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] 
Introduce FileSystemOutputFormat for batch
URL: https://github.com/apache/flink/pull/9864#discussion_r349978969
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/TempFileManager.java
 ##
 @@ -0,0 +1,148 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.filesystem;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.FileStatus;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.flink.util.Preconditions.checkArgument;
+
+/**
+ * Manage temporary files for writing files. Use special rules to organize 
directories
+ * for temporary files.
+ *
+ * Temporary file directory contains the following directory parts:
+ *  1.temporary base path directory.
+ *  2.checkpoint id directory.
+ *  3.task id directory.
+ *  4.directories to specify partitioning.
+ *  5.data files.
+ *  eg: /tmp/cp-1/task-0/p0=1/p1=2/fileName.
+ */
+@Internal
+public class TempFileManager {
+
+   private static final String CHECKPOINT_DIR_PREFIX = "cp-";
+   private static final String TASK_DIR_PREFIX = "task-";
+
+   private final int taskNumber;
+   private final long checkpointId;
+   private final Path taskTmpDir;
+
+   private transient int nameCounter = 0;
+
+   public TempFileManager(Path temporaryPath, int taskNumber, long 
checkpointId) {
+   checkArgument(checkpointId != -1, "checkpoint id start with 
0.");
+   this.taskNumber = taskNumber;
+   this.checkpointId = checkpointId;
+   this.taskTmpDir = new Path(
+   new Path(temporaryPath, 
checkpointName(checkpointId)),
+   TASK_DIR_PREFIX + taskNumber);
+   }
+
+   public Path getTaskTemporaryPath() {
+   return taskTmpDir;
+   }
+
+   /**
+* Generate a new path with directories.
+*/
+   public Path generateTempFile(String... directories) throws Exception {
+   Path parentPath = taskTmpDir;
+   for (String dir : directories) {
+   parentPath = new Path(parentPath, dir);
+   }
+   return new Path(parentPath, newFileName());
+   }
+
+   private String newFileName() {
+   return String.format(
+   checkpointName(checkpointId) + "-" + 
taskName(taskNumber) + "-file-%d",
 
 Review comment:
   Finally, these files will be moved to final directory, so with these 
information, we can reduce name conflicts.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] openinx commented on a change in pull request #10270: [FLINK-14672][sql-client] Make Executor stateful in sql client

2019-11-24 Thread GitBox

openinx commented on a change in pull request #10270: [FLINK-14672][sql-client] 
Make Executor stateful in sql client
URL: https://github.com/apache/flink/pull/10270#discussion_r349978430
 
 

 ##
 File path: 
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/gateway/local/ResultStore.java
 ##
 @@ -63,7 +63,7 @@ public ResultStore(Configuration flinkConfig) {
 
if (env.getExecution().inStreamingMode()) {
// determine gateway address (and port if possible)
-   final InetAddress gatewayAddress = 
getGatewayAddress(env.getDeployment());
+   InetAddress gatewayAddress = 
getGatewayAddress(env.getDeployment());
 
 Review comment:
   I think it's unintentional change,  will revert it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce parallelism inference for HiveSource

2019-11-24 Thread GitBox

flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce 
parallelism inference for HiveSource
URL: https://github.com/apache/flink/pull/10210#issuecomment-554238019
 
 
   
   ## CI report:
   
   * 642c74d85bdc002a43307814307cc5fb4cba923c : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136653666)
   * 28557e594494e4df238542a33fad3da60c4cdb23 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136663675)
   * 600dd204a87bfb69656b20247759efe1095428d1 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/136957895)
   * 6ac5c0321d415ba19b2f9512faae1ed729dcc85c : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/137706192)
   * aaba84a4188eab863f1504c8bec045a51aa4ff73 : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] klion26 commented on issue #10292: [FLINK-14928][docs] Fix the broken links in documentation of page systemFunctions

2019-11-24 Thread GitBox

klion26 commented on issue #10292: [FLINK-14928][docs] Fix the broken links in 
documentation of page systemFunctions
URL: https://github.com/apache/flink/pull/10292#issuecomment-557970148
 
 
   @carp84 @wuchong thanks for the review and merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] wuchong merged pull request #10277: [FLINK-14595][orc] Move flink-orc to flink-formats from flink-connectors

2019-11-24 Thread GitBox

wuchong merged pull request #10277: [FLINK-14595][orc] Move flink-orc to 
flink-formats from flink-connectors
URL: https://github.com/apache/flink/pull/10277
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-14712) Improve back-pressure reporting mechanism

2019-11-24 Thread lining (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lining updated FLINK-14712:
---
Description: 
h4. (1) The current monitor is heavy-weight. 
 *   Backpressure monitoring works by repeatedly taking stack trace samples of 
your running tasks.

h4. (2) It is difficult to find out which vertex is the source  of  
backpressure.
 * User need to know current and upstream's network metric to judge current 
whether is the source of backpressure. Now user has to record relevant 
information.

h3. Proposed Changes

1. expose the new mechanism implemented in FLINK-14472 as a "is back-pressured" 
metric.

2. show the vertex that produces the backpressure source for the job.

3. expose network metric in IOMetricsInfo:
 * SubTask
 **  pool usage: outPoolUsage, inputExclusiveBuffersUsage, 
inputFloatingBuffersUsage.
 *** If the subtask is not back pressured, but it is causing backpressure (full 
input, empty output)
 *** By comparing exclusive/floating buffers usage, whether all channels are 
back-pressure or only some of them
 ** back-pressured for show whether it is back pressured.
 * Vertex
 ** pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg, 
inputFloatingBuffersUsageAvg
 ** back-pressured for show whether it is back pressured(merge all iths 
subtasks)

  was:
h4. (1) The current monitor is heavy-weight. 
 *   Backpressure monitoring works by repeatedly taking stack trace samples of 
your running tasks.

h4. (2) It is difficult to find out which vertex is the source  of  
backpressure.
 * User need to know current and upstream's network metric to judge current 
whether is the source of backpressure. Now user has to record relevant 
information.

h3. Proposed Changes

1. expose the new mechanism implemented in FLINK-14472 as a "is back-pressured" 
metric.

2. show the vertex that produces the backpressure source for the job.

3. expose network pool usage in IOMetricsInfo:
 # if sub task is not back pressured, but it is causing a back pressure (full 
input, empty output)
 # by comparing exclusive/floating buffers usage, whether all channels are 
back-pressured or only some of them

{code:java}
public final class IOMetricsInfo {
private final float outPoolUsage;
private final float inputExclusiveBuffersUsage;
private final float inputFloatingBuffersUsage;
}
{code}
JobDetailsInfo.JobVertexDetailsInfo merge use Math.max.(ps: outPoolUsage is 
from upstream)


> Improve back-pressure reporting mechanism
> -
>
> Key: FLINK-14712
> URL: https://issues.apache.org/jira/browse/FLINK-14712
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Runtime / Network, Runtime / REST
>Reporter: lining
>Assignee: lining
>Priority: Major
> Attachments: image-2019-11-12-14-30-16-130.png
>
>
> h4. (1) The current monitor is heavy-weight. 
>  *   Backpressure monitoring works by repeatedly taking stack trace samples 
> of your running tasks.
> h4. (2) It is difficult to find out which vertex is the source  of  
> backpressure.
>  * User need to know current and upstream's network metric to judge current 
> whether is the source of backpressure. Now user has to record relevant 
> information.
> h3. Proposed Changes
> 1. expose the new mechanism implemented in FLINK-14472 as a "is 
> back-pressured" metric.
> 2. show the vertex that produces the backpressure source for the job.
> 3. expose network metric in IOMetricsInfo:
>  * SubTask
>  **  pool usage: outPoolUsage, inputExclusiveBuffersUsage, 
> inputFloatingBuffersUsage.
>  *** If the subtask is not back pressured, but it is causing backpressure 
> (full input, empty output)
>  *** By comparing exclusive/floating buffers usage, whether all channels are 
> back-pressure or only some of them
>  ** back-pressured for show whether it is back pressured.
>  * Vertex
>  ** pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg, 
> inputFloatingBuffersUsageAvg
>  ** back-pressured for show whether it is back pressured(merge all iths 
> subtasks)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14838) Cleanup the description about container number config option in Scala and python shell doc

2019-11-24 Thread vinoyang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981268#comment-16981268
 ] 

vinoyang commented on FLINK-14838:
--

OK, will open a PR soon. 

> Cleanup the description about container number config option in Scala and 
> python shell doc
> --
>
> Key: FLINK-14838
> URL: https://issues.apache.org/jira/browse/FLINK-14838
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: vinoyang
>Priority: Major
>
> Currently, the config option {{-n}} for Flink on Yarn has not been supported 
> since Flink 1.8+. FLINK-12362 did the cleanup job about this config option. 
> However, the scala shell and python doc still contains some description about 
> {{-n}} which may make users confused. This issue used to track the cleanup 
> work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-14815) Expose network metric in IOMetricsInfo

2019-11-24 Thread lining (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lining updated FLINK-14815:
---
Description: 
* SubTask
 **  pool usage: outPoolUsage, inputExclusiveBuffersUsage, 
inputFloatingBuffersUsage.
 *** If the subtask is not back pressured, but it is causing backpressure (full 
input, empty output)
 *** By comparing exclusive/floating buffers usage, whether all channels are 
back-pressure or only some of them
 ** back-pressured for show whether it is back pressured.
 * Vertex
 ** pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg, 
inputFloatingBuffersUsageAvg
 ** back-pressured for show whether it is back pressured(merge all iths 
subtasks)

  was:
* If sub task is not back pressured, but it is causing a back pressure (full 
input, empty output)
 * By comparing exclusive/floating buffers usage, whether all channels are 
back-pressured or only some of them

{code:java}
public final class IOMetricsInfo {
private final float outPoolUsage;
private final float inputExclusiveBuffersUsage;
private final float inputFloatingBuffersUsage;
}
{code}
JobDetailsInfo.JobVertexDetailsInfo merge use Math.max.(ps: outPoolUsage is 
from upstream)


> Expose network metric in IOMetricsInfo
> --
>
> Key: FLINK-14815
> URL: https://issues.apache.org/jira/browse/FLINK-14815
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Metrics, Runtime / Network, Runtime / REST
>Reporter: lining
>Assignee: lining
>Priority: Major
>
> * SubTask
>  **  pool usage: outPoolUsage, inputExclusiveBuffersUsage, 
> inputFloatingBuffersUsage.
>  *** If the subtask is not back pressured, but it is causing backpressure 
> (full input, empty output)
>  *** By comparing exclusive/floating buffers usage, whether all channels are 
> back-pressure or only some of them
>  ** back-pressured for show whether it is back pressured.
>  * Vertex
>  ** pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg, 
> inputFloatingBuffersUsageAvg
>  ** back-pressured for show whether it is back pressured(merge all iths 
> subtasks)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 3 >

1 - 100 of 223 matches

Mail list logo