[jira] [Assigned] (FLINK-14590) Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py"
[ https://issues.apache.org/jira/browse/FLINK-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hequn Cheng reassigned FLINK-14590: --- Assignee: Hequn Cheng > Unify the working directory of Java process and Python process when > submitting python jobs via "flink run -py" > -- > > Key: FLINK-14590 > URL: https://issues.apache.org/jira/browse/FLINK-14590 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Wei Zhong >Assignee: Hequn Cheng >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Assume we enter this flink directory with following structure: > {code:java} > flink/ > bin/ > flink > pyflink-shell.sh > python-gateway-server.sh > ... > bad_case/ >word_count.py >data.txt > lib/... > opt/...{code} > And the word_count.py has such a piece of code: > {code:java} > t_config = TableConfig() > env = StreamExecutionEnvironment.get_execution_environment() > t_env = StreamTableEnvironment.create(env, t_config) > env._j_stream_execution_environment.registerCachedFile("data", > "bad_case/data.txt") > with open("bad_case/data.txt", "r") as f: > content = f.read() > elements = [(word, 1) for word in content.split(" ")] > t_env.from_elements(elements, ["word", "count"]){code} > Then we enter the "flink" directory and run: > {code:java} > bin/flink run -py bad_case/word_count.py > {code} > The program will fail at the line of "with open("bad_case/data.txt", "r") as > f:". > It is because the working directory of Java process is current directory but > the working directory of Python process is a temporary directory. > So there is no problem when relative path is used in the api call to java > process. But if relative path is used in other place such as native file > access, it will fail, because the working directory of python process has > been change to a temporary directory that is not known to users. > I think it will cause some confusion for users, especially after we support > dependency management. It will be great if we unify the working directory of > Java process and Python process. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-14590) Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py"
[ https://issues.apache.org/jira/browse/FLINK-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hequn Cheng reassigned FLINK-14590: --- Assignee: Wei Zhong (was: Hequn Cheng) > Unify the working directory of Java process and Python process when > submitting python jobs via "flink run -py" > -- > > Key: FLINK-14590 > URL: https://issues.apache.org/jira/browse/FLINK-14590 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Wei Zhong >Assignee: Wei Zhong >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Assume we enter this flink directory with following structure: > {code:java} > flink/ > bin/ > flink > pyflink-shell.sh > python-gateway-server.sh > ... > bad_case/ >word_count.py >data.txt > lib/... > opt/...{code} > And the word_count.py has such a piece of code: > {code:java} > t_config = TableConfig() > env = StreamExecutionEnvironment.get_execution_environment() > t_env = StreamTableEnvironment.create(env, t_config) > env._j_stream_execution_environment.registerCachedFile("data", > "bad_case/data.txt") > with open("bad_case/data.txt", "r") as f: > content = f.read() > elements = [(word, 1) for word in content.split(" ")] > t_env.from_elements(elements, ["word", "count"]){code} > Then we enter the "flink" directory and run: > {code:java} > bin/flink run -py bad_case/word_count.py > {code} > The program will fail at the line of "with open("bad_case/data.txt", "r") as > f:". > It is because the working directory of Java process is current directory but > the working directory of Python process is a temporary directory. > So there is no problem when relative path is used in the api call to java > process. But if relative path is used in other place such as native file > access, it will fail, because the working directory of python process has > been change to a temporary directory that is not known to users. > I think it will cause some confusion for users, especially after we support > dependency management. It will be great if we unify the working directory of > Java process and Python process. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List
flinkbot edited a comment on issue #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List URL: https://github.com/apache/flink/pull/10306#issuecomment-558027199 ## CI report: * 0b265d192e2a6024e5817317be0317136208ccaf : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/138004313) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout
flinkbot edited a comment on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout URL: https://github.com/apache/flink/pull/10305#issuecomment-558011526 ## CI report: * 160416cea63f157c0264c24812e2b2eef90e8c1d : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137989900) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-13590) flink-on-yarn sometimes could create many little files that are xxx-taskmanager-conf.yaml
[ https://issues.apache.org/jira/browse/FLINK-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981350#comment-16981350 ] Yang Wang commented on FLINK-13590: --- This Jira has been fixed via [FLINK-13184|https://issues.apache.org/jira/browse/FLINK-13184]. Since \{uuid}-taskmanager-conf.yaml on hdfs will not be created. We will use dynamic properties instead. It has also been picked to release-1.8 and release-1.9. [~shu_wen...@qq.com] could we close this JIRA? > flink-on-yarn sometimes could create many little files that are > xxx-taskmanager-conf.yaml > - > > Key: FLINK-13590 > URL: https://issues.apache.org/jira/browse/FLINK-13590 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Reporter: shuwenjun >Priority: Major > Attachments: taskmanager-conf-yaml.png > > > Both of 1.7.2 and 1.8.0 are used, but they could create many little files. > These files are the configuration file of taskmanager and when the flink > session try to apply a new container, one of the files will be created. And I > don't know why sometimes the flink session apply container again and again? > Or when one container has lost, it could delete its taskmanager-conf.yaml > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package
hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package URL: https://github.com/apache/flink/pull/10103#discussion_r350021027 ## File path: flink-python/dev/lint-python.sh ## @@ -626,6 +723,12 @@ while getopts "hfi:e:l" arg; do ;; esac done +# decides whether to skip check stage Review comment: Add a blank line before this line? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package
hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package URL: https://github.com/apache/flink/pull/10103#discussion_r350021854 ## File path: flink-python/dev/lint-python.sh ## @@ -303,7 +352,7 @@ function install_environment() { # step-3 install python environment whcih includes # 3.5 3.6 3.7 print_function "STEP" "installing python environment..." Review comment: Move this print into the `if`? because the installation may not be called if it returns false. Same for other commands. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package
hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package URL: https://github.com/apache/flink/pull/10103#discussion_r350022355 ## File path: flink-python/dev/lint-python.sh ## @@ -589,16 +659,40 @@ get_all_supported_checks EXCLUDE_CHECKS="" INCLUDE_CHECKS="" + +SUPPORTED_INSTALLATION_COMPONENTS=() +# search all supported install functions and put them into SUPPORTED_INSTALLATION_COMPONENTS array +get_all_supported_install_components + +INSTALLATION_COMPONENTS=() Review comment: Add a blank line after this line. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package
hequn8128 commented on a change in pull request #10103: [FLINK-14506][python][build] Improve the release script for Python API release package URL: https://github.com/apache/flink/pull/10103#discussion_r350024300 ## File path: flink-python/dev/lint-python.sh ## @@ -105,6 +123,29 @@ function check_valid_stage() { return 1 } +function parse_component_args() { +local REAL_COMPONENTS=() +for component in ${INSTALLATION_COMPONENTS[@]}; do +# because all other components depends on conda, the install of conda is +# required component. +if [[ "$component" == "basic" ]] || [[ "$component" == "miniconda" ]]; then +continue +fi +if [[ "$component" == "all" ]]; then +component="environment" +fi +if [[ `contains_element "${SUPPORTED_INSTALLATION_COMPONENTS[*]}" "${component}"` = true ]]; then +REAL_COMPONENTS+=(${component}) +else +echo "unknown install component ${component}" Review comment: Also print the components that are supported? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-13758) failed to submit JobGraph when registered hdfs file in DistributedCache
[ https://issues.apache.org/jira/browse/FLINK-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981342#comment-16981342 ] Arvid Heise commented on FLINK-13758: - If FLINK-14908 is indeed a duplicate, then this is also affecting Flink 1.9 and most likely 1.10 . > failed to submit JobGraph when registered hdfs file in DistributedCache > > > Key: FLINK-13758 > URL: https://issues.apache.org/jira/browse/FLINK-13758 > Project: Flink > Issue Type: Bug > Components: Command Line Client >Affects Versions: 1.6.3, 1.6.4, 1.7.2, 1.8.0, 1.8.1 >Reporter: luoguohao >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > when using HDFS files for DistributedCache, it would failed to submit > jobGraph, we can see exceptions stack traces in log file after a while, but > if DistributedCache file is a local file, every thing goes fine. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #10126: [FLINK-14590][python] Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py"
flinkbot edited a comment on issue #10126: [FLINK-14590][python] Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py" URL: https://github.com/apache/flink/pull/10126#issuecomment-551413525 ## CI report: * ea5abdfce3ab26a0a196fd7d73a78de109d71bc3 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/135594994) * 25c2e648e4c72c110ec8519392d085bb88ccd023 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/138001762) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc
flinkbot edited a comment on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc URL: https://github.com/apache/flink/pull/10304#issuecomment-558004149 ## CI report: * afdf3960fae62476f377aff7c27fac4f4779283a : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137987427) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List
flinkbot commented on issue #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List URL: https://github.com/apache/flink/pull/10306#issuecomment-558027199 ## CI report: * 0b265d192e2a6024e5817317be0317136208ccaf : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner
flinkbot edited a comment on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner URL: https://github.com/apache/flink/pull/10268#issuecomment-555964030 ## CI report: * 72414aad5f654b834205103f23fbbfc3d5466748 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137370738) * 30d167184e68f6a44fc4f5d58228577c916d63d2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137496415) * f41e0aaf005a489fc1df5c85511ff632ed9402a7 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137522333) * 29cc047cd4d65ecd9d47606843ee4893f765e8bc : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137706167) * 9e6bb01366f7aaa7aacb8f74104213dc9d97ff25 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137729165) * b0271e6b12a9e951074adbb50d7a7110736d61bd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137737105) * 724494d11181f6185df8de83e825e5b1d636a415 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137741018) * f1c9a89535ada7e5d3b5a155fe34cac5ce1dc928 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137902784) * b7b0adea530f8a8273085990cc583128c2e90fc8 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/138001736) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-14936) Introduce MemoryManager#computeMemorySize to calculate managed memory of an operator from a fraction
[ https://issues.apache.org/jira/browse/FLINK-14936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14936: Description: A MemoryManager#computeMemorySize(double fraction) is needed to calculate managed memory from a fraction. It can be helpful for operators to get the memory size it can reserve and for further #reserveMemory. (Similar to #computeNumberOfPages). Here are two cases that may need this method in near future: 1. [Python operator memory management|https://lists.apache.org/thread.html/dd4dedeb9354c2ee559cd2f15629c719853915b5efb31a0eafee9361@%3Cdev.flink.apache.org%3E] 2. [Statebackend memory management|https://issues.apache.org/jira/browse/FLINK-14883] was: A MemoryManager#computeMemorySize(double fraction) is needed to calculate managed memory from a fraction. It can be helpful for operators to get the memory size it can reserve and for further #reserveMemory. (Similar to #computeNumberOfPages). In the near future, there are two cases that may need this method: 1. [Python operator memory management|https://lists.apache.org/thread.html/dd4dedeb9354c2ee559cd2f15629c719853915b5efb31a0eafee9361@%3Cdev.flink.apache.org%3E] 2. [Statebackend memory management|https://issues.apache.org/jira/browse/FLINK-14883] > Introduce MemoryManager#computeMemorySize to calculate managed memory of an > operator from a fraction > > > Key: FLINK-14936 > URL: https://issues.apache.org/jira/browse/FLINK-14936 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.10.0 >Reporter: Zhu Zhu >Priority: Major > > A MemoryManager#computeMemorySize(double fraction) is needed to calculate > managed memory from a fraction. > It can be helpful for operators to get the memory size it can reserve and for > further #reserveMemory. (Similar to #computeNumberOfPages). > Here are two cases that may need this method in near future: > 1. [Python operator memory > management|https://lists.apache.org/thread.html/dd4dedeb9354c2ee559cd2f15629c719853915b5efb31a0eafee9361@%3Cdev.flink.apache.org%3E] > 2. [Statebackend memory > management|https://issues.apache.org/jira/browse/FLINK-14883] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-14936) Introduce MemoryManager#computeMemorySize to calculate managed memory of an operator from a fraction
[ https://issues.apache.org/jira/browse/FLINK-14936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14936: Description: A MemoryManager#computeMemorySize(double fraction) is needed to calculate managed memory from a fraction. It can be helpful for operators to get the memory size it can reserve and for further #reserveMemory. (Similar to #computeNumberOfPages). In the near future, there are two cases that may need this method: 1. [Python operator memory management|https://lists.apache.org/thread.html/dd4dedeb9354c2ee559cd2f15629c719853915b5efb31a0eafee9361@%3Cdev.flink.apache.org%3E] 2. [Statebackend memory management|https://issues.apache.org/jira/browse/FLINK-14883] was: I'd propose to MemoryManager#computeMemorySize(double fraction) which calculates managed memory from a fraction. It can be helpful for operators to get the memory size it can reserve and for further #reserveMemory. (Similar to #computeNumberOfPages). > Introduce MemoryManager#computeMemorySize to calculate managed memory of an > operator from a fraction > > > Key: FLINK-14936 > URL: https://issues.apache.org/jira/browse/FLINK-14936 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.10.0 >Reporter: Zhu Zhu >Priority: Major > > A MemoryManager#computeMemorySize(double fraction) is needed to calculate > managed memory from a fraction. > It can be helpful for operators to get the memory size it can reserve and for > further #reserveMemory. (Similar to #computeNumberOfPages). > In the near future, there are two cases that may need this method: > 1. [Python operator memory > management|https://lists.apache.org/thread.html/dd4dedeb9354c2ee559cd2f15629c719853915b5efb31a0eafee9361@%3Cdev.flink.apache.org%3E] > 2. [Statebackend memory > management|https://issues.apache.org/jira/browse/FLINK-14883] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zjffdu commented on a change in pull request #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List
zjffdu commented on a change in pull request #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List URL: https://github.com/apache/flink/pull/10306#discussion_r350016903 ## File path: flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/table/utils/TableResultUtils.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.utils; + +import org.apache.flink.annotation.Experimental; +import org.apache.flink.api.common.ExecutionConfig; +import org.apache.flink.api.common.JobExecutionResult; +import org.apache.flink.api.common.accumulators.Accumulator; +import org.apache.flink.api.common.accumulators.SerializedListAccumulator; +import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.api.common.typeutils.TypeSerializer; +import org.apache.flink.api.java.Utils; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.DataStreamSink; +import org.apache.flink.table.api.Table; +import org.apache.flink.table.api.TableEnvironment; +import org.apache.flink.table.api.TableSchema; +import org.apache.flink.table.api.internal.TableImpl; +import org.apache.flink.table.sinks.StreamTableSink; +import org.apache.flink.table.sinks.TableSink; +import org.apache.flink.table.types.DataType; +import org.apache.flink.types.Row; +import org.apache.flink.util.AbstractID; + +import java.util.List; + +/** + * A collection of utilities for fetching table results. + * + * NOTE: Methods in this utility class are experimental and can only be used for demonstration or testing + * small table results. Please DO NOT use them in production or on large tables. + */ +@Experimental +public class TableResultUtils { + + /** +* Convert Flink table to Java list. +* +* @param table Flink table to convert +* @return Converted Java list +*/ + public static List tableResultToList(Table table) { + final TableEnvironment tEnv = ((TableImpl) table).getTableEnvironment(); + + final String id = new AbstractID().toString(); + final TypeSerializer serializer = table.getSchema().toRowType().createSerializer(new ExecutionConfig()); + final Utils.CollectHelper outputFormat = new Utils.CollectHelper<>(id, serializer); + final TableResultSink sink = new TableResultSink(table, outputFormat); + + tEnv.registerTableSink("tableResultSink", sink); Review comment: Hard code table sink name ? I am afraid it will cause error when this method is called the second time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List
flinkbot commented on issue #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List URL: https://github.com/apache/flink/pull/10306#issuecomment-558020653 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 0b265d192e2a6024e5817317be0317136208ccaf (Mon Nov 25 07:05:49 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-13943) Provide api to convert flink table to java List (e.g. Table#collect)
[ https://issues.apache.org/jira/browse/FLINK-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-13943: --- Labels: pull-request-available (was: ) > Provide api to convert flink table to java List (e.g. Table#collect) > > > Key: FLINK-13943 > URL: https://issues.apache.org/jira/browse/FLINK-13943 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Jeff Zhang >Assignee: Caizhi Weng >Priority: Major > Labels: pull-request-available > > It would be nice to convert flink table to java List so that I can do other > data manipulation in client side after execution flink job. For flink > planner, I can convert flink table to DataSet and use DataSet#collect, but > for blink planner, there's no such api. > EDIT from FLINK-14807: > Currently, it is very unconvinient for user to fetch data of flink job unless > specify sink expclitly and then fetch data from this sink via its api (e.g. > write to hdfs sink, then read data from hdfs). However, most of time user > just want to get the data and do whatever processing he want. So it is very > necessary for flink to provide api Table#collect for this purpose. > Other apis such as Table#head, Table#print is also helpful. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] TsReaper opened a new pull request #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List
TsReaper opened a new pull request #10306: [FLINK-13943][table-api] Provide utility method to convert Flink table to Java List URL: https://github.com/apache/flink/pull/10306 ## What is the purpose of the change Some users would like to convert Flink table to Java List for demo or testing purpose. This PR introduces a utility method `TableResultUtils#tableResultToList` to convert Flink tables to lists. As the new utility method is experimental, we currently are not going to introduce a new `Table#collect` API. If users find this utility method satisfying to use, we will then propose a FLIP for `Table#collect`. ## Brief change log - Introduce `TableResultUtils#tableResultToList` ## Verifying this change This change added tests and can be verified as follows: run the newly added `TableResultUtilsITCase`. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10126: [FLINK-14590][python] Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py"
flinkbot edited a comment on issue #10126: [FLINK-14590][python] Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py" URL: https://github.com/apache/flink/pull/10126#issuecomment-551413525 ## CI report: * ea5abdfce3ab26a0a196fd7d73a78de109d71bc3 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/135594994) * 25c2e648e4c72c110ec8519392d085bb88ccd023 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-14936) Introduce MemoryManager#computeMemorySize to calculate managed memory of an operator from a fraction
Zhu Zhu created FLINK-14936: --- Summary: Introduce MemoryManager#computeMemorySize to calculate managed memory of an operator from a fraction Key: FLINK-14936 URL: https://issues.apache.org/jira/browse/FLINK-14936 Project: Flink Issue Type: Improvement Components: Runtime / Coordination Affects Versions: 1.10.0 Reporter: Zhu Zhu I'd propose to MemoryManager#computeMemorySize(double fraction) which calculates managed memory from a fraction. It can be helpful for operators to get the memory size it can reserve and for further #reserveMemory. (Similar to #computeNumberOfPages). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout
flinkbot edited a comment on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout URL: https://github.com/apache/flink/pull/10305#issuecomment-558011526 ## CI report: * 160416cea63f157c0264c24812e2b2eef90e8c1d : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/137989900) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner
flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner URL: https://github.com/apache/flink/pull/10296#issuecomment-557762350 ## CI report: * 7c5e46d1f794cb01be5088f5f5d29aec5850bf7f : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137872169) * 9d98382a4a56dfb2df7cc38cac34db0b68a27b93 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137873009) * 6942e402272d95e745e26cb6409dbdb2e0a4496e : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137985445) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner
flinkbot edited a comment on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner URL: https://github.com/apache/flink/pull/10268#issuecomment-555964030 ## CI report: * 72414aad5f654b834205103f23fbbfc3d5466748 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137370738) * 30d167184e68f6a44fc4f5d58228577c916d63d2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137496415) * f41e0aaf005a489fc1df5c85511ff632ed9402a7 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137522333) * 29cc047cd4d65ecd9d47606843ee4893f765e8bc : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137706167) * 9e6bb01366f7aaa7aacb8f74104213dc9d97ff25 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137729165) * b0271e6b12a9e951074adbb50d7a7110736d61bd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137737105) * 724494d11181f6185df8de83e825e5b1d636a415 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137741018) * f1c9a89535ada7e5d3b5a155fe34cac5ce1dc928 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137902784) * b7b0adea530f8a8273085990cc583128c2e90fc8 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wangyang0918 commented on issue #10193: [FLINK-13938][yarn] Use pre-uploaded flink binary to accelerate flink submission
wangyang0918 commented on issue #10193: [FLINK-13938][yarn] Use pre-uploaded flink binary to accelerate flink submission URL: https://github.com/apache/flink/pull/10193#issuecomment-558013369 @walterddr Your assumption 1 is right. For assumption 2, the pre-uploaded flink binary is located on a generic location of any kind of DFS system(usually HDFS), not Yarn distributed cache. We register the pre-uploaded flink jars as Yarn public cache, so that it could be shared by different applications. And they will not be downloaded every time. For #10187, @TisonKun will create a separate JIRA. It is focused on support shipping directory located on HDFS. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-14930) OSS Filesystem Uses Wrong Shading Prefix
[ https://issues.apache.org/jira/browse/FLINK-14930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981322#comment-16981322 ] Konstantin Knauf commented on FLINK-14930: -- [~chesnay] Could you assign me to this? > OSS Filesystem Uses Wrong Shading Prefix > > > Key: FLINK-14930 > URL: https://issues.apache.org/jira/browse/FLINK-14930 > Project: Flink > Issue Type: Bug > Components: FileSystems >Affects Versions: 1.9.1 >Reporter: Konstantin Knauf >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The relevant classes (CredentialsProviders) are relocated to > {{org.apache.flink.fs.osshadoop.shaded.}} not > {{org.apache.flink.fs.shaded.hadoop3.}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-14552) Enable partition statistics in blink planner
[ https://issues.apache.org/jira/browse/FLINK-14552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-14552: --- Assignee: Jingsong Lee > Enable partition statistics in blink planner > > > Key: FLINK-14552 > URL: https://issues.apache.org/jira/browse/FLINK-14552 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > > We need update statistics after partition pruning in > PushPartitionIntoTableSourceScanRule. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-14543) Support partition for temporary table
[ https://issues.apache.org/jira/browse/FLINK-14543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-14543: --- Assignee: Jingsong Lee > Support partition for temporary table > - > > Key: FLINK-14543 > URL: https://issues.apache.org/jira/browse/FLINK-14543 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout
flinkbot commented on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout URL: https://github.com/apache/flink/pull/10305#issuecomment-558011526 ## CI report: * 160416cea63f157c0264c24812e2b2eef90e8c1d : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc
flinkbot edited a comment on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc URL: https://github.com/apache/flink/pull/10304#issuecomment-558004149 ## CI report: * afdf3960fae62476f377aff7c27fac4f4779283a : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/137987427) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wuchong commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner
wuchong commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner URL: https://github.com/apache/flink/pull/10268#issuecomment-558010632 I agree with @JingsongLi . We can add some tests for high precision timestamp as group key, as distinct key, as time attribute. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner
JingsongLi commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner URL: https://github.com/apache/flink/pull/10268#issuecomment-558009679 Can you add more tests for timestamp type with precision that bigger than 3? Like in batch mode, shuffle with timestamp key, join with timestamp key, aggregate with timestamp key, aggregate function to timestamp field. I am not 100% sure that work in these scenarios. @wuchong What do you think about streaming mode? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on a change in pull request #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner
JingsongLi commented on a change in pull request #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner URL: https://github.com/apache/flink/pull/10268#discussion_r350004855 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/codegen/CodeGenUtils.scala ## @@ -590,8 +597,7 @@ object CodeGenUtils { case DOUBLE => s"$arrayTerm.setNullDouble($index)" case TIME_WITHOUT_TIME_ZONE => s"$arrayTerm.setNullInt($index)" case DATE => s"$arrayTerm.setNullInt($index)" -case TIMESTAMP_WITHOUT_TIME_ZONE | - TIMESTAMP_WITH_LOCAL_TIME_ZONE => s"$arrayTerm.setNullLong($index)" +case TIMESTAMP_WITH_LOCAL_TIME_ZONE => s"$arrayTerm.setNullLong($index)" Review comment: Yes, it is what I mean. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] docete commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner
docete commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner URL: https://github.com/apache/flink/pull/10268#issuecomment-558008007 > Looks like your base is broken, rebase to master to let the tests pass? OK. Would rebase to master soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] KurtYoung commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner
KurtYoung commented on issue #10268: [Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner URL: https://github.com/apache/flink/pull/10268#issuecomment-558006414 Looks like your base is broken, rebase to master to let the tests pass? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout
klion26 commented on a change in pull request #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout URL: https://github.com/apache/flink/pull/10305#discussion_r350003208 ## File path: docs/ops/state/checkpoints.md ## @@ -63,6 +63,24 @@ files. The meta data file and data files are stored in the directory that is configured via `state.checkpoints.dir` in the configuration files, and also can be specified for per job in the code. +The current checkpoint directory layout which introduced by FLINK-8531 is as follows: + +{% highlight yaml %} +/user-defined-checkpoint-dir +| ++ --shared/ ++ --taskowned/ ++ --chk-1/ ++ --chk-2/ ++ --chk-3/ +... +{% endhighlight %} + +The **SHARED** directory is for state that is possibly part of multiple checkpoints, **TASKOWNED** is for state that must never by dropped by the JobManager, and **EXCLUSIVE** is for state that belongs to one checkpoint only. Review comment: The description order is according to the order of the above checkpoint directory layout. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout
flinkbot commented on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout URL: https://github.com/apache/flink/pull/10305#issuecomment-558005628 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 160416cea63f157c0264c24812e2b2eef90e8c1d (Mon Nov 25 06:08:20 UTC 2019) ✅no warnings Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] Support all TPC-DS queries
KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] Support all TPC-DS queries URL: https://github.com/apache/flink/pull/10239#discussion_r349973287 ## File path: flink-end-to-end-tests/flink-tpcds-test/src/main/java/org/apache/flink/table/tpcds/utils/AnswerFormatter.java ## @@ -0,0 +1,187 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.tpcds.utils; + +import org.apache.flink.api.java.utils.ParameterTool; + +import java.io.BufferedReader; +import java.io.BufferedWriter; +import java.io.File; +import java.io.FileReader; +import java.io.FileWriter; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.stream.Collectors; + +/** + * Answer set format tool class. convert delimiter from spaces or tabs to bar('|') in TPC-DS answer set. + * before convert, need to format TPC-DS result as following: + * 1. split answer set which has multi query results to multi answer set, includes query14, 23, 24, 39. + * 2. replace tabs by spaces in answer set by vim. + * (1) cd answer_set directory + * (2) vim 1.ans with command model, + * :set ts=8 + * :set noexpandtab + * :%retab! + * :args ./*.ans + * :argdo %retab! |update + * (3) save and quit vim. + */ +public class AnswerFormatter { + + private static final int SPACE_BETWEEN_COL = 1; + private static final String RESULT_HEAD_STRING_BAR = "|"; + private static final String RESULT_HEAD_STRING_DASH = "--"; + private static final String RESULT_HEAD_STRING_SPACE = " "; + private static final String COL_DELIMITER = "|"; + private static final String ANSWER_FILE_SUFFIX = ".ans"; + private static final String REGEX_SPLIT_BAR = "\\|"; + private static final String FILE_SEPARATOR = "/"; + + /** +* 1.flink keeps NULLS_FIRST in ASC order, keeps NULLS_LAST in DESC order, +* choose corresponding answer set file here. +* 2.for query 8、14a、18、70、77, decimal precision of answer set is too low +* and unreasonable, compare result with result from SQL server, they can +* strictly match. +*/ + private static final List ORIGIN_ANSWER_FILE = Arrays.asList( + "1", "2", "3", "4", "5_NULLS_FIRST", "6_NULLS_FIRST", "7", "8_SQL_SERVER", "9", "10", + "11", "12", "13", "14a_SQL_SERVER", "14b_NULLS_FIRST", "15_NULLS_FIRST", "16", + "17", "18_SQL_SERVER", "19", "20_NULLS_FIRST", "21_NULLS_FIRST", "22_NULLS_FIRST", + "23a_NULLS_FIRST", "23b_NULLS_FIRST", "24a", "24b", "25", "26", "27_NULLS_FIRST", + "28", "29", "30", "31", "32", "33", "34_NULLS_FIRST", "35_NULLS_FIRST", "36_NULLS_FIRST", + "37", "38", "39a", "39b", "40", "41", "42", "43", "44", "45", "46_NULLS_FIRST", + "47", "48", "49", "50", "51", "52", "53", "54", "55", "56_NULLS_FIRST", "57", + "58", "59", "60", "61", "62_NULLS_FIRST", "63", "64", "65_NULLS_FIRST", "66_NULLS_FIRST", + "67_NULLS_FIRST", "68_NULLS_FIRST", "69", "70_SQL_SERVER", "71_NULLS_LAST", "72_NULLS_FIRST", "73", + "74", "75", "76_NULLS_FIRST", "77_SQL_SERVER", "78", "79_NULLS_FIRST", "80_NULLS_FIRST", + "81", "82", "83", "84", "85", "86_NULLS_FIRST", "87", "88", "89", "90", + "91", "92", "93_NULLS_FIRST", "94", "95", "96", "97", "98_NULLS_FIRST", "99_NULLS_FIRST" + ); + + public static void main(String[] args) throws Exception { + ParameterTool params = ParameterTool.fromArgs(args); + String originDir = params.getRequired("originDir"); + String destDir = params.getRequired("destDir"); + for (int i = 0; i < ORIGIN_ANSWER_FILE.size(); i++) { + String file = ORIGIN_ANSWER_FILE.get(i); + String originFileName = file + ANSWER_FILE_SUFFIX; + String destFileName = file.split("_")[0] + ANSWER_FILE_SUFFIX; + File originFIle = new File(originDir + FILE_SEPARATOR + originFileName); Review comment: originFIle -> originFile
[GitHub] [flink] KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] Support all TPC-DS queries
KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] Support all TPC-DS queries URL: https://github.com/apache/flink/pull/10239#discussion_r349973626 ## File path: flink-end-to-end-tests/flink-tpcds-test/src/main/java/org/apache/flink/table/tpcds/utils/AnswerFormatter.java ## @@ -0,0 +1,187 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.tpcds.utils; + +import org.apache.flink.api.java.utils.ParameterTool; + +import java.io.BufferedReader; +import java.io.BufferedWriter; +import java.io.File; +import java.io.FileReader; +import java.io.FileWriter; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.stream.Collectors; + +/** + * Answer set format tool class. convert delimiter from spaces or tabs to bar('|') in TPC-DS answer set. + * before convert, need to format TPC-DS result as following: + * 1. split answer set which has multi query results to multi answer set, includes query14, 23, 24, 39. + * 2. replace tabs by spaces in answer set by vim. + * (1) cd answer_set directory + * (2) vim 1.ans with command model, + * :set ts=8 + * :set noexpandtab + * :%retab! + * :args ./*.ans + * :argdo %retab! |update + * (3) save and quit vim. + */ +public class AnswerFormatter { + + private static final int SPACE_BETWEEN_COL = 1; + private static final String RESULT_HEAD_STRING_BAR = "|"; + private static final String RESULT_HEAD_STRING_DASH = "--"; + private static final String RESULT_HEAD_STRING_SPACE = " "; + private static final String COL_DELIMITER = "|"; + private static final String ANSWER_FILE_SUFFIX = ".ans"; + private static final String REGEX_SPLIT_BAR = "\\|"; + private static final String FILE_SEPARATOR = "/"; + + /** +* 1.flink keeps NULLS_FIRST in ASC order, keeps NULLS_LAST in DESC order, +* choose corresponding answer set file here. +* 2.for query 8、14a、18、70、77, decimal precision of answer set is too low +* and unreasonable, compare result with result from SQL server, they can +* strictly match. +*/ + private static final List ORIGIN_ANSWER_FILE = Arrays.asList( + "1", "2", "3", "4", "5_NULLS_FIRST", "6_NULLS_FIRST", "7", "8_SQL_SERVER", "9", "10", + "11", "12", "13", "14a_SQL_SERVER", "14b_NULLS_FIRST", "15_NULLS_FIRST", "16", + "17", "18_SQL_SERVER", "19", "20_NULLS_FIRST", "21_NULLS_FIRST", "22_NULLS_FIRST", + "23a_NULLS_FIRST", "23b_NULLS_FIRST", "24a", "24b", "25", "26", "27_NULLS_FIRST", + "28", "29", "30", "31", "32", "33", "34_NULLS_FIRST", "35_NULLS_FIRST", "36_NULLS_FIRST", + "37", "38", "39a", "39b", "40", "41", "42", "43", "44", "45", "46_NULLS_FIRST", + "47", "48", "49", "50", "51", "52", "53", "54", "55", "56_NULLS_FIRST", "57", + "58", "59", "60", "61", "62_NULLS_FIRST", "63", "64", "65_NULLS_FIRST", "66_NULLS_FIRST", + "67_NULLS_FIRST", "68_NULLS_FIRST", "69", "70_SQL_SERVER", "71_NULLS_LAST", "72_NULLS_FIRST", "73", + "74", "75", "76_NULLS_FIRST", "77_SQL_SERVER", "78", "79_NULLS_FIRST", "80_NULLS_FIRST", + "81", "82", "83", "84", "85", "86_NULLS_FIRST", "87", "88", "89", "90", + "91", "92", "93_NULLS_FIRST", "94", "95", "96", "97", "98_NULLS_FIRST", "99_NULLS_FIRST" + ); + + public static void main(String[] args) throws Exception { + ParameterTool params = ParameterTool.fromArgs(args); + String originDir = params.getRequired("originDir"); + String destDir = params.getRequired("destDir"); + for (int i = 0; i < ORIGIN_ANSWER_FILE.size(); i++) { + String file = ORIGIN_ANSWER_FILE.get(i); + String originFileName = file + ANSWER_FILE_SUFFIX; + String destFileName = file.split("_")[0] + ANSWER_FILE_SUFFIX; + File originFIle = new File(originDir + FILE_SEPARATOR + originFileName); + File destFile = new File(destDir + FILE_SEPARATOR +
[GitHub] [flink] KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] Support all TPC-DS queries
KurtYoung commented on a change in pull request #10239: [FLINK-11491][end2end] Support all TPC-DS queries URL: https://github.com/apache/flink/pull/10239#discussion_r34496 ## File path: flink-end-to-end-tests/flink-tpcds-test/src/main/java/org/apache/flink/table/tpcds/utils/AnswerFormatter.java ## @@ -0,0 +1,187 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.tpcds.utils; + +import org.apache.flink.api.java.utils.ParameterTool; + +import java.io.BufferedReader; +import java.io.BufferedWriter; +import java.io.File; +import java.io.FileReader; +import java.io.FileWriter; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.stream.Collectors; + +/** + * Answer set format tool class. convert delimiter from spaces or tabs to bar('|') in TPC-DS answer set. + * before convert, need to format TPC-DS result as following: + * 1. split answer set which has multi query results to multi answer set, includes query14, 23, 24, 39. + * 2. replace tabs by spaces in answer set by vim. + * (1) cd answer_set directory + * (2) vim 1.ans with command model, + * :set ts=8 + * :set noexpandtab + * :%retab! + * :args ./*.ans + * :argdo %retab! |update + * (3) save and quit vim. + */ +public class AnswerFormatter { + + private static final int SPACE_BETWEEN_COL = 1; + private static final String RESULT_HEAD_STRING_BAR = "|"; + private static final String RESULT_HEAD_STRING_DASH = "--"; + private static final String RESULT_HEAD_STRING_SPACE = " "; + private static final String COL_DELIMITER = "|"; + private static final String ANSWER_FILE_SUFFIX = ".ans"; + private static final String REGEX_SPLIT_BAR = "\\|"; + private static final String FILE_SEPARATOR = "/"; + + /** +* 1.flink keeps NULLS_FIRST in ASC order, keeps NULLS_LAST in DESC order, +* choose corresponding answer set file here. +* 2.for query 8、14a、18、70、77, decimal precision of answer set is too low +* and unreasonable, compare result with result from SQL server, they can +* strictly match. +*/ + private static final List ORIGIN_ANSWER_FILE = Arrays.asList( + "1", "2", "3", "4", "5_NULLS_FIRST", "6_NULLS_FIRST", "7", "8_SQL_SERVER", "9", "10", + "11", "12", "13", "14a_SQL_SERVER", "14b_NULLS_FIRST", "15_NULLS_FIRST", "16", + "17", "18_SQL_SERVER", "19", "20_NULLS_FIRST", "21_NULLS_FIRST", "22_NULLS_FIRST", + "23a_NULLS_FIRST", "23b_NULLS_FIRST", "24a", "24b", "25", "26", "27_NULLS_FIRST", + "28", "29", "30", "31", "32", "33", "34_NULLS_FIRST", "35_NULLS_FIRST", "36_NULLS_FIRST", + "37", "38", "39a", "39b", "40", "41", "42", "43", "44", "45", "46_NULLS_FIRST", + "47", "48", "49", "50", "51", "52", "53", "54", "55", "56_NULLS_FIRST", "57", + "58", "59", "60", "61", "62_NULLS_FIRST", "63", "64", "65_NULLS_FIRST", "66_NULLS_FIRST", + "67_NULLS_FIRST", "68_NULLS_FIRST", "69", "70_SQL_SERVER", "71_NULLS_LAST", "72_NULLS_FIRST", "73", + "74", "75", "76_NULLS_FIRST", "77_SQL_SERVER", "78", "79_NULLS_FIRST", "80_NULLS_FIRST", + "81", "82", "83", "84", "85", "86_NULLS_FIRST", "87", "88", "89", "90", + "91", "92", "93_NULLS_FIRST", "94", "95", "96", "97", "98_NULLS_FIRST", "99_NULLS_FIRST" + ); + + public static void main(String[] args) throws Exception { + ParameterTool params = ParameterTool.fromArgs(args); + String originDir = params.getRequired("originDir"); + String destDir = params.getRequired("destDir"); + for (int i = 0; i < ORIGIN_ANSWER_FILE.size(); i++) { + String file = ORIGIN_ANSWER_FILE.get(i); + String originFileName = file + ANSWER_FILE_SUFFIX; + String destFileName = file.split("_")[0] + ANSWER_FILE_SUFFIX; + File originFIle = new File(originDir + FILE_SEPARATOR + originFileName); + File destFile = new File(destDir + FILE_SEPARATOR +
[GitHub] [flink] klion26 commented on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout
klion26 commented on issue #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout URL: https://github.com/apache/flink/pull/10305#issuecomment-558005061 cc @pnowojski cc @wuchong for the translated Chinese version This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#issuecomment-539910284 ## CI report: * e66114b1aa73a82b4c6bcf5c3a16baedb598f8c3 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131098600) * b7887760a3c3d28ca88eb31800ebd61084a520fc : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131249622) * 19ff8f1384bf24b469fa6cac0566d603a332b31d : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/133329923) * 98c83564a56ddcc30095e83458384a106170443c : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/133770151) * 501f1c3266693d0cbb1b8b1c0195f354756b7526 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134385291) * ae434f5b0a094cbe4dda971dd0d69d7264b50155 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134525931) * 2ab2896c8b1a1334dbb097d4a0c89cffdbdbbca0 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134804805) * ebcdd089caed98b9fef94445781c06c6689fbe60 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134806371) * a818c7d8c5c08f1498533a85165f93abda769f50 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134807944) * 9084a7edd3b9e0553c396154ef96dca15b580ffd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/134814001) * 74a10525ee8fde96179cba59d26a8c2bf9698160 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134816617) * 22f56474fd27ec1cbe79f962607bc5bddfb6e067 : UNKNOWN * 5b35025a2848de85628ac2071f4fc64c02b557a1 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/13482) * 1a42be32e9aadf2328d7e4c2ee7487621e835658 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134835537) * 15d9f43362e702dd1b857b2e91d5c8b9812fe160 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134839586) * 300540862c55ca11b467191389463ebf025aebc4 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134999837) * bc77f17dab4013a5f2ba82d9d9e73b4b9e1be8ac : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135224808) * 0ea64d42dca7fb6f981ee73ebdde11ff93d1e6bd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137983497) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-14892) Add documentation for checkpoint directory layout
[ https://issues.apache.org/jira/browse/FLINK-14892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-14892: --- Labels: pull-request-available (was: ) > Add documentation for checkpoint directory layout > - > > Key: FLINK-14892 > URL: https://issues.apache.org/jira/browse/FLINK-14892 > Project: Flink > Issue Type: Improvement > Components: Documentation, Runtime / Checkpointing >Reporter: Congxian Qiu(klion26) >Assignee: Congxian Qiu(klion26) >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > > In FLINK-8531, we change the checkpoint directory layout to > {code:java} > /user-defined-checkpoint-dir > | > + --shared/ > + --taskowned/ > + --chk-1/ > + --chk-2/ > + --chk-3/ > ... > {code} > But the directory layout did not describe in the doc currently, and I found > some users confused about this, such as[1][2], so I propose to add a > description for the checkpoint directory layout in the documentation, maybe > in the page {{checkpoints#DirectoryStructure}}[3] > [1] > [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-checkpointing-behavior-td30749.html#a30751] > [2] > [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Apache-Flink-Operator-name-and-uuid-best-practices-td31031.html] > [3] > [https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/state/checkpoints.html#directory-structure] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wangyang0918 commented on a change in pull request #9957: [FLINK-10932] Initialize flink-kubernetes module
wangyang0918 commented on a change in pull request #9957: [FLINK-10932] Initialize flink-kubernetes module URL: https://github.com/apache/flink/pull/9957#discussion_r350002500 ## File path: flink-kubernetes/pom.xml ## @@ -0,0 +1,208 @@ + + + +http://maven.apache.org/POM/4.0.0; +xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; +xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd;> + 4.0.0 + + org.apache.flink + flink-parent + 1.10-SNAPSHOT + .. + + + flink-kubernetes_${scala.binary.version} + flink-kubernetes + jar + + + 4.5.2 + + + + + + + + org.apache.flink + flink-clients_${scala.binary.version} + ${project.version} + provided + + + + org.apache.flink + flink-runtime_${scala.binary.version} + ${project.version} + provided + + + + io.fabric8 + kubernetes-client + ${kubernetes.client.version} + + + + com.squareup.okhttp3 + okhttp + + + com.fasterxml.jackson.core + jackson-core + + + com.fasterxml.jackson.core + jackson-databind + Review comment: There's multiple dependencies of `jackson-core` and `jackson-databind` for `kubernetes-client`. So we just exclude them and use a fixed version. This is to avoid dependency convergence check failure. For `jackson-dataformat-yaml`, it is only one dependency and it will also be shaded. I think we do not need to exclude it and add a direct dependency. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 opened a new pull request #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout
klion26 opened a new pull request #10305: [FLINK-14892][docs] Add documentation for checkpoint directory layout URL: https://github.com/apache/flink/pull/10305 ## What is the purpose of the change In FLINK-8531, we change the checkpoint directory layout to ``` /user-defined-checkpoint-dir | + --shared/ + --taskowned/ + --chk-1/ + --chk-2/ + --chk-3/ ... ``` But the directory layout did not describe in the doc currently, and some users confused about this in the mail-list, this pr wants to add a description for the checkpoint directory layout in the documentation. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: ( no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc
flinkbot commented on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc URL: https://github.com/apache/flink/pull/10304#issuecomment-558004149 ## CI report: * afdf3960fae62476f377aff7c27fac4f4779283a : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner
flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner URL: https://github.com/apache/flink/pull/10296#issuecomment-557762350 ## CI report: * 7c5e46d1f794cb01be5088f5f5d29aec5850bf7f : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137872169) * 9d98382a4a56dfb2df7cc38cac34db0b68a27b93 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137873009) * 6942e402272d95e745e26cb6409dbdb2e0a4496e : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/137985445) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc
flinkbot commented on issue #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc URL: https://github.com/apache/flink/pull/10304#issuecomment-558002278 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit afdf3960fae62476f377aff7c27fac4f4779283a (Mon Nov 25 05:54:50 UTC 2019) **Warnings:** * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-14838).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-14838) Cleanup the description about container number config option in Scala and python shell doc
[ https://issues.apache.org/jira/browse/FLINK-14838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-14838: --- Labels: pull-request-available (was: ) > Cleanup the description about container number config option in Scala and > python shell doc > -- > > Key: FLINK-14838 > URL: https://issues.apache.org/jira/browse/FLINK-14838 > Project: Flink > Issue Type: Improvement > Components: Documentation >Reporter: vinoyang >Priority: Major > Labels: pull-request-available > > Currently, the config option {{-n}} for Flink on Yarn has not been supported > since Flink 1.8+. FLINK-12362 did the cleanup job about this config option. > However, the scala shell and python doc still contains some description about > {{-n}} which may make users confused. This issue used to track the cleanup > work. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] yanghua opened a new pull request #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc
yanghua opened a new pull request #10304: [FLINK-14838] Cleanup the description about container number config option in Scala and python shell doc URL: https://github.com/apache/flink/pull/10304 ## What is the purpose of the change *This pull request cleanups the description about container number config option in Scala and python shell doc* ## Brief change log - *Cleanup the description about container number config option in Scala and python shell doc* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / **not documented**) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#issuecomment-539910284 ## CI report: * e66114b1aa73a82b4c6bcf5c3a16baedb598f8c3 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131098600) * b7887760a3c3d28ca88eb31800ebd61084a520fc : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131249622) * 19ff8f1384bf24b469fa6cac0566d603a332b31d : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/133329923) * 98c83564a56ddcc30095e83458384a106170443c : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/133770151) * 501f1c3266693d0cbb1b8b1c0195f354756b7526 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134385291) * ae434f5b0a094cbe4dda971dd0d69d7264b50155 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134525931) * 2ab2896c8b1a1334dbb097d4a0c89cffdbdbbca0 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134804805) * ebcdd089caed98b9fef94445781c06c6689fbe60 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134806371) * a818c7d8c5c08f1498533a85165f93abda769f50 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134807944) * 9084a7edd3b9e0553c396154ef96dca15b580ffd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/134814001) * 74a10525ee8fde96179cba59d26a8c2bf9698160 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134816617) * 22f56474fd27ec1cbe79f962607bc5bddfb6e067 : UNKNOWN * 5b35025a2848de85628ac2071f4fc64c02b557a1 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/13482) * 1a42be32e9aadf2328d7e4c2ee7487621e835658 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134835537) * 15d9f43362e702dd1b857b2e91d5c8b9812fe160 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134839586) * 300540862c55ca11b467191389463ebf025aebc4 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134999837) * bc77f17dab4013a5f2ba82d9d9e73b4b9e1be8ac : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135224808) * 0ea64d42dca7fb6f981ee73ebdde11ff93d1e6bd : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/137983497) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner
flinkbot edited a comment on issue #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner URL: https://github.com/apache/flink/pull/10296#issuecomment-557762350 ## CI report: * 7c5e46d1f794cb01be5088f5f5d29aec5850bf7f : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137872169) * 9d98382a4a56dfb2df7cc38cac34db0b68a27b93 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137873009) * 6942e402272d95e745e26cb6409dbdb2e0a4496e : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce orc ColumnarRow reader for hive connector
flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce orc ColumnarRow reader for hive connector URL: https://github.com/apache/flink/pull/10022#issuecomment-547278354 ## CI report: * 73e9fbb7c4edb8628b93e9a5b17271e57a8b8f14 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/133947525) * b86c5db1ab9a081f0fec52f15fdad3b4f4d61939 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134797047) * 77aa3cb348f97c8e2999bafcee9e7169b96e983e : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135586640) * 3e65b8dfcdbcb91902d7c9b122b3bdb36333b227 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135597364) * 48e164d35f68ea4d11318c1481586ea2beefe386 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135625521) * c8bf33b2b0bc62f779e6ea68e4dc4c3d18432f87 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135891066) * 619180562ffd3549bc1e9a2fd33f87f42d61a772 : UNKNOWN * 111a1761ee7b728ba5bd7fa606879ca018eb2e15 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135898529) * 3c7a40e08a555d26b27dc2738e19ccb600446cb9 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135922821) * 4c43bc9d58bbfabcf6661c1a93e5315a6cee1269 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/136073480) * 82352ba9729d9c9f15872964753fc4d55459efe1 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136076198) * 1e35c8e1496d1aa26bc0da8edfe913c8b7c4cfc3 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/136253883) * 5c0d680d44622e5a371fb501316b4e7b6c875440 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/136255756) * 2907034544549016ac5afce4b136972264adb152 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136522618) * 79e9f7f73eed2478e72575d40df7e7f1cabdf1ad : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/136634189) * c600ec7fed19f38701933fab6fb2822193ebbec1 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/136636034) * 4dcb68db6b75621f17d2bebc69b75ddaa2fb2017 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137762868) * 31a92829a40ecdffe94f9b4e537327c7e9bef0b3 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137979329) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case
flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case URL: https://github.com/apache/flink/pull/10271#issuecomment-556037401 ## CI report: * dfd530a162b1912b00f5740ca07f828b44dcd4bd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137401080) * 366928af4c6384dfeebf26ac04dac6a15b725ba4 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137498499) * 63595b41fa8062be41de37d6abe7cb752c9de280 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137530355) * 488c22fd0bb6ab21f262594d5f1b7897273b3f28 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137762854) * e9c30c1d7e3097774ef019d228069e34ac407852 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137979314) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
flinkbot edited a comment on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#issuecomment-539910284 ## CI report: * e66114b1aa73a82b4c6bcf5c3a16baedb598f8c3 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131098600) * b7887760a3c3d28ca88eb31800ebd61084a520fc : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131249622) * 19ff8f1384bf24b469fa6cac0566d603a332b31d : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/133329923) * 98c83564a56ddcc30095e83458384a106170443c : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/133770151) * 501f1c3266693d0cbb1b8b1c0195f354756b7526 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134385291) * ae434f5b0a094cbe4dda971dd0d69d7264b50155 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134525931) * 2ab2896c8b1a1334dbb097d4a0c89cffdbdbbca0 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134804805) * ebcdd089caed98b9fef94445781c06c6689fbe60 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134806371) * a818c7d8c5c08f1498533a85165f93abda769f50 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134807944) * 9084a7edd3b9e0553c396154ef96dca15b580ffd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/134814001) * 74a10525ee8fde96179cba59d26a8c2bf9698160 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134816617) * 22f56474fd27ec1cbe79f962607bc5bddfb6e067 : UNKNOWN * 5b35025a2848de85628ac2071f4fc64c02b557a1 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/13482) * 1a42be32e9aadf2328d7e4c2ee7487621e835658 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/134835537) * 15d9f43362e702dd1b857b2e91d5c8b9812fe160 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134839586) * 300540862c55ca11b467191389463ebf025aebc4 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134999837) * bc77f17dab4013a5f2ba82d9d9e73b4b9e1be8ac : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135224808) * 0ea64d42dca7fb6f981ee73ebdde11ff93d1e6bd : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner
zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner URL: https://github.com/apache/flink/pull/10296#discussion_r349991863 ## File path: flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java ## @@ -469,18 +475,57 @@ public void sqlUpdate(String stmt) { createTableOperation.getCatalogTable(), createTableOperation.getTableIdentifier(), createTableOperation.isIgnoreIfExists()); + } else if (operation instanceof CreateDatabaseOperation) { + CreateDatabaseOperation createDatabaseOperation = (CreateDatabaseOperation) operation; + catalogManager.createDatabase(createDatabaseOperation.getCatalogName(), + createDatabaseOperation.getDatabaseName(), + createDatabaseOperation.getCatalogDatabase(), + createDatabaseOperation.isIgnoreIfExists(), + false); } else if (operation instanceof DropTableOperation) { DropTableOperation dropTableOperation = (DropTableOperation) operation; catalogManager.dropTable( dropTableOperation.getTableIdentifier(), dropTableOperation.isIfExists()); - } else if (operation instanceof UseCatalogOperation) { - UseCatalogOperation useCatalogOperation = (UseCatalogOperation) operation; - catalogManager.setCurrentCatalog(useCatalogOperation.getCatalogName()); + } else if (operation instanceof DropDatabaseOperation) { + DropDatabaseOperation dropDatabaseOperation = (DropDatabaseOperation) operation; + catalogManager.dropDatabase( + dropDatabaseOperation.getCatalogName(), + dropDatabaseOperation.getDatabaseName(), + dropDatabaseOperation.isIfExists(), + dropDatabaseOperation.isRestrict(), + false); + } else if (operation instanceof AlterDatabaseOperation) { + AlterDatabaseOperation alterDatabaseOperation = (AlterDatabaseOperation) operation; + catalogManager.alterDatabase( + alterDatabaseOperation.getCatalogName(), + alterDatabaseOperation.getDatabaseName(), + alterDatabaseOperation.getCatalogDatabase(), + false); + } else if (operation instanceof UseOperation) { + applyUseOperation((UseOperation) operation); } else { throw new TableException( "Unsupported SQL query! sqlUpdate() only accepts a single SQL statements of " + - "type INSERT, CREATE TABLE, DROP TABLE, USE CATALOG"); + "type INSERT, CREATE TABLE, DROP TABLE, USE CATALOG, USE [catalog.]database, " + + "CREATE DATABASE, DROP DATABASE, ALTER DATABASE"); Review comment: Yes, but now we can't infer the type of unknown. We can do a refactor later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner
zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner URL: https://github.com/apache/flink/pull/10296#discussion_r349991483 ## File path: flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/catalog/CatalogManager.java ## @@ -455,6 +457,94 @@ public void createTable(CatalogBaseTable table, ObjectIdentifier objectIdentifie "CreateTable"); } + /** +* Creates a database in a given fully qualified path. +* @param catalogName +* @param databaseName +* @param database +* @param ignoreIfExists If false exception will be thrown if a database exists in the given path. +*/ + public void createDatabase(String catalogName, + String databaseName, + CatalogDatabase database, + boolean ignoreIfExists, + boolean ignoreNoCatalog) { Review comment: It provides the ability to ignore the exception if there is no corresponding catalog like the other catalog method behavior. I think we can keep this and so that it's more convenient to use later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce parallelism inference for HiveSource
flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce parallelism inference for HiveSource URL: https://github.com/apache/flink/pull/10210#issuecomment-554238019 ## CI report: * 642c74d85bdc002a43307814307cc5fb4cba923c : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136653666) * 28557e594494e4df238542a33fad3da60c4cdb23 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136663675) * 600dd204a87bfb69656b20247759efe1095428d1 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136957895) * 6ac5c0321d415ba19b2f9512faae1ed729dcc85c : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137706192) * aaba84a4188eab863f1504c8bec045a51aa4ff73 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137975262) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on issue #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#issuecomment-557985620 @KurtYoung @lirui-apache Hope you take a look again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10300: [FLINK-14926] [state backends] Make sure no resource leak of RocksObject
flinkbot edited a comment on issue #10300: [FLINK-14926] [state backends] Make sure no resource leak of RocksObject URL: https://github.com/apache/flink/pull/10300#issuecomment-557891612 ## CI report: * dae41b86e4710d0c83d3f764e6196513db26a2c8 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137940914) * 95bb102bbc2cb241ab66efbf0870f2a9d9e515d8 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137975257) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner
zjuwangg commented on a change in pull request #10296: [FLINK-14691][table]Add use/create/drop/alter database operation and support it in flink/blink planner URL: https://github.com/apache/flink/pull/10296#discussion_r349987842 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/client/HiveMetastoreClientWrapper.java ## @@ -149,6 +149,11 @@ public void dropDatabase(String name, boolean deleteData, boolean ignoreIfNotExi client.dropDatabase(name, deleteData, ignoreIfNotExists); } + public void dropDatabase(String name, boolean deleteData, boolean ignoreIfNotExists, boolean cascade) + throws NoSuchObjectException, InvalidOperationException, MetaException, TException { + client.dropDatabase(name, deleteData, ignoreIfNotExists, cascade); Review comment: Yes,I check it hive version's range in [1.0.0, 3.1.2], all of them support this method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce orc ColumnarRow reader for hive connector
flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce orc ColumnarRow reader for hive connector URL: https://github.com/apache/flink/pull/10022#issuecomment-547278354 ## CI report: * 73e9fbb7c4edb8628b93e9a5b17271e57a8b8f14 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/133947525) * b86c5db1ab9a081f0fec52f15fdad3b4f4d61939 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134797047) * 77aa3cb348f97c8e2999bafcee9e7169b96e983e : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135586640) * 3e65b8dfcdbcb91902d7c9b122b3bdb36333b227 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135597364) * 48e164d35f68ea4d11318c1481586ea2beefe386 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135625521) * c8bf33b2b0bc62f779e6ea68e4dc4c3d18432f87 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135891066) * 619180562ffd3549bc1e9a2fd33f87f42d61a772 : UNKNOWN * 111a1761ee7b728ba5bd7fa606879ca018eb2e15 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135898529) * 3c7a40e08a555d26b27dc2738e19ccb600446cb9 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135922821) * 4c43bc9d58bbfabcf6661c1a93e5315a6cee1269 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/136073480) * 82352ba9729d9c9f15872964753fc4d55459efe1 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136076198) * 1e35c8e1496d1aa26bc0da8edfe913c8b7c4cfc3 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/136253883) * 5c0d680d44622e5a371fb501316b4e7b6c875440 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/136255756) * 2907034544549016ac5afce4b136972264adb152 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136522618) * 79e9f7f73eed2478e72575d40df7e7f1cabdf1ad : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/136634189) * c600ec7fed19f38701933fab6fb2822193ebbec1 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/136636034) * 4dcb68db6b75621f17d2bebc69b75ddaa2fb2017 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137762868) * 31a92829a40ecdffe94f9b4e537327c7e9bef0b3 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/137979329) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case
flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case URL: https://github.com/apache/flink/pull/10271#issuecomment-556037401 ## CI report: * dfd530a162b1912b00f5740ca07f828b44dcd4bd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137401080) * 366928af4c6384dfeebf26ac04dac6a15b725ba4 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137498499) * 63595b41fa8062be41de37d6abe7cb752c9de280 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137530355) * 488c22fd0bb6ab21f262594d5f1b7897273b3f28 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137762854) * e9c30c1d7e3097774ef019d228069e34ac407852 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/137979314) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349986777 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSink.java ## @@ -178,27 +171,6 @@ public void setStaticPartition(Map partitionSpec) { } } - private void validatePartitionSpec() { Review comment: Planner already verified it. It should be verify by framework. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349986703 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSink.java ## @@ -86,46 +83,36 @@ public HiveTableSink(JobConf jobConf, ObjectPath tablePath, CatalogTable table) @Override public OutputFormat getOutputFormat() { - List partitionColumns = getPartitionFieldNames(); - boolean isPartitioned = partitionColumns != null && !partitionColumns.isEmpty(); - boolean isDynamicPartition = isPartitioned && partitionColumns.size() > staticPartitionSpec.size(); + String[] partitionColumns = getPartitionFieldNames().toArray(new String[0]); String dbName = tablePath.getDatabaseName(); String tableName = tablePath.getObjectName(); try (HiveMetastoreClientWrapper client = HiveMetastoreClientFactory.create(new HiveConf(jobConf, HiveConf.class), hiveVersion)) { Table table = client.getTable(dbName, tableName); StorageDescriptor sd = table.getSd(); - // here we use the sdLocation to store the output path of the job, which is always a staging dir - String sdLocation = sd.getLocation(); - HiveTablePartition hiveTablePartition; - if (isPartitioned) { - validatePartitionSpec(); - if (isDynamicPartition) { - List path = new ArrayList<>(2); - path.add(sd.getLocation()); - if (!staticPartitionSpec.isEmpty()) { - path.add(Warehouse.makePartName(staticPartitionSpec, false)); - } - sdLocation = String.join(Path.SEPARATOR, path); - } else { - List partitions = client.listPartitions(dbName, tableName, - new ArrayList<>(staticPartitionSpec.values()), (short) 1); - sdLocation = !partitions.isEmpty() ? partitions.get(0).getSd().getLocation() : - sd.getLocation() + Path.SEPARATOR + Warehouse.makePartName(staticPartitionSpec, true); - } - - sd.setLocation(toStagingDir(sdLocation, jobConf)); - hiveTablePartition = new HiveTablePartition(sd, new LinkedHashMap<>(staticPartitionSpec)); - } else { - sd.setLocation(toStagingDir(sdLocation, jobConf)); - hiveTablePartition = new HiveTablePartition(sd, null); - } - return new HiveTableOutputFormat( - jobConf, - tablePath, - catalogTable, - hiveTablePartition, - HiveReflectionUtils.getTableMetadata(hiveShim, table), - overwrite); + + FileSystemOutputFormat.Builder builder = new FileSystemOutputFormat.Builder<>(); + builder.setColumnNames(tableSchema.getFieldNames()); + builder.setDefaultPartName(jobConf.get(HiveConf.ConfVars.DEFAULTPARTITIONNAME.varname, + HiveConf.ConfVars.DEFAULTPARTITIONNAME.defaultStrVal)); + builder.setDynamicGrouped(dynamicGrouping); + builder.setPartitionColumns(partitionColumns); + builder.setFileSystemFactory(new HiveFileSystemFactory(jobConf)); + builder.setFormatFactory(new HiveOutputFormatFactory( + jobConf, + sd.getOutputFormat(), + sd.getSerdeInfo(), + tableSchema, + partitionColumns, + HiveReflectionUtils.getTableMetadata(hiveShim, table), + hiveVersion)); + builder.setMetaStoreFactory( + new HiveMetaStoreFactory(jobConf, hiveVersion, dbName, tableName)); + builder.setOverwrite(overwrite); + builder.setStaticPartitions(staticPartitionSpec); + builder.setTmpPath(new
[GitHub] [flink] godfreyhe commented on a change in pull request #10174: [FLINK-14625][table-planner-blink] Add a rule to eliminate cross join as much as possible without statistics
godfreyhe commented on a change in pull request #10174: [FLINK-14625][table-planner-blink] Add a rule to eliminate cross join as much as possible without statistics URL: https://github.com/apache/flink/pull/10174#discussion_r349981321 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/optimize/program/FlinkBatchProgram.scala ## @@ -139,24 +140,24 @@ object FlinkBatchProgram { // join reorder val deprecatedJoinReorderEnabled = config.getBoolean(OptimizerConfigOptions.TABLE_OPTIMIZER_JOIN_REORDER_ENABLED) -val joinReorderMode = JoinReorderMode.withName( - config.getString(OptimizerConfigOptions.TABLE_OPTIMIZER_JOIN_REORDER_MODE)) +val joinReorderMode = JoinReorderStrategy.valueOf( Review comment: rename to `joinReorderStrategy` ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349985201 ## File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemCommitter.java ## @@ -0,0 +1,161 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.filesystem; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.core.fs.FileSystem; +import org.apache.flink.core.fs.Path; + +import java.io.IOException; +import java.io.Serializable; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; + +import static org.apache.flink.table.filesystem.FileSystemUtils.searchPartSpecAndPaths; +import static org.apache.flink.table.filesystem.TempFileManager.deleteCheckpoint; +import static org.apache.flink.table.filesystem.TempFileManager.headCheckpoints; +import static org.apache.flink.table.filesystem.TempFileManager.listTaskTemporaryPaths; + +/** + * File system file committer implementation. It move all files to output path from temporary path. + * + * In a checkpoint: + * 1.Every task will invoke {@link #createFileManagerAndCleanDir} to initialization, it returns + * a path generator to generate path for task writing. And clean the temporary path of task. + * 2.After writing done for this checkpoint, need invoke {@link #commitUpToCheckpoint(long)}, + * will move the temporary files to real output path. + * + * Batch is a special case of Streaming, which has only one checkpoint. + * + * Data consistency: + * 1.For task failure: will launch a new task and invoke {@link #createFileManagerAndCleanDir}, + * this will clean previous temporary files (This simple design can make it easy to delete the + * invalid temporary directory of the task, but it also causes that our directory does not + * support the same task to start multiple backups to run). + * 2.For job master commit failure when overwrite: this may result in unfinished intermediate + * results, but if we try to run job again, the final result must be correct (because the + * intermediate result will be overwritten). + * 3.For job master commit failure when append: This can lead to inconsistent data. But, + * considering that the commit action is a single point of execution, and only moves files and + * updates metadata, it will be faster, so the probability of inconsistency is relatively small. + * + * See: + * {@link TempFileManager}. + * {@link FileSystemLoader}. + */ +@Internal +public class FileSystemCommitter implements Serializable { + + private static final long serialVersionUID = 1L; + + private final FileSystemFactory factory; + private final MetaStoreFactory metaStoreFactory; + private final boolean overwrite; + private final Path tmpPath; + private final LinkedHashMap staticPartitions; + private final int partitionColumnSize; + + public FileSystemCommitter( + FileSystemFactory factory, + MetaStoreFactory metaStoreFactory, + boolean overwrite, + Path tmpPath, + LinkedHashMap staticPartitions, + int partitionColumnSize) { + this.factory = factory; + this.metaStoreFactory = metaStoreFactory; + this.overwrite = overwrite; + this.tmpPath = tmpPath; + this.staticPartitions = staticPartitions; + this.partitionColumnSize = partitionColumnSize; + } + + /** +* For committing job's output after successful batch job completion or one checkpoint finish +* for streaming job. Should move all files to final output paths. +* +* NOTE: According to checkpoint notify mechanism of Flink, checkpoint may fail and be +* abandoned, so this method should commit all checkpoint ids that less than current +* checkpoint id (Includes failure checkpoints). +*/ + public
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349985134 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveMetaStoreFactory.java ## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.connectors.hive; + +import org.apache.flink.core.fs.Path; +import org.apache.flink.table.catalog.hive.client.HiveMetastoreClientFactory; +import org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper; +import org.apache.flink.table.catalog.hive.util.HiveTableUtil; +import org.apache.flink.table.filesystem.MetaStoreFactory; + +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.api.NoSuchObjectException; +import org.apache.hadoop.hive.metastore.api.Partition; +import org.apache.hadoop.hive.metastore.api.StorageDescriptor; +import org.apache.hadoop.mapred.JobConf; +import org.apache.thrift.TException; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.Optional; + +/** + * Hive {@link MetaStoreFactory}, use {@link HiveMetastoreClientWrapper} to communicate with + * hive meta store. + */ +public class HiveMetaStoreFactory implements MetaStoreFactory { + + private static final long serialVersionUID = 1L; + + private final JobConfWrapper conf; + private final String hiveVersion; + private final String database; + private final String tableName; + + HiveMetaStoreFactory( + JobConf conf, + String hiveVersion, + String database, + String tableName) { + this.conf = new JobConfWrapper(conf); + this.hiveVersion = hiveVersion; + this.database = database; + this.tableName = tableName; + } + + @Override + public HiveTableMetaStore createTableMetaStore() throws Exception { + return new HiveTableMetaStore(); + } + + private class HiveTableMetaStore implements TableMetaStore { + + private HiveMetastoreClientWrapper client; + private StorageDescriptor sd; + + private HiveTableMetaStore() throws TException { + client = HiveMetastoreClientFactory.create( + new HiveConf(conf.conf(), HiveConf.class), hiveVersion); + sd = client.getTable(database, tableName).getSd(); + } + + @Override + public Path getLocationPath() { + return new Path(sd.getLocation()); + } + + @Override + public Optional getPartition( + LinkedHashMap partSpec) throws Exception { + try { + return Optional.of(new Path(client.getPartition( + database, + tableName, + new ArrayList<>(partSpec.values())) + .getSd().getLocation())); + } catch (NoSuchObjectException ignore) { + return Optional.empty(); + } + } + + @Override + public void createPartition(LinkedHashMap partSpec, Path path) throws Exception { + StorageDescriptor newSd = new StorageDescriptor(sd); + newSd.setLocation(path.toString()); + Partition partition = HiveTableUtil.createHivePartition(database, tableName, + new ArrayList<>(partSpec.values()), newSd, new HashMap<>()); + partition.setValues(new ArrayList<>(partSpec.values())); + client.add_partition(partition); Review comment: These two interface are
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349984971 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveMetaStoreFactory.java ## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.connectors.hive; + +import org.apache.flink.core.fs.Path; +import org.apache.flink.table.catalog.hive.client.HiveMetastoreClientFactory; +import org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper; +import org.apache.flink.table.catalog.hive.util.HiveTableUtil; +import org.apache.flink.table.filesystem.MetaStoreFactory; + +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.api.NoSuchObjectException; +import org.apache.hadoop.hive.metastore.api.Partition; +import org.apache.hadoop.hive.metastore.api.StorageDescriptor; +import org.apache.hadoop.mapred.JobConf; +import org.apache.thrift.TException; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.Optional; + +/** + * Hive {@link MetaStoreFactory}, use {@link HiveMetastoreClientWrapper} to communicate with + * hive meta store. + */ +public class HiveMetaStoreFactory implements MetaStoreFactory { + + private static final long serialVersionUID = 1L; + + private final JobConfWrapper conf; + private final String hiveVersion; + private final String database; + private final String tableName; + + HiveMetaStoreFactory( + JobConf conf, + String hiveVersion, + String database, + String tableName) { + this.conf = new JobConfWrapper(conf); + this.hiveVersion = hiveVersion; + this.database = database; + this.tableName = tableName; + } + + @Override + public HiveTableMetaStore createTableMetaStore() throws Exception { + return new HiveTableMetaStore(); + } + + private class HiveTableMetaStore implements TableMetaStore { + + private HiveMetastoreClientWrapper client; + private StorageDescriptor sd; + + private HiveTableMetaStore() throws TException { + client = HiveMetastoreClientFactory.create( + new HiveConf(conf.conf(), HiveConf.class), hiveVersion); + sd = client.getTable(database, tableName).getSd(); + } + + @Override + public Path getLocationPath() { + return new Path(sd.getLocation()); + } + + @Override + public Optional getPartition( + LinkedHashMap partSpec) throws Exception { + try { + return Optional.of(new Path(client.getPartition( Review comment: Will invoke `PartitionLoader.loadNonPartition`, never reach here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349984854 ## File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/MetaStoreFactory.java ## @@ -0,0 +1,72 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.filesystem; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.core.fs.Path; +import org.apache.flink.table.catalog.Catalog; + +import java.io.Closeable; +import java.io.Serializable; +import java.util.LinkedHashMap; +import java.util.Optional; + +/** + * Meta store factory to create {@link TableMetaStore}. Meta store may need contains connection + * to remote, so we should not create too frequently. + */ +@Internal +public interface MetaStoreFactory extends Serializable { + + /** +* Create a {@link TableMetaStore}. +*/ + TableMetaStore createTableMetaStore() throws Exception; Review comment: `TableMetaStoreFactory` already specify DB name and table name, we don't want to let invoker to get DB name and table name every time and every where, this is meaningless, and where factory exists is just for a single table. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-13995) Fix shading of the licence information of netty
[ https://issues.apache.org/jira/browse/FLINK-13995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981282#comment-16981282 ] Hequn Cheng commented on FLINK-13995: - [~chesnay] Hi, as legal problems are always important, so I think this is also a blocker for 1.8.3? > Fix shading of the licence information of netty > --- > > Key: FLINK-13995 > URL: https://issues.apache.org/jira/browse/FLINK-13995 > Project: Flink > Issue Type: Bug > Components: BuildSystem / Shaded >Affects Versions: 1.8.0 >Reporter: Arvid Heise >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0, 1.8.3, 1.9.2 > > Time Spent: 10m > Remaining Estimate: 0h > > The license filter isn't actually filtering anything. It should be > META-INF/license/**. > The first filter seems to be outdated btw. > Multiple modules affected. > {code:xml} > > io.netty:netty > > META-INF/maven/io.netty/** > > META-INF/license > > META-INF/NOTICE.txt > > > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349983145 ## File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemFactory.java ## @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.filesystem; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.core.fs.FileSystem; + +import java.io.IOException; +import java.io.Serializable; +import java.net.URI; + +/** + * A factory to create file systems. + */ +@Internal +public interface FileSystemFactory extends Serializable { Review comment: I don't like to introduce `getScheme` in `FileSystemFactory`. And it is not serializable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case
flinkbot edited a comment on issue #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case URL: https://github.com/apache/flink/pull/10271#issuecomment-556037401 ## CI report: * dfd530a162b1912b00f5740ca07f828b44dcd4bd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137401080) * 366928af4c6384dfeebf26ac04dac6a15b725ba4 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137498499) * 63595b41fa8062be41de37d6abe7cb752c9de280 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137530355) * 488c22fd0bb6ab21f262594d5f1b7897273b3f28 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137762854) * e9c30c1d7e3097774ef019d228069e34ac407852 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce orc ColumnarRow reader for hive connector
flinkbot edited a comment on issue #10022: [FLINK-14135][hive][orc] Introduce orc ColumnarRow reader for hive connector URL: https://github.com/apache/flink/pull/10022#issuecomment-547278354 ## CI report: * 73e9fbb7c4edb8628b93e9a5b17271e57a8b8f14 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/133947525) * b86c5db1ab9a081f0fec52f15fdad3b4f4d61939 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/134797047) * 77aa3cb348f97c8e2999bafcee9e7169b96e983e : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135586640) * 3e65b8dfcdbcb91902d7c9b122b3bdb36333b227 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135597364) * 48e164d35f68ea4d11318c1481586ea2beefe386 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135625521) * c8bf33b2b0bc62f779e6ea68e4dc4c3d18432f87 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135891066) * 619180562ffd3549bc1e9a2fd33f87f42d61a772 : UNKNOWN * 111a1761ee7b728ba5bd7fa606879ca018eb2e15 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135898529) * 3c7a40e08a555d26b27dc2738e19ccb600446cb9 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/135922821) * 4c43bc9d58bbfabcf6661c1a93e5315a6cee1269 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/136073480) * 82352ba9729d9c9f15872964753fc4d55459efe1 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136076198) * 1e35c8e1496d1aa26bc0da8edfe913c8b7c4cfc3 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/136253883) * 5c0d680d44622e5a371fb501316b4e7b6c875440 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/136255756) * 2907034544549016ac5afce4b136972264adb152 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136522618) * 79e9f7f73eed2478e72575d40df7e7f1cabdf1ad : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/136634189) * c600ec7fed19f38701933fab6fb2822193ebbec1 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/136636034) * 4dcb68db6b75621f17d2bebc69b75ddaa2fb2017 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137762868) * 31a92829a40ecdffe94f9b4e537327c7e9bef0b3 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-14933) translate "Data Streaming Fault Tolerance" page in to Chinese
[ https://issues.apache.org/jira/browse/FLINK-14933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981280#comment-16981280 ] Zhangcheng Hu commented on FLINK-14933: --- ok,thx > translate "Data Streaming Fault Tolerance" page in to Chinese > - > > Key: FLINK-14933 > URL: https://issues.apache.org/jira/browse/FLINK-14933 > Project: Flink > Issue Type: Task > Components: chinese-translation >Reporter: Zhangcheng Hu >Assignee: Zhangcheng Hu >Priority: Minor > > The page url is > [https://ci.apache.org/projects/flink/flink-docs-release-1.9/internals/stream_checkpointing.html] > The markdown file is located in > flink/docs/internals/stream_checkpointing.zh.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349983145 ## File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemFactory.java ## @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.filesystem; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.core.fs.FileSystem; + +import java.io.IOException; +import java.io.Serializable; +import java.net.URI; + +/** + * A factory to create file systems. + */ +@Internal +public interface FileSystemFactory extends Serializable { Review comment: I don't like to introduce `getScheme` in `FileSystemFactory`. But I am OK to use it too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349982946 ## File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemUtils.java ## @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.filesystem; + +import org.apache.flink.core.fs.Path; +import org.apache.flink.table.api.TableException; + +import java.util.ArrayList; +import java.util.BitSet; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +/** + * Utils for file system. + */ +public class FileSystemUtils { + + private static final Pattern PARTITION_NAME_PATTERN = Pattern.compile("([^/]+)=([^/]+)"); + + private static final BitSet CHAR_TO_ESCAPE = new BitSet(128); + static { + for (char c = 0; c < ' '; c++) { + CHAR_TO_ESCAPE.set(c); + } + + /* +* ASCII 01-1F are HTTP control characters that need to be escaped. +* \u000A and \u000D are \n and \r, respectively. +*/ + char[] clist = new char[] {'\u0001', '\u0002', '\u0003', '\u0004', + '\u0005', '\u0006', '\u0007', '\u0008', '\u0009', '\n', '\u000B', + '\u000C', '\r', '\u000E', '\u000F', '\u0010', '\u0011', '\u0012', + '\u0013', '\u0014', '\u0015', '\u0016', '\u0017', '\u0018', '\u0019', + '\u001A', '\u001B', '\u001C', '\u001D', '\u001E', '\u001F', + '"', '#', '%', '\'', '*', '/', ':', '=', '?', '\\', '\u007F', '{', + '[', ']', '^'}; + + for (char c : clist) { + CHAR_TO_ESCAPE.set(c); + } + } + + private static boolean needsEscaping(char c) { + return c < CHAR_TO_ESCAPE.size() && CHAR_TO_ESCAPE.get(c); + } + + /** +* Make partition path from partition spec. +* +* @param partitionSpec The partition spec. +* @return An escaped, valid partition name. +*/ + public static String generatePartName(LinkedHashMap partitionSpec) { + StringBuilder suffixBuf = new StringBuilder(); + int i = 0; + for (Map.Entry e : partitionSpec.entrySet()) { + if (i > 0) { + suffixBuf.append(Path.SEPARATOR); + } + suffixBuf.append(escapePathName(e.getKey())); + suffixBuf.append('='); + suffixBuf.append(escapePathName(e.getValue())); + i++; + } + suffixBuf.append(Path.SEPARATOR); + return suffixBuf.toString(); + } + + /** +* Escapes a path name. +* @param path The path to escape. +* @return An escaped path name. +*/ + private static String escapePathName(String path) { + + // __DEFAULT_NULL__ is the system default value for null and empty string. Review comment: Wrong comment, I will remove it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349982669 ## File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemCommitter.java ## @@ -0,0 +1,161 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.filesystem; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.core.fs.FileSystem; +import org.apache.flink.core.fs.Path; + +import java.io.IOException; +import java.io.Serializable; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; + +import static org.apache.flink.table.filesystem.FileSystemUtils.searchPartSpecAndPaths; +import static org.apache.flink.table.filesystem.TempFileManager.deleteCheckpoint; +import static org.apache.flink.table.filesystem.TempFileManager.headCheckpoints; +import static org.apache.flink.table.filesystem.TempFileManager.listTaskTemporaryPaths; + +/** + * File system file committer implementation. It move all files to output path from temporary path. + * + * In a checkpoint: + * 1.Every task will invoke {@link #createFileManagerAndCleanDir} to initialization, it returns + * a path generator to generate path for task writing. And clean the temporary path of task. + * 2.After writing done for this checkpoint, need invoke {@link #commitUpToCheckpoint(long)}, + * will move the temporary files to real output path. + * + * Batch is a special case of Streaming, which has only one checkpoint. + * + * Data consistency: + * 1.For task failure: will launch a new task and invoke {@link #createFileManagerAndCleanDir}, + * this will clean previous temporary files (This simple design can make it easy to delete the + * invalid temporary directory of the task, but it also causes that our directory does not + * support the same task to start multiple backups to run). + * 2.For job master commit failure when overwrite: this may result in unfinished intermediate + * results, but if we try to run job again, the final result must be correct (because the + * intermediate result will be overwritten). + * 3.For job master commit failure when append: This can lead to inconsistent data. But, + * considering that the commit action is a single point of execution, and only moves files and + * updates metadata, it will be faster, so the probability of inconsistency is relatively small. + * + * See: + * {@link TempFileManager}. + * {@link FileSystemLoader}. + */ +@Internal +public class FileSystemCommitter implements Serializable { + + private static final long serialVersionUID = 1L; + + private final FileSystemFactory factory; + private final MetaStoreFactory metaStoreFactory; + private final boolean overwrite; + private final Path tmpPath; + private final LinkedHashMap staticPartitions; + private final int partitionColumnSize; + + public FileSystemCommitter( + FileSystemFactory factory, + MetaStoreFactory metaStoreFactory, + boolean overwrite, + Path tmpPath, + LinkedHashMap staticPartitions, + int partitionColumnSize) { + this.factory = factory; + this.metaStoreFactory = metaStoreFactory; + this.overwrite = overwrite; + this.tmpPath = tmpPath; + this.staticPartitions = staticPartitions; + this.partitionColumnSize = partitionColumnSize; + } + + /** +* For committing job's output after successful batch job completion or one checkpoint finish +* for streaming job. Should move all files to final output paths. +* +* NOTE: According to checkpoint notify mechanism of Flink, checkpoint may fail and be +* abandoned, so this method should commit all checkpoint ids that less than current +* checkpoint id (Includes failure checkpoints). +*/ + public
[GitHub] [flink] flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce parallelism inference for HiveSource
flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce parallelism inference for HiveSource URL: https://github.com/apache/flink/pull/10210#issuecomment-554238019 ## CI report: * 642c74d85bdc002a43307814307cc5fb4cba923c : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136653666) * 28557e594494e4df238542a33fad3da60c4cdb23 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136663675) * 600dd204a87bfb69656b20247759efe1095428d1 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136957895) * 6ac5c0321d415ba19b2f9512faae1ed729dcc85c : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137706192) * aaba84a4188eab863f1504c8bec045a51aa4ff73 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/137975262) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-14933) translate "Data Streaming Fault Tolerance" page in to Chinese
[ https://issues.apache.org/jira/browse/FLINK-14933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981278#comment-16981278 ] Jark Wu commented on FLINK-14933: - Hi [~Zhangcheng], I assigned this issue to you. Feel free to open pull request. > translate "Data Streaming Fault Tolerance" page in to Chinese > - > > Key: FLINK-14933 > URL: https://issues.apache.org/jira/browse/FLINK-14933 > Project: Flink > Issue Type: Task > Components: chinese-translation >Reporter: Zhangcheng Hu >Assignee: Zhangcheng Hu >Priority: Minor > > The page url is > [https://ci.apache.org/projects/flink/flink-docs-release-1.9/internals/stream_checkpointing.html] > The markdown file is located in > flink/docs/internals/stream_checkpointing.zh.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-13943) Provide api to convert flink table to java List (e.g. Table#collect)
[ https://issues.apache.org/jira/browse/FLINK-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981276#comment-16981276 ] Jark Wu commented on FLINK-13943: - Just want to confirm that we don't want to support \{{collect()}} for streaming jobs in this version? > Provide api to convert flink table to java List (e.g. Table#collect) > > > Key: FLINK-13943 > URL: https://issues.apache.org/jira/browse/FLINK-13943 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Jeff Zhang >Assignee: Caizhi Weng >Priority: Major > > It would be nice to convert flink table to java List so that I can do other > data manipulation in client side after execution flink job. For flink > planner, I can convert flink table to DataSet and use DataSet#collect, but > for blink planner, there's no such api. > EDIT from FLINK-14807: > Currently, it is very unconvinient for user to fetch data of flink job unless > specify sink expclitly and then fetch data from this sink via its api (e.g. > write to hdfs sink, then read data from hdfs). However, most of time user > just want to get the data and do whatever processing he want. So it is very > necessary for flink to provide api Table#collect for this purpose. > Other apis such as Table#head, Table#print is also helpful. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349980532 ## File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/TempFileManager.java ## @@ -0,0 +1,148 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.filesystem; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.core.fs.FileStatus; +import org.apache.flink.core.fs.FileSystem; +import org.apache.flink.core.fs.Path; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.util.Preconditions.checkArgument; + +/** + * Manage temporary files for writing files. Use special rules to organize directories + * for temporary files. + * + * Temporary file directory contains the following directory parts: + * 1.temporary base path directory. + * 2.checkpoint id directory. + * 3.task id directory. + * 4.directories to specify partitioning. + * 5.data files. + * eg: /tmp/cp-1/task-0/p0=1/p1=2/fileName. + */ +@Internal +public class TempFileManager { + + private static final String CHECKPOINT_DIR_PREFIX = "cp-"; + private static final String TASK_DIR_PREFIX = "task-"; + + private final int taskNumber; + private final long checkpointId; + private final Path taskTmpDir; + + private transient int nameCounter = 0; + + public TempFileManager(Path temporaryPath, int taskNumber, long checkpointId) { + checkArgument(checkpointId != -1, "checkpoint id start with 0."); + this.taskNumber = taskNumber; + this.checkpointId = checkpointId; + this.taskTmpDir = new Path( + new Path(temporaryPath, checkpointName(checkpointId)), + TASK_DIR_PREFIX + taskNumber); + } + + public Path getTaskTemporaryPath() { Review comment: This is just for committer to clean task temporary dir, but I think we can move this `clean` to `FileManager`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (FLINK-14933) translate "Data Streaming Fault Tolerance" page in to Chinese
[ https://issues.apache.org/jira/browse/FLINK-14933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-14933: --- Assignee: Zhangcheng Hu > translate "Data Streaming Fault Tolerance" page in to Chinese > - > > Key: FLINK-14933 > URL: https://issues.apache.org/jira/browse/FLINK-14933 > Project: Flink > Issue Type: Task > Components: chinese-translation >Reporter: Zhangcheng Hu >Assignee: Zhangcheng Hu >Priority: Minor > > The page url is > [https://ci.apache.org/projects/flink/flink-docs-release-1.9/internals/stream_checkpointing.html] > The markdown file is located in > flink/docs/internals/stream_checkpointing.zh.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349980532 ## File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/TempFileManager.java ## @@ -0,0 +1,148 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.filesystem; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.core.fs.FileStatus; +import org.apache.flink.core.fs.FileSystem; +import org.apache.flink.core.fs.Path; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.util.Preconditions.checkArgument; + +/** + * Manage temporary files for writing files. Use special rules to organize directories + * for temporary files. + * + * Temporary file directory contains the following directory parts: + * 1.temporary base path directory. + * 2.checkpoint id directory. + * 3.task id directory. + * 4.directories to specify partitioning. + * 5.data files. + * eg: /tmp/cp-1/task-0/p0=1/p1=2/fileName. + */ +@Internal +public class TempFileManager { + + private static final String CHECKPOINT_DIR_PREFIX = "cp-"; + private static final String TASK_DIR_PREFIX = "task-"; + + private final int taskNumber; + private final long checkpointId; + private final Path taskTmpDir; + + private transient int nameCounter = 0; + + public TempFileManager(Path temporaryPath, int taskNumber, long checkpointId) { + checkArgument(checkpointId != -1, "checkpoint id start with 0."); + this.taskNumber = taskNumber; + this.checkpointId = checkpointId; + this.taskTmpDir = new Path( + new Path(temporaryPath, checkpointName(checkpointId)), + TASK_DIR_PREFIX + taskNumber); + } + + public Path getTaskTemporaryPath() { Review comment: This is just for committer to clean task temporary dir, but I think we should let it not public. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10300: [FLINK-14926] [state backends] Make sure no resource leak of RocksObject
flinkbot edited a comment on issue #10300: [FLINK-14926] [state backends] Make sure no resource leak of RocksObject URL: https://github.com/apache/flink/pull/10300#issuecomment-557891612 ## CI report: * dae41b86e4710d0c83d3f764e6196513db26a2c8 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137940914) * 95bb102bbc2cb241ab66efbf0870f2a9d9e515d8 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/137975257) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10291: [FLINK-14899] [table-planner-blink] Fix unexpected plan when PROCTIME() is defined in query
flinkbot edited a comment on issue #10291: [FLINK-14899] [table-planner-blink] Fix unexpected plan when PROCTIME() is defined in query URL: https://github.com/apache/flink/pull/10291#issuecomment-557488395 ## CI report: * c3082d201afec00fff9e25d46059b3789c9d28d5 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/137741065) * 76a6169e77815b25b5fb265a706ab222d2395d87 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137744705) * 365b5e6cc7b3953877838d371d971d8a3cbb3966 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/137973852) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] godfreyhe commented on a change in pull request #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case
godfreyhe commented on a change in pull request #10271: [FLINK-14874] [table-planner-blink] add local aggregate to solve data skew for ROLLUP/CUBE case URL: https://github.com/apache/flink/pull/10271#discussion_r349980293 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/rules/physical/batch/EnforceLocalAggRuleBase.scala ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.rules.physical.batch + +import org.apache.flink.table.api.TableException +import org.apache.flink.table.planner.calcite.FlinkTypeFactory +import org.apache.flink.table.planner.plan.`trait`.FlinkRelDistribution +import org.apache.flink.table.planner.plan.nodes.FlinkConventions +import org.apache.flink.table.planner.plan.nodes.physical.batch.{BatchExecExchange, BatchExecExpand, BatchExecGroupAggregateBase, BatchExecHashAggregate, BatchExecLocalHashAggregate, BatchExecLocalSortAggregate, BatchExecSortAggregate} +import org.apache.flink.table.planner.plan.utils.{AggregateUtil, FlinkRelOptUtil} +import org.apache.flink.table.runtime.types.LogicalTypeDataTypeConverter.fromDataTypeToLogicalType + +import org.apache.calcite.plan.{RelOptRule, RelOptRuleCall, RelOptRuleOperand} +import org.apache.calcite.rel.RelNode +import org.apache.calcite.rex.RexUtil +import org.apache.calcite.tools.RelBuilder +import org.apache.calcite.util.Util + +import scala.collection.JavaConversions._ + +/** + * Planner rule that writes one phase aggregate to two phase aggregate, + * when the following conditions are met: + * 1. there is no local aggregate, + * 2. the aggregate has non-empty grouping and two phase aggregate strategy is enabled, + * 3. the input is [[BatchExecExpand]] and there is at least one expand row + * which the columns for grouping are all constant. + */ +abstract class EnforceLocalAggRuleBase( +operand: RelOptRuleOperand, +description: String) + extends RelOptRule(operand, description) + with BatchExecAggRuleBase { + + protected def getBatchExecExpand(call: RelOptRuleCall): BatchExecExpand + + override def matches(call: RelOptRuleCall): Boolean = { +val agg: BatchExecGroupAggregateBase = call.rel(0) +val expand = getBatchExecExpand(call) + +val tableConfig = FlinkRelOptUtil.getTableConfigFromContext(agg) +val aggFunctions = agg.getAggCallToAggFunction.map(_._2).toArray +val enableTwoPhaseAgg = isTwoPhaseAggWorkable(aggFunctions, tableConfig) + +val grouping = agg.getGrouping +// if all group columns in a expand row are constant, this row will be shuffled to +// a single node. (shuffle keys are grouping) +// add local aggregate to greatly reduce the output data +val hasConstantRow = expand.projects.exists { + project => +val groupingColumns = grouping.map(i => project.get(i)) +groupingColumns.forall(RexUtil.isConstant) +} + +grouping.nonEmpty && enableTwoPhaseAgg && hasConstantRow + } + + protected def createLocalAgg( Review comment: actually, the logic of creating `localAgg` can be partially reused, and the logic of creating `globalAgg` can hardly be reused (calling the constructor directly) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349979952 ## File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemLoader.java ## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.filesystem; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.core.fs.FileStatus; +import org.apache.flink.core.fs.FileSystem; +import org.apache.flink.core.fs.Path; +import org.apache.flink.table.filesystem.MetaStoreFactory.TableMetaStore; +import org.apache.flink.util.Preconditions; + +import java.io.Closeable; +import java.io.IOException; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Optional; + +import static org.apache.flink.table.filesystem.FileSystemUtils.generatePartName; +import static org.apache.flink.table.filesystem.FileSystemUtils.listStatusWithoutHidden; + +/** + * Loader to temporary files to final output path and meta store. According to overwrite, + * the loader will delete the previous data. + * + * This provide two interface to load: + * 1.{@link #loadPartition}: load temporary partitioned files, if it is new partition, + * will create partition to meta store. + * 2.{@link #loadNonPartition}: just rename all files to final output path. + */ +@Internal +public class FileSystemLoader implements Closeable { Review comment: But I am OK to `PartitionLoader`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-14574) flink-s3-fs-hadoop doesn't work with plugins mechanism
[ https://issues.apache.org/jira/browse/FLINK-14574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] clay updated FLINK-14574: - Description: As reported by a user via [mailing list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/No-FileSystem-for-scheme-quot-file-quot-for-S3A-in-and-state-processor-api-in-1-9-td30704.html]: {noformat} We've added flink-s3-fs-hadoop library to plugins folder and trying to bootstrap state to S3 using S3A protocol. The following exception happens (unless hadoop library is put to lib folder instead of plugins). Looks like S3A filesystem is trying to use "local" filesystem for temporary files and fails: java.lang.Exception: Could not write timer service of MapPartition (d2976134f80849779b7a94b7e6218476) (4/4) to checkpoint state stream. at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:466) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:89) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:399) at org.apache.flink.state.api.output.SnapshotUtils.snapshot(SnapshotUtils.java:59) at org.apache.flink.state.api.output.operators.KeyedStateBootstrapOperator.endInput(KeyedStateBootstrapOperator.java:84) at org.apache.flink.state.api.output.BoundedStreamTask.performDefaultAction(BoundedStreamTask.java:85) at org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:298) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:403) at org.apache.flink.state.api.output.BoundedOneInputStreamTaskRunner.mapPartition(BoundedOneInputStreamTaskRunner.java:76) at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103) at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:504) at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Could not open output stream for state backend at org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:367) at org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.flush(FsCheckpointStreamFactory.java:234) at org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.write(FsCheckpointStreamFactory.java:209) at org.apache.flink.runtime.state.NonClosingCheckpointOutputStream.write(NonClosingCheckpointOutputStream.java:61) at java.io.DataOutputStream.write(DataOutputStream.java:107) at java.io.DataOutputStream.writeUTF(DataOutputStream.java:401) at java.io.DataOutputStream.writeUTF(DataOutputStream.java:323) at org.apache.flink.util.LinkedOptionalMapSerializer.lambda$writeOptionalMap$0(LinkedOptionalMapSerializer.java:58) at org.apache.flink.util.LinkedOptionalMap.forEach(LinkedOptionalMap.java:163) at org.apache.flink.util.LinkedOptionalMapSerializer.writeOptionalMap(LinkedOptionalMapSerializer.java:57) at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializerSnapshotData.writeKryoRegistrations(KryoSerializerSnapshotData.java:141) at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializerSnapshotData.writeSnapshotData(KryoSerializerSnapshotData.java:128) at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializerSnapshot.writeSnapshot(KryoSerializerSnapshot.java:72) at org.apache.flink.api.common.typeutils.TypeSerializerSnapshot.writeVersionedSnapshot(TypeSerializerSnapshot.java:153) at org.apache.flink.streaming.api.operators.InternalTimersSnapshotReaderWriters$InternalTimersSnapshotWriterV2.writeKeyAndNamespaceSerializers(InternalTimersSnapshotReaderWriters.java:199) at org.apache.flink.streaming.api.operators.InternalTimersSnapshotReaderWriters$AbstractInternalTimersSnapshotWriter.writeTimersSnapshot(InternalTimersSnapshotReaderWriters.java:117) at org.apache.flink.streaming.api.operators.InternalTimerServiceSerializationProxy.write(InternalTimerServiceSerializationProxy.java:101) at org.apache.flink.streaming.api.operators.InternalTimeServiceManager.snapshotStateForKeyGroup(InternalTimeServiceManager.java:139) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:462) ... 14 common frames omitted Caused by:
[jira] [Resolved] (FLINK-14595) Move flink-orc to flink-formats from flink-connectors
[ https://issues.apache.org/jira/browse/FLINK-14595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu resolved FLINK-14595. - Resolution: Fixed 1.10.0: 320240e2c412c15c7fa91649a0faa78018e74d86 > Move flink-orc to flink-formats from flink-connectors > - > > Key: FLINK-14595 > URL: https://issues.apache.org/jira/browse/FLINK-14595 > Project: Flink > Issue Type: Bug > Components: Connectors / ORC >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > Time Spent: 20m > Remaining Estimate: 0h > > We already have the parent model of formats. we have put other > formats(flink-avro, flink-json, flink-parquet, flink-json, flink-csv, > flink-sequence-file) to flink-formats. flink-orc is a format too. So we can > move it to flink-formats. > > In theory, there should be no compatibility problem, only the parent model > needs to be changed, and no other changes are needed. > > Discuss thread: > [http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Move-flink-orc-to-flink-formats-from-flink-connectors-td34438.html] > Vote thread: > [http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Move-flink-orc-to-flink-formats-from-flink-connectors-td34496.html] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] sunhaibotb commented on a change in pull request #10151: [FLINK-14231] Handle the processing-time timers before closing operator to properly support endInput
sunhaibotb commented on a change in pull request #10151: [FLINK-14231] Handle the processing-time timers before closing operator to properly support endInput URL: https://github.com/apache/flink/pull/10151#discussion_r349979144 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java ## @@ -205,6 +206,8 @@ private Long syncSavepointId = null; + private final Map, ProcessingTimeServiceImpl> processingTimeServices; Review comment: > map value: It's better to specify an interface, not a concrete implementation to depend on it less. The methods `quiesce` and `getTimersDoneFutureAfterQuiescing` need to be added in the processing time service, but the `ProcessingTimeService` interafce is user-oriented, so we don't want to change it due to the runtime needs. On the other hand, `StreamTask` understands `ProcessingTimeServiceImpl` because it uses `ProcessingTimeServiceImpl` to build instances in `#getProcessingTimeService`. So it should be possible to use the concrete implementation here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch
JingsongLi commented on a change in pull request #9864: [FLINK-14254][table] Introduce FileSystemOutputFormat for batch URL: https://github.com/apache/flink/pull/9864#discussion_r349978969 ## File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/TempFileManager.java ## @@ -0,0 +1,148 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.filesystem; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.core.fs.FileStatus; +import org.apache.flink.core.fs.FileSystem; +import org.apache.flink.core.fs.Path; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.util.Preconditions.checkArgument; + +/** + * Manage temporary files for writing files. Use special rules to organize directories + * for temporary files. + * + * Temporary file directory contains the following directory parts: + * 1.temporary base path directory. + * 2.checkpoint id directory. + * 3.task id directory. + * 4.directories to specify partitioning. + * 5.data files. + * eg: /tmp/cp-1/task-0/p0=1/p1=2/fileName. + */ +@Internal +public class TempFileManager { + + private static final String CHECKPOINT_DIR_PREFIX = "cp-"; + private static final String TASK_DIR_PREFIX = "task-"; + + private final int taskNumber; + private final long checkpointId; + private final Path taskTmpDir; + + private transient int nameCounter = 0; + + public TempFileManager(Path temporaryPath, int taskNumber, long checkpointId) { + checkArgument(checkpointId != -1, "checkpoint id start with 0."); + this.taskNumber = taskNumber; + this.checkpointId = checkpointId; + this.taskTmpDir = new Path( + new Path(temporaryPath, checkpointName(checkpointId)), + TASK_DIR_PREFIX + taskNumber); + } + + public Path getTaskTemporaryPath() { + return taskTmpDir; + } + + /** +* Generate a new path with directories. +*/ + public Path generateTempFile(String... directories) throws Exception { + Path parentPath = taskTmpDir; + for (String dir : directories) { + parentPath = new Path(parentPath, dir); + } + return new Path(parentPath, newFileName()); + } + + private String newFileName() { + return String.format( + checkpointName(checkpointId) + "-" + taskName(taskNumber) + "-file-%d", Review comment: Finally, these files will be moved to final directory, so with these information, we can reduce name conflicts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] openinx commented on a change in pull request #10270: [FLINK-14672][sql-client] Make Executor stateful in sql client
openinx commented on a change in pull request #10270: [FLINK-14672][sql-client] Make Executor stateful in sql client URL: https://github.com/apache/flink/pull/10270#discussion_r349978430 ## File path: flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/gateway/local/ResultStore.java ## @@ -63,7 +63,7 @@ public ResultStore(Configuration flinkConfig) { if (env.getExecution().inStreamingMode()) { // determine gateway address (and port if possible) - final InetAddress gatewayAddress = getGatewayAddress(env.getDeployment()); + InetAddress gatewayAddress = getGatewayAddress(env.getDeployment()); Review comment: I think it's unintentional change, will revert it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce parallelism inference for HiveSource
flinkbot edited a comment on issue #10210: [FLINK-14800][hive] Introduce parallelism inference for HiveSource URL: https://github.com/apache/flink/pull/10210#issuecomment-554238019 ## CI report: * 642c74d85bdc002a43307814307cc5fb4cba923c : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136653666) * 28557e594494e4df238542a33fad3da60c4cdb23 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136663675) * 600dd204a87bfb69656b20247759efe1095428d1 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/136957895) * 6ac5c0321d415ba19b2f9512faae1ed729dcc85c : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/137706192) * aaba84a4188eab863f1504c8bec045a51aa4ff73 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on issue #10292: [FLINK-14928][docs] Fix the broken links in documentation of page systemFunctions
klion26 commented on issue #10292: [FLINK-14928][docs] Fix the broken links in documentation of page systemFunctions URL: https://github.com/apache/flink/pull/10292#issuecomment-557970148 @carp84 @wuchong thanks for the review and merging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wuchong merged pull request #10277: [FLINK-14595][orc] Move flink-orc to flink-formats from flink-connectors
wuchong merged pull request #10277: [FLINK-14595][orc] Move flink-orc to flink-formats from flink-connectors URL: https://github.com/apache/flink/pull/10277 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-14712) Improve back-pressure reporting mechanism
[ https://issues.apache.org/jira/browse/FLINK-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lining updated FLINK-14712: --- Description: h4. (1) The current monitor is heavy-weight. * Backpressure monitoring works by repeatedly taking stack trace samples of your running tasks. h4. (2) It is difficult to find out which vertex is the source of backpressure. * User need to know current and upstream's network metric to judge current whether is the source of backpressure. Now user has to record relevant information. h3. Proposed Changes 1. expose the new mechanism implemented in FLINK-14472 as a "is back-pressured" metric. 2. show the vertex that produces the backpressure source for the job. 3. expose network metric in IOMetricsInfo: * SubTask ** pool usage: outPoolUsage, inputExclusiveBuffersUsage, inputFloatingBuffersUsage. *** If the subtask is not back pressured, but it is causing backpressure (full input, empty output) *** By comparing exclusive/floating buffers usage, whether all channels are back-pressure or only some of them ** back-pressured for show whether it is back pressured. * Vertex ** pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg, inputFloatingBuffersUsageAvg ** back-pressured for show whether it is back pressured(merge all iths subtasks) was: h4. (1) The current monitor is heavy-weight. * Backpressure monitoring works by repeatedly taking stack trace samples of your running tasks. h4. (2) It is difficult to find out which vertex is the source of backpressure. * User need to know current and upstream's network metric to judge current whether is the source of backpressure. Now user has to record relevant information. h3. Proposed Changes 1. expose the new mechanism implemented in FLINK-14472 as a "is back-pressured" metric. 2. show the vertex that produces the backpressure source for the job. 3. expose network pool usage in IOMetricsInfo: # if sub task is not back pressured, but it is causing a back pressure (full input, empty output) # by comparing exclusive/floating buffers usage, whether all channels are back-pressured or only some of them {code:java} public final class IOMetricsInfo { private final float outPoolUsage; private final float inputExclusiveBuffersUsage; private final float inputFloatingBuffersUsage; } {code} JobDetailsInfo.JobVertexDetailsInfo merge use Math.max.(ps: outPoolUsage is from upstream) > Improve back-pressure reporting mechanism > - > > Key: FLINK-14712 > URL: https://issues.apache.org/jira/browse/FLINK-14712 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics, Runtime / Network, Runtime / REST >Reporter: lining >Assignee: lining >Priority: Major > Attachments: image-2019-11-12-14-30-16-130.png > > > h4. (1) The current monitor is heavy-weight. > * Backpressure monitoring works by repeatedly taking stack trace samples > of your running tasks. > h4. (2) It is difficult to find out which vertex is the source of > backpressure. > * User need to know current and upstream's network metric to judge current > whether is the source of backpressure. Now user has to record relevant > information. > h3. Proposed Changes > 1. expose the new mechanism implemented in FLINK-14472 as a "is > back-pressured" metric. > 2. show the vertex that produces the backpressure source for the job. > 3. expose network metric in IOMetricsInfo: > * SubTask > ** pool usage: outPoolUsage, inputExclusiveBuffersUsage, > inputFloatingBuffersUsage. > *** If the subtask is not back pressured, but it is causing backpressure > (full input, empty output) > *** By comparing exclusive/floating buffers usage, whether all channels are > back-pressure or only some of them > ** back-pressured for show whether it is back pressured. > * Vertex > ** pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg, > inputFloatingBuffersUsageAvg > ** back-pressured for show whether it is back pressured(merge all iths > subtasks) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14838) Cleanup the description about container number config option in Scala and python shell doc
[ https://issues.apache.org/jira/browse/FLINK-14838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981268#comment-16981268 ] vinoyang commented on FLINK-14838: -- OK, will open a PR soon. > Cleanup the description about container number config option in Scala and > python shell doc > -- > > Key: FLINK-14838 > URL: https://issues.apache.org/jira/browse/FLINK-14838 > Project: Flink > Issue Type: Improvement > Components: Documentation >Reporter: vinoyang >Priority: Major > > Currently, the config option {{-n}} for Flink on Yarn has not been supported > since Flink 1.8+. FLINK-12362 did the cleanup job about this config option. > However, the scala shell and python doc still contains some description about > {{-n}} which may make users confused. This issue used to track the cleanup > work. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-14815) Expose network metric in IOMetricsInfo
[ https://issues.apache.org/jira/browse/FLINK-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lining updated FLINK-14815: --- Description: * SubTask ** pool usage: outPoolUsage, inputExclusiveBuffersUsage, inputFloatingBuffersUsage. *** If the subtask is not back pressured, but it is causing backpressure (full input, empty output) *** By comparing exclusive/floating buffers usage, whether all channels are back-pressure or only some of them ** back-pressured for show whether it is back pressured. * Vertex ** pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg, inputFloatingBuffersUsageAvg ** back-pressured for show whether it is back pressured(merge all iths subtasks) was: * If sub task is not back pressured, but it is causing a back pressure (full input, empty output) * By comparing exclusive/floating buffers usage, whether all channels are back-pressured or only some of them {code:java} public final class IOMetricsInfo { private final float outPoolUsage; private final float inputExclusiveBuffersUsage; private final float inputFloatingBuffersUsage; } {code} JobDetailsInfo.JobVertexDetailsInfo merge use Math.max.(ps: outPoolUsage is from upstream) > Expose network metric in IOMetricsInfo > -- > > Key: FLINK-14815 > URL: https://issues.apache.org/jira/browse/FLINK-14815 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Metrics, Runtime / Network, Runtime / REST >Reporter: lining >Assignee: lining >Priority: Major > > * SubTask > ** pool usage: outPoolUsage, inputExclusiveBuffersUsage, > inputFloatingBuffersUsage. > *** If the subtask is not back pressured, but it is causing backpressure > (full input, empty output) > *** By comparing exclusive/floating buffers usage, whether all channels are > back-pressure or only some of them > ** back-pressured for show whether it is back pressured. > * Vertex > ** pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg, > inputFloatingBuffersUsageAvg > ** back-pressured for show whether it is back pressured(merge all iths > subtasks) -- This message was sent by Atlassian Jira (v8.3.4#803005)