[jira] [Created] (FLINK-17935) Logs could not show up when deploying Flink on Yarn via "--executor"
Yang Wang created FLINK-17935: - Summary: Logs could not show up when deploying Flink on Yarn via "--executor" Key: FLINK-17935 URL: https://issues.apache.org/jira/browse/FLINK-17935 Project: Flink Issue Type: Bug Affects Versions: 1.11.0, 1.12.0 Reporter: Yang Wang Fix For: 1.11.0 {code:java} ./bin/flink run -d -p 5 -e yarn-per-job examples/streaming/WindowJoin.jar{code} When we use the {{-e/--executor}} to specify the deploy target to Yarn per-job, the logs could not show up. The root cause is we do not set the logging files in {{ExecutorCLI}}. We only do it in the {{FlinkYarnSessionCli}}. If we use {{-m yarn-cluster}}, everything works well. Maybe we should move the {{setLogConfigFileInConfig}} to {{YarnClusterDescriptor}} to avoid this problem. cc [~kkl0u] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-17889) flink-connector-hive jar contains wrong class in its SPI config file org.apache.flink.table.factories.TableFactory
[ https://issues.apache.org/jira/browse/FLINK-17889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-17889. Resolution: Fixed master: ec0288c6df4edf63ef6601cde2a7b45eaa85cda3 release-1.11: 643f4a1283b82bcbd9cc08878bcee298e10d2bcd > flink-connector-hive jar contains wrong class in its SPI config file > org.apache.flink.table.factories.TableFactory > -- > > Key: FLINK-17889 > URL: https://issues.apache.org/jira/browse/FLINK-17889 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.11.0 >Reporter: Jeff Zhang >Assignee: Jingsong Lee >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > > These 2 classes are in flink-connector-hive jar's SPI config file > {code:java} > org.apache.flink.orc.OrcFileSystemFormatFactory > License.org.apache.flink.formats.parquet.ParquetFileSystemFormatFactory {code} > Due to this issue, I get the following exception in zeppelin side. > {code:java} > Caused by: java.util.ServiceConfigurationError: > org.apache.flink.table.factories.TableFactory: Provider > org.apache.flink.orc.OrcFileSystemFormatFactory not a subtypeCaused by: > java.util.ServiceConfigurationError: > org.apache.flink.table.factories.TableFactory: Provider > org.apache.flink.orc.OrcFileSystemFormatFactory not a subtype at > java.util.ServiceLoader.fail(ServiceLoader.java:239) at > java.util.ServiceLoader.access$300(ServiceLoader.java:185) at > java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376) at > java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at > java.util.ServiceLoader$1.next(ServiceLoader.java:480) at > java.util.Iterator.forEachRemaining(Iterator.java:116) at > org.apache.flink.table.factories.TableFactoryService.discoverFactories(TableFactoryService.java:214) > ... 35 more {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17189) Table with processing time attribute can not be read from Hive catalog
[ https://issues.apache.org/jira/browse/FLINK-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116449#comment-17116449 ] Jingsong Lee commented on FLINK-17189: -- Feel free to re-open this if you think we need fix in release-1.10 too. > Table with processing time attribute can not be read from Hive catalog > -- > > Key: FLINK-17189 > URL: https://issues.apache.org/jira/browse/FLINK-17189 > Project: Flink > Issue Type: Bug > Components: Table SQL / Ecosystem, Table SQL / Planner >Affects Versions: 1.10.1 >Reporter: Timo Walther >Assignee: Jingsong Lee >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > > DDL: > {code} > CREATE TABLE PROD_LINEITEM ( > L_ORDERKEY INTEGER, > L_PARTKEYINTEGER, > L_SUPPKEYINTEGER, > L_LINENUMBER INTEGER, > L_QUANTITY DOUBLE, > L_EXTENDEDPRICE DOUBLE, > L_DISCOUNT DOUBLE, > L_TAXDOUBLE, > L_CURRENCY STRING, > L_RETURNFLAG STRING, > L_LINESTATUS STRING, > L_ORDERTIME TIMESTAMP(3), > L_SHIPINSTRUCT STRING, > L_SHIPMODE STRING, > L_COMMENTSTRING, > WATERMARK FOR L_ORDERTIME AS L_ORDERTIME - INTERVAL '5' MINUTE, > L_PROCTIME AS PROCTIME() > ) WITH ( > 'connector.type' = 'kafka', > 'connector.version' = 'universal', > 'connector.topic' = 'Lineitem', > 'connector.properties.zookeeper.connect' = 'not-needed', > 'connector.properties.bootstrap.servers' = 'kafka:9092', > 'connector.startup-mode' = 'earliest-offset', > 'format.type' = 'csv', > 'format.field-delimiter' = '|' > ); > {code} > Query: > {code} > SELECT * FROM prod_lineitem; > {code} > Result: > {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.AssertionError: Conversion to relational algebra failed to preserve > datatypes: > validated type: > RecordType(INTEGER L_ORDERKEY, INTEGER L_PARTKEY, INTEGER L_SUPPKEY, INTEGER > L_LINENUMBER, DOUBLE L_QUANTITY, DOUBLE L_EXTENDEDPRICE, DOUBLE L_DISCOUNT, > DOUBLE L_TAX, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_CURRENCY, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_RETURNFLAG, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_LINESTATUS, TIME > ATTRIBUTE(ROWTIME) L_ORDERTIME, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > L_SHIPINSTRUCT, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_SHIPMODE, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_COMMENT, TIMESTAMP(3) NOT NULL > L_PROCTIME) NOT NULL > converted type: > RecordType(INTEGER L_ORDERKEY, INTEGER L_PARTKEY, INTEGER L_SUPPKEY, INTEGER > L_LINENUMBER, DOUBLE L_QUANTITY, DOUBLE L_EXTENDEDPRICE, DOUBLE L_DISCOUNT, > DOUBLE L_TAX, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_CURRENCY, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_RETURNFLAG, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_LINESTATUS, TIME > ATTRIBUTE(ROWTIME) L_ORDERTIME, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > L_SHIPINSTRUCT, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_SHIPMODE, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_COMMENT, TIME > ATTRIBUTE(PROCTIME) NOT NULL L_PROCTIME) NOT NULL > rel: > LogicalProject(L_ORDERKEY=[$0], L_PARTKEY=[$1], L_SUPPKEY=[$2], > L_LINENUMBER=[$3], L_QUANTITY=[$4], L_EXTENDEDPRICE=[$5], L_DISCOUNT=[$6], > L_TAX=[$7], L_CURRENCY=[$8], L_RETURNFLAG=[$9], L_LINESTATUS=[$10], > L_ORDERTIME=[$11], L_SHIPINSTRUCT=[$12], L_SHIPMODE=[$13], L_COMMENT=[$14], > L_PROCTIME=[$15]) > LogicalWatermarkAssigner(rowtime=[L_ORDERTIME], watermark=[-($11, > 30:INTERVAL MINUTE)]) > LogicalProject(L_ORDERKEY=[$0], L_PARTKEY=[$1], L_SUPPKEY=[$2], > L_LINENUMBER=[$3], L_QUANTITY=[$4], L_EXTENDEDPRICE=[$5], L_DISCOUNT=[$6], > L_TAX=[$7], L_CURRENCY=[$8], L_RETURNFLAG=[$9], L_LINESTATUS=[$10], > L_ORDERTIME=[$11], L_SHIPINSTRUCT=[$12], L_SHIPMODE=[$13], L_COMMENT=[$14], > L_PROCTIME=[PROCTIME()]) > LogicalTableScan(table=[[hcat, default, prod_lineitem, source: > [KafkaTableSource(L_ORDERKEY, L_PARTKEY, L_SUPPKEY, L_LINENUMBER, L_QUANTITY, > L_EXTENDEDPRICE, L_DISCOUNT, L_TAX, L_CURRENCY, L_RETURNFLAG, L_LINESTATUS, > L_ORDERTIME, L_SHIPINSTRUCT, L_SHIPMODE, L_COMMENT)]]]) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-17189) Table with processing time attribute can not be read from Hive catalog
[ https://issues.apache.org/jira/browse/FLINK-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-17189. Fix Version/s: (was: 1.10.2) Resolution: Fixed master: f08c8309738e519d246aeda163ef1b1f5e7855c7 release-1.11: 86d2777e89b87c842e9fc242757103656e807154 > Table with processing time attribute can not be read from Hive catalog > -- > > Key: FLINK-17189 > URL: https://issues.apache.org/jira/browse/FLINK-17189 > Project: Flink > Issue Type: Bug > Components: Table SQL / Ecosystem, Table SQL / Planner >Affects Versions: 1.10.1 >Reporter: Timo Walther >Assignee: Jingsong Lee >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > > DDL: > {code} > CREATE TABLE PROD_LINEITEM ( > L_ORDERKEY INTEGER, > L_PARTKEYINTEGER, > L_SUPPKEYINTEGER, > L_LINENUMBER INTEGER, > L_QUANTITY DOUBLE, > L_EXTENDEDPRICE DOUBLE, > L_DISCOUNT DOUBLE, > L_TAXDOUBLE, > L_CURRENCY STRING, > L_RETURNFLAG STRING, > L_LINESTATUS STRING, > L_ORDERTIME TIMESTAMP(3), > L_SHIPINSTRUCT STRING, > L_SHIPMODE STRING, > L_COMMENTSTRING, > WATERMARK FOR L_ORDERTIME AS L_ORDERTIME - INTERVAL '5' MINUTE, > L_PROCTIME AS PROCTIME() > ) WITH ( > 'connector.type' = 'kafka', > 'connector.version' = 'universal', > 'connector.topic' = 'Lineitem', > 'connector.properties.zookeeper.connect' = 'not-needed', > 'connector.properties.bootstrap.servers' = 'kafka:9092', > 'connector.startup-mode' = 'earliest-offset', > 'format.type' = 'csv', > 'format.field-delimiter' = '|' > ); > {code} > Query: > {code} > SELECT * FROM prod_lineitem; > {code} > Result: > {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.AssertionError: Conversion to relational algebra failed to preserve > datatypes: > validated type: > RecordType(INTEGER L_ORDERKEY, INTEGER L_PARTKEY, INTEGER L_SUPPKEY, INTEGER > L_LINENUMBER, DOUBLE L_QUANTITY, DOUBLE L_EXTENDEDPRICE, DOUBLE L_DISCOUNT, > DOUBLE L_TAX, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_CURRENCY, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_RETURNFLAG, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_LINESTATUS, TIME > ATTRIBUTE(ROWTIME) L_ORDERTIME, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > L_SHIPINSTRUCT, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_SHIPMODE, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_COMMENT, TIMESTAMP(3) NOT NULL > L_PROCTIME) NOT NULL > converted type: > RecordType(INTEGER L_ORDERKEY, INTEGER L_PARTKEY, INTEGER L_SUPPKEY, INTEGER > L_LINENUMBER, DOUBLE L_QUANTITY, DOUBLE L_EXTENDEDPRICE, DOUBLE L_DISCOUNT, > DOUBLE L_TAX, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_CURRENCY, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_RETURNFLAG, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_LINESTATUS, TIME > ATTRIBUTE(ROWTIME) L_ORDERTIME, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > L_SHIPINSTRUCT, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_SHIPMODE, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_COMMENT, TIME > ATTRIBUTE(PROCTIME) NOT NULL L_PROCTIME) NOT NULL > rel: > LogicalProject(L_ORDERKEY=[$0], L_PARTKEY=[$1], L_SUPPKEY=[$2], > L_LINENUMBER=[$3], L_QUANTITY=[$4], L_EXTENDEDPRICE=[$5], L_DISCOUNT=[$6], > L_TAX=[$7], L_CURRENCY=[$8], L_RETURNFLAG=[$9], L_LINESTATUS=[$10], > L_ORDERTIME=[$11], L_SHIPINSTRUCT=[$12], L_SHIPMODE=[$13], L_COMMENT=[$14], > L_PROCTIME=[$15]) > LogicalWatermarkAssigner(rowtime=[L_ORDERTIME], watermark=[-($11, > 30:INTERVAL MINUTE)]) > LogicalProject(L_ORDERKEY=[$0], L_PARTKEY=[$1], L_SUPPKEY=[$2], > L_LINENUMBER=[$3], L_QUANTITY=[$4], L_EXTENDEDPRICE=[$5], L_DISCOUNT=[$6], > L_TAX=[$7], L_CURRENCY=[$8], L_RETURNFLAG=[$9], L_LINESTATUS=[$10], > L_ORDERTIME=[$11], L_SHIPINSTRUCT=[$12], L_SHIPMODE=[$13], L_COMMENT=[$14], > L_PROCTIME=[PROCTIME()]) > LogicalTableScan(table=[[hcat, default, prod_lineitem, source: > [KafkaTableSource(L_ORDERKEY, L_PARTKEY, L_SUPPKEY, L_LINENUMBER, L_QUANTITY, > L_EXTENDEDPRICE, L_DISCOUNT, L_TAX, L_CURRENCY, L_RETURNFLAG, L_LINESTATUS, > L_ORDERTIME, L_SHIPINSTRUCT, L_SHIPMODE, L_COMMENT)]]]) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17934) StreamingFileWriter should set chainingStrategy
[ https://issues.apache.org/jira/browse/FLINK-17934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-17934: - Description: {{StreamingFileWriter}} should be eagerly chained whenever possible. (was: We should let {{StreamingFileWriter}} should be eagerly chained whenever possible.) > StreamingFileWriter should set chainingStrategy > --- > > Key: FLINK-17934 > URL: https://issues.apache.org/jira/browse/FLINK-17934 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Reporter: Jingsong Lee >Priority: Blocker > Fix For: 1.11.0 > > > {{StreamingFileWriter}} should be eagerly chained whenever possible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17934) StreamingFileWriter should set chainingStrategy
Jingsong Lee created FLINK-17934: Summary: StreamingFileWriter should set chainingStrategy Key: FLINK-17934 URL: https://issues.apache.org/jira/browse/FLINK-17934 Project: Flink Issue Type: Bug Components: Connectors / FileSystem Reporter: Jingsong Lee Fix For: 1.11.0 We should let {{StreamingFileWriter}} should be eagerly chained whenever possible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17303) Return TableResult for Python TableEnvironment
[ https://issues.apache.org/jira/browse/FLINK-17303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116405#comment-17116405 ] Dian Fu commented on FLINK-17303: - [~hequn8128] Could we also backport this to 1.11.0 as: - It was originally in 1.11.0 and we just revert it because the added tests are not stable and so I think it should be fine to add it back to 1.11.0 after we fixed the test stability issue [~zjwang] What's your thought? - It affects the usability of a few APIs such as "execute_sql" a lot without this PR as they will have to return a Java TableResult object instead of a Python TableResult object. Python users are not aware of this and they have to lookup the Java docs to know how to use these Python APIs. Besides, it introduces potential backward compatibility issues between 1.11 and 1.12. What's your thought? > Return TableResult for Python TableEnvironment > -- > > Key: FLINK-17303 > URL: https://issues.apache.org/jira/browse/FLINK-17303 > Project: Flink > Issue Type: Improvement > Components: API / Python >Reporter: godfrey he >Assignee: Nicholas Jiang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > > [FLINK-16366|https://issues.apache.org/jira/browse/FLINK-16366] supports > executing a statement and returning a {{TableResult}} object, which could get > {{JobClient}} (to associates the submitted Flink job), collect the execution > result, or print the execution result. In Python, we should also introduce > python TableResult class to make sure consistent with Java. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17925) Fix Filesystem options to default values and types
[ https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116402#comment-17116402 ] Jingsong Lee commented on FLINK-17925: -- I am OK with "proctime", "processing-time" and "process-time". Corresponding "partition-time", I choose "process-time". > Fix Filesystem options to default values and types > -- > > Key: FLINK-17925 > URL: https://issues.apache.org/jira/browse/FLINK-17925 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Blocker > Fix For: 1.11.0 > > > Fix Filesystem options: > * Throws unsupported exception when using metastore commit policy for > filesystem table, Filesystem connector has an empty implementation in > {{TableMetaStoreFactory}}. We should avoid user configuring this policy. > * Default value of "sink.partition-commit.trigger" should be "process-time". > Users are hard to figure out what is wrong when they don't have watermark. We > can set "sink.partition-commit.trigger" to "process-time" to have better > out-of-box experience. > * The type of "sink.rolling-policy.file-size" should be MemoryType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17677) FLINK_LOG_PREFIX recommended in docs is not always available
[ https://issues.apache.org/jira/browse/FLINK-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116396#comment-17116396 ] Xintong Song commented on FLINK-17677: -- Hi [~RocMarshal], Thanks for your response. I'm not sure whether I have understand your proposal correctly. Are you suggesting to have separate configuration options for log file paths per deployment mode? If yes, could you explain what is the benefit for having per-deployment configuration options compared to setting the common option differently for each deployment? > FLINK_LOG_PREFIX recommended in docs is not always available > > > Key: FLINK-17677 > URL: https://issues.apache.org/jira/browse/FLINK-17677 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.9.3, 1.10.1, 1.11.0 >Reporter: Xintong Song >Priority: Major > Fix For: 1.11.0, 1.10.2, 1.9.4 > > > The [Application Profiling & > Debugging|https://ci.apache.org/projects/flink/flink-docs-master/monitoring/application_profiling.html] > documentation recommend to use the script variable {{FLINK_LOG_PREFIX}} for > defining log file paths. However, this variable is only available in > standalone mode. This is a bit misleading for users of other deployments (see > this > [thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Memory-analyze-on-AWS-EMR-td35036.html]). > I propose to replace {{FLINK_LOG_PREFIX}} with a general representation > {{}}, and add a separate section to discuss how to set the log > path (e.g., use {{FLINK_LOG_PREFIX}} with standalone deployments and > {{}} with Yarn deployments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2
[ https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116391#comment-17116391 ] Zili Chen edited comment on FLINK-17565 at 5/26/20, 3:23 AM: - [~felixzheng] That will be appreciated and you can assign it to [~zjwang] for coordinating the release process. was (Author: tison): [~felixzheng] That will be appreciated and you can assign [~zjwang] for coordinating the release process. > Bump fabric8 version from 4.5.2 to 4.9.2 > > > Key: FLINK-17565 > URL: https://issues.apache.org/jira/browse/FLINK-17565 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Critical > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > Currently, we are using a version of 4.5.2, it's better that we upgrade it to > 4.9.2, some of the reasons are as follows: > # It removed the use of reapers manually doing cascade deletion of > resources, leave it up to Kubernetes APIServer, which solves the issue of > FLINK-17566, more info: > [https://github.com/fabric8io/kubernetes-client/issues/1880] > # It introduced a regression in building Quantity values in 4.7.0, release > note [https://github.com/fabric8io/kubernetes-client/issues/1953]. > # It provided better support for K8s 1.17, release note: > [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0]. > For more release notes, please refer to [fabric8 > releases|https://github.com/fabric8io/kubernetes-client/releases]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2
[ https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116391#comment-17116391 ] Zili Chen commented on FLINK-17565: --- [~felixzheng] That will be appreciated and you can assign [~zjwang] for coordinating the release process. > Bump fabric8 version from 4.5.2 to 4.9.2 > > > Key: FLINK-17565 > URL: https://issues.apache.org/jira/browse/FLINK-17565 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Critical > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > Currently, we are using a version of 4.5.2, it's better that we upgrade it to > 4.9.2, some of the reasons are as follows: > # It removed the use of reapers manually doing cascade deletion of > resources, leave it up to Kubernetes APIServer, which solves the issue of > FLINK-17566, more info: > [https://github.com/fabric8io/kubernetes-client/issues/1880] > # It introduced a regression in building Quantity values in 4.7.0, release > note [https://github.com/fabric8io/kubernetes-client/issues/1953]. > # It provided better support for K8s 1.17, release note: > [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0]. > For more release notes, please refer to [fabric8 > releases|https://github.com/fabric8io/kubernetes-client/releases]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2
[ https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116389#comment-17116389 ] Canbin Zheng commented on FLINK-17565: -- Thank you [~zjwang]. [~tison] Should I make another PR for backporting this bugfix to 1.11? > Bump fabric8 version from 4.5.2 to 4.9.2 > > > Key: FLINK-17565 > URL: https://issues.apache.org/jira/browse/FLINK-17565 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Critical > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > Currently, we are using a version of 4.5.2, it's better that we upgrade it to > 4.9.2, some of the reasons are as follows: > # It removed the use of reapers manually doing cascade deletion of > resources, leave it up to Kubernetes APIServer, which solves the issue of > FLINK-17566, more info: > [https://github.com/fabric8io/kubernetes-client/issues/1880] > # It introduced a regression in building Quantity values in 4.7.0, release > note [https://github.com/fabric8io/kubernetes-client/issues/1953]. > # It provided better support for K8s 1.17, release note: > [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0]. > For more release notes, please refer to [fabric8 > releases|https://github.com/fabric8io/kubernetes-client/releases]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17572) Remove checkpoint alignment buffered metric from webui
[ https://issues.apache.org/jira/browse/FLINK-17572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116388#comment-17116388 ] Zhijiang commented on FLINK-17572: -- Already assigned to you, welcome to contribute! > Remove checkpoint alignment buffered metric from webui > -- > > Key: FLINK-17572 > URL: https://issues.apache.org/jira/browse/FLINK-17572 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Web Frontend >Affects Versions: 1.11.0 >Reporter: Yingjie Cao >Assignee: Nicholas Jiang >Priority: Minor > Fix For: 1.12.0 > > > After FLINK-16404, we never cache buffers while checkpoint barrier alignment, > so the checkpoint alignment buffered metric will be always 0, we should > remove it directly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-17572) Remove checkpoint alignment buffered metric from webui
[ https://issues.apache.org/jira/browse/FLINK-17572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijiang reassigned FLINK-17572: Assignee: Nicholas Jiang > Remove checkpoint alignment buffered metric from webui > -- > > Key: FLINK-17572 > URL: https://issues.apache.org/jira/browse/FLINK-17572 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Web Frontend >Affects Versions: 1.11.0 >Reporter: Yingjie Cao >Assignee: Nicholas Jiang >Priority: Minor > Fix For: 1.12.0 > > > After FLINK-16404, we never cache buffers while checkpoint barrier alignment, > so the checkpoint alignment buffered metric will be always 0, we should > remove it directly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17909) Add a TestCatalog to hold the serialized meta data to uncover more potential bugs
[ https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-17909: Description: Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in HashMap. However, this lead to many bugs when users switch to some persisted catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, FLINK-16021. That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the important path of serialization and deserialization of meta data. We missed some important meta information (PK, time attributes) when serialization and deserialization which lead to bugs. So I propose to add a {{TestCatalog}} to hold the serialized meta data to cover the serialization and deserializtion path. The serialized meta data may be in the {{Map}} properties format. was: Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in HashMap. However, this lead to many bugs when users switch to some persisted catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, FLINK-16021. That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the important path of serialization and deserialization of meta data. We missed some important meta information (PK, time attributes) when serialization and deserialization which lead to bugs. So I propose to hold the serialized meta data in {{GenericInMemoryCatalog}} to cover the serialization and deserializtion path. The serialized meta data may be in the {{Map}} properties format. We may lose some performance here, but {{GenericInMemoryCatalog}} is mostly used in demo/experiment/testing, so I think it's fine. > Add a TestCatalog to hold the serialized meta data to uncover more potential > bugs > - > > Key: FLINK-17909 > URL: https://issues.apache.org/jira/browse/FLINK-17909 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API, Table SQL / Planner >Reporter: Jark Wu >Priority: Major > Fix For: 1.11.0 > > > Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in > HashMap. However, this lead to many bugs when users switch to some persisted > catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, > FLINK-16021. > That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the > important path of serialization and deserialization of meta data. We missed > some important meta information (PK, time attributes) when serialization and > deserialization which lead to bugs. > So I propose to add a {{TestCatalog}} to hold the serialized meta data to > cover the serialization and deserializtion path. The serialized meta data may > be in the {{Map}} properties format. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17909) Add a TestCatalog to hold the serialized meta data to uncover more potential bugs
[ https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116387#comment-17116387 ] Jark Wu commented on FLINK-17909: - OK. I renamed the title and description to introduce a `TestCatalog` which will hold the serialized meta data. I prefer to have a better testing coverage in table module, rather moving tests into connector modules. > Add a TestCatalog to hold the serialized meta data to uncover more potential > bugs > - > > Key: FLINK-17909 > URL: https://issues.apache.org/jira/browse/FLINK-17909 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API, Table SQL / Planner >Reporter: Jark Wu >Priority: Major > Fix For: 1.11.0 > > > Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in > HashMap. However, this lead to many bugs when users switch to some persisted > catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, > FLINK-16021. > That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the > important path of serialization and deserialization of meta data. We missed > some important meta information (PK, time attributes) when serialization and > deserialization which lead to bugs. > So I propose to add a {{TestCatalog}} to hold the serialized meta data to > cover the serialization and deserializtion path. The serialized meta data may > be in the {{Map}} properties format. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-17909) Add a TestCatalog to hold the serialized meta data to uncover more potential bugs
[ https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116387#comment-17116387 ] Jark Wu edited comment on FLINK-17909 at 5/26/20, 2:57 AM: --- OK. I renamed the title and description to introduce a `TestCatalog` which will hold the serialized meta data. I prefer to have a better testing coverage in table module, rather than moving tests into connector modules. was (Author: jark): OK. I renamed the title and description to introduce a `TestCatalog` which will hold the serialized meta data. I prefer to have a better testing coverage in table module, rather moving tests into connector modules. > Add a TestCatalog to hold the serialized meta data to uncover more potential > bugs > - > > Key: FLINK-17909 > URL: https://issues.apache.org/jira/browse/FLINK-17909 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API, Table SQL / Planner >Reporter: Jark Wu >Priority: Major > Fix For: 1.11.0 > > > Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in > HashMap. However, this lead to many bugs when users switch to some persisted > catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, > FLINK-16021. > That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the > important path of serialization and deserialization of meta data. We missed > some important meta information (PK, time attributes) when serialization and > deserialization which lead to bugs. > So I propose to add a {{TestCatalog}} to hold the serialized meta data to > cover the serialization and deserializtion path. The serialized meta data may > be in the {{Map}} properties format. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17909) Add a TestCatalog to hold the serialized meta data to uncover more potential bugs
[ https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-17909: Summary: Add a TestCatalog to hold the serialized meta data to uncover more potential bugs (was: Make the GenericInMemoryCatalog to hold the serialized meta data to uncover more potential bugs) > Add a TestCatalog to hold the serialized meta data to uncover more potential > bugs > - > > Key: FLINK-17909 > URL: https://issues.apache.org/jira/browse/FLINK-17909 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API, Table SQL / Planner >Reporter: Jark Wu >Priority: Major > Fix For: 1.11.0 > > > Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in > HashMap. However, this lead to many bugs when users switch to some persisted > catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, > FLINK-16021. > That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the > important path of serialization and deserialization of meta data. We missed > some important meta information (PK, time attributes) when serialization and > deserialization which lead to bugs. > So I propose to hold the serialized meta data in {{GenericInMemoryCatalog}} > to cover the serialization and deserializtion path. The serialized meta data > may be in the {{Map}} properties format. > We may lose some performance here, but {{GenericInMemoryCatalog}} is mostly > used in demo/experiment/testing, so I think it's fine. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2
[ https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116373#comment-17116373 ] Zhijiang commented on FLINK-17565: -- Since this fix can resolve some known bugs, it is in compliance with the release rule. Makes sense for me. :) > Bump fabric8 version from 4.5.2 to 4.9.2 > > > Key: FLINK-17565 > URL: https://issues.apache.org/jira/browse/FLINK-17565 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Critical > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > Currently, we are using a version of 4.5.2, it's better that we upgrade it to > 4.9.2, some of the reasons are as follows: > # It removed the use of reapers manually doing cascade deletion of > resources, leave it up to Kubernetes APIServer, which solves the issue of > FLINK-17566, more info: > [https://github.com/fabric8io/kubernetes-client/issues/1880] > # It introduced a regression in building Quantity values in 4.7.0, release > note [https://github.com/fabric8io/kubernetes-client/issues/1953]. > # It provided better support for K8s 1.17, release note: > [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0]. > For more release notes, please refer to [fabric8 > releases|https://github.com/fabric8io/kubernetes-client/releases]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17677) FLINK_LOG_PREFIX recommended in docs is not always available
[ https://issues.apache.org/jira/browse/FLINK-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116372#comment-17116372 ] RocMarshal commented on FLINK-17677: Hi, [~xintongsong] What do you think of this idea: How about this Configure the log path in a specific deployment mode according to the specific configuration items in each deployment mode. This configuration item is effective for isolation between deployment modes.If there are no valid values configured in a specific deployment mode, an attempt is made to read a public configuration item that all deployment modes are aware of. Thank you. Best, Roc > FLINK_LOG_PREFIX recommended in docs is not always available > > > Key: FLINK-17677 > URL: https://issues.apache.org/jira/browse/FLINK-17677 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.9.3, 1.10.1, 1.11.0 >Reporter: Xintong Song >Priority: Major > Fix For: 1.11.0, 1.10.2, 1.9.4 > > > The [Application Profiling & > Debugging|https://ci.apache.org/projects/flink/flink-docs-master/monitoring/application_profiling.html] > documentation recommend to use the script variable {{FLINK_LOG_PREFIX}} for > defining log file paths. However, this variable is only available in > standalone mode. This is a bit misleading for users of other deployments (see > this > [thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Memory-analyze-on-AWS-EMR-td35036.html]). > I propose to replace {{FLINK_LOG_PREFIX}} with a general representation > {{}}, and add a separate section to discuss how to set the log > path (e.g., use {{FLINK_LOG_PREFIX}} with standalone deployments and > {{}} with Yarn deployments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17230) Fix incorrect returned address of Endpoint for the ClusterIP Service
[ https://issues.apache.org/jira/browse/FLINK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116369#comment-17116369 ] Canbin Zheng commented on FLINK-17230: -- [~zjwang] The scope of this problem is that the end-users would get the wrong Endpoint in HA setups thus can't talk to the JobManager from client-side via that retrieved Endpoint. Hence, it would be nice that we backport this PR to 1.11. > Fix incorrect returned address of Endpoint for the ClusterIP Service > > > Key: FLINK-17230 > URL: https://issues.apache.org/jira/browse/FLINK-17230 > Project: Flink > Issue Type: Sub-task > Components: Deployment / Kubernetes >Affects Versions: 1.10.0, 1.10.1 >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > At the moment, when the type of the external Service is set to {{ClusterIP}}, > we return an incorrect address > {{KubernetesUtils.getInternalServiceName(clusterId) + "." + nameSpace}} for > the Endpoint. > This ticket aims to fix this bug by returning > {{KubernetesUtils.getRestServiceName(clusterId) + "." + nameSpace}} instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2
[ https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116363#comment-17116363 ] Canbin Zheng commented on FLINK-17565: -- [~zjwang] 1.11 has supported application mode on Kubernetes, and as FLINK-17566 describes, backport this PR to 1.11 makes application mode more robust thus end-users who interested in Kubernetes setups can benefit from this. > Bump fabric8 version from 4.5.2 to 4.9.2 > > > Key: FLINK-17565 > URL: https://issues.apache.org/jira/browse/FLINK-17565 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Critical > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > Currently, we are using a version of 4.5.2, it's better that we upgrade it to > 4.9.2, some of the reasons are as follows: > # It removed the use of reapers manually doing cascade deletion of > resources, leave it up to Kubernetes APIServer, which solves the issue of > FLINK-17566, more info: > [https://github.com/fabric8io/kubernetes-client/issues/1880] > # It introduced a regression in building Quantity values in 4.7.0, release > note [https://github.com/fabric8io/kubernetes-client/issues/1953]. > # It provided better support for K8s 1.17, release note: > [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0]. > For more release notes, please refer to [fabric8 > releases|https://github.com/fabric8io/kubernetes-client/releases]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink
[ https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116362#comment-17116362 ] Dian Fu commented on FLINK-17883: - Yes, I think so. We could close it after verifying that it works in 1.11. I'll verify it. > Unable to configure write mode for FileSystem() connector in PyFlink > > > Key: FLINK-17883 > URL: https://issues.apache.org/jira/browse/FLINK-17883 > Project: Flink > Issue Type: Improvement > Components: API / Python >Affects Versions: 1.10.1 >Reporter: Robert Metzger >Assignee: Nicholas Jiang >Priority: Major > > As a user of PyFlink, I'm getting the following exception: > {code} > File or directory /tmp/output already exists. Existing files and directories > are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite > existing files and directories. > {code} > I would like to be able to configure writeMode = OVERWRITE for the FileSystem > connector. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17909) Make the GenericInMemoryCatalog to hold the serialized meta data to uncover more potential bugs
[ https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116360#comment-17116360 ] Jingsong Lee commented on FLINK-17909: -- +1 for [~twalthr]'s proposal. We can set a `TestCatalog` in our `BatchTestBase` and `StreamingTestBase` and etc... We should not disturb {{GenericInMemoryCatalog}} just for tesing. > Make the GenericInMemoryCatalog to hold the serialized meta data to uncover > more potential bugs > --- > > Key: FLINK-17909 > URL: https://issues.apache.org/jira/browse/FLINK-17909 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API, Table SQL / Planner >Reporter: Jark Wu >Priority: Major > Fix For: 1.11.0 > > > Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in > HashMap. However, this lead to many bugs when users switch to some persisted > catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, > FLINK-16021. > That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the > important path of serialization and deserialization of meta data. We missed > some important meta information (PK, time attributes) when serialization and > deserialization which lead to bugs. > So I propose to hold the serialized meta data in {{GenericInMemoryCatalog}} > to cover the serialization and deserializtion path. The serialized meta data > may be in the {{Map}} properties format. > We may lose some performance here, but {{GenericInMemoryCatalog}} is mostly > used in demo/experiment/testing, so I think it's fine. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17230) Fix incorrect returned address of Endpoint for the ClusterIP Service
[ https://issues.apache.org/jira/browse/FLINK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116359#comment-17116359 ] Yang Wang commented on FLINK-17230: --- This is a bugfix for wrongly returning a rest endpoint when the service exposed type is {{ClusterIP}}. I think it needs to be picked to branch 1.11. [~zjwang] Do you have concerns to integrate this PR to branch 1.11? > Fix incorrect returned address of Endpoint for the ClusterIP Service > > > Key: FLINK-17230 > URL: https://issues.apache.org/jira/browse/FLINK-17230 > Project: Flink > Issue Type: Sub-task > Components: Deployment / Kubernetes >Affects Versions: 1.10.0, 1.10.1 >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > At the moment, when the type of the external Service is set to {{ClusterIP}}, > we return an incorrect address > {{KubernetesUtils.getInternalServiceName(clusterId) + "." + nameSpace}} for > the Endpoint. > This ticket aims to fix this bug by returning > {{KubernetesUtils.getRestServiceName(clusterId) + "." + nameSpace}} instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17395) Add the sign and shasum logic for PyFlink wheel packages
[ https://issues.apache.org/jira/browse/FLINK-17395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116358#comment-17116358 ] Dian Fu commented on FLINK-17395: - Merged the follow-up "Don't remove the dist directory during clean" via - master: 8f992e8e868b846cf7fe8de23923358fc6b50721 - release-1.11: 0285cb7db39b5f9ba4aeb6cd3848cab6db04a333 > Add the sign and shasum logic for PyFlink wheel packages > > > Key: FLINK-17395 > URL: https://issues.apache.org/jira/browse/FLINK-17395 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Release System >Reporter: Huang Xingbo >Assignee: Huang Xingbo >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0 > > > Add the sign and sha logic for PyFlink wheel packages -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2
[ https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116355#comment-17116355 ] Yang Wang commented on FLINK-17565: --- This ticket will fix the following user ability bugs and should be included in the 1.11.0. * Make Flink on K8s could work on java 8u252, FLINK-17416 * Fix the potential resource leak when job reaches the final state, FLINK-17566 [~zjwang] Do you have concerns to merge this to branch 1.11? > Bump fabric8 version from 4.5.2 to 4.9.2 > > > Key: FLINK-17565 > URL: https://issues.apache.org/jira/browse/FLINK-17565 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Critical > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > Currently, we are using a version of 4.5.2, it's better that we upgrade it to > 4.9.2, some of the reasons are as follows: > # It removed the use of reapers manually doing cascade deletion of > resources, leave it up to Kubernetes APIServer, which solves the issue of > FLINK-17566, more info: > [https://github.com/fabric8io/kubernetes-client/issues/1880] > # It introduced a regression in building Quantity values in 4.7.0, release > note [https://github.com/fabric8io/kubernetes-client/issues/1953]. > # It provided better support for K8s 1.17, release note: > [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0]. > For more release notes, please refer to [fabric8 > releases|https://github.com/fabric8io/kubernetes-client/releases]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-17572) Remove checkpoint alignment buffered metric from webui
[ https://issues.apache.org/jira/browse/FLINK-17572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107123#comment-17107123 ] Nicholas Jiang edited comment on FLINK-17572 at 5/26/20, 1:58 AM: -- [~zjwang]I think this just removes alignmentBuffered in Checkpoint Statistics. Therefore, could u please assign this issue to me? was (Author: nicholasjiang): [~kevin.cyj]I think this just removes alignmentBuffered in Checkpoint Statistics. Therefore, could u please assign this issue to me? > Remove checkpoint alignment buffered metric from webui > -- > > Key: FLINK-17572 > URL: https://issues.apache.org/jira/browse/FLINK-17572 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Web Frontend >Affects Versions: 1.11.0 >Reporter: Yingjie Cao >Priority: Minor > Fix For: 1.12.0 > > > After FLINK-16404, we never cache buffers while checkpoint barrier alignment, > so the checkpoint alignment buffered metric will be always 0, we should > remove it directly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-17924) HBase connector:we can write data to HBase table, but con`t read data from HBase
[ https://issues.apache.org/jira/browse/FLINK-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu closed FLINK-17924. --- Fix Version/s: (was: 1.11.0) Resolution: Duplicate > HBase connector:we can write data to HBase table, but con`t read data from > HBase > > > Key: FLINK-17924 > URL: https://issues.apache.org/jira/browse/FLINK-17924 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.10.0 >Reporter: pengnengsong >Priority: Major > > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > EnviromentSettings streamSettings = > EnviromentSetting.newInstance().useBlinkPlanner().inStreamingMode().build(); > StreamTableEnviroment tEnv = StreamTableEnviroment.create(env, > streamSettings); > CREATE TABLE source( > rowkey INT, > f1 ROW, > f2 ROW > )WITH( > 'connector.type' = 'hbase', > 'connector.version' = '1.4.3', > 'connector.table-name' = 'test_source:flink', > 'connector.zookeeper.quorum' = ':2181', > 'connector.zookeeper.znode.parent' = '/test/hbase' > ) > SELECT * FROM source > > In HBaseRowInputformat.Java when execute the configure method, this.conf is > always null. The default hbase configuration information is created, and the > configuration parameters in with are not in effect. > private void connectToTable(){ > if(this.conf ==null) > { this.conf = HBaseConfiguration.create(); } > ... > } -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-17932) Most of hbase config Options can not work in HBaseTableSource
[ https://issues.apache.org/jira/browse/FLINK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-17932: --- Assignee: Leonard Xu > Most of hbase config Options can not work in HBaseTableSource > - > > Key: FLINK-17932 > URL: https://issues.apache.org/jira/browse/FLINK-17932 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.10.0, 1.11.0, 1.12.0 >Reporter: Leonard Xu >Assignee: Leonard Xu >Priority: Major > > All hbase config Options can not work in HBaseTableSource, because field > `conf` in HBaseRowInputFormat is not serializable, and HBaseTableSource will > failback to load hbase-site.xml from classpath. I.E, many table config > Options that user defined in DDL is useless and HbaseTableSource only load > config options from classpath. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17932) Most of hbase config Options can not work in HBaseTableSource
[ https://issues.apache.org/jira/browse/FLINK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-17932: Fix Version/s: 1.11.0 > Most of hbase config Options can not work in HBaseTableSource > - > > Key: FLINK-17932 > URL: https://issues.apache.org/jira/browse/FLINK-17932 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.10.0, 1.11.0, 1.12.0 >Reporter: Leonard Xu >Assignee: Leonard Xu >Priority: Major > Fix For: 1.11.0 > > > All hbase config Options can not work in HBaseTableSource, because field > `conf` in HBaseRowInputFormat is not serializable, and HBaseTableSource will > failback to load hbase-site.xml from classpath. I.E, many table config > Options that user defined in DDL is useless and HbaseTableSource only load > config options from classpath. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17924) HBase connector:we can write data to HBase table, but con`t read data from HBase
[ https://issues.apache.org/jira/browse/FLINK-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-17924: Fix Version/s: 1.11.0 > HBase connector:we can write data to HBase table, but con`t read data from > HBase > > > Key: FLINK-17924 > URL: https://issues.apache.org/jira/browse/FLINK-17924 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.10.0 >Reporter: pengnengsong >Priority: Major > Fix For: 1.11.0 > > > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > EnviromentSettings streamSettings = > EnviromentSetting.newInstance().useBlinkPlanner().inStreamingMode().build(); > StreamTableEnviroment tEnv = StreamTableEnviroment.create(env, > streamSettings); > CREATE TABLE source( > rowkey INT, > f1 ROW, > f2 ROW > )WITH( > 'connector.type' = 'hbase', > 'connector.version' = '1.4.3', > 'connector.table-name' = 'test_source:flink', > 'connector.zookeeper.quorum' = ':2181', > 'connector.zookeeper.znode.parent' = '/test/hbase' > ) > SELECT * FROM source > > In HBaseRowInputformat.Java when execute the configure method, this.conf is > always null. The default hbase configuration information is created, and the > configuration parameters in with are not in effect. > private void connectToTable(){ > if(this.conf ==null) > { this.conf = HBaseConfiguration.create(); } > ... > } -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17924) HBase connector:we can write data to HBase table, but con`t read data from HBase
[ https://issues.apache.org/jira/browse/FLINK-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116353#comment-17116353 ] Jark Wu commented on FLINK-17924: - Yes, I think this is a bug. > HBase connector:we can write data to HBase table, but con`t read data from > HBase > > > Key: FLINK-17924 > URL: https://issues.apache.org/jira/browse/FLINK-17924 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.10.0 >Reporter: pengnengsong >Priority: Major > Fix For: 1.11.0 > > > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > EnviromentSettings streamSettings = > EnviromentSetting.newInstance().useBlinkPlanner().inStreamingMode().build(); > StreamTableEnviroment tEnv = StreamTableEnviroment.create(env, > streamSettings); > CREATE TABLE source( > rowkey INT, > f1 ROW, > f2 ROW > )WITH( > 'connector.type' = 'hbase', > 'connector.version' = '1.4.3', > 'connector.table-name' = 'test_source:flink', > 'connector.zookeeper.quorum' = ':2181', > 'connector.zookeeper.znode.parent' = '/test/hbase' > ) > SELECT * FROM source > > In HBaseRowInputformat.Java when execute the configure method, this.conf is > always null. The default hbase configuration information is created, and the > configuration parameters in with are not in effect. > private void connectToTable(){ > if(this.conf ==null) > { this.conf = HBaseConfiguration.create(); } > ... > } -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-17909) Make the GenericInMemoryCatalog to hold the serialized meta data to uncover more potential bugs
[ https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-17909: --- Assignee: (was: Jark Wu) > Make the GenericInMemoryCatalog to hold the serialized meta data to uncover > more potential bugs > --- > > Key: FLINK-17909 > URL: https://issues.apache.org/jira/browse/FLINK-17909 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API, Table SQL / Planner >Reporter: Jark Wu >Priority: Major > Fix For: 1.11.0 > > > Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in > HashMap. However, this lead to many bugs when users switch to some persisted > catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, > FLINK-16021. > That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the > important path of serialization and deserialization of meta data. We missed > some important meta information (PK, time attributes) when serialization and > deserialization which lead to bugs. > So I propose to hold the serialized meta data in {{GenericInMemoryCatalog}} > to cover the serialization and deserializtion path. The serialized meta data > may be in the {{Map}} properties format. > We may lose some performance here, but {{GenericInMemoryCatalog}} is mostly > used in demo/experiment/testing, so I think it's fine. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17925) Fix Filesystem options to default values and types
[ https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116352#comment-17116352 ] Jark Wu commented on FLINK-17925: - Should it be "processing-time"? > Fix Filesystem options to default values and types > -- > > Key: FLINK-17925 > URL: https://issues.apache.org/jira/browse/FLINK-17925 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Blocker > Fix For: 1.11.0 > > > Fix Filesystem options: > * Throws unsupported exception when using metastore commit policy for > filesystem table, Filesystem connector has an empty implementation in > {{TableMetaStoreFactory}}. We should avoid user configuring this policy. > * Default value of "sink.partition-commit.trigger" should be "process-time". > Users are hard to figure out what is wrong when they don't have watermark. We > can set "sink.partition-commit.trigger" to "process-time" to have better > out-of-box experience. > * The type of "sink.rolling-policy.file-size" should be MemoryType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink
[ https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116347#comment-17116347 ] Nicholas Jiang commented on FLINK-17883: [~dian.fu], as you mentioned, this issue has already supported by using INSERT OVERWRITE statement, therefore would this issue be closed? > Unable to configure write mode for FileSystem() connector in PyFlink > > > Key: FLINK-17883 > URL: https://issues.apache.org/jira/browse/FLINK-17883 > Project: Flink > Issue Type: Improvement > Components: API / Python >Affects Versions: 1.10.1 >Reporter: Robert Metzger >Assignee: Nicholas Jiang >Priority: Major > > As a user of PyFlink, I'm getting the following exception: > {code} > File or directory /tmp/output already exists. Existing files and directories > are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite > existing files and directories. > {code} > I would like to be able to configure writeMode = OVERWRITE for the FileSystem > connector. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16057) Performance regression in ContinuousFileReaderOperator
[ https://issues.apache.org/jira/browse/FLINK-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116275#comment-17116275 ] Roman Khachatryan commented on FLINK-16057: --- I don't the reason. My first guess was that the benchmark is wrong, but I couldn't spot any issues. Can you please take a look: [https://github.com/dataArtisans/flink-benchmarks/pull/62] ? I also repeated the non-io-benchmark (that generates numbers without reading files): {code:java} old: "Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit" "org.apache.flink.benchmark.ContinuousFileReaderOperatorBenchmark.readFileSplit","thrpt",1,30,18193.296713,1288.369575,"ops/ms" new: "Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit" "org.apache.flink.benchmark.ContinuousFileReaderOperatorBenchmark.readFileSplit","thrpt",1,30,12491.844935,165.684694,"ops/ms" {code} > Performance regression in ContinuousFileReaderOperator > -- > > Key: FLINK-16057 > URL: https://issues.apache.org/jira/browse/FLINK-16057 > Project: Flink > Issue Type: Bug > Components: API / DataStream, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Roman Khachatryan >Assignee: Roman Khachatryan >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > Time Spent: 20m > Remaining Estimate: 0h > > After switching CFRO to a single-threaded execution model performance > regression was expected to be about 15-20% (benchmarked in November). > But after merging to master it turned out to be about 50%. > > One reason is that the chaining strategy isn't set by default in CFRO factory. > Without that even reading and outputting all records of a split in a single > mail action doesn't reverse the regression (only about half). > However, with strategy set AND batching enabled fixes the regression > (starting from batch size 6). > Though batching can't be used in practice because it can significantly delay > checkpointing. > > Another approach would be to process one record and the repeat until > defaultMailboxActionAvailable OR haveNewMail. > This reverses regression and even improves the performance by about 50% > compared to the old version. > > The final solution could also be FLIP-27. > > Other things tried (didn't help): > * CFRO rework without subsequent commits (removing checkpoint lock) > * different batch sizes, including the whole split, without chaining > strategy fixed - partial improvement only > * disabling close > * disabling checkpointing > * disabling output (serialization) > * using LinkedList instead of PriorityQueue > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-17870) dependent jars are missing to be shipped to cluster in scala shell
[ https://issues.apache.org/jira/browse/FLINK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas closed FLINK-17870. -- Resolution: Fixed Merged on master with ea54a54be1ea5629ccbd2421aa6f91b19870d40c on release-1.11 with 714369797781e1a49aabf9e45f6ce8d09cb6336a and on release-1.10 with 492f9de2b9040a4ca6c45d5060bc776448682f07 > dependent jars are missing to be shipped to cluster in scala shell > -- > > Key: FLINK-17870 > URL: https://issues.apache.org/jira/browse/FLINK-17870 > Project: Flink > Issue Type: Bug > Components: Scala Shell >Affects Versions: 1.11.0 >Reporter: Jeff Zhang >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0, 1.10.2, 1.12.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17870) dependent jars are missing to be shipped to cluster in scala shell
[ https://issues.apache.org/jira/browse/FLINK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas updated FLINK-17870: --- Fix Version/s: 1.12.0 1.10.2 1.11.0 > dependent jars are missing to be shipped to cluster in scala shell > -- > > Key: FLINK-17870 > URL: https://issues.apache.org/jira/browse/FLINK-17870 > Project: Flink > Issue Type: Bug > Components: Scala Shell >Affects Versions: 1.11.0 >Reporter: Jeff Zhang >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0, 1.10.2, 1.12.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-15012) Checkpoint directory not cleaned up
[ https://issues.apache.org/jira/browse/FLINK-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116169#comment-17116169 ] Stephan Ewen commented on FLINK-15012: -- I think there is a very difference between the working/temp directories and the checkpoint directories. The working/temp directories can be cleaned up after processes shut down, because no data in them will ever be needed. The checkpoint directories may contain retained checkpoints or savepoints that are still relevant. I think we should not ever try to delete these with things like "shutdown hooks". I understand that job cancellation should remove the job's empty parent checkpoint directories. That makes sense. And [~yunta] proposed an issue to fix this. I would question whether we should try and do anything about the {{stop-cluster.sh}} behavior. This is forceful wiping of the cluster rather than proper shutdown, so left-over data is to be expected. And, in my mind, the caution to not accidentally delete a still-needed checkpoint is more important than making the "hard stop" as nice as possible (cleanup wise). > Checkpoint directory not cleaned up > --- > > Key: FLINK-15012 > URL: https://issues.apache.org/jira/browse/FLINK-15012 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.9.1 >Reporter: Nico Kruber >Assignee: Yun Tang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > Time Spent: 10m > Remaining Estimate: 0h > > I started a Flink cluster with 2 TMs using {{start-cluster.sh}} and the > following config (in addition to the default {{flink-conf.yaml}}) > {code:java} > state.checkpoints.dir: file:///path/to/checkpoints/ > state.backend: rocksdb {code} > After submitting a jobwith checkpoints enabled (every 5s), checkpoints show > up, e.g. > {code:java} > bb969f842bbc0ecc3b41b7fbe23b047b/ > ├── chk-2 > │ ├── 238969e1-6949-4b12-98e7-1411c186527c > │ ├── 2702b226-9cfc-4327-979d-e5508ab2e3d5 > │ ├── 4c51cb24-6f71-4d20-9d4c-65ed6e826949 > │ ├── e706d574-c5b2-467a-8640-1885ca252e80 > │ └── _metadata > ├── shared > └── taskowned {code} > If I shut down the cluster via {{stop-cluster.sh}}, these files will remain > on disk and not be cleaned up. > In contrast, if I cancel the job, at least {{chk-2}} will be deleted, but > still leaving the (empty) directories. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17933) TaskManager was terminated on Yarn - investigate
[ https://issues.apache.org/jira/browse/FLINK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Khachatryan updated FLINK-17933: -- Description: When running a job on Yarn cluster (load testing) some jobs result in failures. Initial symptoms are no bytes written/transferred in CSV and failures in logs: {code:java} 2020-05-17 10:02:32,858 WARN org.apache.flink.runtime.taskmanager.Task [] - Map -> Flat Map (138/160) (e49f7ea26b633c8035f2a919b1c580c8) switched from RUNNING to FAILED.{code} It turned out that all such failures were caused by "Connection reset" from a single IP, except for one "Leadership lost" error (another IP). Connection reset was likely caused by TM receiving SIGTERM (container_1589453804748_0118_01_04 and 5 both on ip-172-31-42-229): {code:java} 2020-05-17 10:02:31,362 INFO org.apache.flink.yarn.YarnTaskExecutorRunner [] - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.{code} Other TMs received SIGTERM one minute later (all logs were uploaded at the same time though). >From the JM it looked like this: {code:java} 2020-05-17 10:02:23,583 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Trigger heartbeat request. 2020-05-17 10:02:23,587 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_05. 2020-05-17 10:02:23,590 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_06. 2020-05-17 10:02:23,592 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_04. 2020-05-17 10:02:23,595 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_03. 2020-05-17 10:02:23,598 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_02. 2020-05-17 10:02:23,725 DEBUG org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Received acknowledge message for checkpoint 12 from task 459efd2ad8fe2ffe7fffe28530064fe1 of job 5d4d8c88de23b1361fe0dce6ba8443f8 at container_1589453804748_0118_01_02 @ ip-172-31-43-69.eu-central-1.compute.internal (dataPort=44625). 2020-05-17 10:02:29,103 DEBUG org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Received acknowledge message for checkpoint 12 from task 266a9326be7e3ec669cce2e6a97ae5b0 of job 5d4d8c88de23b1361fe0dce6ba8443f8 at container_1589453804748_0118_01_05 @ ip-172-31-42-229.eu-central-1.compute.internal (dataPort=37329). 2020-05-17 10:02:32,862 WARN akka.remote.ReliableDeliverySupervisor [] - Association with remote system [akka.tcp://fl...@ip-172-31-42-229.eu-central-1.compute.internal:3] has failed, address is now gated for [50] ms. Reason: [Disassociated] 2020-05-17 10:02:32,862 WARN akka.remote.ReliableDeliverySupervisor [] - Association with remote system [akka.tcp://fl...@ip-172-31-42-229.eu-central-1.compute.internal:42567] has failed, address is now gated for [50] ms. Reason: [Disassociated] 2020-05-17 10:02:32,900 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Map -> Flat Map (87/160) (cb77c7002503baa74baf73a3a100c2f2) switched from RUNNING to FAILED. org.apache.flink.runtime.io.network.netty.exception.LocalTransportException: readAddress(..) failed: Connection reset by peer (connection to 'ip-172-31-42-229.eu-central-1.compute.internal/172.31.42.229:37329'){code} There are also JobManager heartbeat timeouts but they don't correlate with the issue. was: When running a job on Yarn cluster (load testing) some jobs result in failures. Initial symptoms are 0 bytes written/transferred in CSV and failures in logs: {code:java} 2020-05-17 10:02:32,858 WARN org.apache.flink.runtime.taskmanager.Task [] - Map -> Flat Map (138/160) (e49f7ea26b633c8035f2a919b1c580c8) switched from RUNNING to FAILED.{code} I turned out that all such failures were caused by "Connection reset" from a single IP, except for on "Leadership lost" error. Connection reset was likely caused by TM receiving SIGTERM (container_1589453804748_0118_01_04 and 5 both on ip-172-31-42-229): {code:java} 2020-05-17 10:02:31,362 INFO org.apache.flink.yarn.YarnTaskExecutorRunner [] - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.{code} Other TMs received SIGTERM one minute later (all logs were uploaded at the same time though). >From the JM it looked like this: {code:java} 2020-05-17 10:02:23,583 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Trigger heartbeat request. 2020-05-17 10:02:23,587 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_05. 2020-05-17 10:02:23,590 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_06.
[jira] [Updated] (FLINK-17933) TaskManager was terminated on Yarn - investigate
[ https://issues.apache.org/jira/browse/FLINK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Khachatryan updated FLINK-17933: -- Description: When running a job on Yarn cluster (load testing) some jobs result in failures. Initial symptoms are 0 bytes written/transferred in CSV and failures in logs: {code:java} 2020-05-17 10:02:32,858 WARN org.apache.flink.runtime.taskmanager.Task [] - Map -> Flat Map (138/160) (e49f7ea26b633c8035f2a919b1c580c8) switched from RUNNING to FAILED.{code} I turned out that all such failures were caused by "Connection reset" from a single IP, except for on "Leadership lost" error. Connection reset was likely caused by TM receiving SIGTERM (container_1589453804748_0118_01_04 and 5 both on ip-172-31-42-229): {code:java} 2020-05-17 10:02:31,362 INFO org.apache.flink.yarn.YarnTaskExecutorRunner [] - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.{code} Other TMs received SIGTERM one minute later (all logs were uploaded at the same time though). >From the JM it looked like this: {code:java} 2020-05-17 10:02:23,583 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Trigger heartbeat request. 2020-05-17 10:02:23,587 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_05. 2020-05-17 10:02:23,590 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_06. 2020-05-17 10:02:23,592 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_04. 2020-05-17 10:02:23,595 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_03. 2020-05-17 10:02:23,598 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - Received heartbeat from container_1589453804748_0118_01_02. 2020-05-17 10:02:23,725 DEBUG org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Received acknowledge message for checkpoint 12 from task 459efd2ad8fe2ffe7fffe28530064fe1 of job 5d4d8c88de23b1361fe0dce6ba8443f8 at container_1589453804748_0118_01_02 @ ip-172-31-43-69.eu-central-1.compute.internal (dataPort=44625). 2020-05-17 10:02:29,103 DEBUG org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Received acknowledge message for checkpoint 12 from task 266a9326be7e3ec669cce2e6a97ae5b0 of job 5d4d8c88de23b1361fe0dce6ba8443f8 at container_1589453804748_0118_01_05 @ ip-172-31-42-229.eu-central-1.compute.internal (dataPort=37329). 2020-05-17 10:02:32,862 WARN akka.remote.ReliableDeliverySupervisor [] - Association with remote system [akka.tcp://fl...@ip-172-31-42-229.eu-central-1.compute.internal:3] has failed, address is now gated for [50] ms. Reason: [Disassociated] 2020-05-17 10:02:32,862 WARN akka.remote.ReliableDeliverySupervisor [] - Association with remote system [akka.tcp://fl...@ip-172-31-42-229.eu-central-1.compute.internal:42567] has failed, address is now gated for [50] ms. Reason: [Disassociated] 2020-05-17 10:02:32,900 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Map -> Flat Map (87/160) (cb77c7002503baa74baf73a3a100c2f2) switched from RUNNING to FAILED. org.apache.flink.runtime.io.network.netty.exception.LocalTransportException: readAddress(..) failed: Connection reset by peer (connection to 'ip-172-31-42-229.eu-central-1.compute.internal/172.31.42.229:37329'){code} There are also JobManager heartbeat timeouts but they don't correlate with the issue. > TaskManager was terminated on Yarn - investigate > > > Key: FLINK-17933 > URL: https://issues.apache.org/jira/browse/FLINK-17933 > Project: Flink > Issue Type: Task > Components: Deployment / YARN, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Roman Khachatryan >Assignee: Roman Khachatryan >Priority: Major > > When running a job on Yarn cluster (load testing) some jobs result in > failures. > Initial symptoms are 0 bytes written/transferred in CSV and failures in logs: > > {code:java} > 2020-05-17 10:02:32,858 WARN org.apache.flink.runtime.taskmanager.Task [] - > Map -> Flat Map (138/160) (e49f7ea26b633c8035f2a919b1c580c8) switched from > RUNNING to FAILED.{code} > > > I turned out that all such failures were caused by "Connection reset" from a > single IP, except for on "Leadership lost" error. > Connection reset was likely caused by TM receiving SIGTERM > (container_1589453804748_0118_01_04 and 5 both on ip-172-31-42-229): > > {code:java} > 2020-05-17 10:02:31,362 INFO org.apache.flink.yarn.YarnTaskExecutorRunner [] > - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.{code} > > Other TMs received SIGTERM one minute later (all logs were
[jira] [Updated] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2
[ https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zili Chen updated FLINK-17565: -- Fix Version/s: 1.12.0 > Bump fabric8 version from 4.5.2 to 4.9.2 > > > Key: FLINK-17565 > URL: https://issues.apache.org/jira/browse/FLINK-17565 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Critical > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > Currently, we are using a version of 4.5.2, it's better that we upgrade it to > 4.9.2, some of the reasons are as follows: > # It removed the use of reapers manually doing cascade deletion of > resources, leave it up to Kubernetes APIServer, which solves the issue of > FLINK-17566, more info: > [https://github.com/fabric8io/kubernetes-client/issues/1880] > # It introduced a regression in building Quantity values in 4.7.0, release > note [https://github.com/fabric8io/kubernetes-client/issues/1953]. > # It provided better support for K8s 1.17, release note: > [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0]. > For more release notes, please refer to [fabric8 > releases|https://github.com/fabric8io/kubernetes-client/releases]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2
[ https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116156#comment-17116156 ] Zili Chen commented on FLINK-17565: --- Just the same as FLINK-17230. *Quote* [~zjwang] [~pnowojski] this issue seems like a bugfix which could be included in 1.11.0. [~fly_in_gis] & I have reviewed the PR and I merged it into current master(1.12). [~felixzheng] [~fly_in_gis] if you think it is nice to be included in 1.11.0, please explain to our release manage, i.e., [~zjwang] & [~pnowojski]. > Bump fabric8 version from 4.5.2 to 4.9.2 > > > Key: FLINK-17565 > URL: https://issues.apache.org/jira/browse/FLINK-17565 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Critical > Labels: pull-request-available > Fix For: 1.11.0 > > > Currently, we are using a version of 4.5.2, it's better that we upgrade it to > 4.9.2, some of the reasons are as follows: > # It removed the use of reapers manually doing cascade deletion of > resources, leave it up to Kubernetes APIServer, which solves the issue of > FLINK-17566, more info: > [https://github.com/fabric8io/kubernetes-client/issues/1880] > # It introduced a regression in building Quantity values in 4.7.0, release > note [https://github.com/fabric8io/kubernetes-client/issues/1953]. > # It provided better support for K8s 1.17, release note: > [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0]. > For more release notes, please refer to [fabric8 > releases|https://github.com/fabric8io/kubernetes-client/releases]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17230) Fix incorrect returned address of Endpoint for the ClusterIP Service
[ https://issues.apache.org/jira/browse/FLINK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116155#comment-17116155 ] Zili Chen commented on FLINK-17230: --- [~zjwang] [~pnowojski] this issue seems like a bugfix which could be included in 1.11.0. [~fly_in_gis] & I have reviewed the PR and I merged it into current master(1.12). [~felixzheng] [~fly_in_gis] if you think it is nice to be included in 1.11.0, please explain to our release manage, i.e., [~zjwang] & [~pnowojski]. > Fix incorrect returned address of Endpoint for the ClusterIP Service > > > Key: FLINK-17230 > URL: https://issues.apache.org/jira/browse/FLINK-17230 > Project: Flink > Issue Type: Sub-task > Components: Deployment / Kubernetes >Affects Versions: 1.10.0, 1.10.1 >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > At the moment, when the type of the external Service is set to {{ClusterIP}}, > we return an incorrect address > {{KubernetesUtils.getInternalServiceName(clusterId) + "." + nameSpace}} for > the Endpoint. > This ticket aims to fix this bug by returning > {{KubernetesUtils.getRestServiceName(clusterId) + "." + nameSpace}} instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17230) Fix incorrect returned address of Endpoint for the ClusterIP Service
[ https://issues.apache.org/jira/browse/FLINK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zili Chen updated FLINK-17230: -- Fix Version/s: 1.12.0 > Fix incorrect returned address of Endpoint for the ClusterIP Service > > > Key: FLINK-17230 > URL: https://issues.apache.org/jira/browse/FLINK-17230 > Project: Flink > Issue Type: Sub-task > Components: Deployment / Kubernetes >Affects Versions: 1.10.0, 1.10.1 >Reporter: Canbin Zheng >Assignee: Canbin Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > At the moment, when the type of the external Service is set to {{ClusterIP}}, > we return an incorrect address > {{KubernetesUtils.getInternalServiceName(clusterId) + "." + nameSpace}} for > the Endpoint. > This ticket aims to fix this bug by returning > {{KubernetesUtils.getRestServiceName(clusterId) + "." + nameSpace}} instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17933) TaskManager was terminated on Yarn - investigate
Roman Khachatryan created FLINK-17933: - Summary: TaskManager was terminated on Yarn - investigate Key: FLINK-17933 URL: https://issues.apache.org/jira/browse/FLINK-17933 Project: Flink Issue Type: Task Components: Deployment / YARN, Runtime / Task Affects Versions: 1.11.0 Reporter: Roman Khachatryan Assignee: Roman Khachatryan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-12675) Event time synchronization in Kafka consumer
[ https://issues.apache.org/jira/browse/FLINK-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116153#comment-17116153 ] Stephan Ewen commented on FLINK-12675: -- The existing Kafka Connector is already super complex (partially due to some bad architecture) and very hard to keep stable. We have been burned a bit in the past with adding more non-trivial features. Given that this is probably the most important connector, I would favor to not add more features to it at this point. Can you fork the Kafka Connector and add the event-time alignment to that fork? That way we don't affect the connector in the core. Putting the connector onto https://flink-packages.org would be a way to share it with more users. For the new Source Interface, I would be up to start a discussion about how a generic event-time alignment mechanism would look like, implemented in the {{SourceReaderBase}} and {{SplitReader}} classes. Something like an extended {{SplitReader}} interface that can handle callbacks in the form of {{suspendSplitReadind(splitId)}} and {{resumeSplitReading(splitId)}} or so. > Event time synchronization in Kafka consumer > > > Key: FLINK-12675 > URL: https://issues.apache.org/jira/browse/FLINK-12675 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Kafka >Reporter: Thomas Weise >Assignee: Akshay Aggarwal >Priority: Major > Labels: pull-request-available > Attachments: 0001-Kafka-event-time-alignment.patch > > > Integrate the source watermark tracking into the Kafka consumer and implement > the sync mechanism (different consumer model, compared to Kinesis). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17820) Memory threshold is ignored for channel state
[ https://issues.apache.org/jira/browse/FLINK-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116150#comment-17116150 ] Stephan Ewen commented on FLINK-17820: -- It is true, the assumption that {{DataOutputStream}} does not buffer is a fragile point, even if an unlikely one (my feeling is too much existing code would be broken if one SDK made the implementation behave differently for such a core class). That said, it should be possible to flush the stream without voiding the usability of the class. If from the initial options, we don't like (1), then, thinking about it a bit more, I would go for option (2) (adjust {{FsCheckpointStateOutputStream}}). I did a quick search through the code and it looks like we can drop the assumption that {{FsCheckpointStateOutputStream}} creates the file in {{flush()}} It seems not used in the production code (though possibly in tests). How about this? - rename the {{flush()}} method to {{flushToFile()}}, including all existing calls to that method within the class and in relevant tests. - override {{flush()}} as a no-op. > Memory threshold is ignored for channel state > - > > Key: FLINK-17820 > URL: https://issues.apache.org/jira/browse/FLINK-17820 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Roman Khachatryan >Assignee: Roman Khachatryan >Priority: Critical > Labels: pull-request-available > Fix For: 1.11.0 > > > Config parameter state.backend.fs.memory-threshold is ignored for channel > state. Causing each subtask to have a file per checkpoint. Regardless of the > size of channel state (of this subtask). > This also causes slow cleanup and delays the next checkpoint. > > The problem is that {{ChannelStateCheckpointWriter.finishWriteAndResult}} > calls flush(); which actually flushes the data on disk. > > From FSDataOutputStream.flush Javadoc: > A completed flush does not mean that the data is necessarily persistent. Data > persistence can is only assumed after calls to close() or sync(). > > Possible solutions: > 1. not to flush in {{ChannelStateCheckpointWriter.finishWriteAndResult (which > can lead to data loss in a wrapping stream).}} > {{2. change }}{{FsCheckpointStateOutputStream.flush behavior}} > {{3. wrap }}{{FsCheckpointStateOutputStream to prevent flush}}{{}}{{}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16057) Performance regression in ContinuousFileReaderOperator
[ https://issues.apache.org/jira/browse/FLINK-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116143#comment-17116143 ] Piotr Nowojski commented on FLINK-16057: Why is that the case? Why the new version should be twice as fast? It sounds strange, like we are missing something important. > Performance regression in ContinuousFileReaderOperator > -- > > Key: FLINK-16057 > URL: https://issues.apache.org/jira/browse/FLINK-16057 > Project: Flink > Issue Type: Bug > Components: API / DataStream, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Roman Khachatryan >Assignee: Roman Khachatryan >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > Time Spent: 20m > Remaining Estimate: 0h > > After switching CFRO to a single-threaded execution model performance > regression was expected to be about 15-20% (benchmarked in November). > But after merging to master it turned out to be about 50%. > > One reason is that the chaining strategy isn't set by default in CFRO factory. > Without that even reading and outputting all records of a split in a single > mail action doesn't reverse the regression (only about half). > However, with strategy set AND batching enabled fixes the regression > (starting from batch size 6). > Though batching can't be used in practice because it can significantly delay > checkpointing. > > Another approach would be to process one record and the repeat until > defaultMailboxActionAvailable OR haveNewMail. > This reverses regression and even improves the performance by about 50% > compared to the old version. > > The final solution could also be FLIP-27. > > Other things tried (didn't help): > * CFRO rework without subsequent commits (removing checkpoint lock) > * different batch sizes, including the whole split, without chaining > strategy fixed - partial improvement only > * disabling close > * disabling checkpointing > * disabling output (serialization) > * using LinkedList instead of PriorityQueue > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17932) Most of hbase config Options can not work in HBaseTableSource
[ https://issues.apache.org/jira/browse/FLINK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leonard Xu updated FLINK-17932: --- Description: All hbase config Options can not work in HBaseTableSource, because field `conf` in HBaseRowInputFormat is not serializable, and HBaseTableSource will failback to load hbase-site.xml from classpath. I.E, many table config Options that user defined in DDL is useless and HbaseTableSource only load config options from classpath. was: All hbase config Options can not work in HBaseTableSource, because field `conf` in HBaseRowInputFormat it not serializable, and HBaseTableSource will failback to load hbase-site.xml from classpath. I.E, many table config Options that user defined in DDL is useless and HbaseTableSource only load config options from classpath. > Most of hbase config Options can not work in HBaseTableSource > - > > Key: FLINK-17932 > URL: https://issues.apache.org/jira/browse/FLINK-17932 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.10.0, 1.11.0, 1.12.0 >Reporter: Leonard Xu >Priority: Major > > All hbase config Options can not work in HBaseTableSource, because field > `conf` in HBaseRowInputFormat is not serializable, and HBaseTableSource will > failback to load hbase-site.xml from classpath. I.E, many table config > Options that user defined in DDL is useless and HbaseTableSource only load > config options from classpath. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17932) Most of hbase config Options can not work in HBaseTableSource
[ https://issues.apache.org/jira/browse/FLINK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leonard Xu updated FLINK-17932: --- Description: All hbase config Options can not work in HBaseTableSource, because field `conf` in HBaseRowInputFormat it not serializable, and HBaseTableSource will failback to load hbase-site.xml from classpath. I.E, many table config Options that user defined in DDL is useless and HbaseTableSource only load config options from classpath. was: All hbase config Options can not work in HBaseTableSource, because field `conf` in HBaseRowInputFormat it not serializable, and HBaseTableSource will fallback to load hbase-site.xml from classpath. I.E, many table config Options that user defined in DDL is useless and HbaseTableSource only load config options from classpath. > Most of hbase config Options can not work in HBaseTableSource > - > > Key: FLINK-17932 > URL: https://issues.apache.org/jira/browse/FLINK-17932 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.10.0, 1.11.0, 1.12.0 >Reporter: Leonard Xu >Priority: Major > > All hbase config Options can not work in HBaseTableSource, because field > `conf` in HBaseRowInputFormat it not serializable, and HBaseTableSource will > failback to load hbase-site.xml from classpath. I.E, many table config > Options that user defined in DDL is useless and HbaseTableSource only load > config options from classpath. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17932) Most of hbase config Options can not work in HBaseTableSource
[ https://issues.apache.org/jira/browse/FLINK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leonard Xu updated FLINK-17932: --- Summary: Most of hbase config Options can not work in HBaseTableSource (was: All hbase config Options can not work in HBaseTableSource) > Most of hbase config Options can not work in HBaseTableSource > - > > Key: FLINK-17932 > URL: https://issues.apache.org/jira/browse/FLINK-17932 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.10.0, 1.11.0, 1.12.0 >Reporter: Leonard Xu >Priority: Major > > All hbase config Options can not work in HBaseTableSource, because field > `conf` in HBaseRowInputFormat it not serializable, and HBaseTableSource will > fallback to load hbase-site.xml from classpath. I.E, many table config > Options that user defined in DDL is useless and HbaseTableSource only load > config options from classpath. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17932) All hbase config Options can not work in HBaseTableSource
Leonard Xu created FLINK-17932: -- Summary: All hbase config Options can not work in HBaseTableSource Key: FLINK-17932 URL: https://issues.apache.org/jira/browse/FLINK-17932 Project: Flink Issue Type: Bug Components: Connectors / HBase Affects Versions: 1.10.0, 1.11.0, 1.12.0 Reporter: Leonard Xu All hbase config Options can not work in HBaseTableSource, because field `conf` in HBaseRowInputFormat it not serializable, and HBaseTableSource will fallback to load hbase-site.xml from classpath. I.E, many table config Options that user defined in DDL is useless and HbaseTableSource only load config options from classpath. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17909) Make the GenericInMemoryCatalog to hold the serialized meta data to uncover more potential bugs
[ https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116124#comment-17116124 ] Timo Walther commented on FLINK-17909: -- I don't know if this really solves the root cause. It just adds unnecessary computation. I think we should rather better test catalogs such as the HiveCatalog and CatalogTable implementations instead. Doing this serialization for a `TestCatalog` sounds better to me than doing it for the general default catalog of the table environment. > Make the GenericInMemoryCatalog to hold the serialized meta data to uncover > more potential bugs > --- > > Key: FLINK-17909 > URL: https://issues.apache.org/jira/browse/FLINK-17909 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API, Table SQL / Planner >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Major > Fix For: 1.11.0 > > > Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in > HashMap. However, this lead to many bugs when users switch to some persisted > catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, > FLINK-16021. > That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the > important path of serialization and deserialization of meta data. We missed > some important meta information (PK, time attributes) when serialization and > deserialization which lead to bugs. > So I propose to hold the serialized meta data in {{GenericInMemoryCatalog}} > to cover the serialization and deserializtion path. The serialized meta data > may be in the {{Map}} properties format. > We may lose some performance here, but {{GenericInMemoryCatalog}} is mostly > used in demo/experiment/testing, so I think it's fine. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (FLINK-17842) Performance regression on 19.05.2020
[ https://issues.apache.org/jira/browse/FLINK-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reopened FLINK-17842: > Performance regression on 19.05.2020 > > > Key: FLINK-17842 > URL: https://issues.apache.org/jira/browse/FLINK-17842 > Project: Flink > Issue Type: Bug > Components: Benchmarks >Affects Versions: 1.11.0 >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > > There is a noticeable performance regression in many benchmarks: > http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2 > http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,1ms=2 > http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.100,100ms=2 > http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2 > that happened on May 19th, probably between 260ef2c and 2f18138 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-17891) FlinkYarnSessionCli sets wrong execution.target type
[ https://issues.apache.org/jira/browse/FLINK-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17115951#comment-17115951 ] Kostas Kloudas edited comment on FLINK-17891 at 5/25/20, 3:21 PM: -- Hi [~tangshangwen], as [~fly_in_gis] said, could you describe what exactly is your scenario? was (Author: kkl0u): Hi [~tangshangwen], as [~fly_in_gis] said, could you describe what exactly is your scenario? I am asking because the fix does not seem to fixing something. You are just removing an option that has to be set to something for correct execution. > FlinkYarnSessionCli sets wrong execution.target type > - > > Key: FLINK-17891 > URL: https://issues.apache.org/jira/browse/FLINK-17891 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.0 >Reporter: Shangwen Tang >Priority: Major > Attachments: image-2020-05-23-00-59-32-702.png, > image-2020-05-23-01-00-19-549.png > > > I submitted a flink session job at the local YARN cluster, and I found that > the *execution.target* is of the wrong type, which should be of yarn-session > type > !image-2020-05-23-00-59-32-702.png|width=545,height=75! > !image-2020-05-23-01-00-19-549.png|width=544,height=94! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17842) Performance regression on 19.05.2020
[ https://issues.apache.org/jira/browse/FLINK-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116033#comment-17116033 ] Piotr Nowojski commented on FLINK-17842: Performance regression was only partially solved: http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2 seems fine, but those are still showing some slow down: http://codespeed.dak8s.net:8000/timeline/?ben=networkBroadcastThroughput=2 http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,100ms=2 > Performance regression on 19.05.2020 > > > Key: FLINK-17842 > URL: https://issues.apache.org/jira/browse/FLINK-17842 > Project: Flink > Issue Type: Bug > Components: Benchmarks >Affects Versions: 1.11.0 >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > > There is a noticeable performance regression in many benchmarks: > http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2 > http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,1ms=2 > http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.100,100ms=2 > http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2 > that happened on May 19th, probably between 260ef2c and 2f18138 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-17842) Performance regression on 19.05.2020
[ https://issues.apache.org/jira/browse/FLINK-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-17842: -- Assignee: (was: Piotr Nowojski) > Performance regression on 19.05.2020 > > > Key: FLINK-17842 > URL: https://issues.apache.org/jira/browse/FLINK-17842 > Project: Flink > Issue Type: Bug > Components: Benchmarks >Affects Versions: 1.11.0 >Reporter: Piotr Nowojski >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > > There is a noticeable performance regression in many benchmarks: > http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2 > http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,1ms=2 > http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.100,100ms=2 > http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2 > that happened on May 19th, probably between 260ef2c and 2f18138 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-17842) Performance regression on 19.05.2020
[ https://issues.apache.org/jira/browse/FLINK-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116033#comment-17116033 ] Piotr Nowojski edited comment on FLINK-17842 at 5/25/20, 1:48 PM: -- Performance regression was only partially solved. Some benchmarks seems fine: http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2 but some are still showing some slow down: http://codespeed.dak8s.net:8000/timeline/?ben=networkBroadcastThroughput=2 http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,100ms=2 was (Author: pnowojski): Performance regression was only partially solved: http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2 seems fine, but those are still showing some slow down: http://codespeed.dak8s.net:8000/timeline/?ben=networkBroadcastThroughput=2 http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,100ms=2 > Performance regression on 19.05.2020 > > > Key: FLINK-17842 > URL: https://issues.apache.org/jira/browse/FLINK-17842 > Project: Flink > Issue Type: Bug > Components: Benchmarks >Affects Versions: 1.11.0 >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > > There is a noticeable performance regression in many benchmarks: > http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2 > http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,1ms=2 > http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.100,100ms=2 > http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2 > that happened on May 19th, probably between 260ef2c and 2f18138 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-17928) Incorrect state size reported when using unaligned checkpoints
[ https://issues.apache.org/jira/browse/FLINK-17928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116021#comment-17116021 ] Roman Khachatryan edited comment on FLINK-17928 at 5/25/20, 2:15 PM: - The numbers were obtained using: {code:java} writtenBytes() { rest_url="http://`hostname`:20888/proxy/`get_aid`; job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'` vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` ) writtenBytes=0 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}" for vertex in "${vertex_ids[@]}"; do stats=`curl $rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes_input,writtenBytes_output\=sum` stats2=`curl $rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes\=sum` >&2 echo "vertex $vertex $stats $stats2" writtenBytes=$(($writtenBytes + `echo $stats $stats2 | jq -s '[add | .[] | select(.id | test("writtenBytes.*")).sum] | add // 0' `)) done echo "$writtenBytes" | numfmt --to=iec-i --suffix=B --padding=7 } transferredBytes() { rest_url="http://`hostname`:20888/proxy/`get_aid`; job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'` vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` ) numBytesOut=0 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}" for vertex in "${vertex_ids[@]}"; do stats=`curl $rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=numBytesOut\=sum` >&2 echo "vertex $vertex $stats" numBytesOut=$(($numBytesOut + `echo $stats | jq -s '[add | .[].sum] | add // 0' `)) done echo "$numBytesOut" | numfmt --to=iec-i --suffix=B --padding=7 } {code} This is how it's called: {code:java} function experiment() { name=$1 nodes=$2 slots=$3 dur=$4 start_job ${nodes} ${slots} duration ${dur} # writtenBytes="`writtenBytes` `writtenBytes` `writtenBytes` `writtenBytes` `writtenBytes` `writtenBytes` `writtenBytes`" transferredBytes="`transferredBytes` `transferredBytes` `transferredBytes` `transferredBytes` `transferredBytes` `transferredBytes` `transferredBytes`" APPID=`kill_on_yarn` getLogsFor ${name} ${APPID} analyzeLogs ${name} ${APPID} truncate -s -1 ${LOG} echo ";$transferredBytes" >> ${LOG} } {code} That is, those are transferred bytes, not written bytes (see commented line in experiment()). was (Author: roman_khachatryan): The numbers were obtained using: {code:java} writtenBytes() { rest_url="http://`hostname`:20888/proxy/`get_aid`; job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'` vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` ) writtenBytes=0 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}" for vertex in "${vertex_ids[@]}"; do stats=`curl $rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes_input,writtenBytes_output\=sum` stats2=`curl $rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes\=sum` >&2 echo "vertex $vertex $stats $stats2" writtenBytes=$(($writtenBytes + `echo $stats $stats2 | jq -s '[add | .[] | select(.id | test("writtenBytes.*")).sum] | add // 0' `)) done echo "$writtenBytes" | numfmt --to=iec-i --suffix=B --padding=7 } transferredBytes() { rest_url="http://`hostname`:20888/proxy/`get_aid`; job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'` vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` ) numBytesOut=0 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}" for vertex in "${vertex_ids[@]}"; do stats=`curl $rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=numBytesOut\=sum` >&2 echo "vertex $vertex $stats" numBytesOut=$(($numBytesOut + `echo $stats | jq -s '[add | .[].sum] | add // 0' `)) done echo "$numBytesOut" | numfmt --to=iec-i --suffix=B --padding=7 } {code} > Incorrect state size reported when using unaligned checkpoints > --- > > Key: FLINK-17928 > URL: https://issues.apache.org/jira/browse/FLINK-17928 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.11.0 >Reporter: Piotr Nowojski >Priority: Critical > Fix For: 1.11.0 > > > Even when checkpoints on HDFS are between 100-300MBs, the reported state size > is in orders of magnitude larger with values like: > {noformat} > 1GiB 1.5TiB 2.0TiB 2.1TiB 2.1TiB > 148GiB 148GiB 148GiB 148GiB 148GiB 148GiB > {noformat} > it's probably because we have multiple > {{Collection}}, and each of the individual handle > returns the same value from
[jira] [Comment Edited] (FLINK-17928) Incorrect state size reported when using unaligned checkpoints
[ https://issues.apache.org/jira/browse/FLINK-17928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116021#comment-17116021 ] Roman Khachatryan edited comment on FLINK-17928 at 5/25/20, 2:11 PM: - The numbers were obtained using: {code:java} writtenBytes() { rest_url="http://`hostname`:20888/proxy/`get_aid`; job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'` vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` ) writtenBytes=0 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}" for vertex in "${vertex_ids[@]}"; do stats=`curl $rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes_input,writtenBytes_output\=sum` stats2=`curl $rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes\=sum` >&2 echo "vertex $vertex $stats $stats2" writtenBytes=$(($writtenBytes + `echo $stats $stats2 | jq -s '[add | .[] | select(.id | test("writtenBytes.*")).sum] | add // 0' `)) done echo "$writtenBytes" | numfmt --to=iec-i --suffix=B --padding=7 } transferredBytes() { rest_url="http://`hostname`:20888/proxy/`get_aid`; job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'` vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` ) numBytesOut=0 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}" for vertex in "${vertex_ids[@]}"; do stats=`curl $rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=numBytesOut\=sum` >&2 echo "vertex $vertex $stats" numBytesOut=$(($numBytesOut + `echo $stats | jq -s '[add | .[].sum] | add // 0' `)) done echo "$numBytesOut" | numfmt --to=iec-i --suffix=B --padding=7 } {code} was (Author: roman_khachatryan): The numbers were obtained using: {code:java} transferredBytes() { rest_url="http://`hostname`:20888/proxy/`get_aid`; job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'` vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` ) numBytesOut=0 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}" for vertex in "${vertex_ids[@]}"; do stats=`curl $rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=numBytesOut\=sum` >&2 echo "vertex $vertex $stats" numBytesOut=$(($numBytesOut + `echo $stats | jq -s '[add | .[].sum] | add // 0' `)) done echo "$numBytesOut" | numfmt --to=iec-i --suffix=B --padding=7 } {code} > Incorrect state size reported when using unaligned checkpoints > --- > > Key: FLINK-17928 > URL: https://issues.apache.org/jira/browse/FLINK-17928 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.11.0 >Reporter: Piotr Nowojski >Priority: Critical > Fix For: 1.11.0 > > > Even when checkpoints on HDFS are between 100-300MBs, the reported state size > is in orders of magnitude larger with values like: > {noformat} > 1GiB 1.5TiB 2.0TiB 2.1TiB 2.1TiB > 148GiB 148GiB 148GiB 148GiB 148GiB 148GiB > {noformat} > it's probably because we have multiple > {{Collection}}, and each of the individual handle > returns the same value from {{AbstractChannelStateHandle#getStateSize}} - the > full size of the spilled data, ignoring that only small portion of those data > belong to a single input channel/result subpartition. In other words {{ > org.apache.flink.runtime.state.AbstractChannelStateHandle#getStateSize}} > should be taking the offsets into account and return only the size of the > data that belong exclusively to this handle. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17931) Document fromValues
Dawid Wysakowicz created FLINK-17931: Summary: Document fromValues Key: FLINK-17931 URL: https://issues.apache.org/jira/browse/FLINK-17931 Project: Flink Issue Type: Sub-task Components: Documentation, Table SQL / API Reporter: Dawid Wysakowicz Assignee: Dawid Wysakowicz -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-17917) ResourceInformationReflector#getExternalResources should ignore the external resource with a value of 0
[ https://issues.apache.org/jira/browse/FLINK-17917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann reassigned FLINK-17917: - Assignee: Yangze Guo > ResourceInformationReflector#getExternalResources should ignore the external > resource with a value of 0 > --- > > Key: FLINK-17917 > URL: https://issues.apache.org/jira/browse/FLINK-17917 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: Yangze Guo >Assignee: Yangze Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > > *Background*: In FLINK-17390, we leverage > {{WorkerSpecContainerResourceAdapter.InternalContainerResource}} to handle > container matching logic. In FLINK-17407, we introduce external resources in > {{WorkerSpecContainerResourceAdapter.InternalContainerResource}}. > On containers returned by Yarn, we try to get the corresponding worker specs > by: > - Convert the container to {{InternalContainerResource}} > - Get the WorkerResourceSpec from {{containerResourceToWorkerSpecs}} map. > *Problem*: Container mismatch could happen in the below scenario: > - Flink does not allocate any external resources, the {{externalResources}} > of {{InternalContainerResource}} is an empty map. > - The returned container contains all the resources (with a value of 0) > defined in Yarn's {{resource-types.xml}}. The {{externalResources}} of > {{InternalContainerResource}} has one or more entries with a value of 0. > - These two {{InternalContainerResource}} do not match. > To solve this problem, we could ignore all the external resources with a > value of 0 in "ResourceInformationReflector#getExternalResources". > cc [~trohrmann] Could you assign this to me? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17931) Document fromValues clause
[ https://issues.apache.org/jira/browse/FLINK-17931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Wysakowicz updated FLINK-17931: - Summary: Document fromValues clause (was: Document fromValues) > Document fromValues clause > -- > > Key: FLINK-17931 > URL: https://issues.apache.org/jira/browse/FLINK-17931 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table SQL / API >Reporter: Dawid Wysakowicz >Assignee: Dawid Wysakowicz >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #12320: [FLINK-17887][table][connector] Improve interface of ScanFormatFactory and SinkFormatFactory
flinkbot edited a comment on pull request #12320: URL: https://github.com/apache/flink/pull/12320#issuecomment-633558182 ## CI report: * ea7a259ecdae60ff7b30932ac3821fdb91e5b674 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2142) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12313: [FLINK-17005][docs] Translate the CREATE TABLE ... LIKE syntax documentation to Chinese
flinkbot edited a comment on pull request #12313: URL: https://github.com/apache/flink/pull/12313#issuecomment-633389597 ## CI report: * c225fa8b2a774a8579501e136004a9d64db89244 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2108) * 07ab8aa279380cc6584c9602a1c344dfb2294074 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2144) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12318: [FLINK-17867][hive][test] Add hdfs dependency to hive-3.1.1 test
flinkbot edited a comment on pull request #12318: URL: https://github.com/apache/flink/pull/12318#issuecomment-633479850 ## CI report: * c395d57a1a2a57b8f9970b64d8241e033ecdd295 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2130) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12078: [FLINK-17610][state] Align the behavior of result of internal map state to return empty iterator
flinkbot edited a comment on pull request #12078: URL: https://github.com/apache/flink/pull/12078#issuecomment-626611802 ## CI report: * e3ffb15cc38bdbbf1f5a11014782d081edaecea6 UNKNOWN * 55be45331331ec6f336d3872b8616b899d936730 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2120) * c87fc3764b52f86b03ee7f0381271c4240671ca1 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #12322: [FLINK-17929][docs] Fix invalid liquid expressions
flinkbot commented on pull request #12322: URL: https://github.com/apache/flink/pull/12322#issuecomment-633576098 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit a53dd539ae9d381671c5b0503a7bd002358fd065 (Mon May 25 13:35:35 UTC 2020) ✅no warnings Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dawidwys commented on pull request #12322: [FLINK-17929][docs] Fix invalid liquid expressions
dawidwys commented on pull request #12322: URL: https://github.com/apache/flink/pull/12322#issuecomment-633574947 Do you mind having a look @azagrebin ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-17929) Fix invalid liquid expressions
[ https://issues.apache.org/jira/browse/FLINK-17929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-17929: --- Labels: pull-request-available (was: ) > Fix invalid liquid expressions > -- > > Key: FLINK-17929 > URL: https://issues.apache.org/jira/browse/FLINK-17929 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.11.0 >Reporter: Dawid Wysakowicz >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > > The .ID expression in ops/deployment/docker.md should be escaped, > otherwise it is not properly rendered. > {code} > Generating... > Liquid Warning: Liquid syntax error (line 331): [:dot, "."] is not a > valid expression in "{{.ID}}" in ops/deployment/docker.md > Liquid Warning: Liquid syntax error (line 357): [:dot, "."] is not a > valid expression in "{{.ID}}" in ops/deployment/docker.md > Liquid Warning: Liquid syntax error (line 331): [:dot, "."] is not a > valid expression in "{{.ID}}" in ops/deployment/docker.zh.md > Liquid Warning: Liquid syntax error (line 357): [:dot, "."] is not a > valid expression in "{{.ID}}" in ops/deployment/docker.zh.md > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] dawidwys opened a new pull request #12322: [FLINK-17929][docs] Fix invalid liquid expressions
dawidwys opened a new pull request #12322: URL: https://github.com/apache/flink/pull/12322 It is a trivial bugfix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-17928) Incorrect state size reported when using unaligned checkpoints
[ https://issues.apache.org/jira/browse/FLINK-17928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116021#comment-17116021 ] Roman Khachatryan commented on FLINK-17928: --- The numbers were obtained using: {code:java} transferredBytes() { rest_url="http://`hostname`:20888/proxy/`get_aid`; job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'` vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` ) numBytesOut=0 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}" for vertex in "${vertex_ids[@]}"; do stats=`curl $rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=numBytesOut\=sum` >&2 echo "vertex $vertex $stats" numBytesOut=$(($numBytesOut + `echo $stats | jq -s '[add | .[].sum] | add // 0' `)) done echo "$numBytesOut" | numfmt --to=iec-i --suffix=B --padding=7 } {code} > Incorrect state size reported when using unaligned checkpoints > --- > > Key: FLINK-17928 > URL: https://issues.apache.org/jira/browse/FLINK-17928 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.11.0 >Reporter: Piotr Nowojski >Priority: Critical > Fix For: 1.11.0 > > > Even when checkpoints on HDFS are between 100-300MBs, the reported state size > is in orders of magnitude larger with values like: > {noformat} > 1GiB 1.5TiB 2.0TiB 2.1TiB 2.1TiB > 148GiB 148GiB 148GiB 148GiB 148GiB 148GiB > {noformat} > it's probably because we have multiple > {{Collection}}, and each of the individual handle > returns the same value from {{AbstractChannelStateHandle#getStateSize}} - the > full size of the spilled data, ignoring that only small portion of those data > belong to a single input channel/result subpartition. In other words {{ > org.apache.flink.runtime.state.AbstractChannelStateHandle#getStateSize}} > should be taking the offsets into account and return only the size of the > data that belong exclusively to this handle. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17928) Incorrect state size reported when using unaligned checkpoints
Piotr Nowojski created FLINK-17928: -- Summary: Incorrect state size reported when using unaligned checkpoints Key: FLINK-17928 URL: https://issues.apache.org/jira/browse/FLINK-17928 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.11.0 Reporter: Piotr Nowojski Fix For: 1.11.0 Even when checkpoints on HDFS are between 100-300MBs, the reported state size is in orders of magnitude larger with values like: {noformat} 1GiB 1.5TiB 2.0TiB 2.1TiB 2.1TiB 148GiB 148GiB 148GiB 148GiB 148GiB 148GiB {noformat} it's probably because we have multiple {{Collection}}, and each of the individual handle returns the same value from {{AbstractChannelStateHandle#getStateSize}} - the full size of the spilled data, ignoring that only small portion of those data belong to a single input channel/result subpartition. In other words {{ org.apache.flink.runtime.state.AbstractChannelStateHandle#getStateSize}} should be taking the offsets into account and return only the size of the data that belong exclusively to this handle. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17929) Fix invalid liquid expressions
Dawid Wysakowicz created FLINK-17929: Summary: Fix invalid liquid expressions Key: FLINK-17929 URL: https://issues.apache.org/jira/browse/FLINK-17929 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.11.0 Reporter: Dawid Wysakowicz Assignee: Dawid Wysakowicz The .ID expression in ops/deployment/docker.md should be escaped, otherwise it is not properly rendered. {code} Generating... Liquid Warning: Liquid syntax error (line 331): [:dot, "."] is not a valid expression in "{{.ID}}" in ops/deployment/docker.md Liquid Warning: Liquid syntax error (line 357): [:dot, "."] is not a valid expression in "{{.ID}}" in ops/deployment/docker.md Liquid Warning: Liquid syntax error (line 331): [:dot, "."] is not a valid expression in "{{.ID}}" in ops/deployment/docker.zh.md Liquid Warning: Liquid syntax error (line 357): [:dot, "."] is not a valid expression in "{{.ID}}" in ops/deployment/docker.zh.md {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] SteNicholas commented on pull request #12078: [FLINK-17610][state] Align the behavior of result of internal map state to return empty iterator
SteNicholas commented on pull request #12078: URL: https://github.com/apache/flink/pull/12078#issuecomment-633568565 @carp84 I have merged the latest master code and resolved conflicts. Please help to check again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12321: Add document for writing Avro files with StreamingFileSink
flinkbot edited a comment on pull request #12321: URL: https://github.com/apache/flink/pull/12321#issuecomment-633558255 ## CI report: * 56bdc3a61f65cb30b48cf7f932520be02ebed734 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2143) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12314: [FLINK-17756][table-api-java] Drop table/view shouldn't take effect o…
flinkbot edited a comment on pull request #12314: URL: https://github.com/apache/flink/pull/12314#issuecomment-633403476 ## CI report: * b7a68f0f07ab9bb2abc3182d72c0dc80ed59dda1 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2113) * 8180c3272155f8db184013ed79946db1571206e8 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2141) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12264: [FLINK-17558][netty] Release partitions asynchronously
flinkbot edited a comment on pull request #12264: URL: https://github.com/apache/flink/pull/12264#issuecomment-631349883 ## CI report: * 19c5f57b94cc56b70002031618c32d9e6f68effb UNKNOWN * bb313e40f5a72dbf20cd0a8b48267063fd4f00af UNKNOWN * eafbd98c812227cb7d9ce7158de1a23309855509 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1948) * 3510bfd56ae6a431783bbade1881dd967b271457 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2139) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12283: [FLINK-16975][documentation] Add docs for FileSystem connector
flinkbot edited a comment on pull request #12283: URL: https://github.com/apache/flink/pull/12283#issuecomment-632089857 ## CI report: * b3fd51b309c78d9dd5056eed35dc2fe388665899 UNKNOWN * 2838b871958d80b1ea22c5ccd550505895e98e60 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2124) * 20ede85b3693bdf20adb0efd1b72dde4ab6c62f5 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2140) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12320: [FLINK-17887][table][connector] Improve interface of ScanFormatFactory and SinkFormatFactory
flinkbot edited a comment on pull request #12320: URL: https://github.com/apache/flink/pull/12320#issuecomment-633558182 ## CI report: * ea7a259ecdae60ff7b30932ac3821fdb91e5b674 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2142) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12313: [FLINK-17005][docs] Translate the CREATE TABLE ... LIKE syntax documentation to Chinese
flinkbot edited a comment on pull request #12313: URL: https://github.com/apache/flink/pull/12313#issuecomment-633389597 ## CI report: * c225fa8b2a774a8579501e136004a9d64db89244 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2108) * 07ab8aa279380cc6584c9602a1c344dfb2294074 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pnowojski commented on a change in pull request #12292: [FLINK-17861][task][checkpointing] Split channel state handles sent to JM
pnowojski commented on a change in pull request #12292: URL: https://github.com/apache/flink/pull/12292#discussion_r429886453 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateCheckpointWriterTest.java ## @@ -149,8 +152,8 @@ public void testRecordingOffsets() throws Exception { } private void write(ChannelStateCheckpointWriter writer, InputChannelInfo channelInfo, byte[] data) throws Exception { - NetworkBuffer buffer = new NetworkBuffer(HeapMemorySegment.FACTORY.allocateUnpooledSegment(data.length, null), FreeingBufferRecycler.INSTANCE); - buffer.setBytes(0, data); + MemorySegment segment = wrap(data); + NetworkBuffer buffer = new NetworkBuffer(segment, FreeingBufferRecycler.INSTANCE, Buffer.DataType.DATA_BUFFER, segment.size()); Review comment: What was the problem here? Could you explain it in the commit message? ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateCheckpointWriter.java ## @@ -25,8 +25,10 @@ import org.apache.flink.runtime.state.InputChannelStateHandle; Review comment: Please copy the PR description to the commit message of this (last one) commit. Could you also extend both those descriptions and JIRA ticket description (copy/paste) with some more elaborate explanation what was the underlying problem? > That the buffered bytes in `StreamStateHandle underlying` (if it's a `ByteStreamStateHandle`) would be referenced many times, one per each input channel and result partition by respective `InputChannelStateHandle` and `ResultSubpartitionStateHandle` handles. Each of those handles would thus duplicate and contain all of the data for every channel, while using only a small portion of it. ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateCheckpointWriter.java ## @@ -180,17 +177,33 @@ private void doComplete(boolean precondition, RunnableWithException complete, Ru } private > void complete( + StreamStateHandle underlying, CompletableFuture> future, Map> offsets, - BiFunction, H> buildHandle) { + TriFunction, H> buildHandle) throws IOException { Review comment: Those two functions (`complete` and `createHandle` are) quite difficult to understand/read. For example by looking at the signature ``` private > H createHandle( StreamStateHandle underlying, TriFunction, H> buildHandle, Map.Entry> e) ``` I have no idea what's happening here. I would at the very least: 1. either extract `buildHandle` to some factory/builder or just replace it by a boolean flag `boolean buildInputHandles`/enum `INPUT_HANDLE`/`OUTPUT_HANDLE` and just if/switch inside. 2. do not pass `Map.Entry`, but convert it to a POJO or named variables ASAP ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateCheckpointWriter.java ## @@ -180,17 +177,33 @@ private void doComplete(boolean precondition, RunnableWithException complete, Ru } private > void complete( + StreamStateHandle underlying, CompletableFuture> future, Map> offsets, - BiFunction, H> buildHandle) { + TriFunction, H> buildHandle) throws IOException { final Collection handles = new ArrayList<>(); for (Map.Entry> e : offsets.entrySet()) { - handles.add(buildHandle.apply(e.getKey(), e.getValue())); + handles.add(createHandle(underlying, buildHandle, e)); } future.complete(handles); LOG.debug("channel state write completed, checkpointId: {}, handles: {}", checkpointId, handles); } + private > H createHandle( + StreamStateHandle underlying, + TriFunction, H> buildHandle, + Map.Entry> e) throws IOException { + if (underlying instanceof ByteStreamStateHandle) { + ByteStreamStateHandle byteHandle = (ByteStreamStateHandle) underlying; + return buildHandle.apply( + e.getKey(), + new ByteStreamStateHandle(randomUUID().toString(), serializer.extractAndMerge(byteHandle.getData(), e.getValue())), + singletonList(serializer.getHeaderLength())); + } else { Review comment: It doesn't look like a generic/error prone solution. It looks like something
[GitHub] [flink] danny0405 commented on a change in pull request #12284: [FLINK-17689][kafka][table] Add integration tests for changelog sourc…
danny0405 commented on a change in pull request #12284: URL: https://github.com/apache/flink/pull/12284#discussion_r429926472 ## File path: flink-table/flink-table-planner-blink/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java ## @@ -247,6 +265,12 @@ public void invoke(RowData value, Context context) throws Exception { "This is probably an incorrectly implemented test."); } } + receivedNum++; + if (expectedSize != -1 && receivedNum >= expectedSize) { + // some sources are infinite (e.g. kafka), Review comment: `>=` or `=` ? ## File path: flink-table/flink-table-planner-blink/src/test/java/org/apache/flink/table/planner/factories/TestValuesTableFactory.java ## @@ -633,8 +648,11 @@ public SinkRuntimeProvider getSinkRuntimeProvider(Context context) { sinkFunction = new KeyedUpsertingSinkFunction( tableName, converter, - keyIndices); + keyIndices, + expectedNum); } else { + checkArgument(expectedNum == -1, + "Retracting Sink doesn't support '" + SINK_EXPECTED_MESSAGES_NUM.key() + "' yet."); Review comment: Should we move the check to the constructor ? ## File path: flink-table/flink-table-planner-blink/src/test/scala/org/apache/flink/table/planner/plan/stream/sql/TableScanTest.scala ## @@ -186,6 +186,61 @@ class TableScanTest extends TableTestBase { util.verifyPlan("SELECT COUNT(*) FROM src WHERE a > 1", ExplainDetail.CHANGELOG_MODE) } + @Test + def testJoinOnChangelogSource(): Unit = { +util.addTable( + """ +|CREATE TABLE changelog_src ( +| ts TIMESTAMP(3), +| a INT, +| b DOUBLE +|) WITH ( +| 'connector' = 'values', +| 'changelog-mode' = 'I,UA,UB' +|) + """.stripMargin) +util.addTable( + """ +|CREATE TABLE append_src ( +| ts TIMESTAMP(3), +| a INT, +| b DOUBLE +|) WITH ( +| 'connector' = 'values', +| 'changelog-mode' = 'I' +|) + """.stripMargin) Review comment: Can we also add a ITCase for the stream join ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingsongLi commented on pull request #12283: [FLINK-16975][documentation] Add docs for FileSystem connector
JingsongLi commented on pull request #12283: URL: https://github.com/apache/flink/pull/12283#issuecomment-633564817 Will solve https://issues.apache.org/jira/browse/FLINK-17925 first. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-17925) Fix Filesystem options to default values and types
[ https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee reassigned FLINK-17925: Assignee: Jingsong Lee > Fix Filesystem options to default values and types > -- > > Key: FLINK-17925 > URL: https://issues.apache.org/jira/browse/FLINK-17925 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Blocker > Fix For: 1.11.0 > > > Fix Filesystem options: > * Throws unsupported exception when using metastore commit policy for > filesystem table, Filesystem connector has an empty implementation in > {{TableMetaStoreFactory}}. We should avoid user configuring this policy. > * Default value of "sink.partition-commit.trigger" should be "process-time". > Users are hard to figure out what is wrong when they don't have watermark. We > can set "sink.partition-commit.trigger" to "process-time" to have better > out-of-box experience. > * The type of "sink.rolling-policy.file-size" should be MemoryType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17925) Fix Filesystem options to default values and types
[ https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-17925: - Priority: Blocker (was: Critical) > Fix Filesystem options to default values and types > -- > > Key: FLINK-17925 > URL: https://issues.apache.org/jira/browse/FLINK-17925 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Reporter: Jingsong Lee >Priority: Blocker > Fix For: 1.11.0 > > > Fix Filesystem options: > * Throws unsupported exception when using metastore commit policy for > filesystem table, Filesystem connector has an empty implementation in > {{TableMetaStoreFactory}}. We should avoid user configuring this policy. > * Default value of "sink.partition-commit.trigger" should be "process-time". > Users are hard to figure out what is wrong when they don't have watermark. We > can set "sink.partition-commit.trigger" to "process-time" to have better > out-of-box experience. > * The type of "sink.rolling-policy.file-size" should be MemoryType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17927) Default value of "sink.partition-commit.trigger" should be "process-time"
[ https://issues.apache.org/jira/browse/FLINK-17927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-17927: - Priority: Major (was: Critical) > Default value of "sink.partition-commit.trigger" should be "process-time" > - > > Key: FLINK-17927 > URL: https://issues.apache.org/jira/browse/FLINK-17927 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Reporter: Jingsong Lee >Priority: Major > > Users are hard to figure out what is wrong when they don't have watermark. We > can set "sink.partition-commit.trigger" to "process-time" to have better > out-of-box experience. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-17927) Default value of "sink.partition-commit.trigger" should be "process-time"
[ https://issues.apache.org/jira/browse/FLINK-17927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-17927. Fix Version/s: (was: 1.11.0) Resolution: Duplicate > Default value of "sink.partition-commit.trigger" should be "process-time" > - > > Key: FLINK-17927 > URL: https://issues.apache.org/jira/browse/FLINK-17927 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Reporter: Jingsong Lee >Priority: Critical > > Users are hard to figure out what is wrong when they don't have watermark. We > can set "sink.partition-commit.trigger" to "process-time" to have better > out-of-box experience. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17925) Fix Filesystem options to default values and types
[ https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-17925: - Description: Fix Filesystem options: * Throws unsupported exception when using metastore commit policy for filesystem table, Filesystem connector has an empty implementation in {{TableMetaStoreFactory}}. We should avoid user configuring this policy. * Default value of "sink.partition-commit.trigger" should be "process-time". Users are hard to figure out what is wrong when they don't have watermark. We can set "sink.partition-commit.trigger" to "process-time" to have better out-of-box experience. * The type of "sink.rolling-policy.file-size" should be MemoryType. was: Fix Filesystem options: * Throws unsupported exception when using metastore commit policy for filesystem table, Filesystem connector has an empty implementation in {{TableMetaStoreFactory}}. We should avoid user configuring this policy. > Fix Filesystem options to default values and types > -- > > Key: FLINK-17925 > URL: https://issues.apache.org/jira/browse/FLINK-17925 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Reporter: Jingsong Lee >Priority: Critical > Fix For: 1.11.0 > > > Fix Filesystem options: > * Throws unsupported exception when using metastore commit policy for > filesystem table, Filesystem connector has an empty implementation in > {{TableMetaStoreFactory}}. We should avoid user configuring this policy. > * Default value of "sink.partition-commit.trigger" should be "process-time". > Users are hard to figure out what is wrong when they don't have watermark. We > can set "sink.partition-commit.trigger" to "process-time" to have better > out-of-box experience. > * The type of "sink.rolling-policy.file-size" should be MemoryType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17925) Fix Filesystem options to default values and types
[ https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-17925: - Summary: Fix Filesystem options to default values and types (was: Throws unsupported exception when using metastore commit policy for filesystem table) > Fix Filesystem options to default values and types > -- > > Key: FLINK-17925 > URL: https://issues.apache.org/jira/browse/FLINK-17925 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Reporter: Jingsong Lee >Priority: Critical > Fix For: 1.11.0 > > > Fix Filesystem options: > * Throws unsupported exception when using metastore commit policy for > filesystem table, Filesystem connector has an empty implementation in > {{TableMetaStoreFactory}}. We should avoid user configuring this policy. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] yangyichao-mango commented on pull request #12313: [FLINK-17005][docs] Translate the CREATE TABLE ... LIKE syntax documentation to Chinese
yangyichao-mango commented on pull request #12313: URL: https://github.com/apache/flink/pull/12313#issuecomment-633562437 Thx. I've rebase that commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-17925) Throws unsupported exception when using metastore commit policy for filesystem table
[ https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-17925: - Description: Fix Filesystem options: * Throws unsupported exception when using metastore commit policy for filesystem table, Filesystem connector has an empty implementation in {{TableMetaStoreFactory}}. We should avoid user configuring this policy. was:Filesystem connector has an empty implementation in \{{TableMetaStoreFactory}}. We should avoid user configuring this policy. > Throws unsupported exception when using metastore commit policy for > filesystem table > > > Key: FLINK-17925 > URL: https://issues.apache.org/jira/browse/FLINK-17925 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Reporter: Jingsong Lee >Priority: Critical > Fix For: 1.11.0 > > > Fix Filesystem options: > * Throws unsupported exception when using metastore commit policy for > filesystem table, Filesystem connector has an empty implementation in > {{TableMetaStoreFactory}}. We should avoid user configuring this policy. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wuchong edited a comment on pull request #12320: [FLINK-17887][table][connector] Improve interface of ScanFormatFactory and SinkFormatFactory
wuchong edited a comment on pull request #12320: URL: https://github.com/apache/flink/pull/12320#issuecomment-633558842 When I writing the code, I have a feeling that maybe `DeserializationFormat` is a better name than `DeserializationSchemaProvider` , because it is created from `DeserializationFormatFactory` and is more intuitive in the `FactoryUtil#discoverXxxxFormat` than `FactoryUtil#discoverXxxxProvider`. But I’m not sure what’s your opinion here @twalthr @KurtYoung , as we’re swinging in the new names again… This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] klion26 commented on pull request #12313: [FLINK-17005][docs] Translate the CREATE TABLE ... LIKE syntax documentation to Chinese
klion26 commented on pull request #12313: URL: https://github.com/apache/flink/pull/12313#issuecomment-633558758 @yangyichao-mango thanks for your contribution, could you please git rid of the "merge" commit? you can use "git rebase" instead This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong commented on pull request #12320: [FLINK-17887][table][connector] Improve interface of ScanFormatFactory and SinkFormatFactory
wuchong commented on pull request #12320: URL: https://github.com/apache/flink/pull/12320#issuecomment-633558842 When I writing the code, I have a feeling that maybe `DeserializationFormat` is a better name than `DeserializationSchemaProvider` , because it is created from `DeserializationFormatFactory` and is more intuitive in the `FactoryUtil#discoverXxxxFormat`. But I’m not sure what’s your opinion here @twalthr @KurtYoung , as we’re swinging in the new names again… This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-17926) Can't build flink-web docker image because of EOL of Ubuntu:18.10
[ https://issues.apache.org/jira/browse/FLINK-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger reassigned FLINK-17926: -- Assignee: Congxian Qiu(klion26) > Can't build flink-web docker image because of EOL of Ubuntu:18.10 > - > > Key: FLINK-17926 > URL: https://issues.apache.org/jira/browse/FLINK-17926 > Project: Flink > Issue Type: Bug > Components: Project Website >Reporter: Congxian Qiu(klion26) >Assignee: Congxian Qiu(klion26) >Priority: Major > > Currently, the Dockerfile[1] in flink-web project is broken because of the > EOL of Ubuntu 18.10[2], will encounter the error such as bellow when > executing {{./run.sh}} > {code:java} > Err:3 http://security.ubuntu.com/ubuntu cosmic-security Release > 404 Not Found [IP: 91.189.88.152 80] > Ign:4 http://archive.ubuntu.com/ubuntu cosmic-updates InRelease > Ign:5 http://archive.ubuntu.com/ubuntu cosmic-backports InRelease > Err:6 http://archive.ubuntu.com/ubuntu cosmic Release > 404 Not Found [IP: 91.189.88.142 80] > Err:7 http://archive.ubuntu.com/ubuntu cosmic-updates Release > 404 Not Found [IP: 91.189.88.142 80] > Err:8 http://archive.ubuntu.com/ubuntu cosmic-backports Release > 404 Not Found [IP: 91.189.88.142 80] > Reading package lists... > {code} > The current LTS versions can be found in release website[2]. > Apache Flink docker image uses fedora:28[3], so it unaffected. > As fedora does not have LTS release[4], I proposal to use Ubuntu for website > here, and change the version from {{18.10}} to the closest LTS version > {{18.04, tried locally, it works successfully.}} > [1] > [https://github.com/apache/flink-web/blob/bc66f0f0f463ab62a22e81df7d7efd301b76a6b4/docker/Dockerfile#L17] > [2] [https://wiki.ubuntu.com/Releases] > > [3]https://github.com/apache/flink/blob/e92b2bf19bdf03ad3bae906dc5fa3781aeddb3ee/docs/docker/Dockerfile#L17 > [4] > https://fedoraproject.org/wiki/Fedora_Release_Life_Cycle#Maintenance_Schedule -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17926) Can't build flink-web docker image because of EOL of Ubuntu:18.10
[ https://issues.apache.org/jira/browse/FLINK-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116007#comment-17116007 ] Robert Metzger commented on FLINK-17926: Thanks, this is a good proposal. I assigned you. > Can't build flink-web docker image because of EOL of Ubuntu:18.10 > - > > Key: FLINK-17926 > URL: https://issues.apache.org/jira/browse/FLINK-17926 > Project: Flink > Issue Type: Bug > Components: Project Website >Reporter: Congxian Qiu(klion26) >Assignee: Congxian Qiu(klion26) >Priority: Major > > Currently, the Dockerfile[1] in flink-web project is broken because of the > EOL of Ubuntu 18.10[2], will encounter the error such as bellow when > executing {{./run.sh}} > {code:java} > Err:3 http://security.ubuntu.com/ubuntu cosmic-security Release > 404 Not Found [IP: 91.189.88.152 80] > Ign:4 http://archive.ubuntu.com/ubuntu cosmic-updates InRelease > Ign:5 http://archive.ubuntu.com/ubuntu cosmic-backports InRelease > Err:6 http://archive.ubuntu.com/ubuntu cosmic Release > 404 Not Found [IP: 91.189.88.142 80] > Err:7 http://archive.ubuntu.com/ubuntu cosmic-updates Release > 404 Not Found [IP: 91.189.88.142 80] > Err:8 http://archive.ubuntu.com/ubuntu cosmic-backports Release > 404 Not Found [IP: 91.189.88.142 80] > Reading package lists... > {code} > The current LTS versions can be found in release website[2]. > Apache Flink docker image uses fedora:28[3], so it unaffected. > As fedora does not have LTS release[4], I proposal to use Ubuntu for website > here, and change the version from {{18.10}} to the closest LTS version > {{18.04, tried locally, it works successfully.}} > [1] > [https://github.com/apache/flink-web/blob/bc66f0f0f463ab62a22e81df7d7efd301b76a6b4/docker/Dockerfile#L17] > [2] [https://wiki.ubuntu.com/Releases] > > [3]https://github.com/apache/flink/blob/e92b2bf19bdf03ad3bae906dc5fa3781aeddb3ee/docs/docker/Dockerfile#L17 > [4] > https://fedoraproject.org/wiki/Fedora_Release_Life_Cycle#Maintenance_Schedule -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #12320: [FLINK-17887][table][connector] Improve interface of ScanFormatFactory and SinkFormatFactory
flinkbot commented on pull request #12320: URL: https://github.com/apache/flink/pull/12320#issuecomment-633558182 ## CI report: * ea7a259ecdae60ff7b30932ac3821fdb91e5b674 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org