date:20200525

[jira] [Created] (FLINK-17935) Logs could not show up when deploying Flink on Yarn via "--executor"

2020-05-25 Thread Yang Wang (Jira)

Yang Wang created FLINK-17935:
-

 Summary: Logs could not show up when deploying Flink on Yarn via 
"--executor"
 Key: FLINK-17935
 URL: https://issues.apache.org/jira/browse/FLINK-17935
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.11.0, 1.12.0
Reporter: Yang Wang
 Fix For: 1.11.0


{code:java}
./bin/flink run -d -p 5 -e yarn-per-job examples/streaming/WindowJoin.jar{code}
When we use the {{-e/--executor}} to specify the deploy target to Yarn per-job, 
the logs could not show up. The root cause is we do not set the logging files 
in {{ExecutorCLI}}. We only do it in the {{FlinkYarnSessionCli}}.

If we use {{-m yarn-cluster}}, everything works well.

 

Maybe we should move the {{setLogConfigFileInConfig}} to 
{{YarnClusterDescriptor}} to avoid this problem. cc [~kkl0u]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-17889) flink-connector-hive jar contains wrong class in its SPI config file org.apache.flink.table.factories.TableFactory

2020-05-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-17889.

Resolution: Fixed

master: ec0288c6df4edf63ef6601cde2a7b45eaa85cda3

release-1.11: 643f4a1283b82bcbd9cc08878bcee298e10d2bcd

> flink-connector-hive jar contains wrong class in its SPI config file 
> org.apache.flink.table.factories.TableFactory
> --
>
> Key: FLINK-17889
> URL: https://issues.apache.org/jira/browse/FLINK-17889
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.11.0
>Reporter: Jeff Zhang
>Assignee: Jingsong Lee
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> These 2 classes are in flink-connector-hive jar's SPI config file
> {code:java}
> org.apache.flink.orc.OrcFileSystemFormatFactory
> License.org.apache.flink.formats.parquet.ParquetFileSystemFormatFactory {code}
> Due to this issue, I get the following exception in zeppelin side.
> {code:java}
> Caused by: java.util.ServiceConfigurationError: 
> org.apache.flink.table.factories.TableFactory: Provider 
> org.apache.flink.orc.OrcFileSystemFormatFactory not a subtypeCaused by: 
> java.util.ServiceConfigurationError: 
> org.apache.flink.table.factories.TableFactory: Provider 
> org.apache.flink.orc.OrcFileSystemFormatFactory not a subtype at 
> java.util.ServiceLoader.fail(ServiceLoader.java:239) at 
> java.util.ServiceLoader.access$300(ServiceLoader.java:185) at 
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376) at 
> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at 
> java.util.ServiceLoader$1.next(ServiceLoader.java:480) at 
> java.util.Iterator.forEachRemaining(Iterator.java:116) at 
> org.apache.flink.table.factories.TableFactoryService.discoverFactories(TableFactoryService.java:214)
>  ... 35 more {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17189) Table with processing time attribute can not be read from Hive catalog

2020-05-25 Thread Jingsong Lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116449#comment-17116449
 ] 

Jingsong Lee commented on FLINK-17189:
--

Feel free to re-open this if you think we need fix in release-1.10 too.

> Table with processing time attribute can not be read from Hive catalog
> --
>
> Key: FLINK-17189
> URL: https://issues.apache.org/jira/browse/FLINK-17189
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Ecosystem, Table SQL / Planner
>Affects Versions: 1.10.1
>Reporter: Timo Walther
>Assignee: Jingsong Lee
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> DDL:
> {code}
> CREATE TABLE PROD_LINEITEM (
>   L_ORDERKEY   INTEGER,
>   L_PARTKEYINTEGER,
>   L_SUPPKEYINTEGER,
>   L_LINENUMBER INTEGER,
>   L_QUANTITY   DOUBLE,
>   L_EXTENDEDPRICE  DOUBLE,
>   L_DISCOUNT   DOUBLE,
>   L_TAXDOUBLE,
>   L_CURRENCY   STRING,
>   L_RETURNFLAG STRING,
>   L_LINESTATUS STRING,
>   L_ORDERTIME  TIMESTAMP(3),
>   L_SHIPINSTRUCT   STRING,
>   L_SHIPMODE   STRING,
>   L_COMMENTSTRING,
>   WATERMARK FOR L_ORDERTIME AS L_ORDERTIME - INTERVAL '5' MINUTE,
>   L_PROCTIME   AS PROCTIME()
> ) WITH (
>   'connector.type' = 'kafka',
>   'connector.version' = 'universal',
>   'connector.topic' = 'Lineitem',
>   'connector.properties.zookeeper.connect' = 'not-needed',
>   'connector.properties.bootstrap.servers' = 'kafka:9092',
>   'connector.startup-mode' = 'earliest-offset',
>   'format.type' = 'csv',
>   'format.field-delimiter' = '|'
> );
> {code}
> Query:
> {code}
> SELECT * FROM prod_lineitem;
> {code}
> Result:
> {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.AssertionError: Conversion to relational algebra failed to preserve 
> datatypes:
> validated type:
> RecordType(INTEGER L_ORDERKEY, INTEGER L_PARTKEY, INTEGER L_SUPPKEY, INTEGER 
> L_LINENUMBER, DOUBLE L_QUANTITY, DOUBLE L_EXTENDEDPRICE, DOUBLE L_DISCOUNT, 
> DOUBLE L_TAX, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_CURRENCY, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_RETURNFLAG, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_LINESTATUS, TIME 
> ATTRIBUTE(ROWTIME) L_ORDERTIME, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> L_SHIPINSTRUCT, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_SHIPMODE, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_COMMENT, TIMESTAMP(3) NOT NULL 
> L_PROCTIME) NOT NULL
> converted type:
> RecordType(INTEGER L_ORDERKEY, INTEGER L_PARTKEY, INTEGER L_SUPPKEY, INTEGER 
> L_LINENUMBER, DOUBLE L_QUANTITY, DOUBLE L_EXTENDEDPRICE, DOUBLE L_DISCOUNT, 
> DOUBLE L_TAX, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_CURRENCY, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_RETURNFLAG, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_LINESTATUS, TIME 
> ATTRIBUTE(ROWTIME) L_ORDERTIME, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> L_SHIPINSTRUCT, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_SHIPMODE, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_COMMENT, TIME 
> ATTRIBUTE(PROCTIME) NOT NULL L_PROCTIME) NOT NULL
> rel:
> LogicalProject(L_ORDERKEY=[$0], L_PARTKEY=[$1], L_SUPPKEY=[$2], 
> L_LINENUMBER=[$3], L_QUANTITY=[$4], L_EXTENDEDPRICE=[$5], L_DISCOUNT=[$6], 
> L_TAX=[$7], L_CURRENCY=[$8], L_RETURNFLAG=[$9], L_LINESTATUS=[$10], 
> L_ORDERTIME=[$11], L_SHIPINSTRUCT=[$12], L_SHIPMODE=[$13], L_COMMENT=[$14], 
> L_PROCTIME=[$15])
>   LogicalWatermarkAssigner(rowtime=[L_ORDERTIME], watermark=[-($11, 
> 30:INTERVAL MINUTE)])
> LogicalProject(L_ORDERKEY=[$0], L_PARTKEY=[$1], L_SUPPKEY=[$2], 
> L_LINENUMBER=[$3], L_QUANTITY=[$4], L_EXTENDEDPRICE=[$5], L_DISCOUNT=[$6], 
> L_TAX=[$7], L_CURRENCY=[$8], L_RETURNFLAG=[$9], L_LINESTATUS=[$10], 
> L_ORDERTIME=[$11], L_SHIPINSTRUCT=[$12], L_SHIPMODE=[$13], L_COMMENT=[$14], 
> L_PROCTIME=[PROCTIME()])
>   LogicalTableScan(table=[[hcat, default, prod_lineitem, source: 
> [KafkaTableSource(L_ORDERKEY, L_PARTKEY, L_SUPPKEY, L_LINENUMBER, L_QUANTITY, 
> L_EXTENDEDPRICE, L_DISCOUNT, L_TAX, L_CURRENCY, L_RETURNFLAG, L_LINESTATUS, 
> L_ORDERTIME, L_SHIPINSTRUCT, L_SHIPMODE, L_COMMENT)]]])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-17189) Table with processing time attribute can not be read from Hive catalog

2020-05-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-17189.

Fix Version/s: (was: 1.10.2)
   Resolution: Fixed

master: f08c8309738e519d246aeda163ef1b1f5e7855c7

release-1.11: 86d2777e89b87c842e9fc242757103656e807154

> Table with processing time attribute can not be read from Hive catalog
> --
>
> Key: FLINK-17189
> URL: https://issues.apache.org/jira/browse/FLINK-17189
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Ecosystem, Table SQL / Planner
>Affects Versions: 1.10.1
>Reporter: Timo Walther
>Assignee: Jingsong Lee
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> DDL:
> {code}
> CREATE TABLE PROD_LINEITEM (
>   L_ORDERKEY   INTEGER,
>   L_PARTKEYINTEGER,
>   L_SUPPKEYINTEGER,
>   L_LINENUMBER INTEGER,
>   L_QUANTITY   DOUBLE,
>   L_EXTENDEDPRICE  DOUBLE,
>   L_DISCOUNT   DOUBLE,
>   L_TAXDOUBLE,
>   L_CURRENCY   STRING,
>   L_RETURNFLAG STRING,
>   L_LINESTATUS STRING,
>   L_ORDERTIME  TIMESTAMP(3),
>   L_SHIPINSTRUCT   STRING,
>   L_SHIPMODE   STRING,
>   L_COMMENTSTRING,
>   WATERMARK FOR L_ORDERTIME AS L_ORDERTIME - INTERVAL '5' MINUTE,
>   L_PROCTIME   AS PROCTIME()
> ) WITH (
>   'connector.type' = 'kafka',
>   'connector.version' = 'universal',
>   'connector.topic' = 'Lineitem',
>   'connector.properties.zookeeper.connect' = 'not-needed',
>   'connector.properties.bootstrap.servers' = 'kafka:9092',
>   'connector.startup-mode' = 'earliest-offset',
>   'format.type' = 'csv',
>   'format.field-delimiter' = '|'
> );
> {code}
> Query:
> {code}
> SELECT * FROM prod_lineitem;
> {code}
> Result:
> {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.AssertionError: Conversion to relational algebra failed to preserve 
> datatypes:
> validated type:
> RecordType(INTEGER L_ORDERKEY, INTEGER L_PARTKEY, INTEGER L_SUPPKEY, INTEGER 
> L_LINENUMBER, DOUBLE L_QUANTITY, DOUBLE L_EXTENDEDPRICE, DOUBLE L_DISCOUNT, 
> DOUBLE L_TAX, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_CURRENCY, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_RETURNFLAG, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_LINESTATUS, TIME 
> ATTRIBUTE(ROWTIME) L_ORDERTIME, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> L_SHIPINSTRUCT, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_SHIPMODE, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_COMMENT, TIMESTAMP(3) NOT NULL 
> L_PROCTIME) NOT NULL
> converted type:
> RecordType(INTEGER L_ORDERKEY, INTEGER L_PARTKEY, INTEGER L_SUPPKEY, INTEGER 
> L_LINENUMBER, DOUBLE L_QUANTITY, DOUBLE L_EXTENDEDPRICE, DOUBLE L_DISCOUNT, 
> DOUBLE L_TAX, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_CURRENCY, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_RETURNFLAG, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_LINESTATUS, TIME 
> ATTRIBUTE(ROWTIME) L_ORDERTIME, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> L_SHIPINSTRUCT, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_SHIPMODE, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" L_COMMENT, TIME 
> ATTRIBUTE(PROCTIME) NOT NULL L_PROCTIME) NOT NULL
> rel:
> LogicalProject(L_ORDERKEY=[$0], L_PARTKEY=[$1], L_SUPPKEY=[$2], 
> L_LINENUMBER=[$3], L_QUANTITY=[$4], L_EXTENDEDPRICE=[$5], L_DISCOUNT=[$6], 
> L_TAX=[$7], L_CURRENCY=[$8], L_RETURNFLAG=[$9], L_LINESTATUS=[$10], 
> L_ORDERTIME=[$11], L_SHIPINSTRUCT=[$12], L_SHIPMODE=[$13], L_COMMENT=[$14], 
> L_PROCTIME=[$15])
>   LogicalWatermarkAssigner(rowtime=[L_ORDERTIME], watermark=[-($11, 
> 30:INTERVAL MINUTE)])
> LogicalProject(L_ORDERKEY=[$0], L_PARTKEY=[$1], L_SUPPKEY=[$2], 
> L_LINENUMBER=[$3], L_QUANTITY=[$4], L_EXTENDEDPRICE=[$5], L_DISCOUNT=[$6], 
> L_TAX=[$7], L_CURRENCY=[$8], L_RETURNFLAG=[$9], L_LINESTATUS=[$10], 
> L_ORDERTIME=[$11], L_SHIPINSTRUCT=[$12], L_SHIPMODE=[$13], L_COMMENT=[$14], 
> L_PROCTIME=[PROCTIME()])
>   LogicalTableScan(table=[[hcat, default, prod_lineitem, source: 
> [KafkaTableSource(L_ORDERKEY, L_PARTKEY, L_SUPPKEY, L_LINENUMBER, L_QUANTITY, 
> L_EXTENDEDPRICE, L_DISCOUNT, L_TAX, L_CURRENCY, L_RETURNFLAG, L_LINESTATUS, 
> L_ORDERTIME, L_SHIPINSTRUCT, L_SHIPMODE, L_COMMENT)]]])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17934) StreamingFileWriter should set chainingStrategy

2020-05-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-17934:
-
Description: {{StreamingFileWriter}} should be eagerly chained whenever 
possible.  (was: We should let {{StreamingFileWriter}} should be eagerly 
chained whenever possible.)

> StreamingFileWriter should set chainingStrategy
> ---
>
> Key: FLINK-17934
> URL: https://issues.apache.org/jira/browse/FLINK-17934
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Priority: Blocker
> Fix For: 1.11.0
>
>
> {{StreamingFileWriter}} should be eagerly chained whenever possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17934) StreamingFileWriter should set chainingStrategy

2020-05-25 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-17934:


 Summary: StreamingFileWriter should set chainingStrategy
 Key: FLINK-17934
 URL: https://issues.apache.org/jira/browse/FLINK-17934
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem
Reporter: Jingsong Lee
 Fix For: 1.11.0


We should let {{StreamingFileWriter}} should be eagerly chained whenever 
possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17303) Return TableResult for Python TableEnvironment

2020-05-25 Thread Dian Fu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116405#comment-17116405
 ] 

Dian Fu commented on FLINK-17303:
-

[~hequn8128] Could we also backport this to 1.11.0 as:
- It was originally in 1.11.0 and we just revert it because the added tests are 
not stable and so I think it should be fine to add it back to 1.11.0 after we 
fixed the test stability issue  [~zjwang] What's your thought?

- It affects the usability of a few APIs such as "execute_sql" a lot without 
this PR as they will have to return a Java TableResult object instead of a 
Python TableResult object. Python users are not aware of this and they have to 
lookup the Java docs to know how to use these Python APIs. Besides, it 
introduces potential backward compatibility issues between 1.11 and 1.12.

What's your thought?

> Return TableResult for Python TableEnvironment
> --
>
> Key: FLINK-17303
> URL: https://issues.apache.org/jira/browse/FLINK-17303
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: godfrey he
>Assignee: Nicholas Jiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> [FLINK-16366|https://issues.apache.org/jira/browse/FLINK-16366] supports 
> executing a statement and returning a {{TableResult}} object, which could get 
> {{JobClient}} (to associates the submitted Flink job), collect the execution 
> result, or print the execution result. In Python, we should also introduce 
> python TableResult class to make sure consistent with Java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17925) Fix Filesystem options to default values and types

2020-05-25 Thread Jingsong Lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116402#comment-17116402
 ] 

Jingsong Lee commented on FLINK-17925:
--

I am OK with "proctime", "processing-time" and "process-time". Corresponding 
"partition-time", I choose "process-time".

> Fix Filesystem options to default values and types
> --
>
> Key: FLINK-17925
> URL: https://issues.apache.org/jira/browse/FLINK-17925
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Blocker
> Fix For: 1.11.0
>
>
> Fix Filesystem options:
>  * Throws unsupported exception when using metastore commit policy for 
> filesystem table, Filesystem connector has an empty implementation in 
> {{TableMetaStoreFactory}}. We should avoid user configuring this policy.
>  * Default value of "sink.partition-commit.trigger" should be "process-time". 
> Users are hard to figure out what is wrong when they don't have watermark. We 
> can set "sink.partition-commit.trigger" to "process-time" to have better 
> out-of-box experience.
>  * The type of "sink.rolling-policy.file-size" should be MemoryType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17677) FLINK_LOG_PREFIX recommended in docs is not always available

2020-05-25 Thread Xintong Song (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116396#comment-17116396
 ] 

Xintong Song commented on FLINK-17677:
--

Hi [~RocMarshal],

Thanks for your response.

I'm not sure whether I have understand your proposal correctly. Are you 
suggesting to have separate configuration options for log file paths per 
deployment mode?

If yes, could you explain what is the benefit for having per-deployment 
configuration options compared to setting the common option differently for 
each deployment?

> FLINK_LOG_PREFIX recommended in docs is not always available
> 
>
> Key: FLINK-17677
> URL: https://issues.apache.org/jira/browse/FLINK-17677
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.9.3, 1.10.1, 1.11.0
>Reporter: Xintong Song
>Priority: Major
> Fix For: 1.11.0, 1.10.2, 1.9.4
>
>
> The [Application Profiling & 
> Debugging|https://ci.apache.org/projects/flink/flink-docs-master/monitoring/application_profiling.html]
>  documentation recommend to use the script variable {{FLINK_LOG_PREFIX}} for 
> defining log file paths. However, this variable is only available in 
> standalone mode. This is a bit misleading for users of other deployments (see 
> this 
> [thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Memory-analyze-on-AWS-EMR-td35036.html]).
> I propose to replace {{FLINK_LOG_PREFIX}} with a general representation 
> {{}}, and add a separate section to discuss how to set the log 
> path (e.g., use {{FLINK_LOG_PREFIX}} with standalone deployments and 
> {{}} with Yarn deployments).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2

2020-05-25 Thread Zili Chen (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116391#comment-17116391
 ] 

Zili Chen edited comment on FLINK-17565 at 5/26/20, 3:23 AM:
-

[~felixzheng] That will be appreciated and you can assign it to [~zjwang] for 
coordinating the release process.


was (Author: tison):
[~felixzheng] That will be appreciated and you can assign [~zjwang] for 
coordinating the release process.

> Bump fabric8 version from 4.5.2 to 4.9.2
> 
>
> Key: FLINK-17565
> URL: https://issues.apache.org/jira/browse/FLINK-17565
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.12.0
>
>
> Currently, we are using a version of 4.5.2, it's better that we upgrade it to 
> 4.9.2, some of the reasons are as follows:
>  # It removed the use of reapers manually doing cascade deletion of 
> resources, leave it up to Kubernetes APIServer, which solves the issue of 
> FLINK-17566, more info: 
> [https://github.com/fabric8io/kubernetes-client/issues/1880]
>  # It introduced a regression in building Quantity values in 4.7.0, release 
> note [https://github.com/fabric8io/kubernetes-client/issues/1953].
>  # It provided better support for K8s 1.17, release note: 
> [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0].
> For more release notes, please refer to [fabric8 
> releases|https://github.com/fabric8io/kubernetes-client/releases].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2

2020-05-25 Thread Zili Chen (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116391#comment-17116391
 ] 

Zili Chen commented on FLINK-17565:
---

[~felixzheng] That will be appreciated and you can assign [~zjwang] for 
coordinating the release process.

> Bump fabric8 version from 4.5.2 to 4.9.2
> 
>
> Key: FLINK-17565
> URL: https://issues.apache.org/jira/browse/FLINK-17565
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.12.0
>
>
> Currently, we are using a version of 4.5.2, it's better that we upgrade it to 
> 4.9.2, some of the reasons are as follows:
>  # It removed the use of reapers manually doing cascade deletion of 
> resources, leave it up to Kubernetes APIServer, which solves the issue of 
> FLINK-17566, more info: 
> [https://github.com/fabric8io/kubernetes-client/issues/1880]
>  # It introduced a regression in building Quantity values in 4.7.0, release 
> note [https://github.com/fabric8io/kubernetes-client/issues/1953].
>  # It provided better support for K8s 1.17, release note: 
> [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0].
> For more release notes, please refer to [fabric8 
> releases|https://github.com/fabric8io/kubernetes-client/releases].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2

2020-05-25 Thread Canbin Zheng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116389#comment-17116389
 ] 

Canbin Zheng commented on FLINK-17565:
--

Thank you [~zjwang].

[~tison]  Should I make another PR for backporting this bugfix to 1.11?

> Bump fabric8 version from 4.5.2 to 4.9.2
> 
>
> Key: FLINK-17565
> URL: https://issues.apache.org/jira/browse/FLINK-17565
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.12.0
>
>
> Currently, we are using a version of 4.5.2, it's better that we upgrade it to 
> 4.9.2, some of the reasons are as follows:
>  # It removed the use of reapers manually doing cascade deletion of 
> resources, leave it up to Kubernetes APIServer, which solves the issue of 
> FLINK-17566, more info: 
> [https://github.com/fabric8io/kubernetes-client/issues/1880]
>  # It introduced a regression in building Quantity values in 4.7.0, release 
> note [https://github.com/fabric8io/kubernetes-client/issues/1953].
>  # It provided better support for K8s 1.17, release note: 
> [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0].
> For more release notes, please refer to [fabric8 
> releases|https://github.com/fabric8io/kubernetes-client/releases].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17572) Remove checkpoint alignment buffered metric from webui

2020-05-25 Thread Zhijiang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116388#comment-17116388
 ] 

Zhijiang commented on FLINK-17572:
--

Already assigned to you, welcome to contribute!

> Remove checkpoint alignment buffered metric from webui
> --
>
> Key: FLINK-17572
> URL: https://issues.apache.org/jira/browse/FLINK-17572
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Web Frontend
>Affects Versions: 1.11.0
>Reporter: Yingjie Cao
>Assignee: Nicholas Jiang
>Priority: Minor
> Fix For: 1.12.0
>
>
> After FLINK-16404, we never cache buffers while checkpoint barrier alignment, 
> so the checkpoint alignment buffered metric will be always 0, we should 
> remove it directly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-17572) Remove checkpoint alignment buffered metric from webui

2020-05-25 Thread Zhijiang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijiang reassigned FLINK-17572:


Assignee: Nicholas Jiang

> Remove checkpoint alignment buffered metric from webui
> --
>
> Key: FLINK-17572
> URL: https://issues.apache.org/jira/browse/FLINK-17572
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Web Frontend
>Affects Versions: 1.11.0
>Reporter: Yingjie Cao
>Assignee: Nicholas Jiang
>Priority: Minor
> Fix For: 1.12.0
>
>
> After FLINK-16404, we never cache buffers while checkpoint barrier alignment, 
> so the checkpoint alignment buffered metric will be always 0, we should 
> remove it directly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17909) Add a TestCatalog to hold the serialized meta data to uncover more potential bugs

2020-05-25 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-17909:

Description: 
Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in 
HashMap. However, this lead to many bugs when users switch to some persisted 
catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, 
FLINK-16021. 

That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the 
important path of serialization and deserialization of meta data. We missed 
some important meta information (PK, time attributes) when serialization and 
deserialization which lead to bugs. 

So I propose to add a {{TestCatalog}} to hold the serialized meta data to cover 
the serialization and deserializtion path. The serialized meta data may be in 
the {{Map}} properties format. 



  was:
Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in 
HashMap. However, this lead to many bugs when users switch to some persisted 
catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, 
FLINK-16021. 

That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the 
important path of serialization and deserialization of meta data. We missed 
some important meta information (PK, time attributes) when serialization and 
deserialization which lead to bugs. 

So I propose to hold the serialized meta data in {{GenericInMemoryCatalog}} to 
cover the serialization and deserializtion path. The serialized meta data may 
be in the {{Map}} properties format. 

We may lose some performance here, but {{GenericInMemoryCatalog}} is mostly 
used in demo/experiment/testing, so I think it's fine. 







> Add a TestCatalog to hold the serialized meta data to uncover more potential 
> bugs
> -
>
> Key: FLINK-17909
> URL: https://issues.apache.org/jira/browse/FLINK-17909
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in 
> HashMap. However, this lead to many bugs when users switch to some persisted 
> catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, 
> FLINK-16021. 
> That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the 
> important path of serialization and deserialization of meta data. We missed 
> some important meta information (PK, time attributes) when serialization and 
> deserialization which lead to bugs. 
> So I propose to add a {{TestCatalog}} to hold the serialized meta data to 
> cover the serialization and deserializtion path. The serialized meta data may 
> be in the {{Map}} properties format. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17909) Add a TestCatalog to hold the serialized meta data to uncover more potential bugs

2020-05-25 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116387#comment-17116387
 ] 

Jark Wu commented on FLINK-17909:
-

OK. I renamed the title and description to introduce a `TestCatalog` which will 
hold the serialized meta data. I prefer to have a better testing coverage in 
table module, rather moving tests into connector modules. 

> Add a TestCatalog to hold the serialized meta data to uncover more potential 
> bugs
> -
>
> Key: FLINK-17909
> URL: https://issues.apache.org/jira/browse/FLINK-17909
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in 
> HashMap. However, this lead to many bugs when users switch to some persisted 
> catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, 
> FLINK-16021. 
> That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the 
> important path of serialization and deserialization of meta data. We missed 
> some important meta information (PK, time attributes) when serialization and 
> deserialization which lead to bugs. 
> So I propose to add a {{TestCatalog}} to hold the serialized meta data to 
> cover the serialization and deserializtion path. The serialized meta data may 
> be in the {{Map}} properties format. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17909) Add a TestCatalog to hold the serialized meta data to uncover more potential bugs

2020-05-25 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116387#comment-17116387
 ] 

Jark Wu edited comment on FLINK-17909 at 5/26/20, 2:57 AM:
---

OK. I renamed the title and description to introduce a `TestCatalog` which will 
hold the serialized meta data. I prefer to have a better testing coverage in 
table module, rather than moving tests into connector modules. 


was (Author: jark):
OK. I renamed the title and description to introduce a `TestCatalog` which will 
hold the serialized meta data. I prefer to have a better testing coverage in 
table module, rather moving tests into connector modules. 

> Add a TestCatalog to hold the serialized meta data to uncover more potential 
> bugs
> -
>
> Key: FLINK-17909
> URL: https://issues.apache.org/jira/browse/FLINK-17909
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in 
> HashMap. However, this lead to many bugs when users switch to some persisted 
> catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, 
> FLINK-16021. 
> That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the 
> important path of serialization and deserialization of meta data. We missed 
> some important meta information (PK, time attributes) when serialization and 
> deserialization which lead to bugs. 
> So I propose to add a {{TestCatalog}} to hold the serialized meta data to 
> cover the serialization and deserializtion path. The serialized meta data may 
> be in the {{Map}} properties format. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17909) Add a TestCatalog to hold the serialized meta data to uncover more potential bugs

2020-05-25 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-17909:

Summary: Add a TestCatalog to hold the serialized meta data to uncover more 
potential bugs  (was: Make the GenericInMemoryCatalog to hold the serialized 
meta data to uncover more potential bugs)

> Add a TestCatalog to hold the serialized meta data to uncover more potential 
> bugs
> -
>
> Key: FLINK-17909
> URL: https://issues.apache.org/jira/browse/FLINK-17909
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in 
> HashMap. However, this lead to many bugs when users switch to some persisted 
> catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, 
> FLINK-16021. 
> That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the 
> important path of serialization and deserialization of meta data. We missed 
> some important meta information (PK, time attributes) when serialization and 
> deserialization which lead to bugs. 
> So I propose to hold the serialized meta data in {{GenericInMemoryCatalog}} 
> to cover the serialization and deserializtion path. The serialized meta data 
> may be in the {{Map}} properties format. 
> We may lose some performance here, but {{GenericInMemoryCatalog}} is mostly 
> used in demo/experiment/testing, so I think it's fine. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2

2020-05-25 Thread Zhijiang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116373#comment-17116373
 ] 

Zhijiang commented on FLINK-17565:
--

Since this fix can resolve some known bugs, it is in compliance with the 
release rule. Makes sense for me. :) 

> Bump fabric8 version from 4.5.2 to 4.9.2
> 
>
> Key: FLINK-17565
> URL: https://issues.apache.org/jira/browse/FLINK-17565
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.12.0
>
>
> Currently, we are using a version of 4.5.2, it's better that we upgrade it to 
> 4.9.2, some of the reasons are as follows:
>  # It removed the use of reapers manually doing cascade deletion of 
> resources, leave it up to Kubernetes APIServer, which solves the issue of 
> FLINK-17566, more info: 
> [https://github.com/fabric8io/kubernetes-client/issues/1880]
>  # It introduced a regression in building Quantity values in 4.7.0, release 
> note [https://github.com/fabric8io/kubernetes-client/issues/1953].
>  # It provided better support for K8s 1.17, release note: 
> [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0].
> For more release notes, please refer to [fabric8 
> releases|https://github.com/fabric8io/kubernetes-client/releases].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17677) FLINK_LOG_PREFIX recommended in docs is not always available

2020-05-25 Thread RocMarshal (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116372#comment-17116372
 ] 

RocMarshal commented on FLINK-17677:


Hi, [~xintongsong]

What do you think of this idea:

How about this Configure the log path in a specific deployment mode according 
to the specific configuration items in each deployment mode. This configuration 
item is effective for isolation between deployment modes.If there are no valid 
values configured in a specific deployment mode, an attempt is made to read a 
public configuration item that all deployment modes are aware of.

Thank you.

Best,

Roc

> FLINK_LOG_PREFIX recommended in docs is not always available
> 
>
> Key: FLINK-17677
> URL: https://issues.apache.org/jira/browse/FLINK-17677
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.9.3, 1.10.1, 1.11.0
>Reporter: Xintong Song
>Priority: Major
> Fix For: 1.11.0, 1.10.2, 1.9.4
>
>
> The [Application Profiling & 
> Debugging|https://ci.apache.org/projects/flink/flink-docs-master/monitoring/application_profiling.html]
>  documentation recommend to use the script variable {{FLINK_LOG_PREFIX}} for 
> defining log file paths. However, this variable is only available in 
> standalone mode. This is a bit misleading for users of other deployments (see 
> this 
> [thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Memory-analyze-on-AWS-EMR-td35036.html]).
> I propose to replace {{FLINK_LOG_PREFIX}} with a general representation 
> {{}}, and add a separate section to discuss how to set the log 
> path (e.g., use {{FLINK_LOG_PREFIX}} with standalone deployments and 
> {{}} with Yarn deployments).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17230) Fix incorrect returned address of Endpoint for the ClusterIP Service

2020-05-25 Thread Canbin Zheng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116369#comment-17116369
 ] 

Canbin Zheng commented on FLINK-17230:
--

[~zjwang] The scope of this problem is that the end-users would get the wrong 
Endpoint in HA setups thus can't talk to the JobManager from client-side via 
that retrieved Endpoint. Hence, it would be nice that we backport this PR to 
1.11.

> Fix incorrect returned address of Endpoint for the ClusterIP Service
> 
>
> Key: FLINK-17230
> URL: https://issues.apache.org/jira/browse/FLINK-17230
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Affects Versions: 1.10.0, 1.10.1
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.12.0
>
>
> At the moment, when the type of the external Service is set to {{ClusterIP}}, 
> we return an incorrect address 
> {{KubernetesUtils.getInternalServiceName(clusterId) + "." + nameSpace}} for 
> the Endpoint.
> This ticket aims to fix this bug by returning 
> {{KubernetesUtils.getRestServiceName(clusterId) + "." + nameSpace}} instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2

2020-05-25 Thread Canbin Zheng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116363#comment-17116363
 ] 

Canbin Zheng commented on FLINK-17565:
--

[~zjwang] 1.11 has supported application mode on Kubernetes, and as FLINK-17566 
describes, backport this PR to 1.11 makes application mode more robust thus 
end-users who interested in Kubernetes setups can benefit from this.

> Bump fabric8 version from 4.5.2 to 4.9.2
> 
>
> Key: FLINK-17565
> URL: https://issues.apache.org/jira/browse/FLINK-17565
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.12.0
>
>
> Currently, we are using a version of 4.5.2, it's better that we upgrade it to 
> 4.9.2, some of the reasons are as follows:
>  # It removed the use of reapers manually doing cascade deletion of 
> resources, leave it up to Kubernetes APIServer, which solves the issue of 
> FLINK-17566, more info: 
> [https://github.com/fabric8io/kubernetes-client/issues/1880]
>  # It introduced a regression in building Quantity values in 4.7.0, release 
> note [https://github.com/fabric8io/kubernetes-client/issues/1953].
>  # It provided better support for K8s 1.17, release note: 
> [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0].
> For more release notes, please refer to [fabric8 
> releases|https://github.com/fabric8io/kubernetes-client/releases].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-25 Thread Dian Fu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116362#comment-17116362
 ] 

Dian Fu commented on FLINK-17883:
-

Yes, I think so. We could close it after verifying that it works in 1.11. I'll 
verify it.

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17909) Make the GenericInMemoryCatalog to hold the serialized meta data to uncover more potential bugs

2020-05-25 Thread Jingsong Lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116360#comment-17116360
 ] 

Jingsong Lee commented on FLINK-17909:
--

+1 for [~twalthr]'s proposal. We can set a `TestCatalog` in our `BatchTestBase` 
and `StreamingTestBase` and etc... We should not disturb 
{{GenericInMemoryCatalog}} just for tesing.

> Make the GenericInMemoryCatalog to hold the serialized meta data to uncover 
> more potential bugs
> ---
>
> Key: FLINK-17909
> URL: https://issues.apache.org/jira/browse/FLINK-17909
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in 
> HashMap. However, this lead to many bugs when users switch to some persisted 
> catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, 
> FLINK-16021. 
> That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the 
> important path of serialization and deserialization of meta data. We missed 
> some important meta information (PK, time attributes) when serialization and 
> deserialization which lead to bugs. 
> So I propose to hold the serialized meta data in {{GenericInMemoryCatalog}} 
> to cover the serialization and deserializtion path. The serialized meta data 
> may be in the {{Map}} properties format. 
> We may lose some performance here, but {{GenericInMemoryCatalog}} is mostly 
> used in demo/experiment/testing, so I think it's fine. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17230) Fix incorrect returned address of Endpoint for the ClusterIP Service

2020-05-25 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116359#comment-17116359
 ] 

Yang Wang commented on FLINK-17230:
---

This is a bugfix for wrongly returning a rest endpoint when the service exposed 
type is {{ClusterIP}}. I think it needs to be picked to branch 1.11. 

[~zjwang] Do you have concerns to integrate this PR to branch 1.11?

> Fix incorrect returned address of Endpoint for the ClusterIP Service
> 
>
> Key: FLINK-17230
> URL: https://issues.apache.org/jira/browse/FLINK-17230
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Affects Versions: 1.10.0, 1.10.1
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.12.0
>
>
> At the moment, when the type of the external Service is set to {{ClusterIP}}, 
> we return an incorrect address 
> {{KubernetesUtils.getInternalServiceName(clusterId) + "." + nameSpace}} for 
> the Endpoint.
> This ticket aims to fix this bug by returning 
> {{KubernetesUtils.getRestServiceName(clusterId) + "." + nameSpace}} instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17395) Add the sign and shasum logic for PyFlink wheel packages

2020-05-25 Thread Dian Fu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116358#comment-17116358
 ] 

Dian Fu commented on FLINK-17395:
-

Merged the follow-up "Don't remove the dist directory during clean" via
- master: 8f992e8e868b846cf7fe8de23923358fc6b50721
- release-1.11: 0285cb7db39b5f9ba4aeb6cd3848cab6db04a333

> Add the sign and shasum logic for PyFlink wheel packages
> 
>
> Key: FLINK-17395
> URL: https://issues.apache.org/jira/browse/FLINK-17395
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python, Release System
>Reporter: Huang Xingbo
>Assignee: Huang Xingbo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Add the sign and sha logic for PyFlink wheel packages



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2

2020-05-25 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116355#comment-17116355
 ] 

Yang Wang commented on FLINK-17565:
---

This ticket will fix the following user ability bugs and should be included in 
the 1.11.0. 
 * Make Flink on K8s could work on java 8u252, FLINK-17416
 * Fix the potential resource leak when job reaches the final state, FLINK-17566

[~zjwang] Do you have concerns to merge this to branch 1.11?

> Bump fabric8 version from 4.5.2 to 4.9.2
> 
>
> Key: FLINK-17565
> URL: https://issues.apache.org/jira/browse/FLINK-17565
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.12.0
>
>
> Currently, we are using a version of 4.5.2, it's better that we upgrade it to 
> 4.9.2, some of the reasons are as follows:
>  # It removed the use of reapers manually doing cascade deletion of 
> resources, leave it up to Kubernetes APIServer, which solves the issue of 
> FLINK-17566, more info: 
> [https://github.com/fabric8io/kubernetes-client/issues/1880]
>  # It introduced a regression in building Quantity values in 4.7.0, release 
> note [https://github.com/fabric8io/kubernetes-client/issues/1953].
>  # It provided better support for K8s 1.17, release note: 
> [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0].
> For more release notes, please refer to [fabric8 
> releases|https://github.com/fabric8io/kubernetes-client/releases].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17572) Remove checkpoint alignment buffered metric from webui

2020-05-25 Thread Nicholas Jiang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107123#comment-17107123
 ] 

Nicholas Jiang edited comment on FLINK-17572 at 5/26/20, 1:58 AM:
--

[~zjwang]I think this just removes alignmentBuffered in Checkpoint Statistics.  
Therefore, could u please assign this issue to me?


was (Author: nicholasjiang):
[~kevin.cyj]I think this just removes alignmentBuffered in Checkpoint 
Statistics.  Therefore, could u please assign this issue to me?

> Remove checkpoint alignment buffered metric from webui
> --
>
> Key: FLINK-17572
> URL: https://issues.apache.org/jira/browse/FLINK-17572
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Web Frontend
>Affects Versions: 1.11.0
>Reporter: Yingjie Cao
>Priority: Minor
> Fix For: 1.12.0
>
>
> After FLINK-16404, we never cache buffers while checkpoint barrier alignment, 
> so the checkpoint alignment buffered metric will be always 0, we should 
> remove it directly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-17924) HBase connector：we can write data to HBase table, but con`t read data from HBase

2020-05-25 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu closed FLINK-17924.
---
Fix Version/s: (was: 1.11.0)
   Resolution: Duplicate

> HBase connector：we can write data to HBase table, but con`t read data from 
> HBase
> 
>
> Key: FLINK-17924
> URL: https://issues.apache.org/jira/browse/FLINK-17924
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / HBase
>Affects Versions: 1.10.0
>Reporter: pengnengsong
>Priority: Major
>
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> EnviromentSettings streamSettings = 
> EnviromentSetting.newInstance().useBlinkPlanner().inStreamingMode().build();
> StreamTableEnviroment tEnv = StreamTableEnviroment.create(env, 
> streamSettings);
> CREATE TABLE source(
>  rowkey INT,
>  f1 ROW,
>  f2 ROW
>  )WITH(
>  'connector.type' = 'hbase',
>  'connector.version' = '1.4.3',
>  'connector.table-name' = 'test_source:flink',
>  'connector.zookeeper.quorum' = ':2181',
>  'connector.zookeeper.znode.parent' = '/test/hbase'
>  )
> SELECT * FROM source
>  
> In  HBaseRowInputformat.Java when execute the configure method, this.conf is 
> always null. The default hbase configuration information is created, and the 
> configuration parameters in with are not in effect.
> private void connectToTable(){
>  if(this.conf ==null)
> { this.conf = HBaseConfiguration.create(); }
> ...
>  }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-17932) Most of hbase config Options can not work in HBaseTableSource

2020-05-25 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-17932:
---

Assignee: Leonard Xu

> Most of hbase config Options can not work in HBaseTableSource
> -
>
> Key: FLINK-17932
> URL: https://issues.apache.org/jira/browse/FLINK-17932
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / HBase
>Affects Versions: 1.10.0, 1.11.0, 1.12.0
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
>
> All hbase config Options can not work in HBaseTableSource, because field 
> `conf` in HBaseRowInputFormat is not serializable, and HBaseTableSource will 
> failback to load hbase-site.xml from classpath. I.E, many table  config 
> Options that user defined in DDL is useless and HbaseTableSource only load 
> config options from classpath.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17932) Most of hbase config Options can not work in HBaseTableSource

2020-05-25 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-17932:

Fix Version/s: 1.11.0

> Most of hbase config Options can not work in HBaseTableSource
> -
>
> Key: FLINK-17932
> URL: https://issues.apache.org/jira/browse/FLINK-17932
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / HBase
>Affects Versions: 1.10.0, 1.11.0, 1.12.0
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
> Fix For: 1.11.0
>
>
> All hbase config Options can not work in HBaseTableSource, because field 
> `conf` in HBaseRowInputFormat is not serializable, and HBaseTableSource will 
> failback to load hbase-site.xml from classpath. I.E, many table  config 
> Options that user defined in DDL is useless and HbaseTableSource only load 
> config options from classpath.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17924) HBase connector：we can write data to HBase table, but con`t read data from HBase

2020-05-25 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-17924:

Fix Version/s: 1.11.0

> HBase connector：we can write data to HBase table, but con`t read data from 
> HBase
> 
>
> Key: FLINK-17924
> URL: https://issues.apache.org/jira/browse/FLINK-17924
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / HBase
>Affects Versions: 1.10.0
>Reporter: pengnengsong
>Priority: Major
> Fix For: 1.11.0
>
>
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> EnviromentSettings streamSettings = 
> EnviromentSetting.newInstance().useBlinkPlanner().inStreamingMode().build();
> StreamTableEnviroment tEnv = StreamTableEnviroment.create(env, 
> streamSettings);
> CREATE TABLE source(
>  rowkey INT,
>  f1 ROW,
>  f2 ROW
>  )WITH(
>  'connector.type' = 'hbase',
>  'connector.version' = '1.4.3',
>  'connector.table-name' = 'test_source:flink',
>  'connector.zookeeper.quorum' = ':2181',
>  'connector.zookeeper.znode.parent' = '/test/hbase'
>  )
> SELECT * FROM source
>  
> In  HBaseRowInputformat.Java when execute the configure method, this.conf is 
> always null. The default hbase configuration information is created, and the 
> configuration parameters in with are not in effect.
> private void connectToTable(){
>  if(this.conf ==null)
> { this.conf = HBaseConfiguration.create(); }
> ...
>  }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17924) HBase connector：we can write data to HBase table, but con`t read data from HBase

2020-05-25 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116353#comment-17116353
 ] 

Jark Wu commented on FLINK-17924:
-

Yes, I think this is a bug. 

> HBase connector：we can write data to HBase table, but con`t read data from 
> HBase
> 
>
> Key: FLINK-17924
> URL: https://issues.apache.org/jira/browse/FLINK-17924
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / HBase
>Affects Versions: 1.10.0
>Reporter: pengnengsong
>Priority: Major
> Fix For: 1.11.0
>
>
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> EnviromentSettings streamSettings = 
> EnviromentSetting.newInstance().useBlinkPlanner().inStreamingMode().build();
> StreamTableEnviroment tEnv = StreamTableEnviroment.create(env, 
> streamSettings);
> CREATE TABLE source(
>  rowkey INT,
>  f1 ROW,
>  f2 ROW
>  )WITH(
>  'connector.type' = 'hbase',
>  'connector.version' = '1.4.3',
>  'connector.table-name' = 'test_source:flink',
>  'connector.zookeeper.quorum' = ':2181',
>  'connector.zookeeper.znode.parent' = '/test/hbase'
>  )
> SELECT * FROM source
>  
> In  HBaseRowInputformat.Java when execute the configure method, this.conf is 
> always null. The default hbase configuration information is created, and the 
> configuration parameters in with are not in effect.
> private void connectToTable(){
>  if(this.conf ==null)
> { this.conf = HBaseConfiguration.create(); }
> ...
>  }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-17909) Make the GenericInMemoryCatalog to hold the serialized meta data to uncover more potential bugs

2020-05-25 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-17909:
---

Assignee: (was: Jark Wu)

> Make the GenericInMemoryCatalog to hold the serialized meta data to uncover 
> more potential bugs
> ---
>
> Key: FLINK-17909
> URL: https://issues.apache.org/jira/browse/FLINK-17909
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in 
> HashMap. However, this lead to many bugs when users switch to some persisted 
> catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, 
> FLINK-16021. 
> That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the 
> important path of serialization and deserialization of meta data. We missed 
> some important meta information (PK, time attributes) when serialization and 
> deserialization which lead to bugs. 
> So I propose to hold the serialized meta data in {{GenericInMemoryCatalog}} 
> to cover the serialization and deserializtion path. The serialized meta data 
> may be in the {{Map}} properties format. 
> We may lose some performance here, but {{GenericInMemoryCatalog}} is mostly 
> used in demo/experiment/testing, so I think it's fine. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17925) Fix Filesystem options to default values and types

2020-05-25 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116352#comment-17116352
 ] 

Jark Wu commented on FLINK-17925:
-

Should it be "processing-time"? 

> Fix Filesystem options to default values and types
> --
>
> Key: FLINK-17925
> URL: https://issues.apache.org/jira/browse/FLINK-17925
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Blocker
> Fix For: 1.11.0
>
>
> Fix Filesystem options:
>  * Throws unsupported exception when using metastore commit policy for 
> filesystem table, Filesystem connector has an empty implementation in 
> {{TableMetaStoreFactory}}. We should avoid user configuring this policy.
>  * Default value of "sink.partition-commit.trigger" should be "process-time". 
> Users are hard to figure out what is wrong when they don't have watermark. We 
> can set "sink.partition-commit.trigger" to "process-time" to have better 
> out-of-box experience.
>  * The type of "sink.rolling-policy.file-size" should be MemoryType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-25 Thread Nicholas Jiang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116347#comment-17116347
 ] 

Nicholas Jiang commented on FLINK-17883:


[~dian.fu], as you mentioned, this issue has already supported by using INSERT 
OVERWRITE statement, therefore would this issue be closed?

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16057) Performance regression in ContinuousFileReaderOperator

2020-05-25 Thread Roman Khachatryan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116275#comment-17116275
 ] 

Roman Khachatryan commented on FLINK-16057:
---

I don't the reason. 

My first guess was that the benchmark is wrong, but I couldn't spot any issues.

Can you please take a look: 
[https://github.com/dataArtisans/flink-benchmarks/pull/62] ?

 

I also repeated the non-io-benchmark (that generates numbers without reading 
files):

 
{code:java}
old:
"Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit" 
"org.apache.flink.benchmark.ContinuousFileReaderOperatorBenchmark.readFileSplit","thrpt",1,30,18193.296713,1288.369575,"ops/ms"
new:
"Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit" 
"org.apache.flink.benchmark.ContinuousFileReaderOperatorBenchmark.readFileSplit","thrpt",1,30,12491.844935,165.684694,"ops/ms"
{code}
 

> Performance regression in ContinuousFileReaderOperator
> --
>
> Key: FLINK-16057
> URL: https://issues.apache.org/jira/browse/FLINK-16057
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> After switching CFRO to a single-threaded execution model performance 
> regression was expected to be about 15-20% (benchmarked in November).
> But after merging to master it turned out to be about 50%.
>   
> One reason is that the chaining strategy isn't set by default in CFRO factory.
> Without that even reading and outputting all records of a split in a single 
> mail action doesn't reverse the regression (only about half).
> However,  with strategy set AND batching enabled fixes the regression 
> (starting from batch size 6).
> Though batching can't be used in practice because it can significantly delay 
> checkpointing.
>  
> Another approach would be to process one record and the repeat until 
> defaultMailboxActionAvailable OR haveNewMail.
> This reverses regression and even improves the performance by about 50% 
> compared to the old version.
>  
> The final solution could also be FLIP-27.
>  
> Other things tried (didn't help):
>  * CFRO rework without subsequent commits (removing checkpoint lock)
>  * different batch sizes, including the whole split, without chaining 
> strategy fixed - partial improvement only
>  * disabling close
>  * disabling checkpointing
>  * disabling output (serialization)
>  * using LinkedList instead of PriorityQueue
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-17870) dependent jars are missing to be shipped to cluster in scala shell

2020-05-25 Thread Kostas Kloudas (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostas Kloudas closed FLINK-17870.
--
Resolution: Fixed

Merged on master with ea54a54be1ea5629ccbd2421aa6f91b19870d40c
on release-1.11 with 714369797781e1a49aabf9e45f6ce8d09cb6336a
and on release-1.10 with 492f9de2b9040a4ca6c45d5060bc776448682f07

> dependent jars are missing to be shipped to cluster in scala shell
> --
>
> Key: FLINK-17870
> URL: https://issues.apache.org/jira/browse/FLINK-17870
> Project: Flink
>  Issue Type: Bug
>  Components: Scala Shell
>Affects Versions: 1.11.0
>Reporter: Jeff Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.10.2, 1.12.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17870) dependent jars are missing to be shipped to cluster in scala shell

2020-05-25 Thread Kostas Kloudas (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostas Kloudas updated FLINK-17870:
---
Fix Version/s: 1.12.0
   1.10.2
   1.11.0

> dependent jars are missing to be shipped to cluster in scala shell
> --
>
> Key: FLINK-17870
> URL: https://issues.apache.org/jira/browse/FLINK-17870
> Project: Flink
>  Issue Type: Bug
>  Components: Scala Shell
>Affects Versions: 1.11.0
>Reporter: Jeff Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.10.2, 1.12.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15012) Checkpoint directory not cleaned up

2020-05-25 Thread Stephan Ewen (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116169#comment-17116169
 ] 

Stephan Ewen commented on FLINK-15012:
--

I think there is a very difference between the working/temp directories and the 
checkpoint directories.

The working/temp directories can be cleaned up after processes shut down, 
because no data in them will ever be needed.
The checkpoint directories may contain retained checkpoints or savepoints that 
are still relevant. I think we should not ever try to delete these with things 
like "shutdown hooks".

I understand that job cancellation should remove the job's empty parent 
checkpoint directories. That makes sense. And [~yunta] proposed an issue to fix 
this.

I would question whether we should try and do anything about the 
{{stop-cluster.sh}} behavior. This is forceful wiping of the cluster rather 
than proper shutdown, so left-over data is to be expected. And, in my mind, the 
caution to not accidentally delete a still-needed checkpoint is more important 
than making the "hard stop" as nice as possible (cleanup wise).


> Checkpoint directory not cleaned up
> ---
>
> Key: FLINK-15012
> URL: https://issues.apache.org/jira/browse/FLINK-15012
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.9.1
>Reporter: Nico Kruber
>Assignee: Yun Tang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I started a Flink cluster with 2 TMs using {{start-cluster.sh}} and the 
> following config (in addition to the default {{flink-conf.yaml}})
> {code:java}
> state.checkpoints.dir: file:///path/to/checkpoints/
> state.backend: rocksdb {code}
> After submitting a jobwith checkpoints enabled (every 5s), checkpoints show 
> up, e.g.
> {code:java}
> bb969f842bbc0ecc3b41b7fbe23b047b/
> ├── chk-2
> │   ├── 238969e1-6949-4b12-98e7-1411c186527c
> │   ├── 2702b226-9cfc-4327-979d-e5508ab2e3d5
> │   ├── 4c51cb24-6f71-4d20-9d4c-65ed6e826949
> │   ├── e706d574-c5b2-467a-8640-1885ca252e80
> │   └── _metadata
> ├── shared
> └── taskowned {code}
> If I shut down the cluster via {{stop-cluster.sh}}, these files will remain 
> on disk and not be cleaned up.
> In contrast, if I cancel the job, at least {{chk-2}} will be deleted, but 
> still leaving the (empty) directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17933) TaskManager was terminated on Yarn - investigate

2020-05-25 Thread Roman Khachatryan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan updated FLINK-17933:
--
Description: 
When running a job on Yarn cluster (load testing) some jobs result in failures.

Initial symptoms are no bytes written/transferred in CSV and failures in logs: 
{code:java}
2020-05-17 10:02:32,858 WARN org.apache.flink.runtime.taskmanager.Task [] - Map 
-> Flat Map (138/160) (e49f7ea26b633c8035f2a919b1c580c8) switched from RUNNING 
to FAILED.{code}
 

It turned out that all such failures were caused by "Connection reset" from a 
single IP, except for one "Leadership lost" error (another IP).

Connection reset was likely caused by TM receiving SIGTERM 
(container_1589453804748_0118_01_04 and 5 both on ip-172-31-42-229):
{code:java}
2020-05-17 10:02:31,362 INFO org.apache.flink.yarn.YarnTaskExecutorRunner [] - 
RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.{code}
 

Other TMs received SIGTERM one minute later (all logs were uploaded at the same 
time though).

 

>From the JM it looked like this:
{code:java}
2020-05-17 10:02:23,583 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Trigger heartbeat request.
2020-05-17 10:02:23,587 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_05.
2020-05-17 10:02:23,590 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_06.
2020-05-17 10:02:23,592 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_04.
2020-05-17 10:02:23,595 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_03.
2020-05-17 10:02:23,598 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_02.
2020-05-17 10:02:23,725 DEBUG 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Received 
acknowledge message for checkpoint 12 from task 
459efd2ad8fe2ffe7fffe28530064fe1 of job 5d4d8c88de23b1361fe0dce6ba8443f8 at 
container_1589453804748_0118_01_02 @ 
ip-172-31-43-69.eu-central-1.compute.internal (dataPort=44625).
2020-05-17 10:02:29,103 DEBUG 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Received 
acknowledge message for checkpoint 12 from task 
266a9326be7e3ec669cce2e6a97ae5b0 of job 5d4d8c88de23b1361fe0dce6ba8443f8 at 
container_1589453804748_0118_01_05 @ 
ip-172-31-42-229.eu-central-1.compute.internal (dataPort=37329).
2020-05-17 10:02:32,862 WARN akka.remote.ReliableDeliverySupervisor [] - 
Association with remote system 
[akka.tcp://fl...@ip-172-31-42-229.eu-central-1.compute.internal:3] has 
failed, address is now gated for [50] ms. Reason: [Disassociated]
2020-05-17 10:02:32,862 WARN akka.remote.ReliableDeliverySupervisor [] - 
Association with remote system 
[akka.tcp://fl...@ip-172-31-42-229.eu-central-1.compute.internal:42567] has 
failed, address is now gated for [50] ms. Reason: [Disassociated]
2020-05-17 10:02:32,900 INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Map -> Flat Map 
(87/160) (cb77c7002503baa74baf73a3a100c2f2) switched from RUNNING to FAILED.
org.apache.flink.runtime.io.network.netty.exception.LocalTransportException: 
readAddress(..) failed: Connection reset by peer (connection to 
'ip-172-31-42-229.eu-central-1.compute.internal/172.31.42.229:37329'){code}
 

There are also JobManager heartbeat timeouts but they don't correlate with the 
issue.

  was:
When running a job on Yarn cluster (load testing) some jobs result in failures.

Initial symptoms are 0 bytes written/transferred in CSV and failures in logs:

 
{code:java}
2020-05-17 10:02:32,858 WARN org.apache.flink.runtime.taskmanager.Task [] - Map 
-> Flat Map (138/160) (e49f7ea26b633c8035f2a919b1c580c8) switched from RUNNING 
to FAILED.{code}
 

 

I turned out that all such failures were caused by "Connection reset" from a 
single IP, except for on "Leadership lost" error.

Connection reset was likely caused by TM receiving SIGTERM 
(container_1589453804748_0118_01_04 and 5 both on ip-172-31-42-229):

 
{code:java}
2020-05-17 10:02:31,362 INFO org.apache.flink.yarn.YarnTaskExecutorRunner [] - 
RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.{code}
 

Other TMs received SIGTERM one minute later (all logs were uploaded at the same 
time though).

 

>From the JM it looked like this:
{code:java}
2020-05-17 10:02:23,583 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Trigger heartbeat request.
2020-05-17 10:02:23,587 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_05.
2020-05-17 10:02:23,590 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_06.

[jira] [Updated] (FLINK-17933) TaskManager was terminated on Yarn - investigate

2020-05-25 Thread Roman Khachatryan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan updated FLINK-17933:
--
Description: 
When running a job on Yarn cluster (load testing) some jobs result in failures.

Initial symptoms are 0 bytes written/transferred in CSV and failures in logs:

 
{code:java}
2020-05-17 10:02:32,858 WARN org.apache.flink.runtime.taskmanager.Task [] - Map 
-> Flat Map (138/160) (e49f7ea26b633c8035f2a919b1c580c8) switched from RUNNING 
to FAILED.{code}
 

 

I turned out that all such failures were caused by "Connection reset" from a 
single IP, except for on "Leadership lost" error.

Connection reset was likely caused by TM receiving SIGTERM 
(container_1589453804748_0118_01_04 and 5 both on ip-172-31-42-229):

 
{code:java}
2020-05-17 10:02:31,362 INFO org.apache.flink.yarn.YarnTaskExecutorRunner [] - 
RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.{code}
 

Other TMs received SIGTERM one minute later (all logs were uploaded at the same 
time though).

 

>From the JM it looked like this:
{code:java}
2020-05-17 10:02:23,583 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Trigger heartbeat request.
2020-05-17 10:02:23,587 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_05.
2020-05-17 10:02:23,590 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_06.
2020-05-17 10:02:23,592 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_04.
2020-05-17 10:02:23,595 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_03.
2020-05-17 10:02:23,598 DEBUG org.apache.flink.runtime.jobmaster.JobMaster [] - 
Received heartbeat from container_1589453804748_0118_01_02.
2020-05-17 10:02:23,725 DEBUG 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Received 
acknowledge message for checkpoint 12 from task 
459efd2ad8fe2ffe7fffe28530064fe1 of job 5d4d8c88de23b1361fe0dce6ba8443f8 at 
container_1589453804748_0118_01_02 @ 
ip-172-31-43-69.eu-central-1.compute.internal (dataPort=44625).
2020-05-17 10:02:29,103 DEBUG 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Received 
acknowledge message for checkpoint 12 from task 
266a9326be7e3ec669cce2e6a97ae5b0 of job 5d4d8c88de23b1361fe0dce6ba8443f8 at 
container_1589453804748_0118_01_05 @ 
ip-172-31-42-229.eu-central-1.compute.internal (dataPort=37329).
2020-05-17 10:02:32,862 WARN akka.remote.ReliableDeliverySupervisor [] - 
Association with remote system 
[akka.tcp://fl...@ip-172-31-42-229.eu-central-1.compute.internal:3] has 
failed, address is now gated for [50] ms. Reason: [Disassociated]
2020-05-17 10:02:32,862 WARN akka.remote.ReliableDeliverySupervisor [] - 
Association with remote system 
[akka.tcp://fl...@ip-172-31-42-229.eu-central-1.compute.internal:42567] has 
failed, address is now gated for [50] ms. Reason: [Disassociated]
2020-05-17 10:02:32,900 INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Map -> Flat Map 
(87/160) (cb77c7002503baa74baf73a3a100c2f2) switched from RUNNING to FAILED.
org.apache.flink.runtime.io.network.netty.exception.LocalTransportException: 
readAddress(..) failed: Connection reset by peer (connection to 
'ip-172-31-42-229.eu-central-1.compute.internal/172.31.42.229:37329'){code}
 

There are also JobManager heartbeat timeouts but they don't correlate with the 
issue.

> TaskManager was terminated on Yarn - investigate
> 
>
> Key: FLINK-17933
> URL: https://issues.apache.org/jira/browse/FLINK-17933
> Project: Flink
>  Issue Type: Task
>  Components: Deployment / YARN, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Major
>
> When running a job on Yarn cluster (load testing) some jobs result in 
> failures.
> Initial symptoms are 0 bytes written/transferred in CSV and failures in logs:
>  
> {code:java}
> 2020-05-17 10:02:32,858 WARN org.apache.flink.runtime.taskmanager.Task [] - 
> Map -> Flat Map (138/160) (e49f7ea26b633c8035f2a919b1c580c8) switched from 
> RUNNING to FAILED.{code}
>  
>  
> I turned out that all such failures were caused by "Connection reset" from a 
> single IP, except for on "Leadership lost" error.
> Connection reset was likely caused by TM receiving SIGTERM 
> (container_1589453804748_0118_01_04 and 5 both on ip-172-31-42-229):
>  
> {code:java}
> 2020-05-17 10:02:31,362 INFO org.apache.flink.yarn.YarnTaskExecutorRunner [] 
> - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.{code}
>  
> Other TMs received SIGTERM one minute later (all logs were

[jira] [Updated] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2

2020-05-25 Thread Zili Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zili Chen updated FLINK-17565:
--
Fix Version/s: 1.12.0

> Bump fabric8 version from 4.5.2 to 4.9.2
> 
>
> Key: FLINK-17565
> URL: https://issues.apache.org/jira/browse/FLINK-17565
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.12.0
>
>
> Currently, we are using a version of 4.5.2, it's better that we upgrade it to 
> 4.9.2, some of the reasons are as follows:
>  # It removed the use of reapers manually doing cascade deletion of 
> resources, leave it up to Kubernetes APIServer, which solves the issue of 
> FLINK-17566, more info: 
> [https://github.com/fabric8io/kubernetes-client/issues/1880]
>  # It introduced a regression in building Quantity values in 4.7.0, release 
> note [https://github.com/fabric8io/kubernetes-client/issues/1953].
>  # It provided better support for K8s 1.17, release note: 
> [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0].
> For more release notes, please refer to [fabric8 
> releases|https://github.com/fabric8io/kubernetes-client/releases].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.9.2

2020-05-25 Thread Zili Chen (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116156#comment-17116156
 ] 

Zili Chen commented on FLINK-17565:
---

Just the same as FLINK-17230.

*Quote*

[~zjwang] [~pnowojski] this issue seems like a bugfix which could be included 
in 1.11.0. [~fly_in_gis] & I have reviewed the PR and I merged it into current 
master(1.12).

[~felixzheng] [~fly_in_gis]  if you think it is nice to be included in 1.11.0, 
please explain to our release manage, i.e., [~zjwang] & [~pnowojski].

> Bump fabric8 version from 4.5.2 to 4.9.2
> 
>
> Key: FLINK-17565
> URL: https://issues.apache.org/jira/browse/FLINK-17565
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Currently, we are using a version of 4.5.2, it's better that we upgrade it to 
> 4.9.2, some of the reasons are as follows:
>  # It removed the use of reapers manually doing cascade deletion of 
> resources, leave it up to Kubernetes APIServer, which solves the issue of 
> FLINK-17566, more info: 
> [https://github.com/fabric8io/kubernetes-client/issues/1880]
>  # It introduced a regression in building Quantity values in 4.7.0, release 
> note [https://github.com/fabric8io/kubernetes-client/issues/1953].
>  # It provided better support for K8s 1.17, release note: 
> [https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0].
> For more release notes, please refer to [fabric8 
> releases|https://github.com/fabric8io/kubernetes-client/releases].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17230) Fix incorrect returned address of Endpoint for the ClusterIP Service

2020-05-25 Thread Zili Chen (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116155#comment-17116155
 ] 

Zili Chen commented on FLINK-17230:
---

[~zjwang] [~pnowojski] this issue seems like a bugfix which could be included 
in 1.11.0. [~fly_in_gis] & I have reviewed the PR and I merged it into current 
master(1.12).

[~felixzheng] [~fly_in_gis]  if you think it is nice to be included in 1.11.0, 
please explain to our release manage, i.e., [~zjwang] & [~pnowojski].

> Fix incorrect returned address of Endpoint for the ClusterIP Service
> 
>
> Key: FLINK-17230
> URL: https://issues.apache.org/jira/browse/FLINK-17230
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Affects Versions: 1.10.0, 1.10.1
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.12.0
>
>
> At the moment, when the type of the external Service is set to {{ClusterIP}}, 
> we return an incorrect address 
> {{KubernetesUtils.getInternalServiceName(clusterId) + "." + nameSpace}} for 
> the Endpoint.
> This ticket aims to fix this bug by returning 
> {{KubernetesUtils.getRestServiceName(clusterId) + "." + nameSpace}} instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17230) Fix incorrect returned address of Endpoint for the ClusterIP Service

2020-05-25 Thread Zili Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zili Chen updated FLINK-17230:
--
Fix Version/s: 1.12.0

> Fix incorrect returned address of Endpoint for the ClusterIP Service
> 
>
> Key: FLINK-17230
> URL: https://issues.apache.org/jira/browse/FLINK-17230
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Affects Versions: 1.10.0, 1.10.1
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.12.0
>
>
> At the moment, when the type of the external Service is set to {{ClusterIP}}, 
> we return an incorrect address 
> {{KubernetesUtils.getInternalServiceName(clusterId) + "." + nameSpace}} for 
> the Endpoint.
> This ticket aims to fix this bug by returning 
> {{KubernetesUtils.getRestServiceName(clusterId) + "." + nameSpace}} instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17933) TaskManager was terminated on Yarn - investigate

2020-05-25 Thread Roman Khachatryan (Jira)

Roman Khachatryan created FLINK-17933:
-

 Summary: TaskManager was terminated on Yarn - investigate
 Key: FLINK-17933
 URL: https://issues.apache.org/jira/browse/FLINK-17933
 Project: Flink
  Issue Type: Task
  Components: Deployment / YARN, Runtime / Task
Affects Versions: 1.11.0
Reporter: Roman Khachatryan
Assignee: Roman Khachatryan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-12675) Event time synchronization in Kafka consumer

2020-05-25 Thread Stephan Ewen (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116153#comment-17116153
 ] 

Stephan Ewen commented on FLINK-12675:
--

The existing Kafka Connector is already super complex (partially due to some 
bad architecture) and very hard to keep stable.
We have been burned a bit in the past with adding more non-trivial features. 
Given that this is probably the most important connector, I would favor to not 
add more features to it at this point.

Can you fork the Kafka Connector and add the event-time alignment to that fork? 
That way we don't affect the connector in the core. Putting the connector onto 
https://flink-packages.org would be a way to share it with more users.

For the new Source Interface, I would be up to start a discussion about how a 
generic event-time alignment mechanism would look like, implemented in the 
{{SourceReaderBase}} and {{SplitReader}} classes. Something like an extended 
{{SplitReader}} interface that can handle callbacks in the form of 
{{suspendSplitReadind(splitId)}} and {{resumeSplitReading(splitId)}} or so.

> Event time synchronization in Kafka consumer
> 
>
> Key: FLINK-12675
> URL: https://issues.apache.org/jira/browse/FLINK-12675
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Kafka
>Reporter: Thomas Weise
>Assignee: Akshay Aggarwal
>Priority: Major
>  Labels: pull-request-available
> Attachments: 0001-Kafka-event-time-alignment.patch
>
>
> Integrate the source watermark tracking into the Kafka consumer and implement 
> the sync mechanism (different consumer model, compared to Kinesis).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17820) Memory threshold is ignored for channel state

2020-05-25 Thread Stephan Ewen (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116150#comment-17116150
 ] 

Stephan Ewen commented on FLINK-17820:
--

It is true, the assumption that {{DataOutputStream}} does not buffer is a 
fragile point, even if an unlikely one (my feeling is too much existing code 
would be broken if one SDK made the implementation behave differently for such 
a core class). That said, it should be possible to flush the stream without 
voiding the usability of the class.

If from the initial options, we don't like (1), then, thinking about it a bit 
more, I would go for option (2) (adjust {{FsCheckpointStateOutputStream}}).

I did a quick search through the code and it looks like we can drop the 
assumption that {{FsCheckpointStateOutputStream}} creates the file in 
{{flush()}} It seems not used in the production code (though possibly in 
tests). How about this?
  - rename the {{flush()}} method to {{flushToFile()}}, including all existing 
calls to that method within the class and in relevant tests.
  - override {{flush()}} as a no-op.


> Memory threshold is ignored for channel state
> -
>
> Key: FLINK-17820
> URL: https://issues.apache.org/jira/browse/FLINK-17820
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Config parameter state.backend.fs.memory-threshold is ignored for channel 
> state. Causing each subtask to have a file per checkpoint. Regardless of the 
> size of channel state (of this subtask).
> This also causes slow cleanup and delays the next checkpoint.
>  
> The problem is that {{ChannelStateCheckpointWriter.finishWriteAndResult}} 
> calls flush(); which actually flushes the data on disk.
>  
> From FSDataOutputStream.flush Javadoc:
> A completed flush does not mean that the data is necessarily persistent. Data 
> persistence can is only assumed after calls to close() or sync().
>  
> Possible solutions:
> 1. not to flush in {{ChannelStateCheckpointWriter.finishWriteAndResult (which 
> can lead to data loss in a wrapping stream).}}
> {{2. change }}{{FsCheckpointStateOutputStream.flush behavior}}
> {{3. wrap }}{{FsCheckpointStateOutputStream to prevent flush}}{{}}{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16057) Performance regression in ContinuousFileReaderOperator

2020-05-25 Thread Piotr Nowojski (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116143#comment-17116143
 ] 

Piotr Nowojski commented on FLINK-16057:


Why is that the case? Why the new version should be twice as fast? It sounds 
strange, like we are missing something important.

> Performance regression in ContinuousFileReaderOperator
> --
>
> Key: FLINK-16057
> URL: https://issues.apache.org/jira/browse/FLINK-16057
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> After switching CFRO to a single-threaded execution model performance 
> regression was expected to be about 15-20% (benchmarked in November).
> But after merging to master it turned out to be about 50%.
>   
> One reason is that the chaining strategy isn't set by default in CFRO factory.
> Without that even reading and outputting all records of a split in a single 
> mail action doesn't reverse the regression (only about half).
> However,  with strategy set AND batching enabled fixes the regression 
> (starting from batch size 6).
> Though batching can't be used in practice because it can significantly delay 
> checkpointing.
>  
> Another approach would be to process one record and the repeat until 
> defaultMailboxActionAvailable OR haveNewMail.
> This reverses regression and even improves the performance by about 50% 
> compared to the old version.
>  
> The final solution could also be FLIP-27.
>  
> Other things tried (didn't help):
>  * CFRO rework without subsequent commits (removing checkpoint lock)
>  * different batch sizes, including the whole split, without chaining 
> strategy fixed - partial improvement only
>  * disabling close
>  * disabling checkpointing
>  * disabling output (serialization)
>  * using LinkedList instead of PriorityQueue
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17932) Most of hbase config Options can not work in HBaseTableSource

2020-05-25 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-17932:
---
Description: 
All hbase config Options can not work in HBaseTableSource, because field `conf` 
in HBaseRowInputFormat is not serializable, and HBaseTableSource will failback 
to load hbase-site.xml from classpath. I.E, many table  config Options that 
user defined in DDL is useless and HbaseTableSource only load config options 
from classpath.

 

  was:
All hbase config Options can not work in HBaseTableSource, because field `conf` 
in HBaseRowInputFormat it not serializable, and HBaseTableSource will failback 
to load hbase-site.xml from classpath. I.E, many table  config Options that 
user defined in DDL is useless and HbaseTableSource only load config options 
from classpath.

 


> Most of hbase config Options can not work in HBaseTableSource
> -
>
> Key: FLINK-17932
> URL: https://issues.apache.org/jira/browse/FLINK-17932
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / HBase
>Affects Versions: 1.10.0, 1.11.0, 1.12.0
>Reporter: Leonard Xu
>Priority: Major
>
> All hbase config Options can not work in HBaseTableSource, because field 
> `conf` in HBaseRowInputFormat is not serializable, and HBaseTableSource will 
> failback to load hbase-site.xml from classpath. I.E, many table  config 
> Options that user defined in DDL is useless and HbaseTableSource only load 
> config options from classpath.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17932) Most of hbase config Options can not work in HBaseTableSource

2020-05-25 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-17932:
---
Description: 
All hbase config Options can not work in HBaseTableSource, because field `conf` 
in HBaseRowInputFormat it not serializable, and HBaseTableSource will failback 
to load hbase-site.xml from classpath. I.E, many table  config Options that 
user defined in DDL is useless and HbaseTableSource only load config options 
from classpath.

 

  was:
All hbase config Options can not work in HBaseTableSource, because field `conf` 
in HBaseRowInputFormat it not serializable, and HBaseTableSource will fallback 
to load hbase-site.xml from classpath. I.E, many table  config Options that 
user defined in DDL is useless and HbaseTableSource only load config options 
from classpath.

 


> Most of hbase config Options can not work in HBaseTableSource
> -
>
> Key: FLINK-17932
> URL: https://issues.apache.org/jira/browse/FLINK-17932
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / HBase
>Affects Versions: 1.10.0, 1.11.0, 1.12.0
>Reporter: Leonard Xu
>Priority: Major
>
> All hbase config Options can not work in HBaseTableSource, because field 
> `conf` in HBaseRowInputFormat it not serializable, and HBaseTableSource will 
> failback to load hbase-site.xml from classpath. I.E, many table  config 
> Options that user defined in DDL is useless and HbaseTableSource only load 
> config options from classpath.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17932) Most of hbase config Options can not work in HBaseTableSource

2020-05-25 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-17932:
---
Summary: Most of hbase config Options can not work in HBaseTableSource  
(was: All hbase config Options can not work in HBaseTableSource)

> Most of hbase config Options can not work in HBaseTableSource
> -
>
> Key: FLINK-17932
> URL: https://issues.apache.org/jira/browse/FLINK-17932
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / HBase
>Affects Versions: 1.10.0, 1.11.0, 1.12.0
>Reporter: Leonard Xu
>Priority: Major
>
> All hbase config Options can not work in HBaseTableSource, because field 
> `conf` in HBaseRowInputFormat it not serializable, and HBaseTableSource will 
> fallback to load hbase-site.xml from classpath. I.E, many table  config 
> Options that user defined in DDL is useless and HbaseTableSource only load 
> config options from classpath.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17932) All hbase config Options can not work in HBaseTableSource

2020-05-25 Thread Leonard Xu (Jira)

Leonard Xu created FLINK-17932:
--

 Summary: All hbase config Options can not work in HBaseTableSource
 Key: FLINK-17932
 URL: https://issues.apache.org/jira/browse/FLINK-17932
 Project: Flink
  Issue Type: Bug
  Components: Connectors / HBase
Affects Versions: 1.10.0, 1.11.0, 1.12.0
Reporter: Leonard Xu


All hbase config Options can not work in HBaseTableSource, because field `conf` 
in HBaseRowInputFormat it not serializable, and HBaseTableSource will fallback 
to load hbase-site.xml from classpath. I.E, many table  config Options that 
user defined in DDL is useless and HbaseTableSource only load config options 
from classpath.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17909) Make the GenericInMemoryCatalog to hold the serialized meta data to uncover more potential bugs

2020-05-25 Thread Timo Walther (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116124#comment-17116124
 ] 

Timo Walther commented on FLINK-17909:
--

I don't know if this really solves the root cause. It just adds unnecessary 
computation. I think we should rather better test catalogs such as the 
HiveCatalog and CatalogTable implementations instead. Doing this serialization 
for a `TestCatalog` sounds better to me than doing it for the general default 
catalog of the table environment.

> Make the GenericInMemoryCatalog to hold the serialized meta data to uncover 
> more potential bugs
> ---
>
> Key: FLINK-17909
> URL: https://issues.apache.org/jira/browse/FLINK-17909
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, the builtin {{GenericInMemoryCatalog}} hold the meta objects in 
> HashMap. However, this lead to many bugs when users switch to some persisted 
> catalogs, e.g. Hive Metastore. For example, FLINK-17189, FLINK-17868, 
> FLINK-16021. 
> That is because the builtin {{GenericInMemoryCatalog}} doesn't cover the 
> important path of serialization and deserialization of meta data. We missed 
> some important meta information (PK, time attributes) when serialization and 
> deserialization which lead to bugs. 
> So I propose to hold the serialized meta data in {{GenericInMemoryCatalog}} 
> to cover the serialization and deserializtion path. The serialized meta data 
> may be in the {{Map}} properties format. 
> We may lose some performance here, but {{GenericInMemoryCatalog}} is mostly 
> used in demo/experiment/testing, so I think it's fine. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (FLINK-17842) Performance regression on 19.05.2020

2020-05-25 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski reopened FLINK-17842:


> Performance regression on 19.05.2020
> 
>
> Key: FLINK-17842
> URL: https://issues.apache.org/jira/browse/FLINK-17842
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks
>Affects Versions: 1.11.0
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> There is a noticeable performance regression in many benchmarks:
> http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2
> http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,1ms=2
> http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.100,100ms=2
> http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2
> that happened on May 19th, probably between 260ef2c and 2f18138



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17891) FlinkYarnSessionCli sets wrong execution.target type

2020-05-25 Thread Kostas Kloudas (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17115951#comment-17115951
 ] 

Kostas Kloudas edited comment on FLINK-17891 at 5/25/20, 3:21 PM:
--

Hi [~tangshangwen], as [~fly_in_gis] said, could you describe what exactly is 
your scenario?


was (Author: kkl0u):
Hi [~tangshangwen], as [~fly_in_gis] said, could you describe what exactly is 
your scenario? I am asking because the fix does not seem to fixing something. 
You are just removing an option that has to be set to something for correct 
execution.

>  FlinkYarnSessionCli sets wrong execution.target type
> -
>
> Key: FLINK-17891
> URL: https://issues.apache.org/jira/browse/FLINK-17891
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.11.0
>Reporter: Shangwen Tang
>Priority: Major
> Attachments: image-2020-05-23-00-59-32-702.png, 
> image-2020-05-23-01-00-19-549.png
>
>
> I submitted a flink session job at the local YARN cluster, and I found that 
> the *execution.target* is of the wrong type, which should be of yarn-session 
> type
> !image-2020-05-23-00-59-32-702.png|width=545,height=75!
> !image-2020-05-23-01-00-19-549.png|width=544,height=94!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17842) Performance regression on 19.05.2020

2020-05-25 Thread Piotr Nowojski (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116033#comment-17116033
 ] 

Piotr Nowojski commented on FLINK-17842:


Performance regression was only partially solved:
http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2 
seems fine, but those are still showing some slow down:
http://codespeed.dak8s.net:8000/timeline/?ben=networkBroadcastThroughput=2
http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,100ms=2

> Performance regression on 19.05.2020
> 
>
> Key: FLINK-17842
> URL: https://issues.apache.org/jira/browse/FLINK-17842
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks
>Affects Versions: 1.11.0
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> There is a noticeable performance regression in many benchmarks:
> http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2
> http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,1ms=2
> http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.100,100ms=2
> http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2
> that happened on May 19th, probably between 260ef2c and 2f18138



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-17842) Performance regression on 19.05.2020

2020-05-25 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski reassigned FLINK-17842:
--

Assignee: (was: Piotr Nowojski)

> Performance regression on 19.05.2020
> 
>
> Key: FLINK-17842
> URL: https://issues.apache.org/jira/browse/FLINK-17842
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks
>Affects Versions: 1.11.0
>Reporter: Piotr Nowojski
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> There is a noticeable performance regression in many benchmarks:
> http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2
> http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,1ms=2
> http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.100,100ms=2
> http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2
> that happened on May 19th, probably between 260ef2c and 2f18138



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17842) Performance regression on 19.05.2020

2020-05-25 Thread Piotr Nowojski (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116033#comment-17116033
 ] 

Piotr Nowojski edited comment on FLINK-17842 at 5/25/20, 1:48 PM:
--

Performance regression was only partially solved. Some benchmarks seems fine:
http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2 
but some are still showing some slow down:
http://codespeed.dak8s.net:8000/timeline/?ben=networkBroadcastThroughput=2
http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,100ms=2


was (Author: pnowojski):
Performance regression was only partially solved:
http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2 
seems fine, but those are still showing some slow down:
http://codespeed.dak8s.net:8000/timeline/?ben=networkBroadcastThroughput=2
http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,100ms=2

> Performance regression on 19.05.2020
> 
>
> Key: FLINK-17842
> URL: https://issues.apache.org/jira/browse/FLINK-17842
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks
>Affects Versions: 1.11.0
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> There is a noticeable performance regression in many benchmarks:
> http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString=2
> http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.1000,1ms=2
> http://codespeed.dak8s.net:8000/timeline/?ben=networkThroughput.100,100ms=2
> http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2
> that happened on May 19th, probably between 260ef2c and 2f18138



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17928) Incorrect state size reported when using unaligned checkpoints

2020-05-25 Thread Roman Khachatryan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116021#comment-17116021
 ] 

Roman Khachatryan edited comment on FLINK-17928 at 5/25/20, 2:15 PM:
-

The numbers were obtained using:

 
{code:java}
writtenBytes() {
rest_url="http://`hostname`:20888/proxy/`get_aid`;
job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'`
vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` )
writtenBytes=0
>&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}"
for vertex in "${vertex_ids[@]}";
do
stats=`curl 
$rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes_input,writtenBytes_output\=sum`
stats2=`curl 
$rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes\=sum`
>&2 echo "vertex $vertex $stats $stats2"
writtenBytes=$(($writtenBytes + `echo $stats $stats2 | jq -s 
'[add | .[] | select(.id | test("writtenBytes.*")).sum] | add // 0' `))
done
echo "$writtenBytes" | numfmt --to=iec-i --suffix=B --padding=7
}


transferredBytes() {
 rest_url="http://`hostname`:20888/proxy/`get_aid`;
 job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'`
 vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` )
 numBytesOut=0
 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}"
 for vertex in "${vertex_ids[@]}";
 do
 stats=`curl 
$rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=numBytesOut\=sum`
 >&2 echo "vertex $vertex $stats"
 numBytesOut=$(($numBytesOut + `echo $stats | jq -s '[add | .[].sum] | add // 
0' `))
 done
 echo "$numBytesOut" | numfmt --to=iec-i --suffix=B --padding=7
} 
{code}
 

This is how it's called:

 
{code:java}
function experiment() {
 name=$1
 nodes=$2
 slots=$3
 dur=$4
 start_job ${nodes} ${slots}
 duration ${dur}
# writtenBytes="`writtenBytes` `writtenBytes` `writtenBytes` `writtenBytes` 
`writtenBytes` `writtenBytes` `writtenBytes`"
 transferredBytes="`transferredBytes` `transferredBytes` `transferredBytes` 
`transferredBytes` `transferredBytes` `transferredBytes` `transferredBytes`"
 APPID=`kill_on_yarn`
getLogsFor ${name} ${APPID}
 analyzeLogs ${name} ${APPID}
 truncate -s -1 ${LOG}
 echo ";$transferredBytes" >> ${LOG}
}
{code}
That is, those are transferred bytes, not written bytes (see commented line in 
experiment()).

 


was (Author: roman_khachatryan):
The numbers were obtained using:

 
{code:java}
writtenBytes() {
rest_url="http://`hostname`:20888/proxy/`get_aid`;
job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'`
vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` )
writtenBytes=0
>&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}"
for vertex in "${vertex_ids[@]}";
do
stats=`curl 
$rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes_input,writtenBytes_output\=sum`
stats2=`curl 
$rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes\=sum`
>&2 echo "vertex $vertex $stats $stats2"
writtenBytes=$(($writtenBytes + `echo $stats $stats2 | jq -s 
'[add | .[] | select(.id | test("writtenBytes.*")).sum] | add // 0' `))
done
echo "$writtenBytes" | numfmt --to=iec-i --suffix=B --padding=7
}


transferredBytes() {
 rest_url="http://`hostname`:20888/proxy/`get_aid`;
 job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'`
 vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` )
 numBytesOut=0
 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}"
 for vertex in "${vertex_ids[@]}";
 do
 stats=`curl 
$rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=numBytesOut\=sum`
 >&2 echo "vertex $vertex $stats"
 numBytesOut=$(($numBytesOut + `echo $stats | jq -s '[add | .[].sum] | add // 
0' `))
 done
 echo "$numBytesOut" | numfmt --to=iec-i --suffix=B --padding=7
} 
{code}
 

> Incorrect state size reported when using unaligned checkpoints 
> ---
>
> Key: FLINK-17928
> URL: https://issues.apache.org/jira/browse/FLINK-17928
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.0
>Reporter: Piotr Nowojski
>Priority: Critical
> Fix For: 1.11.0
>
>
> Even when checkpoints on HDFS are between 100-300MBs, the reported state size 
> is in orders of magnitude larger with values like:
> {noformat}
> 1GiB  1.5TiB  2.0TiB  2.1TiB  2.1TiB
> 148GiB  148GiB  148GiB  148GiB  148GiB  148GiB
> {noformat}
> it's probably because we have multiple 
> {{Collection}}, and each of the individual handle 
> returns the same value from

[jira] [Comment Edited] (FLINK-17928) Incorrect state size reported when using unaligned checkpoints

2020-05-25 Thread Roman Khachatryan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116021#comment-17116021
 ] 

Roman Khachatryan edited comment on FLINK-17928 at 5/25/20, 2:11 PM:
-

The numbers were obtained using:

 
{code:java}
writtenBytes() {
rest_url="http://`hostname`:20888/proxy/`get_aid`;
job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'`
vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` )
writtenBytes=0
>&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}"
for vertex in "${vertex_ids[@]}";
do
stats=`curl 
$rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes_input,writtenBytes_output\=sum`
stats2=`curl 
$rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=writtenBytes\=sum`
>&2 echo "vertex $vertex $stats $stats2"
writtenBytes=$(($writtenBytes + `echo $stats $stats2 | jq -s 
'[add | .[] | select(.id | test("writtenBytes.*")).sum] | add // 0' `))
done
echo "$writtenBytes" | numfmt --to=iec-i --suffix=B --padding=7
}


transferredBytes() {
 rest_url="http://`hostname`:20888/proxy/`get_aid`;
 job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'`
 vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` )
 numBytesOut=0
 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}"
 for vertex in "${vertex_ids[@]}";
 do
 stats=`curl 
$rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=numBytesOut\=sum`
 >&2 echo "vertex $vertex $stats"
 numBytesOut=$(($numBytesOut + `echo $stats | jq -s '[add | .[].sum] | add // 
0' `))
 done
 echo "$numBytesOut" | numfmt --to=iec-i --suffix=B --padding=7
} 
{code}
 


was (Author: roman_khachatryan):
The numbers were obtained using:

 
{code:java}
transferredBytes() {
 rest_url="http://`hostname`:20888/proxy/`get_aid`;
 job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'`
 vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` )
 numBytesOut=0
 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}"
 for vertex in "${vertex_ids[@]}";
 do
 stats=`curl 
$rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=numBytesOut\=sum`
 >&2 echo "vertex $vertex $stats"
 numBytesOut=$(($numBytesOut + `echo $stats | jq -s '[add | .[].sum] | add // 
0' `))
 done
 echo "$numBytesOut" | numfmt --to=iec-i --suffix=B --padding=7
} 
{code}
 

> Incorrect state size reported when using unaligned checkpoints 
> ---
>
> Key: FLINK-17928
> URL: https://issues.apache.org/jira/browse/FLINK-17928
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.0
>Reporter: Piotr Nowojski
>Priority: Critical
> Fix For: 1.11.0
>
>
> Even when checkpoints on HDFS are between 100-300MBs, the reported state size 
> is in orders of magnitude larger with values like:
> {noformat}
> 1GiB  1.5TiB  2.0TiB  2.1TiB  2.1TiB
> 148GiB  148GiB  148GiB  148GiB  148GiB  148GiB
> {noformat}
> it's probably because we have multiple 
> {{Collection}}, and each of the individual handle 
> returns the same value from {{AbstractChannelStateHandle#getStateSize}} - the 
> full size of the spilled data, ignoring that only small portion of those data 
> belong to a single input channel/result subpartition. In other words {{
> org.apache.flink.runtime.state.AbstractChannelStateHandle#getStateSize}} 
> should be taking the offsets into account and return only the size of the 
> data that belong exclusively to this handle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17931) Document fromValues

2020-05-25 Thread Dawid Wysakowicz (Jira)

Dawid Wysakowicz created FLINK-17931:


 Summary: Document fromValues
 Key: FLINK-17931
 URL: https://issues.apache.org/jira/browse/FLINK-17931
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Table SQL / API
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-17917) ResourceInformationReflector#getExternalResources should ignore the external resource with a value of 0

2020-05-25 Thread Till Rohrmann (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-17917:
-

Assignee: Yangze Guo

> ResourceInformationReflector#getExternalResources should ignore the external 
> resource with a value of 0
> ---
>
> Key: FLINK-17917
> URL: https://issues.apache.org/jira/browse/FLINK-17917
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.11.0
>Reporter: Yangze Guo
>Assignee: Yangze Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> *Background*: In FLINK-17390, we leverage 
> {{WorkerSpecContainerResourceAdapter.InternalContainerResource}} to handle 
> container matching logic. In FLINK-17407, we introduce external resources in 
> {{WorkerSpecContainerResourceAdapter.InternalContainerResource}}.
>  On containers returned by Yarn, we try to get the corresponding worker specs 
> by:
>  - Convert the container to {{InternalContainerResource}}
>  - Get the WorkerResourceSpec from {{containerResourceToWorkerSpecs}} map.
> *Problem*: Container mismatch could happen in the below scenario:
>  - Flink does not allocate any external resources, the {{externalResources}} 
> of {{InternalContainerResource}} is an empty map.
>  - The returned container contains all the resources (with a value of 0) 
> defined in Yarn's {{resource-types.xml}}. The {{externalResources}} of 
> {{InternalContainerResource}} has one or more entries with a value of 0.
>  - These two {{InternalContainerResource}} do not match.
> To solve this problem, we could ignore all the external resources with a 
> value of 0 in "ResourceInformationReflector#getExternalResources".
> cc [~trohrmann] Could you assign this to me?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17931) Document fromValues clause

2020-05-25 Thread Dawid Wysakowicz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz updated FLINK-17931:
-
Summary: Document fromValues clause  (was: Document fromValues)

> Document fromValues clause
> --
>
> Key: FLINK-17931
> URL: https://issues.apache.org/jira/browse/FLINK-17931
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation, Table SQL / API
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #12320: [FLINK-17887][table][connector] Improve interface of ScanFormatFactory and SinkFormatFactory

2020-05-25 Thread GitBox



flinkbot edited a comment on pull request #12320:
URL: https://github.com/apache/flink/pull/12320#issuecomment-633558182


   
   ## CI report:
   
   * ea7a259ecdae60ff7b30932ac3821fdb91e5b674 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2142)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12313: [FLINK-17005][docs] Translate the CREATE TABLE ... LIKE syntax documentation to Chinese

2020-05-25 Thread GitBox



flinkbot edited a comment on pull request #12313:
URL: https://github.com/apache/flink/pull/12313#issuecomment-633389597


   
   ## CI report:
   
   * c225fa8b2a774a8579501e136004a9d64db89244 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2108)
 
   * 07ab8aa279380cc6584c9602a1c344dfb2294074 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2144)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12318: [FLINK-17867][hive][test] Add hdfs dependency to hive-3.1.1 test

2020-05-25 Thread GitBox



flinkbot edited a comment on pull request #12318:
URL: https://github.com/apache/flink/pull/12318#issuecomment-633479850


   
   ## CI report:
   
   * c395d57a1a2a57b8f9970b64d8241e033ecdd295 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2130)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12078: [FLINK-17610][state] Align the behavior of result of internal map state to return empty iterator

2020-05-25 Thread GitBox



flinkbot edited a comment on pull request #12078:
URL: https://github.com/apache/flink/pull/12078#issuecomment-626611802


   
   ## CI report:
   
   * e3ffb15cc38bdbbf1f5a11014782d081edaecea6 UNKNOWN
   * 55be45331331ec6f336d3872b8616b899d936730 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2120)
 
   * c87fc3764b52f86b03ee7f0381271c4240671ca1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #12322: [FLINK-17929][docs] Fix invalid liquid expressions

2020-05-25 Thread GitBox



flinkbot commented on pull request #12322:
URL: https://github.com/apache/flink/pull/12322#issuecomment-633576098


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit a53dd539ae9d381671c5b0503a7bd002358fd065 (Mon May 25 
13:35:35 UTC 2020)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] dawidwys commented on pull request #12322: [FLINK-17929][docs] Fix invalid liquid expressions

2020-05-25 Thread GitBox



dawidwys commented on pull request #12322:
URL: https://github.com/apache/flink/pull/12322#issuecomment-633574947


   Do you mind having a look @azagrebin ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-17929) Fix invalid liquid expressions

2020-05-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-17929:
---
Labels: pull-request-available  (was: )

> Fix invalid liquid expressions
> --
>
> Key: FLINK-17929
> URL: https://issues.apache.org/jira/browse/FLINK-17929
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
>  Labels: pull-request-available
>
> The .ID expression in ops/deployment/docker.md should be escaped, 
> otherwise it is not properly rendered.
> {code}
>   Generating... 
> Liquid Warning: Liquid syntax error (line 331): [:dot, "."] is not a 
> valid expression in "{{.ID}}" in ops/deployment/docker.md
> Liquid Warning: Liquid syntax error (line 357): [:dot, "."] is not a 
> valid expression in "{{.ID}}" in ops/deployment/docker.md
> Liquid Warning: Liquid syntax error (line 331): [:dot, "."] is not a 
> valid expression in "{{.ID}}" in ops/deployment/docker.zh.md
> Liquid Warning: Liquid syntax error (line 357): [:dot, "."] is not a 
> valid expression in "{{.ID}}" in ops/deployment/docker.zh.md
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] dawidwys opened a new pull request #12322: [FLINK-17929][docs] Fix invalid liquid expressions

2020-05-25 Thread GitBox



dawidwys opened a new pull request #12322:
URL: https://github.com/apache/flink/pull/12322


   It is a trivial bugfix.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-17928) Incorrect state size reported when using unaligned checkpoints

2020-05-25 Thread Roman Khachatryan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116021#comment-17116021
 ] 

Roman Khachatryan commented on FLINK-17928:
---

The numbers were obtained using:

 
{code:java}
transferredBytes() {
 rest_url="http://`hostname`:20888/proxy/`get_aid`;
 job_id=`curl $rest_url/jobs/ | jq -r '.jobs[0].id'`
 vertex_ids=( `curl $rest_url/jobs/$job_id/ | jq -r '.vertices[].id'` )
 numBytesOut=0
 >&2 echo "job_id $job_id; vertex_ids: ${vertex_ids[@]}"
 for vertex in "${vertex_ids[@]}";
 do
 stats=`curl 
$rest_url/jobs/$job_id/vertices/$vertex/subtasks/metrics\?get=numBytesOut\=sum`
 >&2 echo "vertex $vertex $stats"
 numBytesOut=$(($numBytesOut + `echo $stats | jq -s '[add | .[].sum] | add // 
0' `))
 done
 echo "$numBytesOut" | numfmt --to=iec-i --suffix=B --padding=7
} 
{code}
 

> Incorrect state size reported when using unaligned checkpoints 
> ---
>
> Key: FLINK-17928
> URL: https://issues.apache.org/jira/browse/FLINK-17928
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.0
>Reporter: Piotr Nowojski
>Priority: Critical
> Fix For: 1.11.0
>
>
> Even when checkpoints on HDFS are between 100-300MBs, the reported state size 
> is in orders of magnitude larger with values like:
> {noformat}
> 1GiB  1.5TiB  2.0TiB  2.1TiB  2.1TiB
> 148GiB  148GiB  148GiB  148GiB  148GiB  148GiB
> {noformat}
> it's probably because we have multiple 
> {{Collection}}, and each of the individual handle 
> returns the same value from {{AbstractChannelStateHandle#getStateSize}} - the 
> full size of the spilled data, ignoring that only small portion of those data 
> belong to a single input channel/result subpartition. In other words {{
> org.apache.flink.runtime.state.AbstractChannelStateHandle#getStateSize}} 
> should be taking the offsets into account and return only the size of the 
> data that belong exclusively to this handle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17928) Incorrect state size reported when using unaligned checkpoints

2020-05-25 Thread Piotr Nowojski (Jira)

Piotr Nowojski created FLINK-17928:
--

 Summary: Incorrect state size reported when using unaligned 
checkpoints 
 Key: FLINK-17928
 URL: https://issues.apache.org/jira/browse/FLINK-17928
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.11.0
Reporter: Piotr Nowojski
 Fix For: 1.11.0


Even when checkpoints on HDFS are between 100-300MBs, the reported state size 
is in orders of magnitude larger with values like:
{noformat}
1GiB  1.5TiB  2.0TiB  2.1TiB  2.1TiB
148GiB  148GiB  148GiB  148GiB  148GiB  148GiB
{noformat}
it's probably because we have multiple {{Collection}}, 
and each of the individual handle returns the same value from 
{{AbstractChannelStateHandle#getStateSize}} - the full size of the spilled 
data, ignoring that only small portion of those data belong to a single input 
channel/result subpartition. In other words {{
org.apache.flink.runtime.state.AbstractChannelStateHandle#getStateSize}} should 
be taking the offsets into account and return only the size of the data that 
belong exclusively to this handle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17929) Fix invalid liquid expressions

2020-05-25 Thread Dawid Wysakowicz (Jira)

Dawid Wysakowicz created FLINK-17929:


 Summary: Fix invalid liquid expressions
 Key: FLINK-17929
 URL: https://issues.apache.org/jira/browse/FLINK-17929
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.11.0
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz


The .ID expression in ops/deployment/docker.md should be escaped, 
otherwise it is not properly rendered.

{code}
  Generating... 
Liquid Warning: Liquid syntax error (line 331): [:dot, "."] is not a valid 
expression in "{{.ID}}" in ops/deployment/docker.md
Liquid Warning: Liquid syntax error (line 357): [:dot, "."] is not a valid 
expression in "{{.ID}}" in ops/deployment/docker.md
Liquid Warning: Liquid syntax error (line 331): [:dot, "."] is not a valid 
expression in "{{.ID}}" in ops/deployment/docker.zh.md
Liquid Warning: Liquid syntax error (line 357): [:dot, "."] is not a valid 
expression in "{{.ID}}" in ops/deployment/docker.zh.md
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] SteNicholas commented on pull request #12078: [FLINK-17610][state] Align the behavior of result of internal map state to return empty iterator

2020-05-25 Thread GitBox



SteNicholas commented on pull request #12078:
URL: https://github.com/apache/flink/pull/12078#issuecomment-633568565


   @carp84 I have merged the latest master code and resolved conflicts. Please 
help to check again.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12321: Add document for writing Avro files with StreamingFileSink

2020-05-25 Thread GitBox



flinkbot edited a comment on pull request #12321:
URL: https://github.com/apache/flink/pull/12321#issuecomment-633558255


   
   ## CI report:
   
   * 56bdc3a61f65cb30b48cf7f932520be02ebed734 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2143)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12314: [FLINK-17756][table-api-java] Drop table/view shouldn't take effect o…

2020-05-25 Thread GitBox



flinkbot edited a comment on pull request #12314:
URL: https://github.com/apache/flink/pull/12314#issuecomment-633403476


   
   ## CI report:
   
   * b7a68f0f07ab9bb2abc3182d72c0dc80ed59dda1 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2113)
 
   * 8180c3272155f8db184013ed79946db1571206e8 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2141)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12264: [FLINK-17558][netty] Release partitions asynchronously

2020-05-25 Thread GitBox



flinkbot edited a comment on pull request #12264:
URL: https://github.com/apache/flink/pull/12264#issuecomment-631349883


   
   ## CI report:
   
   * 19c5f57b94cc56b70002031618c32d9e6f68effb UNKNOWN
   * bb313e40f5a72dbf20cd0a8b48267063fd4f00af UNKNOWN
   * eafbd98c812227cb7d9ce7158de1a23309855509 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1948)
 
   * 3510bfd56ae6a431783bbade1881dd967b271457 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2139)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12283: [FLINK-16975][documentation] Add docs for FileSystem connector

2020-05-25 Thread GitBox



flinkbot edited a comment on pull request #12283:
URL: https://github.com/apache/flink/pull/12283#issuecomment-632089857


   
   ## CI report:
   
   * b3fd51b309c78d9dd5056eed35dc2fe388665899 UNKNOWN
   * 2838b871958d80b1ea22c5ccd550505895e98e60 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2124)
 
   * 20ede85b3693bdf20adb0efd1b72dde4ab6c62f5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2140)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12320: [FLINK-17887][table][connector] Improve interface of ScanFormatFactory and SinkFormatFactory

2020-05-25 Thread GitBox



flinkbot edited a comment on pull request #12320:
URL: https://github.com/apache/flink/pull/12320#issuecomment-633558182


   
   ## CI report:
   
   * ea7a259ecdae60ff7b30932ac3821fdb91e5b674 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2142)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12313: [FLINK-17005][docs] Translate the CREATE TABLE ... LIKE syntax documentation to Chinese

2020-05-25 Thread GitBox



flinkbot edited a comment on pull request #12313:
URL: https://github.com/apache/flink/pull/12313#issuecomment-633389597


   
   ## CI report:
   
   * c225fa8b2a774a8579501e136004a9d64db89244 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2108)
 
   * 07ab8aa279380cc6584c9602a1c344dfb2294074 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] pnowojski commented on a change in pull request #12292: [FLINK-17861][task][checkpointing] Split channel state handles sent to JM

2020-05-25 Thread GitBox



pnowojski commented on a change in pull request #12292:
URL: https://github.com/apache/flink/pull/12292#discussion_r429886453



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateCheckpointWriterTest.java
##
@@ -149,8 +152,8 @@ public void testRecordingOffsets() throws Exception {
}
 
private void write(ChannelStateCheckpointWriter writer, 
InputChannelInfo channelInfo, byte[] data) throws Exception {
-   NetworkBuffer buffer = new 
NetworkBuffer(HeapMemorySegment.FACTORY.allocateUnpooledSegment(data.length, 
null), FreeingBufferRecycler.INSTANCE);
-   buffer.setBytes(0, data);
+   MemorySegment segment = wrap(data);
+   NetworkBuffer buffer = new NetworkBuffer(segment, 
FreeingBufferRecycler.INSTANCE, Buffer.DataType.DATA_BUFFER, segment.size());

Review comment:
   What was the problem here? Could you explain it in the commit message?

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateCheckpointWriter.java
##
@@ -25,8 +25,10 @@
 import org.apache.flink.runtime.state.InputChannelStateHandle;

Review comment:
   Please copy the PR description to the commit message of this (last one) 
commit.
   
   Could you also extend both those descriptions and JIRA ticket description 
(copy/paste) with some more elaborate explanation what was the underlying 
problem? 
   
   > That the buffered bytes in `StreamStateHandle underlying` (if it's a 
`ByteStreamStateHandle`) would be referenced many times, one per each input 
channel and result partition by respective `InputChannelStateHandle` and 
`ResultSubpartitionStateHandle` handles. Each of those handles would thus 
duplicate and contain all of the data for every channel, while using only a 
small portion of it.

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateCheckpointWriter.java
##
@@ -180,17 +177,33 @@ private void doComplete(boolean precondition, 
RunnableWithException complete, Ru
}
 
private > void complete(
+   StreamStateHandle underlying,
CompletableFuture> future,
Map> offsets,
-   BiFunction, H> buildHandle) {
+   TriFunction, H> 
buildHandle) throws IOException {

Review comment:
   Those two functions (`complete` and `createHandle` are) quite difficult 
to understand/read. For example by looking at the signature 
   
   ```
private > H createHandle(
StreamStateHandle underlying,
TriFunction, H> 
buildHandle,
Map.Entry> e)
   ```
   
   I have no idea what's happening here.
   
   I would at the very least:
   1. either extract `buildHandle` to some factory/builder or just replace it 
by a boolean flag `boolean buildInputHandles`/enum  
`INPUT_HANDLE`/`OUTPUT_HANDLE` and just if/switch inside.
   2. do not pass `Map.Entry`, but convert it to a POJO or named variables ASAP

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateCheckpointWriter.java
##
@@ -180,17 +177,33 @@ private void doComplete(boolean precondition, 
RunnableWithException complete, Ru
}
 
private > void complete(
+   StreamStateHandle underlying,
CompletableFuture> future,
Map> offsets,
-   BiFunction, H> buildHandle) {
+   TriFunction, H> 
buildHandle) throws IOException {
final Collection handles = new ArrayList<>();
for (Map.Entry> e : offsets.entrySet()) {
-   handles.add(buildHandle.apply(e.getKey(), 
e.getValue()));
+   handles.add(createHandle(underlying, buildHandle, e));
}
future.complete(handles);
LOG.debug("channel state write completed, checkpointId: {}, 
handles: {}", checkpointId, handles);
}
 
+   private > H createHandle(
+   StreamStateHandle underlying,
+   TriFunction, H> 
buildHandle,
+   Map.Entry> e) throws IOException {
+   if (underlying instanceof ByteStreamStateHandle) {
+   ByteStreamStateHandle byteHandle = 
(ByteStreamStateHandle) underlying;
+   return buildHandle.apply(
+   e.getKey(),
+   new 
ByteStreamStateHandle(randomUUID().toString(), 
serializer.extractAndMerge(byteHandle.getData(), e.getValue())),
+   singletonList(serializer.getHeaderLength()));
+   } else {

Review comment:
   It doesn't look like a generic/error prone solution. It looks like 
something

[GitHub] [flink] danny0405 commented on a change in pull request #12284: [FLINK-17689][kafka][table] Add integration tests for changelog sourc…

2020-05-25 Thread GitBox



danny0405 commented on a change in pull request #12284:
URL: https://github.com/apache/flink/pull/12284#discussion_r429926472



##
File path: 
flink-table/flink-table-planner-blink/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java
##
@@ -247,6 +265,12 @@ public void invoke(RowData value, Context context) throws 
Exception {
"This is probably an 
incorrectly implemented test.");
}
}
+   receivedNum++;
+   if (expectedSize != -1 && receivedNum >= expectedSize) {
+   // some sources are infinite (e.g. kafka),

Review comment:
   `>=` or `=` ?

##
File path: 
flink-table/flink-table-planner-blink/src/test/java/org/apache/flink/table/planner/factories/TestValuesTableFactory.java
##
@@ -633,8 +648,11 @@ public SinkRuntimeProvider getSinkRuntimeProvider(Context 
context) {
sinkFunction = new 
KeyedUpsertingSinkFunction(
tableName,
converter,
-   keyIndices);
+   keyIndices,
+   expectedNum);
} else {
+   checkArgument(expectedNum == -1,
+   "Retracting Sink doesn't 
support '" + SINK_EXPECTED_MESSAGES_NUM.key() + "' yet.");

Review comment:
   Should we move the check to the constructor ?

##
File path: 
flink-table/flink-table-planner-blink/src/test/scala/org/apache/flink/table/planner/plan/stream/sql/TableScanTest.scala
##
@@ -186,6 +186,61 @@ class TableScanTest extends TableTestBase {
 util.verifyPlan("SELECT COUNT(*) FROM src WHERE a > 1", 
ExplainDetail.CHANGELOG_MODE)
   }
 
+  @Test
+  def testJoinOnChangelogSource(): Unit = {
+util.addTable(
+  """
+|CREATE TABLE changelog_src (
+|  ts TIMESTAMP(3),
+|  a INT,
+|  b DOUBLE
+|) WITH (
+|  'connector' = 'values',
+|  'changelog-mode' = 'I,UA,UB'
+|)
+  """.stripMargin)
+util.addTable(
+  """
+|CREATE TABLE append_src (
+|  ts TIMESTAMP(3),
+|  a INT,
+|  b DOUBLE
+|) WITH (
+|  'connector' = 'values',
+|  'changelog-mode' = 'I'
+|)
+  """.stripMargin)

Review comment:
   Can we also add a ITCase for the stream join ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] JingsongLi commented on pull request #12283: [FLINK-16975][documentation] Add docs for FileSystem connector

2020-05-25 Thread GitBox



JingsongLi commented on pull request #12283:
URL: https://github.com/apache/flink/pull/12283#issuecomment-633564817


   Will solve https://issues.apache.org/jira/browse/FLINK-17925 first.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-17925) Fix Filesystem options to default values and types

2020-05-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reassigned FLINK-17925:


Assignee: Jingsong Lee

> Fix Filesystem options to default values and types
> --
>
> Key: FLINK-17925
> URL: https://issues.apache.org/jira/browse/FLINK-17925
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Blocker
> Fix For: 1.11.0
>
>
> Fix Filesystem options:
>  * Throws unsupported exception when using metastore commit policy for 
> filesystem table, Filesystem connector has an empty implementation in 
> {{TableMetaStoreFactory}}. We should avoid user configuring this policy.
>  * Default value of "sink.partition-commit.trigger" should be "process-time". 
> Users are hard to figure out what is wrong when they don't have watermark. We 
> can set "sink.partition-commit.trigger" to "process-time" to have better 
> out-of-box experience.
>  * The type of "sink.rolling-policy.file-size" should be MemoryType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17925) Fix Filesystem options to default values and types

2020-05-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-17925:
-
Priority: Blocker  (was: Critical)

> Fix Filesystem options to default values and types
> --
>
> Key: FLINK-17925
> URL: https://issues.apache.org/jira/browse/FLINK-17925
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Priority: Blocker
> Fix For: 1.11.0
>
>
> Fix Filesystem options:
>  * Throws unsupported exception when using metastore commit policy for 
> filesystem table, Filesystem connector has an empty implementation in 
> {{TableMetaStoreFactory}}. We should avoid user configuring this policy.
>  * Default value of "sink.partition-commit.trigger" should be "process-time". 
> Users are hard to figure out what is wrong when they don't have watermark. We 
> can set "sink.partition-commit.trigger" to "process-time" to have better 
> out-of-box experience.
>  * The type of "sink.rolling-policy.file-size" should be MemoryType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17927) Default value of "sink.partition-commit.trigger" should be "process-time"

2020-05-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-17927:
-
Priority: Major  (was: Critical)

> Default value of "sink.partition-commit.trigger" should be "process-time"
> -
>
> Key: FLINK-17927
> URL: https://issues.apache.org/jira/browse/FLINK-17927
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Priority: Major
>
> Users are hard to figure out what is wrong when they don't have watermark. We 
> can set "sink.partition-commit.trigger" to "process-time" to have better 
> out-of-box experience.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-17927) Default value of "sink.partition-commit.trigger" should be "process-time"

2020-05-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-17927.

Fix Version/s: (was: 1.11.0)
   Resolution: Duplicate

> Default value of "sink.partition-commit.trigger" should be "process-time"
> -
>
> Key: FLINK-17927
> URL: https://issues.apache.org/jira/browse/FLINK-17927
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Priority: Critical
>
> Users are hard to figure out what is wrong when they don't have watermark. We 
> can set "sink.partition-commit.trigger" to "process-time" to have better 
> out-of-box experience.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17925) Fix Filesystem options to default values and types

2020-05-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-17925:
-
Description: 
Fix Filesystem options:
 * Throws unsupported exception when using metastore commit policy for 
filesystem table, Filesystem connector has an empty implementation in 
{{TableMetaStoreFactory}}. We should avoid user configuring this policy.
 * Default value of "sink.partition-commit.trigger" should be "process-time". 
Users are hard to figure out what is wrong when they don't have watermark. We 
can set "sink.partition-commit.trigger" to "process-time" to have better 
out-of-box experience.
 * The type of "sink.rolling-policy.file-size" should be MemoryType.

  was:
Fix Filesystem options:
 * Throws unsupported exception when using metastore commit policy for 
filesystem table, Filesystem connector has an empty implementation in 
{{TableMetaStoreFactory}}. We should avoid user configuring this policy.


> Fix Filesystem options to default values and types
> --
>
> Key: FLINK-17925
> URL: https://issues.apache.org/jira/browse/FLINK-17925
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Priority: Critical
> Fix For: 1.11.0
>
>
> Fix Filesystem options:
>  * Throws unsupported exception when using metastore commit policy for 
> filesystem table, Filesystem connector has an empty implementation in 
> {{TableMetaStoreFactory}}. We should avoid user configuring this policy.
>  * Default value of "sink.partition-commit.trigger" should be "process-time". 
> Users are hard to figure out what is wrong when they don't have watermark. We 
> can set "sink.partition-commit.trigger" to "process-time" to have better 
> out-of-box experience.
>  * The type of "sink.rolling-policy.file-size" should be MemoryType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17925) Fix Filesystem options to default values and types

2020-05-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-17925:
-
Summary: Fix Filesystem options to default values and types  (was: Throws 
unsupported exception when using metastore commit policy for filesystem table)

> Fix Filesystem options to default values and types
> --
>
> Key: FLINK-17925
> URL: https://issues.apache.org/jira/browse/FLINK-17925
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Priority: Critical
> Fix For: 1.11.0
>
>
> Fix Filesystem options:
>  * Throws unsupported exception when using metastore commit policy for 
> filesystem table, Filesystem connector has an empty implementation in 
> {{TableMetaStoreFactory}}. We should avoid user configuring this policy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] yangyichao-mango commented on pull request #12313: [FLINK-17005][docs] Translate the CREATE TABLE ... LIKE syntax documentation to Chinese

2020-05-25 Thread GitBox



yangyichao-mango commented on pull request #12313:
URL: https://github.com/apache/flink/pull/12313#issuecomment-633562437


   Thx. I've rebase that commit.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-17925) Throws unsupported exception when using metastore commit policy for filesystem table

2020-05-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-17925:
-
Description: 
Fix Filesystem options:
 * Throws unsupported exception when using metastore commit policy for 
filesystem table, Filesystem connector has an empty implementation in 
{{TableMetaStoreFactory}}. We should avoid user configuring this policy.

  was:Filesystem connector has an empty implementation in 
\{{TableMetaStoreFactory}}. We should avoid user configuring this policy.


> Throws unsupported exception when using metastore commit policy for 
> filesystem table
> 
>
> Key: FLINK-17925
> URL: https://issues.apache.org/jira/browse/FLINK-17925
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Priority: Critical
> Fix For: 1.11.0
>
>
> Fix Filesystem options:
>  * Throws unsupported exception when using metastore commit policy for 
> filesystem table, Filesystem connector has an empty implementation in 
> {{TableMetaStoreFactory}}. We should avoid user configuring this policy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] wuchong edited a comment on pull request #12320: [FLINK-17887][table][connector] Improve interface of ScanFormatFactory and SinkFormatFactory

2020-05-25 Thread GitBox



wuchong edited a comment on pull request #12320:
URL: https://github.com/apache/flink/pull/12320#issuecomment-633558842


   When I writing the code, I have a feeling that maybe `DeserializationFormat` 
is a better name than `DeserializationSchemaProvider` , because it is created 
from `DeserializationFormatFactory` and is more intuitive in the 
`FactoryUtil#discoverXxxxFormat` than `FactoryUtil#discoverXxxxProvider`. But 
I’m not sure what’s your opinion here @twalthr @KurtYoung , as we’re swinging 
in the new names again…



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] klion26 commented on pull request #12313: [FLINK-17005][docs] Translate the CREATE TABLE ... LIKE syntax documentation to Chinese

2020-05-25 Thread GitBox



klion26 commented on pull request #12313:
URL: https://github.com/apache/flink/pull/12313#issuecomment-633558758


   @yangyichao-mango thanks for your contribution, could you please git rid of 
the "merge" commit? you can use "git rebase" instead



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wuchong commented on pull request #12320: [FLINK-17887][table][connector] Improve interface of ScanFormatFactory and SinkFormatFactory

2020-05-25 Thread GitBox



wuchong commented on pull request #12320:
URL: https://github.com/apache/flink/pull/12320#issuecomment-633558842


   When I writing the code, I have a feeling that maybe `DeserializationFormat` 
is a better name than `DeserializationSchemaProvider` , because it is created 
from `DeserializationFormatFactory` and is more intuitive in the 
`FactoryUtil#discoverXxxxFormat`. But I’m not sure what’s your opinion here 
@twalthr @KurtYoung , as we’re swinging in the new names again…



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-17926) Can't build flink-web docker image because of EOL of Ubuntu:18.10

2020-05-25 Thread Robert Metzger (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-17926:
--

Assignee: Congxian Qiu(klion26)

> Can't build flink-web docker image because of EOL of Ubuntu:18.10
> -
>
> Key: FLINK-17926
> URL: https://issues.apache.org/jira/browse/FLINK-17926
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Congxian Qiu(klion26)
>Assignee: Congxian Qiu(klion26)
>Priority: Major
>
> Currently, the Dockerfile[1] in flink-web project is broken because of the 
> EOL of Ubuntu 18.10[2], will encounter the error such as bellow when 
> executing {{./run.sh}}
> {code:java}
> Err:3 http://security.ubuntu.com/ubuntu cosmic-security Release
>   404  Not Found [IP: 91.189.88.152 80]
> Ign:4 http://archive.ubuntu.com/ubuntu cosmic-updates InRelease
> Ign:5 http://archive.ubuntu.com/ubuntu cosmic-backports InRelease
> Err:6 http://archive.ubuntu.com/ubuntu cosmic Release
>   404  Not Found [IP: 91.189.88.142 80]
> Err:7 http://archive.ubuntu.com/ubuntu cosmic-updates Release
>   404  Not Found [IP: 91.189.88.142 80]
> Err:8 http://archive.ubuntu.com/ubuntu cosmic-backports Release
>   404  Not Found [IP: 91.189.88.142 80]
> Reading package lists...
> {code}
> The current LTS versions can be found in release website[2].
> Apache Flink docker image uses fedora:28[3], so it unaffected.
> As fedora does not have LTS release[4], I proposal to use Ubuntu for website 
> here, and change the version from {{18.10}} to the closest LTS version 
> {{18.04, tried locally, it works successfully.}}
>  [1] 
> [https://github.com/apache/flink-web/blob/bc66f0f0f463ab62a22e81df7d7efd301b76a6b4/docker/Dockerfile#L17]
> [2] [https://wiki.ubuntu.com/Releases]
>  
> [3]https://github.com/apache/flink/blob/e92b2bf19bdf03ad3bae906dc5fa3781aeddb3ee/docs/docker/Dockerfile#L17
>  [4] 
> https://fedoraproject.org/wiki/Fedora_Release_Life_Cycle#Maintenance_Schedule



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17926) Can't build flink-web docker image because of EOL of Ubuntu:18.10

2020-05-25 Thread Robert Metzger (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116007#comment-17116007
 ] 

Robert Metzger commented on FLINK-17926:


Thanks, this is a good proposal. I assigned you.

> Can't build flink-web docker image because of EOL of Ubuntu:18.10
> -
>
> Key: FLINK-17926
> URL: https://issues.apache.org/jira/browse/FLINK-17926
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Congxian Qiu(klion26)
>Assignee: Congxian Qiu(klion26)
>Priority: Major
>
> Currently, the Dockerfile[1] in flink-web project is broken because of the 
> EOL of Ubuntu 18.10[2], will encounter the error such as bellow when 
> executing {{./run.sh}}
> {code:java}
> Err:3 http://security.ubuntu.com/ubuntu cosmic-security Release
>   404  Not Found [IP: 91.189.88.152 80]
> Ign:4 http://archive.ubuntu.com/ubuntu cosmic-updates InRelease
> Ign:5 http://archive.ubuntu.com/ubuntu cosmic-backports InRelease
> Err:6 http://archive.ubuntu.com/ubuntu cosmic Release
>   404  Not Found [IP: 91.189.88.142 80]
> Err:7 http://archive.ubuntu.com/ubuntu cosmic-updates Release
>   404  Not Found [IP: 91.189.88.142 80]
> Err:8 http://archive.ubuntu.com/ubuntu cosmic-backports Release
>   404  Not Found [IP: 91.189.88.142 80]
> Reading package lists...
> {code}
> The current LTS versions can be found in release website[2].
> Apache Flink docker image uses fedora:28[3], so it unaffected.
> As fedora does not have LTS release[4], I proposal to use Ubuntu for website 
> here, and change the version from {{18.10}} to the closest LTS version 
> {{18.04, tried locally, it works successfully.}}
>  [1] 
> [https://github.com/apache/flink-web/blob/bc66f0f0f463ab62a22e81df7d7efd301b76a6b4/docker/Dockerfile#L17]
> [2] [https://wiki.ubuntu.com/Releases]
>  
> [3]https://github.com/apache/flink/blob/e92b2bf19bdf03ad3bae906dc5fa3781aeddb3ee/docs/docker/Dockerfile#L17
>  [4] 
> https://fedoraproject.org/wiki/Fedora_Release_Life_Cycle#Maintenance_Schedule



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on pull request #12320: [FLINK-17887][table][connector] Improve interface of ScanFormatFactory and SinkFormatFactory

2020-05-25 Thread GitBox



flinkbot commented on pull request #12320:
URL: https://github.com/apache/flink/pull/12320#issuecomment-633558182


   
   ## CI report:
   
   * ea7a259ecdae60ff7b30932ac3821fdb91e5b674 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 3 4 5 >

1 - 100 of 408 matches

Mail list logo