date:20231129

Re: [PR] FLINK-33600][table] print the query time cost for batch query in cli [flink]

2023-11-29 Thread via GitHub



JingGe commented on code in PR #23809:
URL: https://github.com/apache/flink/pull/23809#discussion_r1410249663


##
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/config/SqlClientOptions.java:
##
@@ -70,6 +70,14 @@ private SqlClientOptions() {}
 + "This only applies to columns with 
variable-length types (e.g. CHAR, VARCHAR, STRING) in streaming mode. "
 + "Fixed-length types and all types in 
batch mode are printed using a deterministic column width.");
 
+@Documentation.TableOption(execMode = Documentation.ExecMode.BATCH)
+public static final ConfigOption DISPLAY_QUERY_TIME_COST =

Review Comment:
   After talking with @fsk119 offline, this PR will focus on the batch and will 
create a follow-up PR for the stream mode.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] FLINK-33600][table] print the query time cost for batch query in cli [flink]

2023-11-29 Thread via GitHub



JingGe commented on code in PR #23809:
URL: https://github.com/apache/flink/pull/23809#discussion_r1410249663


##
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/config/SqlClientOptions.java:
##
@@ -70,6 +70,14 @@ private SqlClientOptions() {}
 + "This only applies to columns with 
variable-length types (e.g. CHAR, VARCHAR, STRING) in streaming mode. "
 + "Fixed-length types and all types in 
batch mode are printed using a deterministic column width.");
 
+@Documentation.TableOption(execMode = Documentation.ExecMode.BATCH)
+public static final ConfigOption DISPLAY_QUERY_TIME_COST =

Review Comment:
   I'd like to focus on batch in the PR and will create a follow-up PR for the 
stream mode.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] FLINK-33600][table] print the query time cost for batch query in cli [flink]

2023-11-29 Thread via GitHub



JingGe commented on code in PR #23809:
URL: https://github.com/apache/flink/pull/23809#discussion_r1410249002


##
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/CliTableauResultView.java:
##
@@ -128,8 +146,9 @@ private void printBatchResults(AtomicInteger 
receivedRowCount) {
 resultDescriptor.getRowDataStringConverter(),
 resultDescriptor.maxColumnWidth(),
 false,
-false);
-style.print(resultRows.iterator(), terminal.writer());
+false,

Review Comment:
   updated accordingly. Please check again. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-33699) Verify the snapshot migration on Java21

2023-11-29 Thread Sergey Nuyanzin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791494#comment-17791494
 ] 

Sergey Nuyanzin commented on FLINK-33699:
-

I wonder whether it should be a release testing related task?

> Verify the snapshot migration on Java21
> ---
>
> Key: FLINK-33699
> URL: https://issues.apache.org/jira/browse/FLINK-33699
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Yun Tang
>Priority: Major
>
> In Java 21 builds, Scala is being bumped to 2.12.18, which causes 
> incompatibilities within Flink.
> This could affect loading savepoints from a Java 8/11/17 build. We already 
> have tests extending {{SnapshotMigrationTestBase}} to verify the logic of 
> migrating snapshots generated by the older Flink version. I think we can also 
> introduce similar tests to verify the logic across different Java versions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] FLINK-33600][table] print the query time cost for batch query in cli [flink]

2023-11-29 Thread via GitHub



JingGe commented on code in PR #23809:
URL: https://github.com/apache/flink/pull/23809#discussion_r1410248169


##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/utils/print/TableauStyle.java:
##
@@ -117,6 +122,10 @@ public final class TableauStyle implements PrintStyle {
 
 @Override
 public void print(Iterator it, PrintWriter printWriter) {
+print(it, printWriter, -1);
+}
+
+public void print(Iterator it, PrintWriter printWriter, long 
queryBeginTime) {

Review Comment:
   update to control the print style for batch result in `CliTableauResultView` 
too



##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/utils/print/TableauStyle.java:
##
@@ -117,6 +122,10 @@ public final class TableauStyle implements PrintStyle {
 
 @Override
 public void print(Iterator it, PrintWriter printWriter) {
+print(it, printWriter, -1);
+}
+
+public void print(Iterator it, PrintWriter printWriter, long 
queryBeginTime) {

Review Comment:
   updated to control the print style for batch result in 
`CliTableauResultView` too



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] FLINK-33600][table] print the query time cost for batch query in cli [flink]

2023-11-29 Thread via GitHub



JingGe commented on code in PR #23809:
URL: https://github.com/apache/flink/pull/23809#discussion_r1410247314


##
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/CliTableauResultView.java:
##
@@ -42,27 +42,45 @@
 /** Print result in tableau mode. */
 public class CliTableauResultView implements AutoCloseable {
 
+public static final long DEFAULT_QUERY_BEGIN_TIME = -1;
+
 private final Terminal terminal;
 private final ResultDescriptor resultDescriptor;
 
 private final ChangelogResult collectResult;
 private final ExecutorService displayResultExecutorService;
 
+private final long queryBeginTime;
+
 public CliTableauResultView(final Terminal terminal, final 
ResultDescriptor resultDescriptor) {

Review Comment:
   removed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-33656) If json.ignose-parse-errors =true is configured and Array parsing errors occur, other columns will be empty

2023-11-29 Thread duke (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

duke updated FLINK-33656:
-
Affects Version/s: (was: 1.16.0)
   (was: 1.17.0)
   (was: 1.16.1)
   (was: 1.16.2)
   (was: 1.17.1)

> If json.ignose-parse-errors =true is configured and Array parsing errors 
> occur, other columns will be empty
> ---
>
> Key: FLINK-33656
> URL: https://issues.apache.org/jira/browse/FLINK-33656
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.18.0
>Reporter: duke
>Priority: Critical
> Attachments: image-2023-11-27-13-58-22-513.png, 
> image-2023-11-27-13-59-42-066.png, image-2023-11-27-14-00-04-672.png, 
> image-2023-11-27-14-00-41-176.png, image-2023-11-27-14-01-12-187.png, 
> image-2023-11-27-14-02-01-252.png, image-2023-11-27-14-02-30-666.png, 
> image-2023-11-27-14-02-52-065.png, image-2023-11-27-14-03-10-885.png, 
> image-2023-11-30-14-47-02-300.png
>
>
> If json.ignore-parse-errors is set to true and Array parsing errors occur, 
> the fields following array are resolved as empty in the complete json message
> Create Table DDL
> create table default_catalog.default_database.test (
> id                   STRING
> ,sheetNo              STRING
> ,attentionList        ARRAY
> )
> WITH ('connector' = 'kafka',
> 'topic' = 'test',
> 'properties.bootstrap.servers' = 'xxx',
> 'scan.startup.mode' = 'latest-offset',
> 'format' = 'json',
> 'json.ignore-parse-errors' = 'true',
> 'json.map-null-key.mode' = 'LITERAL'
> );
> 1.、 !image-2023-11-27-13-59-42-066.png|width=501,height=128!
> !image-2023-11-27-14-00-04-672.png|width=631,height=85!
> 2.
> !image-2023-11-27-14-00-41-176.png|width=506,height=67!
> !image-2023-11-27-14-01-12-187.png|width=493,height=118!
> 3.
> !image-2023-11-27-14-02-52-065.png|width=493,height=181!
> !image-2023-11-27-14-03-10-885.png|width=546,height=144!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-33656) If json.ignose-parse-errors =true is configured and Array parsing errors occur, other columns will be empty

2023-11-29 Thread duke (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791482#comment-17791482
 ] 

duke edited comment on FLINK-33656 at 11/30/23 7:05 AM:


Hi,[~martijnvisser]

I found that in version 1.18 json parsing uses a new class class, and the 
method is controlled by the decode.json-parser.enabled parameter, but this 
parameter does not take effect according to the value configured in the sql. 
And array resolution through JsonParserRowDataDeserializationSchema class 
should be using the reconstruction after the class, resulting in abnormal array 
resolution after eventually lead to the problem.

!image-2023-11-30-14-47-02-300.png|width=353,height=321!


was (Author: JIRAUSER303286):
Hi,[~martijnvisser]

I found that in version 1.18 json parsing uses a new class class, and the 
method is controlled by the decode. Json-parser. enabled parameter, but this 
parameter does not take effect according to the value configured in the sql. 
And array resolution through JsonParserRowDataDeserializationSchema class 
should be using the reconstruction after the class, resulting in abnormal array 
resolution after eventually lead to the problem.

!image-2023-11-30-14-47-02-300.png|width=353,height=321!

> If json.ignose-parse-errors =true is configured and Array parsing errors 
> occur, other columns will be empty
> ---
>
> Key: FLINK-33656
> URL: https://issues.apache.org/jira/browse/FLINK-33656
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.16.2, 1.18.0, 1.17.1
>Reporter: duke
>Priority: Critical
> Attachments: image-2023-11-27-13-58-22-513.png, 
> image-2023-11-27-13-59-42-066.png, image-2023-11-27-14-00-04-672.png, 
> image-2023-11-27-14-00-41-176.png, image-2023-11-27-14-01-12-187.png, 
> image-2023-11-27-14-02-01-252.png, image-2023-11-27-14-02-30-666.png, 
> image-2023-11-27-14-02-52-065.png, image-2023-11-27-14-03-10-885.png, 
> image-2023-11-30-14-47-02-300.png
>
>
> If json.ignore-parse-errors is set to true and Array parsing errors occur, 
> the fields following array are resolved as empty in the complete json message
> Create Table DDL
> create table default_catalog.default_database.test (
> id                   STRING
> ,sheetNo              STRING
> ,attentionList        ARRAY
> )
> WITH ('connector' = 'kafka',
> 'topic' = 'test',
> 'properties.bootstrap.servers' = 'xxx',
> 'scan.startup.mode' = 'latest-offset',
> 'format' = 'json',
> 'json.ignore-parse-errors' = 'true',
> 'json.map-null-key.mode' = 'LITERAL'
> );
> 1.、 !image-2023-11-27-13-59-42-066.png|width=501,height=128!
> !image-2023-11-27-14-00-04-672.png|width=631,height=85!
> 2.
> !image-2023-11-27-14-00-41-176.png|width=506,height=67!
> !image-2023-11-27-14-01-12-187.png|width=493,height=118!
> 3.
> !image-2023-11-27-14-02-52-065.png|width=493,height=181!
> !image-2023-11-27-14-03-10-885.png|width=546,height=144!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33656) If json.ignose-parse-errors =true is configured and Array parsing errors occur, other columns will be empty

2023-11-29 Thread duke (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

duke updated FLINK-33656:
-
Description: 
If json.ignore-parse-errors is set to true and Array parsing errors occur, the 
fields following array are resolved as empty in the complete json message

Create Table DDL

create table default_catalog.default_database.test (
id                   STRING
,sheetNo              STRING
,attentionList        ARRAY
)
WITH ('connector' = 'kafka',
'topic' = 'test',
'properties.bootstrap.servers' = 'xxx',
'scan.startup.mode' = 'latest-offset',
'format' = 'json',
'json.ignore-parse-errors' = 'true',
'json.map-null-key.mode' = 'LITERAL'
);

1.、 !image-2023-11-27-13-59-42-066.png|width=501,height=128!

!image-2023-11-27-14-00-04-672.png|width=631,height=85!

2.

!image-2023-11-27-14-00-41-176.png|width=506,height=67!

!image-2023-11-27-14-01-12-187.png|width=493,height=118!

3.

!image-2023-11-27-14-02-52-065.png|width=493,height=181!

!image-2023-11-27-14-03-10-885.png|width=546,height=144!

  was:
If json.ignore-parse-errors is set to true and Array parsing errors occur, the 
fields following array are resolved as empty in the complete json message

Create Table DDL

 

1.、 !image-2023-11-27-13-59-42-066.png|width=501,height=128!

!image-2023-11-27-14-00-04-672.png|width=631,height=85!

2.

!image-2023-11-27-14-00-41-176.png|width=506,height=67!

!image-2023-11-27-14-01-12-187.png|width=493,height=118!

3.

!image-2023-11-27-14-02-52-065.png|width=493,height=181!

!image-2023-11-27-14-03-10-885.png|width=546,height=144!


> If json.ignose-parse-errors =true is configured and Array parsing errors 
> occur, other columns will be empty
> ---
>
> Key: FLINK-33656
> URL: https://issues.apache.org/jira/browse/FLINK-33656
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.16.2, 1.18.0, 1.17.1
>Reporter: duke
>Priority: Critical
> Attachments: image-2023-11-27-13-58-22-513.png, 
> image-2023-11-27-13-59-42-066.png, image-2023-11-27-14-00-04-672.png, 
> image-2023-11-27-14-00-41-176.png, image-2023-11-27-14-01-12-187.png, 
> image-2023-11-27-14-02-01-252.png, image-2023-11-27-14-02-30-666.png, 
> image-2023-11-27-14-02-52-065.png, image-2023-11-27-14-03-10-885.png, 
> image-2023-11-30-14-47-02-300.png
>
>
> If json.ignore-parse-errors is set to true and Array parsing errors occur, 
> the fields following array are resolved as empty in the complete json message
> Create Table DDL
> create table default_catalog.default_database.test (
> id                   STRING
> ,sheetNo              STRING
> ,attentionList        ARRAY
> )
> WITH ('connector' = 'kafka',
> 'topic' = 'test',
> 'properties.bootstrap.servers' = 'xxx',
> 'scan.startup.mode' = 'latest-offset',
> 'format' = 'json',
> 'json.ignore-parse-errors' = 'true',
> 'json.map-null-key.mode' = 'LITERAL'
> );
> 1.、 !image-2023-11-27-13-59-42-066.png|width=501,height=128!
> !image-2023-11-27-14-00-04-672.png|width=631,height=85!
> 2.
> !image-2023-11-27-14-00-41-176.png|width=506,height=67!
> !image-2023-11-27-14-01-12-187.png|width=493,height=118!
> 3.
> !image-2023-11-27-14-02-52-065.png|width=493,height=181!
> !image-2023-11-27-14-03-10-885.png|width=546,height=144!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33656) If json.ignose-parse-errors =true is configured and Array parsing errors occur, other columns will be empty

2023-11-29 Thread duke (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

duke updated FLINK-33656:
-
Description: 
If json.ignore-parse-errors is set to true and Array parsing errors occur, the 
fields following array are resolved as empty in the complete json message

Create Table DDL

 

1.、 !image-2023-11-27-13-59-42-066.png|width=501,height=128!

!image-2023-11-27-14-00-04-672.png|width=631,height=85!

2.

!image-2023-11-27-14-00-41-176.png|width=506,height=67!

!image-2023-11-27-14-01-12-187.png|width=493,height=118!

3.

!image-2023-11-27-14-02-52-065.png|width=493,height=181!

!image-2023-11-27-14-03-10-885.png|width=546,height=144!

  was:
If json.ignore-parse-errors is set to true and Array parsing errors occur, the 
fields following array are resolved as empty in the complete json message

1. !image-2023-11-27-13-59-42-066.png!

!image-2023-11-27-14-00-04-672.png!

 

2. !image-2023-11-27-14-00-41-176.png!

!image-2023-11-27-14-01-12-187.png!

3. !image-2023-11-27-14-02-52-065.png!

!image-2023-11-27-14-03-10-885.png!


> If json.ignose-parse-errors =true is configured and Array parsing errors 
> occur, other columns will be empty
> ---
>
> Key: FLINK-33656
> URL: https://issues.apache.org/jira/browse/FLINK-33656
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.16.2, 1.18.0, 1.17.1
>Reporter: duke
>Priority: Critical
> Attachments: image-2023-11-27-13-58-22-513.png, 
> image-2023-11-27-13-59-42-066.png, image-2023-11-27-14-00-04-672.png, 
> image-2023-11-27-14-00-41-176.png, image-2023-11-27-14-01-12-187.png, 
> image-2023-11-27-14-02-01-252.png, image-2023-11-27-14-02-30-666.png, 
> image-2023-11-27-14-02-52-065.png, image-2023-11-27-14-03-10-885.png, 
> image-2023-11-30-14-47-02-300.png
>
>
> If json.ignore-parse-errors is set to true and Array parsing errors occur, 
> the fields following array are resolved as empty in the complete json message
> Create Table DDL
>  
> 1.、 !image-2023-11-27-13-59-42-066.png|width=501,height=128!
> !image-2023-11-27-14-00-04-672.png|width=631,height=85!
> 2.
> !image-2023-11-27-14-00-41-176.png|width=506,height=67!
> !image-2023-11-27-14-01-12-187.png|width=493,height=118!
> 3.
> !image-2023-11-27-14-02-52-065.png|width=493,height=181!
> !image-2023-11-27-14-03-10-885.png|width=546,height=144!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32181) Drop support for Maven 3.2.5

2023-11-29 Thread Dan Zou (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791489#comment-17791489
 ] 

Dan Zou commented on FLINK-32181:
-

Hi [~chesnay], I see we enforce Maven 3.8.6 as required version in this ticket, 
but I do not get why we only support 3.8.6, not all versions newer than 3.8.6?
h1.  

> Drop support for Maven 3.2.5
> 
>
> Key: FLINK-32181
> URL: https://issues.apache.org/jira/browse/FLINK-32181
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Build System
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> Collection of improvements we can make when dropping support for Maven 3.2.5.
> Targeting 1.19 so we have 3.2.5 as a fallback for the 1.18.0 release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-33680) Failed to build document with docker

2023-11-29 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791486#comment-17791486
 ] 

Matthias Pohl edited comment on FLINK-33680 at 11/30/23 6:56 AM:
-

What git version are you using? I tried to run the command from within the 
{{./docs}} subfolder and didn't have any issues.
{code:java}
$ git --version
git version 2.34.1 {code}


was (Author: mapohl):
What git version are you using? I tried to run the command from within the 
{{./docs}} subfolder and didn't have any issues.

> Failed to build document with docker
> 
>
> Key: FLINK-33680
> URL: https://issues.apache.org/jira/browse/FLINK-33680
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: wangshiheng
>Priority: Major
>  Labels: doc-site, docement, pull-request-available
>
> Follow the documentation, the documentation comes from 
> [https://github.com/apache/flink/blob/master/docs/README.md]
>  
> The implementation results are as follows:
> {code:java}
> [root@bigdatadev workpace]# git clone https://github.com/apache/flink.git
> ...
> [root@bigdatadev workpace]# cd flink/docs/
> [root@bigdatadev docs]# ./setup_docs.sh
> [root@bigdatadev docs]# docker pull jakejarvis/hugo-extended:latest
> latest: Pulling from jakejarvis/hugo-extended
> Digest: 
> sha256:f659daa3b52693d8f6fc380e4fc5d0d3faf5b9c25ef260244ff67625c59c45a7
> Status: Image is up to date for jakejarvis/hugo-extended:latest
> docker.io/jakejarvis/hugo-extended:latest
> [root@bigdatadev docs]#  docker run -v $(pwd):/src -p 1313:1313 
> jakejarvis/hugo-extended:latest server --buildDrafts --buildFuture --bind 
> 0.0.0.0
> Start building sites … 
> hugo v0.113.0-085c1b3d614e23d218ebf9daad909deaa2390c9a+extended linux/amd64 
> BuildDate=2023-06-05T15:04:51Z VendorInfo=docker
> Built in 515 ms
> Error: error building site: assemble: "/src/content/_index.md:36:1": failed 
> to extract shortcode: template for shortcode "columns" not found
>  {code}
>  
> [root@bigdatadev docs]# vim content/_index.md
> {panel}
>  30 # Apache Flink Documentation
>  31 
>  32 {{< center >}}
>  33 *{*}Apache Flink{*}* is a framework and distributed processing engine for 
> stateful computations over *unbounded* and *bounded* data streams. Flink has 
> been designed to run in {*}all common cluster en    vironments{*}, perform 
> computations at *in-memory* speed and at {*}any scale{*}.
>  34 {{< /center >}}
>  35 
> {color:#de350b} 36 \{{< columns >}}{color}
>  37 
>  38 ### Try Flink
>  39 
>  40 If you’re interested in playing around with Flink, try one of our 
> tutorials:
>  41 
>  42 * [Fraud Detection with the DataStream API]({{{}< ref 
> "docs/try-flink/datastream" >{}}})
> {panel}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33680) Failed to build document with docker

2023-11-29 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791486#comment-17791486
 ] 

Matthias Pohl commented on FLINK-33680:
---

What git version are you using? I tried to run the command from within the 
{{./docs}} subfolder and didn't have any issues.

> Failed to build document with docker
> 
>
> Key: FLINK-33680
> URL: https://issues.apache.org/jira/browse/FLINK-33680
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: wangshiheng
>Priority: Major
>  Labels: doc-site, docement, pull-request-available
>
> Follow the documentation, the documentation comes from 
> [https://github.com/apache/flink/blob/master/docs/README.md]
>  
> The implementation results are as follows:
> {code:java}
> [root@bigdatadev workpace]# git clone https://github.com/apache/flink.git
> ...
> [root@bigdatadev workpace]# cd flink/docs/
> [root@bigdatadev docs]# ./setup_docs.sh
> [root@bigdatadev docs]# docker pull jakejarvis/hugo-extended:latest
> latest: Pulling from jakejarvis/hugo-extended
> Digest: 
> sha256:f659daa3b52693d8f6fc380e4fc5d0d3faf5b9c25ef260244ff67625c59c45a7
> Status: Image is up to date for jakejarvis/hugo-extended:latest
> docker.io/jakejarvis/hugo-extended:latest
> [root@bigdatadev docs]#  docker run -v $(pwd):/src -p 1313:1313 
> jakejarvis/hugo-extended:latest server --buildDrafts --buildFuture --bind 
> 0.0.0.0
> Start building sites … 
> hugo v0.113.0-085c1b3d614e23d218ebf9daad909deaa2390c9a+extended linux/amd64 
> BuildDate=2023-06-05T15:04:51Z VendorInfo=docker
> Built in 515 ms
> Error: error building site: assemble: "/src/content/_index.md:36:1": failed 
> to extract shortcode: template for shortcode "columns" not found
>  {code}
>  
> [root@bigdatadev docs]# vim content/_index.md
> {panel}
>  30 # Apache Flink Documentation
>  31 
>  32 {{< center >}}
>  33 *{*}Apache Flink{*}* is a framework and distributed processing engine for 
> stateful computations over *unbounded* and *bounded* data streams. Flink has 
> been designed to run in {*}all common cluster en    vironments{*}, perform 
> computations at *in-memory* speed and at {*}any scale{*}.
>  34 {{< /center >}}
>  35 
> {color:#de350b} 36 \{{< columns >}}{color}
>  37 
>  38 ### Try Flink
>  39 
>  40 If you’re interested in playing around with Flink, try one of our 
> tutorials:
>  41 
>  42 * [Fraud Detection with the DataStream API]({{{}< ref 
> "docs/try-flink/datastream" >{}}})
> {panel}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-27681) Improve the availability of Flink when the RocksDB file is corrupted.

2023-11-29 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791485#comment-17791485
 ] 

Rui Fan commented on FLINK-27681:
-

Hey [~mayuehappy]  [~masteryhx] , thanks for your feedback.:)
{quote}The downside is that the job has to rollback to the older checkpoint. 
But there should be some policies for high-quality job just as [~mayuehappy] 
mentioned.
{quote}
My concern is that if we found the file is corrupted, and fail the checkpoint. 
The job will continue to run (if tolerable-failed-checkpoints > 0),  and all 
checkpoints cannot be completed in the future.

However,  the job must fail in the future(When the corrupted block is read or 
compacted, or checkpoint failed number >= tolerable-failed-checkpoint). Then it 
will rollback to the older checkpoint.

The older checkpoint must be before we found the file is corrupted. Therefore, 
it is useless to run a job between the time it is discovered that the file is 
corrupted and the time it actually fails.

In brief, tolerable-failed-checkpoint can work, but the extra cost isn't 
necessary.

BTW, if failing job directly, this 
[comment|https://github.com/apache/flink/pull/23765#discussion_r1404136470] 
will be solved directly.
{quote}The check at runtime is block level, whose overhead should be little 
(rocksdb always need to read the block from the disk at runtime, so the 
checksum could be calculated easily).
{quote}
Thanks [~masteryhx] for the clarification.

 
{quote}Wouldn't the much more reliable and faster solution be to enable CRC on 
the local filesystem/disk that Flink's using? Benefits of this approach:
 * no changes to Flink/no increased complexity of our code base
 * would protect from not only errors that happen to occur between writing the 
file and uploading to the DFS, but also from any errors that happen at any 
point of time
 * would amortise the performance hit. Instead of amplifying reads by 100%, 
error correction bits/bytes are a small fraction of the payload, so the 
performance penalty would be at every read/write access but ultimately a very 
small fraction of the total cost of reading{quote}
[~pnowojski] 's comment also directly causes the job to fail? I'm not familiar 
with how to enable CRC for filesystem/disk? Would you mind describing it in 
detail?

> Improve the availability of Flink when the RocksDB file is corrupted.
> -
>
> Key: FLINK-27681
> URL: https://issues.apache.org/jira/browse/FLINK-27681
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Ming Li
>Assignee: Yue Ma
>Priority: Critical
>  Labels: pull-request-available
> Attachments: image-2023-08-23-15-06-16-717.png
>
>
> We have encountered several times when the RocksDB checksum does not match or 
> the block verification fails when the job is restored. The reason for this 
> situation is generally that there are some problems with the machine where 
> the task is located, which causes the files uploaded to HDFS to be incorrect, 
> but it has been a long time (a dozen minutes to half an hour) when we found 
> this problem. I'm not sure if anyone else has had a similar problem.
> Since this file is referenced by incremental checkpoints for a long time, 
> when the maximum number of checkpoints reserved is exceeded, we can only use 
> this file until it is no longer referenced. When the job failed, it cannot be 
> recovered.
> Therefore we consider:
> 1. Can RocksDB periodically check whether all files are correct and find the 
> problem in time?
> 2. Can Flink automatically roll back to the previous checkpoint when there is 
> a problem with the checkpoint data, because even with manual intervention, it 
> just tries to recover from the existing checkpoint or discard the entire 
> state.
> 3. Can we increase the maximum number of references to a file based on the 
> maximum number of checkpoints reserved? When the number of references exceeds 
> the maximum number of checkpoints -1, the Task side is required to upload a 
> new file for this reference. Not sure if this way will ensure that the new 
> file we upload will be correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33656) If json.ignose-parse-errors =true is configured and Array parsing errors occur, other columns will be empty

2023-11-29 Thread duke (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791482#comment-17791482
 ] 

duke commented on FLINK-33656:
--

Hi,[~martijnvisser]

I found that in version 1.18 json parsing uses a new class class, and the 
method is controlled by the decode. Json-parser. enabled parameter, but this 
parameter does not take effect according to the value configured in the sql. 
And array resolution through JsonParserRowDataDeserializationSchema class 
should be using the reconstruction after the class, resulting in abnormal array 
resolution after eventually lead to the problem.

!image-2023-11-30-14-47-02-300.png|width=353,height=321!

> If json.ignose-parse-errors =true is configured and Array parsing errors 
> occur, other columns will be empty
> ---
>
> Key: FLINK-33656
> URL: https://issues.apache.org/jira/browse/FLINK-33656
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.16.2, 1.18.0, 1.17.1
>Reporter: duke
>Priority: Critical
> Attachments: image-2023-11-27-13-58-22-513.png, 
> image-2023-11-27-13-59-42-066.png, image-2023-11-27-14-00-04-672.png, 
> image-2023-11-27-14-00-41-176.png, image-2023-11-27-14-01-12-187.png, 
> image-2023-11-27-14-02-01-252.png, image-2023-11-27-14-02-30-666.png, 
> image-2023-11-27-14-02-52-065.png, image-2023-11-27-14-03-10-885.png, 
> image-2023-11-30-14-47-02-300.png
>
>
> If json.ignore-parse-errors is set to true and Array parsing errors occur, 
> the fields following array are resolved as empty in the complete json message
> 1. !image-2023-11-27-13-59-42-066.png!
> !image-2023-11-27-14-00-04-672.png!
>  
> 2. !image-2023-11-27-14-00-41-176.png!
> !image-2023-11-27-14-01-12-187.png!
> 3. !image-2023-11-27-14-02-52-065.png!
> !image-2023-11-27-14-03-10-885.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-29 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791480#comment-17791480
 ] 

Matthias Pohl commented on FLINK-33623:
---

Thanks for the pointers. I'm gonna have a look.

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
>  Labels: metaspace-leak
> Attachments: image-2023-11-23-09-47-50-536.png, image.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33656) If json.ignose-parse-errors =true is configured and Array parsing errors occur, other columns will be empty

2023-11-29 Thread duke (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

duke updated FLINK-33656:
-
Attachment: image-2023-11-30-14-47-02-300.png

> If json.ignose-parse-errors =true is configured and Array parsing errors 
> occur, other columns will be empty
> ---
>
> Key: FLINK-33656
> URL: https://issues.apache.org/jira/browse/FLINK-33656
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.16.2, 1.18.0, 1.17.1
>Reporter: duke
>Priority: Critical
> Attachments: image-2023-11-27-13-58-22-513.png, 
> image-2023-11-27-13-59-42-066.png, image-2023-11-27-14-00-04-672.png, 
> image-2023-11-27-14-00-41-176.png, image-2023-11-27-14-01-12-187.png, 
> image-2023-11-27-14-02-01-252.png, image-2023-11-27-14-02-30-666.png, 
> image-2023-11-27-14-02-52-065.png, image-2023-11-27-14-03-10-885.png, 
> image-2023-11-30-14-47-02-300.png
>
>
> If json.ignore-parse-errors is set to true and Array parsing errors occur, 
> the fields following array are resolved as empty in the complete json message
> 1. !image-2023-11-27-13-59-42-066.png!
> !image-2023-11-27-14-00-04-672.png!
>  
> 2. !image-2023-11-27-14-00-41-176.png!
> !image-2023-11-27-14-01-12-187.png!
> 3. !image-2023-11-27-14-02-52-065.png!
> !image-2023-11-27-14-03-10-885.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-11-29 Thread via GitHub



xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1833181042

   @masteryhx Hello, I have rebased the code to the master, PTAL~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33612][table-planner] Hybrid shuffle mode avoids unnecessary blocking edges in the plan [flink]

2023-11-29 Thread via GitHub



TanYuxin-tyx commented on PR #23771:
URL: https://github.com/apache/flink/pull/23771#issuecomment-1833175583

   @lsyldliu Could you please take a look at this PR?  Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Add flink-shaded 18.0 release [flink-web]

2023-11-29 Thread via GitHub



1996fanrui commented on code in PR #701:
URL: https://github.com/apache/flink-web/pull/701#discussion_r1410193976


##
docs/data/additional_components.yml:
##
@@ -21,11 +21,11 @@ flink-connector-parent:
   source_release_asc_url: 
"https://downloads.apache.org/flink/flink-connector-parent-1.0.0/flink-connector-parent-1.0.0-src.tgz.asc;
   source_release_sha512_url: 
"https://downloads.apache.org/flink/flink-connector-parent-1.0.0/flink-connector-parent-1.0.0-src.tgz.sha512;
 
-flink-shaded-17.0:
-  name: "Apache Flink-shaded 17.0 Source Release"
-  source_release_url: 
"https://www.apache.org/dyn/closer.lua/flink/flink-shaded-17.0/flink-shaded-17.0-src.tgz;
-  source_release_asc_url: 
"https://downloads.apache.org/flink/flink-shaded-17.0/flink-shaded-17.0-src.tgz.asc;
-  source_release_sha512_url: 
"https://downloads.apache.org/flink/flink-shaded-17.0/flink-shaded-17.0-src.tgz.sha512;
+flink-shaded-18.0:
+  name: "Apache Flink-shaded 18.0 Source Release"
+  source_release_url: 
"https://www.apache.org/dyn/closer.lua/flink/flink-shaded-18.0/flink-shaded-18.0-src.tgz;
+  source_release_asc_url: 
"https://downloads.apache.org/flink/flink-shaded-18.0/flink-shaded-18.0-src.tgz.asc;
+  source_release_sha512_url: 
"https://downloads.apache.org/flink/flink-shaded-18.0/flink-shaded-18.0-src.tgz.sha512;
 
 flink-shaded-16.2:

Review Comment:
   I see the `pre-bundled-hadoop-` has 4 versions in this yml, so should we 
keep 16, 17 and 18?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33509] Fix flaky test testNodeAffinity() in InitTaskManagerDecoratorTest.java [flink]

2023-11-29 Thread via GitHub



yijut2 commented on PR #23694:
URL: https://github.com/apache/flink/pull/23694#issuecomment-1833145868

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-33698) Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy

2023-11-29 Thread lincoln lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791466#comment-17791466
 ] 

lincoln lee commented on FLINK-33698:
-

[~xiangyu0xf] assigned to you, I can help review when the pr is ready :)

> Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy
> 
>
> Key: FLINK-33698
> URL: https://issues.apache.org/jira/browse/FLINK-33698
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: xiangyu feng
>Assignee: xiangyu feng
>Priority: Major
>  Labels: pull-request-available
>
> The backoff time calculation in `ExponentialBackoffDelayRetryStrategy` should 
> consider currentAttempts. 
>  
> Current Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return lastRetryDelay;
> }
> long backoff = Math.min((long) (lastRetryDelay * multiplier), 
> maxRetryDelay);
> this.lastRetryDelay = backoff;
> return backoff;
> } {code}
> Fixed Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return initialDelay;
> }
> long backoff =
> Math.min(
> (long) (initialDelay * Math.pow(multiplier, 
> currentAttempts - 1)),
> maxRetryDelay);
> return backoff;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-33698) Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy

2023-11-29 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee reassigned FLINK-33698:
---

Assignee: xiangyu feng

> Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy
> 
>
> Key: FLINK-33698
> URL: https://issues.apache.org/jira/browse/FLINK-33698
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: xiangyu feng
>Assignee: xiangyu feng
>Priority: Major
>  Labels: pull-request-available
>
> The backoff time calculation in `ExponentialBackoffDelayRetryStrategy` should 
> consider currentAttempts. 
>  
> Current Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return lastRetryDelay;
> }
> long backoff = Math.min((long) (lastRetryDelay * multiplier), 
> maxRetryDelay);
> this.lastRetryDelay = backoff;
> return backoff;
> } {code}
> Fixed Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return initialDelay;
> }
> long backoff =
> Math.min(
> (long) (initialDelay * Math.pow(multiplier, 
> currentAttempts - 1)),
> maxRetryDelay);
> return backoff;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-33364) Introduce standard YAML for flink configuration

2023-11-29 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-33364.
---
Fix Version/s: 1.19.0
   Resolution: Fixed

Done via d5b1afb7f67fe7da166193deb7711b8a8b163bcf

> Introduce standard YAML for flink configuration
> ---
>
> Key: FLINK-33364
> URL: https://issues.apache.org/jira/browse/FLINK-33364
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Configuration
>Affects Versions: 1.19.0
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33364][core] Introduce standard YAML for flink configuration. [flink]

2023-11-29 Thread via GitHub



zhuzhurk closed pull request #23606: [FLINK-33364][core] Introduce standard 
YAML for flink configuration.
URL: https://github.com/apache/flink/pull/23606


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-33679) RestoreMode uses NO_CLAIM as default instead of LEGACY

2023-11-29 Thread junzhong qin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791458#comment-17791458
 ] 

junzhong qin edited comment on FLINK-33679 at 11/30/23 5:06 AM:


Hi [~masteryhx] , I'm confused about the term 'LEGACY' which is commented as 
'This is the mode in which Flink worked so far.' However, the default restore 
mode is 'NO_CLAIM'. Perhaps the issue title is misleading.


was (Author: easonqin):
Hi [~masteryhx] , I confused with LEGACY which commented as "This is the mode 
in which Flink worked so far." But the default restore mode is "NO_CLAIM".

> RestoreMode uses NO_CLAIM as default instead of LEGACY
> --
>
> Key: FLINK-33679
> URL: https://issues.apache.org/jira/browse/FLINK-33679
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / State Backends
>Reporter: junzhong qin
>Priority: Minor
>
> RestoreMode uses NO_CLAIM as default instead of LEGACY.
> {code:java}
> public enum RestoreMode implements DescribedEnum {
> CLAIM(
> "Flink will take ownership of the given snapshot. It will clean 
> the"
> + " snapshot once it is subsumed by newer ones."),
> NO_CLAIM(
> "Flink will not claim ownership of the snapshot files. However it 
> will make sure it"
> + " does not depend on any artefacts from the restored 
> snapshot. In order to do that,"
> + " Flink will take the first checkpoint as a full one, 
> which means it might"
> + " reupload/duplicate files that are part of the 
> restored checkpoint."),
> LEGACY(
> "This is the mode in which Flink worked so far. It will not claim 
> ownership of the"
> + " snapshot and will not delete the files. However, it 
> can directly depend on"
> + " the existence of the files of the restored 
> checkpoint. It might not be safe"
> + " to delete checkpoints that were restored in legacy 
> mode ");
> private final String description;
> RestoreMode(String description) {
> this.description = description;
> }
> @Override
> @Internal
> public InlineElement getDescription() {
> return text(description);
> }
> public static final RestoreMode DEFAULT = NO_CLAIM;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33638][table] Support variable-length data generation for variable-length data types [flink]

2023-11-29 Thread via GitHub



liyubin117 commented on code in PR #23810:
URL: https://github.com/apache/flink/pull/23810#discussion_r1410159149


##
flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenConnectorOptionsUtil.java:
##
@@ -34,6 +34,8 @@ public class DataGenConnectorOptionsUtil {
 public static final String MAX = "max";
 public static final String MAX_PAST = "max-past";
 public static final String LENGTH = "length";
+

Review Comment:
   thanks for your reminds, done :)



##
flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/RandomGeneratorVisitor.java:
##
@@ -503,10 +524,14 @@ private static RandomGenerator 
getRandomBytesGenerator(int length) {
 return new RandomGenerator() {
 @Override
 public byte[] next() {
-byte[] arr = new byte[length];
+byte[] arr = new byte[getVariableLengthFieldRealLen(length, 
varLen)];
 random.getRandomGenerator().nextBytes(arr);
 return arr;
 }
 };
 }
+
+private static int getVariableLengthFieldRealLen(int length, boolean 
varLen) {

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-33679) RestoreMode uses NO_CLAIM as default instead of LEGACY

2023-11-29 Thread junzhong qin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791458#comment-17791458
 ] 

junzhong qin commented on FLINK-33679:
--

Hi [~masteryhx] , I confused with LEGACY which commented as "This is the mode 
in which Flink worked so far." But the default restore mode is "NO_CLAIM".

> RestoreMode uses NO_CLAIM as default instead of LEGACY
> --
>
> Key: FLINK-33679
> URL: https://issues.apache.org/jira/browse/FLINK-33679
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / State Backends
>Reporter: junzhong qin
>Priority: Minor
>
> RestoreMode uses NO_CLAIM as default instead of LEGACY.
> {code:java}
> public enum RestoreMode implements DescribedEnum {
> CLAIM(
> "Flink will take ownership of the given snapshot. It will clean 
> the"
> + " snapshot once it is subsumed by newer ones."),
> NO_CLAIM(
> "Flink will not claim ownership of the snapshot files. However it 
> will make sure it"
> + " does not depend on any artefacts from the restored 
> snapshot. In order to do that,"
> + " Flink will take the first checkpoint as a full one, 
> which means it might"
> + " reupload/duplicate files that are part of the 
> restored checkpoint."),
> LEGACY(
> "This is the mode in which Flink worked so far. It will not claim 
> ownership of the"
> + " snapshot and will not delete the files. However, it 
> can directly depend on"
> + " the existence of the files of the restored 
> checkpoint. It might not be safe"
> + " to delete checkpoints that were restored in legacy 
> mode ");
> private final String description;
> RestoreMode(String description) {
> this.description = description;
> }
> @Override
> @Internal
> public InlineElement getDescription() {
> return text(description);
> }
> public static final RestoreMode DEFAULT = NO_CLAIM;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33638][table] Support variable-length data generation for variable-length data types [flink]

2023-11-29 Thread via GitHub



liyubin117 commented on code in PR #23810:
URL: https://github.com/apache/flink/pull/23810#discussion_r1410156944


##
flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/RandomGeneratorVisitor.java:
##
@@ -503,10 +524,14 @@ private static RandomGenerator 
getRandomBytesGenerator(int length) {
 return new RandomGenerator() {
 @Override
 public byte[] next() {
-byte[] arr = new byte[length];
+byte[] arr = new byte[getVariableLengthFieldRealLen(length, 
varLen)];
 random.getRandomGenerator().nextBytes(arr);
 return arr;
 }
 };
 }
+
+private static int getVariableLengthFieldRealLen(int length, boolean 
varLen) {
+return varLen ? (new Random().nextInt(length)) + 1 : length;

Review Comment:
   What about using static ThreadLocalRandom.current() provided by java instead 
of per-call new instance? I found that DecimalDataRandomGenerator#next also 
uses such method to generate random data.
   ```
   public DecimalData next() {
   if (nullRate == 0f || ThreadLocalRandom.current().nextFloat() > 
nullRate) {
   BigDecimal decimal =
   new BigDecimal(
   ThreadLocalRandom.current().nextDouble(min, max),
   new MathContext(precision, RoundingMode.DOWN));
   return DecimalData.fromBigDecimal(decimal, precision, scale);
   }
   return null;
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-29 Thread gabrywu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gabrywu updated FLINK-33623:

Component/s: API / Core

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
>  Labels: metaspace-leak
> Attachments: image-2023-11-23-09-47-50-536.png, image.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-29 Thread gabrywu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gabrywu updated FLINK-33623:

Labels: metaspace-leak  (was: )

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
>  Labels: metaspace-leak
> Attachments: image-2023-11-23-09-47-50-536.png, image.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [hotfix] typo in the pom files of flink-format modules for archunit [flink]

2023-11-29 Thread via GitHub



JingGe merged PR #23825:
URL: https://github.com/apache/flink/pull/23825


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [hotfix] typo in the pom files of flink-format modules for archunit [flink]

2023-11-29 Thread via GitHub



JingGe commented on PR #23825:
URL: https://github.com/apache/flink/pull/23825#issuecomment-1833101654

   Thanks @PatrickRen for the review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33680][docs]Failed to build document with docker [flink]

2023-11-29 Thread via GitHub



flinkbot commented on PR #23831:
URL: https://github.com/apache/flink/pull/23831#issuecomment-1833098101

   
   ## CI report:
   
   * 68ee5a12dfa9773dfd4e8e81d9b40ccec3c032b5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33700] CustomSSLEngineProvider supports SHA-256 [flink]

2023-11-29 Thread via GitHub



flinkbot commented on PR #23833:
URL: https://github.com/apache/flink/pull/23833#issuecomment-1833094492

   
   ## CI report:
   
   * 102d1f1487eacab667917098b5fbfdd28a701d17 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-33700) CustomSSLEngineProvider supports SHA-256

2023-11-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33700:
---
Labels: pull-request-available  (was: )

> CustomSSLEngineProvider supports SHA-256
> 
>
> Key: FLINK-33700
> URL: https://issues.apache.org/jira/browse/FLINK-33700
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / RPC
>Affects Versions: 1.19.0
>Reporter: Bo Cui
>Priority: Major
>  Labels: pull-request-available
>
> The algorithm of CustomSSLEngineProvider supports only SHA1.
> https://github.com/apache/flink/blob/72654384686d127172b48b0071ea7656b16e9134/flink-rpc/flink-rpc-akka/src/main/java/org/apache/flink/runtime/rpc/pekko/CustomSSLEngineProvider.java#L58C1-L59C1



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-33700] CustomSSLEngineProvider supports SHA-256 [flink]

2023-11-29 Thread via GitHub



cuibo01 opened a new pull request, #23833:
URL: https://github.com/apache/flink/pull/23833

   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [FLINK-33680][docs]Failed to build document with docker [flink]

2023-11-29 Thread via GitHub



section9-lab opened a new pull request, #23832:
URL: https://github.com/apache/flink/pull/23832

   
   
   ## What is the purpose of the change
   
   Command description for optimizing document construction
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33680][docs]Failed to build document with docker [flink]

2023-11-29 Thread via GitHub



section9-lab closed pull request #23832: [FLINK-33680][docs]Failed to build 
document with docker
URL: https://github.com/apache/flink/pull/23832


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32714] Add dialect for OceanBase database [flink-connector-jdbc]

2023-11-29 Thread via GitHub



whhe commented on code in PR #72:
URL: 
https://github.com/apache/flink-connector-jdbc/pull/72#discussion_r1410134563


##
flink-connector-jdbc/src/test/java/org/apache/flink/connector/jdbc/databases/oceanbase/table/OceanBaseMySqlDynamicTableSinkITCase.java:
##
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.jdbc.databases.oceanbase.table;
+
+import org.apache.flink.connector.jdbc.databases.oceanbase.OceanBaseTestBase;
+import org.apache.flink.connector.jdbc.table.JdbcDynamicTableSinkITCase;
+import org.apache.flink.connector.jdbc.testutils.tables.TableRow;
+import org.apache.flink.table.api.DataTypes;
+
+import java.util.Map;
+
+import static 
org.apache.flink.connector.jdbc.testutils.tables.TableBuilder.dbType;
+import static 
org.apache.flink.connector.jdbc.testutils.tables.TableBuilder.field;
+import static 
org.apache.flink.connector.jdbc.testutils.tables.TableBuilder.pkField;
+
+/** The Table Sink ITCase for OceanBase MySql mode. */
+public class OceanBaseMySqlDynamicTableSinkITCase extends 
JdbcDynamicTableSinkITCase

Review Comment:
   There is no docker image of OceanBase Enterprise Edition which supports the 
Oracle compatibility, so it's a little hard to test it with existing test 
framework. I can add some test cases for offline testing, but they will not be 
executed on ci.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32714] Add dialect for OceanBase database [flink-connector-jdbc]

2023-11-29 Thread via GitHub



whhe commented on code in PR #72:
URL: 
https://github.com/apache/flink-connector-jdbc/pull/72#discussion_r1410130574


##
flink-connector-jdbc/pom.xml:
##
@@ -110,6 +110,13 @@ under the License.
provided

 
+
+com.oceanbase
+oceanbase-client

Review Comment:
   It indeed uses LGPL license. 
   
   I'm not familiar with the legal issues, is it forbidden to include a LGPL 
dependency with 'provide' scope?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33680][docs]Failed to build document with docker [flink]

2023-11-29 Thread via GitHub



section9-lab closed pull request #23831: [FLINK-33680][docs]Failed to build 
document with docker
URL: https://github.com/apache/flink/pull/23831


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-33680) Failed to build document with docker

2023-11-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33680:
---
Labels: doc-site docement pull-request-available  (was: doc-site docement)

> Failed to build document with docker
> 
>
> Key: FLINK-33680
> URL: https://issues.apache.org/jira/browse/FLINK-33680
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: wangshiheng
>Priority: Major
>  Labels: doc-site, docement, pull-request-available
>
> Follow the documentation, the documentation comes from 
> [https://github.com/apache/flink/blob/master/docs/README.md]
>  
> The implementation results are as follows:
> {code:java}
> [root@bigdatadev workpace]# git clone https://github.com/apache/flink.git
> ...
> [root@bigdatadev workpace]# cd flink/docs/
> [root@bigdatadev docs]# ./setup_docs.sh
> [root@bigdatadev docs]# docker pull jakejarvis/hugo-extended:latest
> latest: Pulling from jakejarvis/hugo-extended
> Digest: 
> sha256:f659daa3b52693d8f6fc380e4fc5d0d3faf5b9c25ef260244ff67625c59c45a7
> Status: Image is up to date for jakejarvis/hugo-extended:latest
> docker.io/jakejarvis/hugo-extended:latest
> [root@bigdatadev docs]#  docker run -v $(pwd):/src -p 1313:1313 
> jakejarvis/hugo-extended:latest server --buildDrafts --buildFuture --bind 
> 0.0.0.0
> Start building sites … 
> hugo v0.113.0-085c1b3d614e23d218ebf9daad909deaa2390c9a+extended linux/amd64 
> BuildDate=2023-06-05T15:04:51Z VendorInfo=docker
> Built in 515 ms
> Error: error building site: assemble: "/src/content/_index.md:36:1": failed 
> to extract shortcode: template for shortcode "columns" not found
>  {code}
>  
> [root@bigdatadev docs]# vim content/_index.md
> {panel}
>  30 # Apache Flink Documentation
>  31 
>  32 {{< center >}}
>  33 *{*}Apache Flink{*}* is a framework and distributed processing engine for 
> stateful computations over *unbounded* and *bounded* data streams. Flink has 
> been designed to run in {*}all common cluster en    vironments{*}, perform 
> computations at *in-memory* speed and at {*}any scale{*}.
>  34 {{< /center >}}
>  35 
> {color:#de350b} 36 \{{< columns >}}{color}
>  37 
>  38 ### Try Flink
>  39 
>  40 If you’re interested in playing around with Flink, try one of our 
> tutorials:
>  41 
>  42 * [Fraud Detection with the DataStream API]({{{}< ref 
> "docs/try-flink/datastream" >{}}})
> {panel}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-33680][docs]Failed to build document with docker [flink]

2023-11-29 Thread via GitHub



section9-lab opened a new pull request, #23831:
URL: https://github.com/apache/flink/pull/23831

   
   
   ## What is the purpose of the change
   
   Command description for optimizing document construction
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-33680) Failed to build document with docker

2023-11-29 Thread wangshiheng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791414#comment-17791414
 ] 

wangshiheng edited comment on FLINK-33680 at 11/30/23 4:00 AM:
---

Thank you for your reply. I checked my execution record and found that I had 
executed the `git submodule update --init --recursive` command, but I ignored 
the alarm and executed it successfully in the root directory.
 
I plan to optimize the documentation description to make it more detailed, so 
as to avoid the same problem in the future.
{panel}
[root@bigdatadev docs]#  git clone 
[https://github.com/apache/flink.git|https://github.com/section9-lab/flink.git]
[root@bigdatadev docs]#  cd flink/docs/
[root@bigdatadev docs]# git submodule update --init --recursive
You need to run this command from the toplevel of the working tree.
[root@bigdatadev docs]#  ./setup_docs.sh
...
{panel}
 


was (Author: JIRAUSER303015):
Yes, I executed it and it still failed,
You can also try to rebuild it in a new local environment.
 
Here is my command execution record:
{panel}
  575  git clone [https://github.com/section9-lab/flink.git]
  576  cd flink/docs/
  577  vim README.md 
  578  git submodule update --init --recursive
  579  vim README.md 
  580  ./setup_docs.sh
  581  vim README.md 
  582  docker pull jakejarvis/hugo-extended:latest
  583  vim README.md 
  584   docker run -v $(pwd):/src -p 1313:1313 jakejarvis/hugo-extended:latest 
server --buildDrafts --buildFuture --bind 0.0.0.0
  585  cat content/_index.md 
{panel}
[~mapohl] 

> Failed to build document with docker
> 
>
> Key: FLINK-33680
> URL: https://issues.apache.org/jira/browse/FLINK-33680
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: wangshiheng
>Priority: Major
>  Labels: doc-site, docement
>
> Follow the documentation, the documentation comes from 
> [https://github.com/apache/flink/blob/master/docs/README.md]
>  
> The implementation results are as follows:
> {code:java}
> [root@bigdatadev workpace]# git clone https://github.com/apache/flink.git
> ...
> [root@bigdatadev workpace]# cd flink/docs/
> [root@bigdatadev docs]# ./setup_docs.sh
> [root@bigdatadev docs]# docker pull jakejarvis/hugo-extended:latest
> latest: Pulling from jakejarvis/hugo-extended
> Digest: 
> sha256:f659daa3b52693d8f6fc380e4fc5d0d3faf5b9c25ef260244ff67625c59c45a7
> Status: Image is up to date for jakejarvis/hugo-extended:latest
> docker.io/jakejarvis/hugo-extended:latest
> [root@bigdatadev docs]#  docker run -v $(pwd):/src -p 1313:1313 
> jakejarvis/hugo-extended:latest server --buildDrafts --buildFuture --bind 
> 0.0.0.0
> Start building sites … 
> hugo v0.113.0-085c1b3d614e23d218ebf9daad909deaa2390c9a+extended linux/amd64 
> BuildDate=2023-06-05T15:04:51Z VendorInfo=docker
> Built in 515 ms
> Error: error building site: assemble: "/src/content/_index.md:36:1": failed 
> to extract shortcode: template for shortcode "columns" not found
>  {code}
>  
> [root@bigdatadev docs]# vim content/_index.md
> {panel}
>  30 # Apache Flink Documentation
>  31 
>  32 {{< center >}}
>  33 *{*}Apache Flink{*}* is a framework and distributed processing engine for 
> stateful computations over *unbounded* and *bounded* data streams. Flink has 
> been designed to run in {*}all common cluster en    vironments{*}, perform 
> computations at *in-memory* speed and at {*}any scale{*}.
>  34 {{< /center >}}
>  35 
> {color:#de350b} 36 \{{< columns >}}{color}
>  37 
>  38 ### Try Flink
>  39 
>  40 If you’re interested in playing around with Flink, try one of our 
> tutorials:
>  41 
>  42 * [Fraud Detection with the DataStream API]({{{}< ref 
> "docs/try-flink/datastream" >{}}})
> {panel}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33698][datastream] Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy [flink]

2023-11-29 Thread via GitHub



flinkbot commented on PR #23830:
URL: https://github.com/apache/flink/pull/23830#issuecomment-1833070399

   
   ## CI report:
   
   * a46a6199800b5dfbff93f2a3d2a186b64edbddfe UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33698][datastream] Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy [flink]

2023-11-29 Thread via GitHub



xiangyuf commented on PR #23830:
URL: https://github.com/apache/flink/pull/23830#issuecomment-1833068864

   @lincoln-lil hi, could help review when you have time?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32714] Add dialect for OceanBase database [flink-connector-jdbc]

2023-11-29 Thread via GitHub



whhe commented on code in PR #72:
URL: 
https://github.com/apache/flink-connector-jdbc/pull/72#discussion_r1410125605


##
flink-connector-jdbc/src/main/java/org/apache/flink/connector/jdbc/dialect/JdbcDialectFactory.java:
##
@@ -46,4 +46,14 @@ public interface JdbcDialectFactory {
 
 /** @return Creates a new instance of the {@link JdbcDialect}. */
 JdbcDialect create();
+
+/**
+ * Creates a new instance of the {@link JdbcDialect} based on compatible 
mode.
+ *
+ * @param compatibleMode the compatible mode of database
+ * @return a new instance of {@link JdbcDialect}
+ */
+default JdbcDialect create(String compatibleMode) {
+return create();

Review Comment:
   Maybe add a check to fail immediately?
   
   ```java
   if (StringUtils.isNullOrWhitespaceOnly(compatibleMode)) {
   return create();
   }
   throw new UnsupportedOperationException("Option 'compatible-mode' is not 
supported for current dialect factory.");
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-33698) Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy

2023-11-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33698:
---
Labels: pull-request-available  (was: )

> Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy
> 
>
> Key: FLINK-33698
> URL: https://issues.apache.org/jira/browse/FLINK-33698
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: xiangyu feng
>Priority: Major
>  Labels: pull-request-available
>
> The backoff time calculation in `ExponentialBackoffDelayRetryStrategy` should 
> consider currentAttempts. 
>  
> Current Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return lastRetryDelay;
> }
> long backoff = Math.min((long) (lastRetryDelay * multiplier), 
> maxRetryDelay);
> this.lastRetryDelay = backoff;
> return backoff;
> } {code}
> Fixed Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return initialDelay;
> }
> long backoff =
> Math.min(
> (long) (initialDelay * Math.pow(multiplier, 
> currentAttempts - 1)),
> maxRetryDelay);
> return backoff;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-33698][datastream] Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy [flink]

2023-11-29 Thread via GitHub



xiangyuf opened a new pull request, #23830:
URL: https://github.com/apache/flink/pull/23830

   
   ## What is the purpose of the change
   
   Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy.
   
   ## Brief change log
   
   The backoff time calculation in `ExponentialBackoffDelayRetryStrategy` 
should consider currentAttempts. It should generate consistent backoff time for 
specific attempt count.
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   This change added tests and can be verified as follows:
   
 - Added unit test in  `AsyncRetryStrategiesTest`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-33266) Support plan cache for DQL in SQL Gateway

2023-11-29 Thread Dan Zou (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Zou updated FLINK-33266:

Description: 
In OLAP scenarios, running a single query typically cost hundreds of 
milliseconds or a few seconds, of which it takes about tens of milliseconds to 
parse, validate, optimize, and translate it to Flink transformations. Adding 
cache to cache the transformations corresponding to queries is meaningful for 
scenarios where certain queries are often executed repeatedly.

We focus on submitting DQL through SQL gateway in this ticket.

  was:In OLAP scenarios, running a single query typically cost hundreds of 
milliseconds or a few seconds, of which it takes about tens of milliseconds to 
parse, validate, optimize, and translate it to Flink transformations. Adding 
cache to cache the transformations corresponding to queries is meaningful for 
scenarios where certain queries are often executed repeatedly.


> Support plan cache for DQL in SQL Gateway
> -
>
> Key: FLINK-33266
> URL: https://issues.apache.org/jira/browse/FLINK-33266
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Dan Zou
>Assignee: Dan Zou
>Priority: Major
>
> In OLAP scenarios, running a single query typically cost hundreds of 
> milliseconds or a few seconds, of which it takes about tens of milliseconds 
> to parse, validate, optimize, and translate it to Flink transformations. 
> Adding cache to cache the transformations corresponding to queries is 
> meaningful for scenarios where certain queries are often executed repeatedly.
> We focus on submitting DQL through SQL gateway in this ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33266) Support plan cache for DQL in SQL Gateway

2023-11-29 Thread Dan Zou (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Zou updated FLINK-33266:

Summary: Support plan cache for DQL in SQL Gateway  (was: Support plan 
cache for SQL jobs)

> Support plan cache for DQL in SQL Gateway
> -
>
> Key: FLINK-33266
> URL: https://issues.apache.org/jira/browse/FLINK-33266
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Dan Zou
>Assignee: Dan Zou
>Priority: Major
>
> In OLAP scenarios, running a single query typically cost hundreds of 
> milliseconds or a few seconds, of which it takes about tens of milliseconds 
> to parse, validate, optimize, and translate it to Flink transformations. 
> Adding cache to cache the transformations corresponding to queries is 
> meaningful for scenarios where certain queries are often executed repeatedly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-32714] Add dialect for OceanBase database [flink-connector-jdbc]

2023-11-29 Thread via GitHub



whhe commented on code in PR #72:
URL: 
https://github.com/apache/flink-connector-jdbc/pull/72#discussion_r1410120366


##
flink-connector-jdbc/src/test/java/org/apache/flink/connector/jdbc/databases/oceanbase/dialect/OceanBaseMysqlDialectTypeTest.java:
##
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.jdbc.databases.oceanbase.dialect;
+
+import org.apache.flink.connector.jdbc.dialect.JdbcDialectTypeTest;
+
+import java.util.Arrays;
+import java.util.List;
+
+/** The OceanBase MySql mode params for {@link JdbcDialectTypeTest}. */
+public class OceanBaseMysqlDialectTypeTest extends JdbcDialectTypeTest {

Review Comment:
   They use the same ddl template for creating table, but the test cases are 
not exactly the same.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-33670) Public operators cannot be reused in multi sinks

2023-11-29 Thread Lyn Zhang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791426#comment-17791426
 ] 

Lyn Zhang commented on FLINK-33670:
---

Making "temporary view" as a real "common table expression" in the whole 
optimization process is a perfect solution to resolve this issue. It's can be 
controlled by sql developers.

I would like to know which stage the CTE feature is currently in.

> Public operators cannot be reused in multi sinks
> 
>
> Key: FLINK-33670
> URL: https://issues.apache.org/jira/browse/FLINK-33670
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Lyn Zhang
>Priority: Major
> Attachments: image-2023-11-28-14-31-30-153.png
>
>
> Dear all:
>    I find that some public operators cannot be reused when submit a job with 
> multi sinks. I have an example as follows:
> {code:java}
> CREATE TABLE source (
>     id              STRING,
>     ts              TIMESTAMP(3),
>     v              BIGINT,
>     WATERMARK FOR ts AS ts - INTERVAL '3' SECOND
> ) WITH (...);
> CREATE VIEW source_distinct AS
> SELECT * FROM (
>     SELECT *, ROW_NUMBER() OVER w AS row_nu
>     FROM source
>     WINDOW w AS (PARTITION BY id ORDER BY proctime() ASC)
> ) WHERE row_nu = 1;
> CREATE TABLE print1 (
>      id             STRING,
>      ts             TIMESTAMP(3)
> ) WITH('connector' = 'blackhole');
> INSERT INTO print1 SELECT id, ts FROM source_distinct;
> CREATE TABLE print2 (
>      id              STRING,
>      ts              TIMESTAMP(3),
>      v               BIGINT
> ) WITH('connector' = 'blackhole');
> INSERT INTO print2
> SELECT id, TUMBLE_START(ts, INTERVAL '20' SECOND), SUM(v)
> FROM source_distinct
> GROUP BY TUMBLE(ts, INTERVAL '20' SECOND), id; {code}
> !image-2023-11-28-14-31-30-153.png|width=384,height=145!
> I try to check the code, Flink add the rule of CoreRules.PROJECT_MERGE by 
> default, This will create different digests of  the deduplicate operator and 
> finally fail to match same sub plan.
> In real production environment, Reuse same sub plan like deduplicate is more 
> worthy than project merge. A good solution is to interrupt the project merge 
> crossing shuffle operators in multi sinks cases. 
> How did you consider it? Looking forward to your reply.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-32714] Add dialect for OceanBase database [flink-connector-jdbc]

2023-11-29 Thread via GitHub



whhe commented on code in PR #72:
URL: 
https://github.com/apache/flink-connector-jdbc/pull/72#discussion_r1410105087


##
flink-connector-jdbc/src/main/java/org/apache/flink/connector/jdbc/databases/oceanbase/dialect/OceanBaseRowConverter.java:
##
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.jdbc.databases.oceanbase.dialect;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.connector.jdbc.converter.AbstractJdbcRowConverter;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.table.types.logical.LogicalType;
+import org.apache.flink.table.types.logical.RowType;
+
+import com.oceanbase.jdbc.Blob;
+import com.oceanbase.jdbc.extend.datatype.BINARY_DOUBLE;
+import com.oceanbase.jdbc.extend.datatype.BINARY_FLOAT;
+import com.oceanbase.jdbc.extend.datatype.NUMBER;
+import com.oceanbase.jdbc.extend.datatype.RAW;
+import com.oceanbase.jdbc.extend.datatype.TIMESTAMP;
+import com.oceanbase.jdbc.extend.datatype.TIMESTAMPTZ;
+
+import java.math.BigDecimal;
+import java.sql.Date;
+import java.sql.Time;
+import java.sql.Timestamp;
+import java.time.LocalDateTime;
+
+/**
+ * Runtime converter that responsible to convert between JDBC object and Flink 
internal object for
+ * OceanBase.
+ */
+@Internal
+public class OceanBaseRowConverter extends AbstractJdbcRowConverter {
+
+private static final long serialVersionUID = 1L;
+
+private final String compatibleMode;
+
+@Override
+public String converterName() {
+return "OceanBase";
+}
+
+public OceanBaseRowConverter(RowType rowType, String compatibleMode) {
+super(rowType);
+this.compatibleMode = compatibleMode;
+}
+
+public JdbcDeserializationConverter createInternalConverter(LogicalType 
type) {
+if ("mysql".equalsIgnoreCase(compatibleMode)) {

Review Comment:
   It's ok to use MySqlRowConverter for MySQL mode connection, but not for 
Oracle mode. The reason is that column data types are mapped to internal 
classes of driver `oceanbase-client`, and the original Oracle driver can't work 
with OceanBase database.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33638][table] Support variable-length data generation for variable-length data types [flink]

2023-11-29 Thread via GitHub



lincoln-lil commented on code in PR #23810:
URL: https://github.com/apache/flink/pull/23810#discussion_r1410076944


##
flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenConnectorOptionsUtil.java:
##
@@ -34,6 +34,8 @@ public class DataGenConnectorOptionsUtil {
 public static final String MAX = "max";
 public static final String MAX_PAST = "max-past";
 public static final String LENGTH = "length";
+

Review Comment:
   nit: unnecessary new line here?



##
flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/RandomGeneratorVisitor.java:
##
@@ -503,10 +524,14 @@ private static RandomGenerator 
getRandomBytesGenerator(int length) {
 return new RandomGenerator() {
 @Override
 public byte[] next() {
-byte[] arr = new byte[length];
+byte[] arr = new byte[getVariableLengthFieldRealLen(length, 
varLen)];
 random.getRandomGenerator().nextBytes(arr);
 return arr;
 }
 };
 }
+
+private static int getVariableLengthFieldRealLen(int length, boolean 
varLen) {

Review Comment:
   nit:  getVariableLengthFieldRealLen -> generateLength(int maxLen, boolean 
varLen) ?



##
flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/RandomGeneratorVisitor.java:
##
@@ -503,10 +524,14 @@ private static RandomGenerator 
getRandomBytesGenerator(int length) {
 return new RandomGenerator() {
 @Override
 public byte[] next() {
-byte[] arr = new byte[length];
+byte[] arr = new byte[getVariableLengthFieldRealLen(length, 
varLen)];
 random.getRandomGenerator().nextBytes(arr);
 return arr;
 }
 };
 }
+
+private static int getVariableLengthFieldRealLen(int length, boolean 
varLen) {
+return varLen ? (new Random().nextInt(length)) + 1 : length;

Review Comment:
   we can avoid per-call new instance here by using a lazy initialized 
`RandomGenerator` member variable(and it also support lower bound so that `+ 1` 
can be omitted)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-33700) CustomSSLEngineProvider supports SHA-256

2023-11-29 Thread Bo Cui (Jira)

Bo Cui created FLINK-33700:
--

 Summary: CustomSSLEngineProvider supports SHA-256
 Key: FLINK-33700
 URL: https://issues.apache.org/jira/browse/FLINK-33700
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / RPC
Affects Versions: 1.19.0
Reporter: Bo Cui


The algorithm of CustomSSLEngineProvider supports only SHA1.
https://github.com/apache/flink/blob/72654384686d127172b48b0071ea7656b16e9134/flink-rpc/flink-rpc-akka/src/main/java/org/apache/flink/runtime/rpc/pekko/CustomSSLEngineProvider.java#L58C1-L59C1



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (FLINK-33680) Failed to build document with docker

2023-11-29 Thread wangshiheng (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangshiheng reopened FLINK-33680:
-

> Failed to build document with docker
> 
>
> Key: FLINK-33680
> URL: https://issues.apache.org/jira/browse/FLINK-33680
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: wangshiheng
>Priority: Major
>  Labels: doc-site, docement
>
> Follow the documentation, the documentation comes from 
> [https://github.com/apache/flink/blob/master/docs/README.md]
>  
> The implementation results are as follows:
> {code:java}
> [root@bigdatadev workpace]# git clone https://github.com/apache/flink.git
> ...
> [root@bigdatadev workpace]# cd flink/docs/
> [root@bigdatadev docs]# ./setup_docs.sh
> [root@bigdatadev docs]# docker pull jakejarvis/hugo-extended:latest
> latest: Pulling from jakejarvis/hugo-extended
> Digest: 
> sha256:f659daa3b52693d8f6fc380e4fc5d0d3faf5b9c25ef260244ff67625c59c45a7
> Status: Image is up to date for jakejarvis/hugo-extended:latest
> docker.io/jakejarvis/hugo-extended:latest
> [root@bigdatadev docs]#  docker run -v $(pwd):/src -p 1313:1313 
> jakejarvis/hugo-extended:latest server --buildDrafts --buildFuture --bind 
> 0.0.0.0
> Start building sites … 
> hugo v0.113.0-085c1b3d614e23d218ebf9daad909deaa2390c9a+extended linux/amd64 
> BuildDate=2023-06-05T15:04:51Z VendorInfo=docker
> Built in 515 ms
> Error: error building site: assemble: "/src/content/_index.md:36:1": failed 
> to extract shortcode: template for shortcode "columns" not found
>  {code}
>  
> [root@bigdatadev docs]# vim content/_index.md
> {panel}
>  30 # Apache Flink Documentation
>  31 
>  32 {{< center >}}
>  33 *{*}Apache Flink{*}* is a framework and distributed processing engine for 
> stateful computations over *unbounded* and *bounded* data streams. Flink has 
> been designed to run in {*}all common cluster en    vironments{*}, perform 
> computations at *in-memory* speed and at {*}any scale{*}.
>  34 {{< /center >}}
>  35 
> {color:#de350b} 36 \{{< columns >}}{color}
>  37 
>  38 ### Try Flink
>  39 
>  40 If you’re interested in playing around with Flink, try one of our 
> tutorials:
>  41 
>  42 * [Fraud Detection with the DataStream API]({{{}< ref 
> "docs/try-flink/datastream" >{}}})
> {panel}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-33680) Failed to build document with docker

2023-11-29 Thread wangshiheng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791414#comment-17791414
 ] 

wangshiheng edited comment on FLINK-33680 at 11/30/23 2:43 AM:
---

Yes, I executed it and it still failed,
You can also try to rebuild it in a new local environment.
 
Here is my command execution record:
{panel}
  575  git clone [https://github.com/section9-lab/flink.git]
  576  cd flink/docs/
  577  vim README.md 
  578  git submodule update --init --recursive
  579  vim README.md 
  580  ./setup_docs.sh
  581  vim README.md 
  582  docker pull jakejarvis/hugo-extended:latest
  583  vim README.md 
  584   docker run -v $(pwd):/src -p 1313:1313 jakejarvis/hugo-extended:latest 
server --buildDrafts --buildFuture --bind 0.0.0.0
  585  cat content/_index.md 
{panel}
[~mapohl] 


was (Author: JIRAUSER303015):
{panel}
 {panel}
Yes, I executed it and it still failed,
You can also try to rebuild it in a new local environment.
 
Here is my command execution record:
{panel}
  575  git clone https://github.com/section9-lab/flink.git
  576  cd flink/docs/
  577  vim README.md 
  578  git submodule update --init --recursive
  579  vim README.md 
  580  ./setup_docs.sh
  581  vim README.md 
  582  docker pull jakejarvis/hugo-extended:latest
  583  vim README.md 
  584   docker run -v $(pwd):/src -p 1313:1313 jakejarvis/hugo-extended:latest 
server --buildDrafts --buildFuture --bind 0.0.0.0
  585  cat content/_index.md 
{panel}
[~mapohl] 

> Failed to build document with docker
> 
>
> Key: FLINK-33680
> URL: https://issues.apache.org/jira/browse/FLINK-33680
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: wangshiheng
>Priority: Major
>  Labels: doc-site, docement
>
> Follow the documentation, the documentation comes from 
> [https://github.com/apache/flink/blob/master/docs/README.md]
>  
> The implementation results are as follows:
> {code:java}
> [root@bigdatadev workpace]# git clone https://github.com/apache/flink.git
> ...
> [root@bigdatadev workpace]# cd flink/docs/
> [root@bigdatadev docs]# ./setup_docs.sh
> [root@bigdatadev docs]# docker pull jakejarvis/hugo-extended:latest
> latest: Pulling from jakejarvis/hugo-extended
> Digest: 
> sha256:f659daa3b52693d8f6fc380e4fc5d0d3faf5b9c25ef260244ff67625c59c45a7
> Status: Image is up to date for jakejarvis/hugo-extended:latest
> docker.io/jakejarvis/hugo-extended:latest
> [root@bigdatadev docs]#  docker run -v $(pwd):/src -p 1313:1313 
> jakejarvis/hugo-extended:latest server --buildDrafts --buildFuture --bind 
> 0.0.0.0
> Start building sites … 
> hugo v0.113.0-085c1b3d614e23d218ebf9daad909deaa2390c9a+extended linux/amd64 
> BuildDate=2023-06-05T15:04:51Z VendorInfo=docker
> Built in 515 ms
> Error: error building site: assemble: "/src/content/_index.md:36:1": failed 
> to extract shortcode: template for shortcode "columns" not found
>  {code}
>  
> [root@bigdatadev docs]# vim content/_index.md
> {panel}
>  30 # Apache Flink Documentation
>  31 
>  32 {{< center >}}
>  33 *{*}Apache Flink{*}* is a framework and distributed processing engine for 
> stateful computations over *unbounded* and *bounded* data streams. Flink has 
> been designed to run in {*}all common cluster en    vironments{*}, perform 
> computations at *in-memory* speed and at {*}any scale{*}.
>  34 {{< /center >}}
>  35 
> {color:#de350b} 36 \{{< columns >}}{color}
>  37 
>  38 ### Try Flink
>  39 
>  40 If you’re interested in playing around with Flink, try one of our 
> tutorials:
>  41 
>  42 * [Fraud Detection with the DataStream API]({{{}< ref 
> "docs/try-flink/datastream" >{}}})
> {panel}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33680) Failed to build document with docker

2023-11-29 Thread wangshiheng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791414#comment-17791414
 ] 

wangshiheng commented on FLINK-33680:
-

{panel}
 {panel}
Yes, I executed it and it still failed,
You can also try to rebuild it in a new local environment.
 
Here is my command execution record:
{panel}
  575  git clone https://github.com/section9-lab/flink.git
  576  cd flink/docs/
  577  vim README.md 
  578  git submodule update --init --recursive
  579  vim README.md 
  580  ./setup_docs.sh
  581  vim README.md 
  582  docker pull jakejarvis/hugo-extended:latest
  583  vim README.md 
  584   docker run -v $(pwd):/src -p 1313:1313 jakejarvis/hugo-extended:latest 
server --buildDrafts --buildFuture --bind 0.0.0.0
  585  cat content/_index.md 
{panel}
[~mapohl] 

> Failed to build document with docker
> 
>
> Key: FLINK-33680
> URL: https://issues.apache.org/jira/browse/FLINK-33680
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: wangshiheng
>Priority: Major
>  Labels: doc-site, docement
>
> Follow the documentation, the documentation comes from 
> [https://github.com/apache/flink/blob/master/docs/README.md]
>  
> The implementation results are as follows:
> {code:java}
> [root@bigdatadev workpace]# git clone https://github.com/apache/flink.git
> ...
> [root@bigdatadev workpace]# cd flink/docs/
> [root@bigdatadev docs]# ./setup_docs.sh
> [root@bigdatadev docs]# docker pull jakejarvis/hugo-extended:latest
> latest: Pulling from jakejarvis/hugo-extended
> Digest: 
> sha256:f659daa3b52693d8f6fc380e4fc5d0d3faf5b9c25ef260244ff67625c59c45a7
> Status: Image is up to date for jakejarvis/hugo-extended:latest
> docker.io/jakejarvis/hugo-extended:latest
> [root@bigdatadev docs]#  docker run -v $(pwd):/src -p 1313:1313 
> jakejarvis/hugo-extended:latest server --buildDrafts --buildFuture --bind 
> 0.0.0.0
> Start building sites … 
> hugo v0.113.0-085c1b3d614e23d218ebf9daad909deaa2390c9a+extended linux/amd64 
> BuildDate=2023-06-05T15:04:51Z VendorInfo=docker
> Built in 515 ms
> Error: error building site: assemble: "/src/content/_index.md:36:1": failed 
> to extract shortcode: template for shortcode "columns" not found
>  {code}
>  
> [root@bigdatadev docs]# vim content/_index.md
> {panel}
>  30 # Apache Flink Documentation
>  31 
>  32 {{< center >}}
>  33 *{*}Apache Flink{*}* is a framework and distributed processing engine for 
> stateful computations over *unbounded* and *bounded* data streams. Flink has 
> been designed to run in {*}all common cluster en    vironments{*}, perform 
> computations at *in-memory* speed and at {*}any scale{*}.
>  34 {{< /center >}}
>  35 
> {color:#de350b} 36 \{{< columns >}}{color}
>  37 
>  38 ### Try Flink
>  39 
>  40 If you’re interested in playing around with Flink, try one of our 
> tutorials:
>  41 
>  42 * [Fraud Detection with the DataStream API]({{{}< ref 
> "docs/try-flink/datastream" >{}}})
> {panel}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-33699) Verify the snapshot migration on Java21

2023-11-29 Thread Yun Tang (Jira)

Yun Tang created FLINK-33699:


 Summary: Verify the snapshot migration on Java21
 Key: FLINK-33699
 URL: https://issues.apache.org/jira/browse/FLINK-33699
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing
Reporter: Yun Tang


In Java 21 builds, Scala is being bumped to 2.12.18, which causes 
incompatibilities within Flink.

This could affect loading savepoints from a Java 8/11/17 build. We already have 
tests extending {{SnapshotMigrationTestBase}} to verify the logic of migrating 
snapshots generated by the older Flink version. I think we can also introduce 
similar tests to verify the logic across different Java versions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-29 Thread gabrywu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791413#comment-17791413
 ] 

gabrywu commented on FLINK-33623:
-

` classloader.check-leaked-classloader = false` causes ChildFirstClassLoader 
leaks. and if it's true, SafetyNetWrapperClassLoader leaks

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png, image.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-29 Thread gabrywu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791412#comment-17791412
 ] 

gabrywu commented on FLINK-33623:
-

[~mapohl] it's not related to FLINK-25023. 

you can start a minimal flink cluster on your desktop, and submit an example 
`$FLINK_HOME/examples/batch/WordCount.jar`, and this leak will be there.

Flink Netty Client is using 
`org.apache.flink.runtime.io.network.netty.NettyServer.THREAD_FACTORY_BUILDER` 
to create a thread when flink job starts, this factory will create a thread 
with parent context class loader as its `contextClassLoader`

Here is a clue.

!image.png!

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png, image.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-29 Thread gabrywu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gabrywu updated FLINK-33623:

Attachment: image.png

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png, image.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-27681) Improve the availability of Flink when the RocksDB file is corrupted.

2023-11-29 Thread Hangxiang Yu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791409#comment-17791409
 ] 

Hangxiang Yu commented on FLINK-27681:
--

{quote}Fail job directly is fine for me, but I guess the PR doesn't fail the 
job, it just fails the current checkpoint, right?
{quote}
Yeah, I think failing the checkpoint maybe also fine currently. It will not 
affect the correctness of the running job.

The downside is that the job has to rollback to the older checkpoint. But there 
should be some policies for high-quality job just as [~mayuehappy] mentioned.
{quote}If the checksum is called for each reading, can we think the check is 
very quick? If so, could we enable it directly without any option? Hey 
[~mayuehappy]  , could you provide some simple benchmark here?
{quote}
The check at runtime is block level, whose overhead should be little (rocksdb 
always need to read the block from the disk at runtime, so the checksum could 
be calculated easily).

But the checksum in file level will always be done with extra overhead, and the 
overhead will be bigger if the state is very large, so that's why I'd like to 
suggest it as an option. Also appreciate and look forward the benchmark result 
of [~mayuehappy] 

> Improve the availability of Flink when the RocksDB file is corrupted.
> -
>
> Key: FLINK-27681
> URL: https://issues.apache.org/jira/browse/FLINK-27681
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Ming Li
>Assignee: Yue Ma
>Priority: Critical
>  Labels: pull-request-available
> Attachments: image-2023-08-23-15-06-16-717.png
>
>
> We have encountered several times when the RocksDB checksum does not match or 
> the block verification fails when the job is restored. The reason for this 
> situation is generally that there are some problems with the machine where 
> the task is located, which causes the files uploaded to HDFS to be incorrect, 
> but it has been a long time (a dozen minutes to half an hour) when we found 
> this problem. I'm not sure if anyone else has had a similar problem.
> Since this file is referenced by incremental checkpoints for a long time, 
> when the maximum number of checkpoints reserved is exceeded, we can only use 
> this file until it is no longer referenced. When the job failed, it cannot be 
> recovered.
> Therefore we consider:
> 1. Can RocksDB periodically check whether all files are correct and find the 
> problem in time?
> 2. Can Flink automatically roll back to the previous checkpoint when there is 
> a problem with the checkpoint data, because even with manual intervention, it 
> just tries to recover from the existing checkpoint or discard the entire 
> state.
> 3. Can we increase the maximum number of references to a file based on the 
> maximum number of checkpoints reserved? When the number of references exceeds 
> the maximum number of checkpoints -1, the Task side is required to upload a 
> new file for this reference. Not sure if this way will ensure that the new 
> file we upload will be correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33364][core] Introduce standard YAML for flink configuration. [flink]

2023-11-29 Thread via GitHub



JunRuiLee commented on code in PR #23606:
URL: https://github.com/apache/flink/pull/23606#discussion_r1410077969


##
flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java:
##
@@ -1082,6 +1082,9 @@ public static class GlobalJobParameters implements 
Serializable {
  * Convert UserConfig into a {@code Map} 
representation. This can be used by
  * the runtime, for example for presenting the user config in the web 
frontend.
  *
+ * NOTE: While persisting configuration settings to files, it's 
advisable to avoid using
+ * this method and instead use the {@link 
Configuration#toFileWritableMap()} method.
+ *

Review Comment:
   fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-33698) Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy

2023-11-29 Thread xiangyu feng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791405#comment-17791405
 ] 

xiangyu feng commented on FLINK-33698:
--

[~lincoln.86xy] Sure, would you kindly assign this to me?

> Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy
> 
>
> Key: FLINK-33698
> URL: https://issues.apache.org/jira/browse/FLINK-33698
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: xiangyu feng
>Priority: Major
>
> The backoff time calculation in `ExponentialBackoffDelayRetryStrategy` should 
> consider currentAttempts. 
>  
> Current Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return lastRetryDelay;
> }
> long backoff = Math.min((long) (lastRetryDelay * multiplier), 
> maxRetryDelay);
> this.lastRetryDelay = backoff;
> return backoff;
> } {code}
> Fixed Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return initialDelay;
> }
> long backoff =
> Math.min(
> (long) (initialDelay * Math.pow(multiplier, 
> currentAttempts - 1)),
> maxRetryDelay);
> return backoff;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33698) Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy

2023-11-29 Thread lincoln lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791404#comment-17791404
 ] 

lincoln lee commented on FLINK-33698:
-

[~xiangyu0xf] thanks for reporting this issue! Are you interested to submit a 
pr?

> Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy
> 
>
> Key: FLINK-33698
> URL: https://issues.apache.org/jira/browse/FLINK-33698
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: xiangyu feng
>Priority: Major
>
> The backoff time calculation in `ExponentialBackoffDelayRetryStrategy` should 
> consider currentAttempts. 
>  
> Current Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return lastRetryDelay;
> }
> long backoff = Math.min((long) (lastRetryDelay * multiplier), 
> maxRetryDelay);
> this.lastRetryDelay = backoff;
> return backoff;
> } {code}
> Fixed Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return initialDelay;
> }
> long backoff =
> Math.min(
> (long) (initialDelay * Math.pow(multiplier, 
> currentAttempts - 1)),
> maxRetryDelay);
> return backoff;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] Release Flink 1.16.3 [flink-web]

2023-11-29 Thread via GitHub



1996fanrui merged PR #698:
URL: https://github.com/apache/flink-web/pull/698


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-33679) RestoreMode uses NO_CLAIM as default instead of LEGACY

2023-11-29 Thread Hangxiang Yu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791403#comment-17791403
 ] 

Hangxiang Yu commented on FLINK-33679:
--

Hi, NO_CLAIM should be more production friendly than Legacy mode, so it's the 
better default option.

Do you have any other concerns about this ?

> RestoreMode uses NO_CLAIM as default instead of LEGACY
> --
>
> Key: FLINK-33679
> URL: https://issues.apache.org/jira/browse/FLINK-33679
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / State Backends
>Reporter: junzhong qin
>Priority: Minor
>
> RestoreMode uses NO_CLAIM as default instead of LEGACY.
> {code:java}
> public enum RestoreMode implements DescribedEnum {
> CLAIM(
> "Flink will take ownership of the given snapshot. It will clean 
> the"
> + " snapshot once it is subsumed by newer ones."),
> NO_CLAIM(
> "Flink will not claim ownership of the snapshot files. However it 
> will make sure it"
> + " does not depend on any artefacts from the restored 
> snapshot. In order to do that,"
> + " Flink will take the first checkpoint as a full one, 
> which means it might"
> + " reupload/duplicate files that are part of the 
> restored checkpoint."),
> LEGACY(
> "This is the mode in which Flink worked so far. It will not claim 
> ownership of the"
> + " snapshot and will not delete the files. However, it 
> can directly depend on"
> + " the existence of the files of the restored 
> checkpoint. It might not be safe"
> + " to delete checkpoints that were restored in legacy 
> mode ");
> private final String description;
> RestoreMode(String description) {
> this.description = description;
> }
> @Override
> @Internal
> public InlineElement getDescription() {
> return text(description);
> }
> public static final RestoreMode DEFAULT = NO_CLAIM;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-28051) Promote ExternallyInducedSourceReader to non-experimental @Public

2023-11-29 Thread Brian Zhou (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-28051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791402#comment-17791402
 ] 

Brian Zhou commented on FLINK-28051:


Hi [~afedulov] , we can confirm the usage of ExternallyInducedSourceReader will 
be removed in the coming FLIP-27 improvement.

As Pravega connector seems to be the only user of this API, and it also raises 
some special code diverge in Flink like 
[https://github.com/apache/flink/blob/72654384686d127172b48b0071ea7656b16e9134/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/SourceOperatorStreamTask.java#L101]
 , instead of promotion, is it fine to deprecate or remove this API for Flink 
2.0?

> Promote ExternallyInducedSourceReader to non-experimental @Public
> -
>
> Key: FLINK-28051
> URL: https://issues.apache.org/jira/browse/FLINK-28051
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Tests
>Reporter: Alexander Fedulov
>Priority: Major
>
> It needs to be evaluated if ExternallyInducedSourceReader can be promoted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-11-29 Thread via GitHub



masteryhx commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1832964678

   > @masteryhx kindly remind, hope to get your reply, many thanks~
   
   Seems there are some conflicts.
   Could you rebase the master and resolve them at first ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Add flink-shaded 18.0 release [flink-web]

2023-11-29 Thread via GitHub



snuyanzin commented on code in PR #701:
URL: https://github.com/apache/flink-web/pull/701#discussion_r1409991523


##
docs/data/additional_components.yml:
##
@@ -21,11 +21,11 @@ flink-connector-parent:
   source_release_asc_url: 
"https://downloads.apache.org/flink/flink-connector-parent-1.0.0/flink-connector-parent-1.0.0-src.tgz.asc;
   source_release_sha512_url: 
"https://downloads.apache.org/flink/flink-connector-parent-1.0.0/flink-connector-parent-1.0.0-src.tgz.sha512;
 
-flink-shaded-17.0:
-  name: "Apache Flink-shaded 17.0 Source Release"
-  source_release_url: 
"https://www.apache.org/dyn/closer.lua/flink/flink-shaded-17.0/flink-shaded-17.0-src.tgz;
-  source_release_asc_url: 
"https://downloads.apache.org/flink/flink-shaded-17.0/flink-shaded-17.0-src.tgz.asc;
-  source_release_sha512_url: 
"https://downloads.apache.org/flink/flink-shaded-17.0/flink-shaded-17.0-src.tgz.sha512;
+flink-shaded-18.0:
+  name: "Apache Flink-shaded 18.0 Source Release"
+  source_release_url: 
"https://www.apache.org/dyn/closer.lua/flink/flink-shaded-18.0/flink-shaded-18.0-src.tgz;
+  source_release_asc_url: 
"https://downloads.apache.org/flink/flink-shaded-18.0/flink-shaded-18.0-src.tgz.asc;
+  source_release_sha512_url: 
"https://downloads.apache.org/flink/flink-shaded-18.0/flink-shaded-18.0-src.tgz.sha512;
 
 flink-shaded-16.2:

Review Comment:
   That's actually a good question
   the only thing making me stopping before removal that is that it was 
released just a couple of weeks ago...
   However I think you are right and  probably we should not stick to it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33584][Filesystems] Update Hadoop Filesystem dependencies to 3.3.6 [flink]

2023-11-29 Thread via GitHub



snuyanzin commented on PR #23740:
URL: https://github.com/apache/flink/pull/23740#issuecomment-1832862274

   @MartijnVisser do you mind if I continue here or are you still working on 
this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30662) Add support for Delete

2023-11-29 Thread Chesnay Schepler (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-30662:
-
Fix Version/s: 1.17.0

> Add support for Delete
> --
>
> Key: FLINK-30662
> URL: https://issues.apache.org/jira/browse/FLINK-30662
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: luoyuxia
>Assignee: luoyuxia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33698) Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy

2023-11-29 Thread xiangyu feng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791252#comment-17791252
 ] 

xiangyu feng commented on FLINK-33698:
--

Hi [~lincoln.86xy] , what do you think about this?

> Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy
> 
>
> Key: FLINK-33698
> URL: https://issues.apache.org/jira/browse/FLINK-33698
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: xiangyu feng
>Priority: Major
>
> The backoff time calculation in `ExponentialBackoffDelayRetryStrategy` should 
> consider currentAttempts. 
>  
> Current Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return lastRetryDelay;
> }
> long backoff = Math.min((long) (lastRetryDelay * multiplier), 
> maxRetryDelay);
> this.lastRetryDelay = backoff;
> return backoff;
> } {code}
> Fixed Version:
> {code:java}
> @Override
> public long getBackoffTimeMillis(int currentAttempts) {
> if (currentAttempts <= 1) {
> // equivalent to initial delay
> return initialDelay;
> }
> long backoff =
> Math.min(
> (long) (initialDelay * Math.pow(multiplier, 
> currentAttempts - 1)),
> maxRetryDelay);
> return backoff;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-33698) Fix the backoff time calculation in ExponentialBackoffDelayRetryStrategy

2023-11-29 Thread xiangyu feng (Jira)

xiangyu feng created FLINK-33698:


 Summary: Fix the backoff time calculation in 
ExponentialBackoffDelayRetryStrategy
 Key: FLINK-33698
 URL: https://issues.apache.org/jira/browse/FLINK-33698
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream
Reporter: xiangyu feng


The backoff time calculation in `ExponentialBackoffDelayRetryStrategy` should 
consider currentAttempts. 

 

Current Version:
{code:java}
@Override
public long getBackoffTimeMillis(int currentAttempts) {
if (currentAttempts <= 1) {
// equivalent to initial delay
return lastRetryDelay;
}
long backoff = Math.min((long) (lastRetryDelay * multiplier), 
maxRetryDelay);
this.lastRetryDelay = backoff;
return backoff;
} {code}
Fixed Version:
{code:java}
@Override
public long getBackoffTimeMillis(int currentAttempts) {
if (currentAttempts <= 1) {
// equivalent to initial delay
return initialDelay;
}
long backoff =
Math.min(
(long) (initialDelay * Math.pow(multiplier, currentAttempts 
- 1)),
maxRetryDelay);
return backoff;
} {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33652]: Fixing hyperlink on first steps page. concepts to concepts/overview, as concepts page is empty. [flink]

2023-11-29 Thread via GitHub



JingGe commented on PR #23815:
URL: https://github.com/apache/flink/pull/23815#issuecomment-1832480538

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-33683) Improve the performance of submitting jobs and fetching results to a running flink cluster

2023-11-29 Thread xiangyu feng (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiangyu feng updated FLINK-33683:
-
Description: 
Now there are lots of unnecessary overhead involved in submitting jobs and 
fetching results to a long-running flink cluster. This works well for streaming 
and batch job, because in these scenarios users will not frequently submit jobs 
and fetch result to a running cluster.

 

But in OLAP scenario, users will continuously submit lots of short-lived jobs 
to the running cluster. In this situation, these overhead will have a huge 
impact on the E2E performance.  Here are some examples of unnecessary overhead:
 * Each `RemoteExecutor` will create a new `StandaloneClusterDescriptor` when 
executing a job on the same remote cluster
 * `StandaloneClusterDescriptor` will always create a new `RestClusterClient` 
when retrieving an existing Flink Cluster
 * Each `RestClusterClient` will create a new `ClientHighAvailabilityServices` 
which might contains a resource-consuming ha client(ZKClient or KubeClient) and 
a time-consuming leader retrieval operation
 * `RestClient` will create a new connection for every request which costs 
extra connection establishment time

 

Therefore, I suggest creating this ticket and following subtasks to improve 
this performance. This ticket is also relates to  FLINK-25318.

  was:
There is now a lot of unnecessary overhead involved in submitting jobs and 
fetching results to a long-running flink cluster. This works well for streaming 
and batch job, because in these scenarios users will not frequently submit jobs 
and fetch result to a running cluster.

 

But in OLAP scenario, users will continuously submit lots of short-lived jobs 
to the running cluster. In this situation, these overhead will have a huge 
impact on the E2E performance.  Here are some examples of unnecessary overhead:
 * Each `RemoteExecutor` will create a new `StandaloneClusterDescriptor` when 
executing a job on the same remote cluster
 * `StandaloneClusterDescriptor` will always create a new `RestClusterClient` 
when retrieving an existing Flink Cluster
 * Each `RestClusterClient` will create a new `ClientHighAvailabilityServices` 
which might contains a resource-consuming ha client(ZKClient or KubeClient) and 
a time-consuming leader retrieval operation
 * `RestClient` will create a new connection for every request which costs 
extra connection establishment time

 

Therefore, I suggest creating this ticket and following subtasks to improve 
this performance. This ticket is also relates to  FLINK-25318.


> Improve the performance of submitting jobs and fetching results to a running 
> flink cluster
> --
>
> Key: FLINK-33683
> URL: https://issues.apache.org/jira/browse/FLINK-33683
> Project: Flink
>  Issue Type: Improvement
>  Components: Client / Job Submission, Table SQL / Client
>Reporter: xiangyu feng
>Priority: Major
>
> Now there are lots of unnecessary overhead involved in submitting jobs and 
> fetching results to a long-running flink cluster. This works well for 
> streaming and batch job, because in these scenarios users will not frequently 
> submit jobs and fetch result to a running cluster.
>  
> But in OLAP scenario, users will continuously submit lots of short-lived jobs 
> to the running cluster. In this situation, these overhead will have a huge 
> impact on the E2E performance.  Here are some examples of unnecessary 
> overhead:
>  * Each `RemoteExecutor` will create a new `StandaloneClusterDescriptor` when 
> executing a job on the same remote cluster
>  * `StandaloneClusterDescriptor` will always create a new `RestClusterClient` 
> when retrieving an existing Flink Cluster
>  * Each `RestClusterClient` will create a new 
> `ClientHighAvailabilityServices` which might contains a resource-consuming ha 
> client(ZKClient or KubeClient) and a time-consuming leader retrieval operation
>  * `RestClient` will create a new connection for every request which costs 
> extra connection establishment time
>  
> Therefore, I suggest creating this ticket and following subtasks to improve 
> this performance. This ticket is also relates to  FLINK-25318.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] FLINK-33600][table] print the query time cost for batch query in cli [flink]

2023-11-29 Thread via GitHub



JingGe commented on code in PR #23809:
URL: https://github.com/apache/flink/pull/23809#discussion_r1409698883


##
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/CliTableauResultView.java:
##
@@ -42,27 +42,45 @@
 /** Print result in tableau mode. */
 public class CliTableauResultView implements AutoCloseable {
 
+public static final long DEFAULT_QUERY_BEGIN_TIME = -1;
+
 private final Terminal terminal;
 private final ResultDescriptor resultDescriptor;
 
 private final ChangelogResult collectResult;
 private final ExecutorService displayResultExecutorService;
 
+private final long queryBeginTime;
+
 public CliTableauResultView(final Terminal terminal, final 
ResultDescriptor resultDescriptor) {

Review Comment:
   removed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33647] Implement restore tests for LookupJoin node [flink]

2023-11-29 Thread via GitHub



bvarghese1 commented on code in PR #23814:
URL: https://github.com/apache/flink/pull/23814#discussion_r1409648619


##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/stream/LookupJoinTestPrograms.java:
##
@@ -0,0 +1,287 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.nodes.exec.stream;
+
+import org.apache.flink.table.test.program.SinkTestStep;
+import org.apache.flink.table.test.program.SourceTestStep;
+import org.apache.flink.table.test.program.TableTestProgram;
+import org.apache.flink.types.Row;
+
+import java.util.Arrays;
+import java.util.List;
+import java.util.stream.Collectors;
+
+/** {@link TableTestProgram} definitions for testing {@link 
StreamExecGroupWindowAggregate}. */
+public class LookupJoinTestPrograms {
+
+static final SourceTestStep CUSTOMERS =
+SourceTestStep.newBuilder("customers_t") // static table
+.addOption("disable-lookup", "false")
+.addOption("filterable-fields", "age")
+.addSchema(
+"id INT PRIMARY KEY NOT ENFORCED",
+"name STRING",
+"age INT",
+"city STRING",
+"state STRING",
+"zipcode INT")
+.producedBeforeRestore(
+Row.of(1, "Bob", 28, "Mountain View", 
"California", 94043),
+Row.of(2, "Alice", 32, "San Francisco", 
"California", 95016),
+Row.of(3, "Claire", 37, "Austin", "Texas", 73301),
+Row.of(4, "Shannon", 29, "Boise", "Idaho", 83701),
+Row.of(5, "Jake", 42, "New York City", "New York", 
10001))
+// Note: Before data state is not persisted for static 
tables during savepoint
+.producedAfterRestore(
+Row.of(1, "Bob", 28, "San Jose", "California", 
94089),
+Row.of(6, "Joana", 54, "Atlanta", "Georgia", 
30033))
+.build();
+
+static final SourceTestStep ORDERS =
+SourceTestStep.newBuilder("orders_t")
+.addOption("filterable-fields", "customer_id")
+.addSchema(
+"order_id INT",
+"customer_id INT",
+"total DOUBLE",
+"order_time STRING",
+"proc_time AS PROCTIME()")
+.producedBeforeRestore(
+Row.of(1, 3, 44.44, "2020-10-10 00:00:01"),
+Row.of(2, 5, 100.02, "2020-10-10 00:00:02"),
+Row.of(4, 2, 92.61, "2020-10-10 00:00:04"),
+Row.of(3, 1, 23.89, "2020-10-10 00:00:03"),
+Row.of(6, 4, 7.65, "2020-10-10 00:00:06"),
+Row.of(5, 2, 12.78, "2020-10-10 00:00:05"))
+.producedAfterRestore(
+Row.of(7, 6, 17.58, "2020-10-10 00:00:07"), // new 
customer
+Row.of(9, 1, 143.21, "2020-10-10 00:00:08") // 
updated zip code
+)
+.build();
+
+static final List SINK_SCHEMA =
+Arrays.asList(
+"order_id INT",
+"total DOUBLE",
+"id INT",
+"name STRING",
+"age INT",
+"city STRING",
+"state STRING",
+"zipcode INT");
+
+static final TableTestProgram LOOKUP_JOIN_PROJECT_PUSHDOWN =
+TableTestProgram.of(
+"lookup-join-project-pushdown",
+"validates lookup join with project pushdown")
+.setupTableSource(CUSTOMERS)
+.setupTableSource(ORDERS)
+.setupTableSink(
+SinkTestStep.newBuilder("sink_t")
+

Re: [PR] [FLINK-33647] Implement restore tests for LookupJoin node [flink]

2023-11-29 Thread via GitHub



bvarghese1 commented on code in PR #23814:
URL: https://github.com/apache/flink/pull/23814#discussion_r1409647014


##
flink-table/flink-table-api-java/src/test/java/org/apache/flink/table/test/program/TableTestStep.java:
##
@@ -56,7 +56,7 @@ public TableResult apply(TableEnvironment env) {
 
 public TableResult apply(TableEnvironment env, Map 
extraOptions) {
 final Map allOptions = new HashMap<>(options);
-allOptions.putAll(extraOptions);

Review Comment:
   Alright. Will do that



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-33697) FLIP-386: Support adding custom metrics in Recovery Spans

2023-11-29 Thread Piotr Nowojski (Jira)

Piotr Nowojski created FLINK-33697:
--

 Summary: FLIP-386: Support adding custom metrics in Recovery Spans
 Key: FLINK-33697
 URL: https://issues.apache.org/jira/browse/FLINK-33697
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Metrics, Runtime / State Backends
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski
 Fix For: 1.19.0


h1. Motivation

FLIP-386 is building on top of 
[FLIP-384|https://cwiki.apache.org/confluence/display/FLINK/FLIP-384%3A+Introduce+TraceReporter+and+use+it+to+create+checkpointing+and+recovery+traces].
 The intention here is to add a capability for state backends to attach custom 
attributes during recovery to recovery spans. For example 
RocksDBIncrementalRestoreOperation could report both remote download time and 
time to actually clip/ingest the RocksDB instances after rescaling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33675] Serialize ValueLiteralExpressions into SQL [flink]

2023-11-29 Thread via GitHub



dawidwys commented on code in PR #23829:
URL: https://github.com/apache/flink/pull/23829#discussion_r1409609082


##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/expressions/ValueLiteralExpression.java:
##
@@ -219,6 +222,83 @@ public String asSummaryString() {
 return stringifyValue(value);
 }
 
+@Override
+public String asSerializableString() {
+if (value == null && 
!dataType.getLogicalType().is(LogicalTypeRoot.NULL)) {
+return String.format(
+"CAST(NULL AS %s)",
+// casting does not support nullability
+
dataType.getLogicalType().copy(true).asSerializableString());
+}
+final LogicalType logicalType = dataType.getLogicalType();
+switch (logicalType.getTypeRoot()) {
+case TINYINT:
+return String.format("CAST(%s AS TINYINT)", value);
+case SMALLINT:
+return String.format("CAST(%s AS SMALLINT)", value);
+case BIGINT:
+return String.format("CAST(%s AS BIGINT)", value);
+case FLOAT:
+return String.format("CAST(%s AS FLOAT)", value);
+case DOUBLE:
+return String.format("CAST(%s AS DOUBLE)", value);
+case CHAR:
+case VARCHAR:
+case DECIMAL:
+case INTEGER:
+return stringifyValue(value);
+case BOOLEAN:
+case SYMBOL:

Review Comment:
   The problem with symbol is that it cannot be a top-level literal. It can 
only ever be an argument of a function, but in this PR we don't have the 
`CallExpression#asSerializableString` yet.
   
   I can extend the test with `lit("abc").isJson(JsonType.SCALAR)` in 
https://github.com/apache/flink/pull/23811



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-33696) FLIP-385: Add OpenTelemetryTraceReporter and OpenTelemetryMetricReporter

2023-11-29 Thread Piotr Nowojski (Jira)

Piotr Nowojski created FLINK-33696:
--

 Summary: FLIP-385: Add OpenTelemetryTraceReporter and 
OpenTelemetryMetricReporter
 Key: FLINK-33696
 URL: https://issues.apache.org/jira/browse/FLINK-33696
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Metrics
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski
 Fix For: 1.19.0


h1. Motivation

[FLIP-384|https://cwiki.apache.org/confluence/display/FLINK/FLIP-384%3A+Introduce+TraceReporter+and+use+it+to+create+checkpointing+and+recovery+traces]
 is adding TraceReporter interface. However with 
[FLIP-384|https://cwiki.apache.org/confluence/display/FLINK/FLIP-384%3A+Introduce+TraceReporter+and+use+it+to+create+checkpointing+and+recovery+traces]
 alone, Log4jTraceReporter would be the only available implementation of 
TraceReporter interface, which is not very helpful.

In this FLIP I’m proposing to contribute both MetricExporter and TraceReporter 
implementation using OpenTelemetry.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-7894) Improve metrics around fine-grained recovery and associated checkpointing behaviors

2023-11-29 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-7894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-7894.
-
Resolution: Duplicate

> Improve metrics around fine-grained recovery and associated checkpointing 
> behaviors
> ---
>
> Key: FLINK-7894
> URL: https://issues.apache.org/jira/browse/FLINK-7894
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Affects Versions: 1.3.2, 1.4.0
>Reporter: Zhenzhong Xu
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> Currently, the only metric around fine-grained recovery is "task_failures". 
> It's a very high level metric, it would be nice to have the following 
> improvements:
> * Allows slice and dice into which tasks were restarted. 
> * Recovery duration.
> * Recovery associated checkpoint behaviors: cancels, failures, etc



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33695) FLIP-384: Introduce TraceReporter and use it to create checkpointing and recovery traces

2023-11-29 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-33695:
---
Description: 
https://cwiki.apache.org/confluence/x/TguZE

*Motivation*
Currently Flink has a limited observability of checkpoint and recovery 
processes.

For checkpointing Flink has a very detailed overview in the Flink WebUI, which 
works great in many use cases, however it’s problematic if one is operating 
multiple Flink clusters, or if cluster/JM dies. Additionally there are a couple 
of metrics (like lastCheckpointDuration or lastCheckpointSize), however those 
metrics have a couple of issues:
* They are reported and refreshed periodically, depending on the MetricReporter 
settings, which doesn’t take into account checkpointing frequency.
** If checkpointing interval > metric reporting interval, we would be reporting 
the same values multiple times.
** If checkpointing interval < metric reporting interval, we would be randomly 
dropping metrics for some of the checkpoints.

For recovery we are missing even the most basic of the metrics and Flink WebUI 
support. Also given the fact that recovery is even less frequent compared to 
checkpoints, adding recovery metrics would have even bigger problems with 
unnecessary reporting the same values.

In this FLIP I’m proposing to add support for reporting traces/spans (example: 
Traces) and use this mechanism to report checkpointing and recovery traces. I 
hope in the future traces will also prove useful in other areas of Flink like 
job submission, job state changes, ... . Moreover as the API to report traces 
will be added to the MetricGroup , users will be also able to access this API. 

  was:
https://cwiki.apache.org/confluence/x/TguZE

*Motivation*
Currently Flink has a limited observability of checkpoint and recovery 
processes.

For checkpointing Flink has a very detailed overview in the Flink WebUI, which 
works great in many use cases, however it’s problematic if one is operating 
multiple Flink clusters, or if cluster/JM dies. Additionally there are a couple 
of metrics (like lastCheckpointDuration or lastCheckpointSize), however those 
metrics have a couple of issues:

They are reported and refreshed periodically, depending on the MetricReporter 
settings, which doesn’t take into account checkpointing frequency.

If checkpointing interval > metric reporting interval, we would be reporting 
the same values multiple times.

If checkpointing interval < metric reporting interval, we would be randomly 
dropping metrics for some of the checkpoints.

For recovery we are missing even the most basic of the metrics and Flink WebUI 
support. Also given the fact that recovery is even less frequent compared to 
checkpoints, adding recovery metrics would have even bigger problems with 
unnecessary reporting the same values.

In this FLIP I’m proposing to add support for reporting traces/spans (example: 
Traces) and use this mechanism to report checkpointing and recovery traces. I 
hope in the future traces will also prove useful in other areas of Flink like 
job submission, job state changes, ... . Moreover as the API to report traces 
will be added to the MetricGroup , users will be also able to access this API. 


> FLIP-384: Introduce TraceReporter and use it to create checkpointing and 
> recovery traces
> 
>
> Key: FLINK-33695
> URL: https://issues.apache.org/jira/browse/FLINK-33695
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Checkpointing, Runtime / Metrics
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Major
> Fix For: 1.19.0
>
>
> https://cwiki.apache.org/confluence/x/TguZE
> *Motivation*
> Currently Flink has a limited observability of checkpoint and recovery 
> processes.
> For checkpointing Flink has a very detailed overview in the Flink WebUI, 
> which works great in many use cases, however it’s problematic if one is 
> operating multiple Flink clusters, or if cluster/JM dies. Additionally there 
> are a couple of metrics (like lastCheckpointDuration or lastCheckpointSize), 
> however those metrics have a couple of issues:
> * They are reported and refreshed periodically, depending on the 
> MetricReporter settings, which doesn’t take into account checkpointing 
> frequency.
> ** If checkpointing interval > metric reporting interval, we would be 
> reporting the same values multiple times.
> ** If checkpointing interval < metric reporting interval, we would be 
> randomly dropping metrics for some of the checkpoints.
> For recovery we are missing even the most basic of the metrics and Flink 
> WebUI support. Also given the fact that recovery is even less frequent 
> compared to checkpoints, adding recovery

[jira] [Updated] (FLINK-33695) FLIP-384: Introduce TraceReporter and use it to create checkpointing and recovery traces

2023-11-29 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-33695:
---
Description: 
https://cwiki.apache.org/confluence/x/TguZE

*Motivation*
Currently Flink has a limited observability of checkpoint and recovery 
processes.

For checkpointing Flink has a very detailed overview in the Flink WebUI, which 
works great in many use cases, however it’s problematic if one is operating 
multiple Flink clusters, or if cluster/JM dies. Additionally there are a couple 
of metrics (like lastCheckpointDuration or lastCheckpointSize), however those 
metrics have a couple of issues:

They are reported and refreshed periodically, depending on the MetricReporter 
settings, which doesn’t take into account checkpointing frequency.

If checkpointing interval > metric reporting interval, we would be reporting 
the same values multiple times.

If checkpointing interval < metric reporting interval, we would be randomly 
dropping metrics for some of the checkpoints.

For recovery we are missing even the most basic of the metrics and Flink WebUI 
support. Also given the fact that recovery is even less frequent compared to 
checkpoints, adding recovery metrics would have even bigger problems with 
unnecessary reporting the same values.

In this FLIP I’m proposing to add support for reporting traces/spans (example: 
Traces) and use this mechanism to report checkpointing and recovery traces. I 
hope in the future traces will also prove useful in other areas of Flink like 
job submission, job state changes, ... . Moreover as the API to report traces 
will be added to the MetricGroup , users will be also able to access this API. 

> FLIP-384: Introduce TraceReporter and use it to create checkpointing and 
> recovery traces
> 
>
> Key: FLINK-33695
> URL: https://issues.apache.org/jira/browse/FLINK-33695
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Checkpointing, Runtime / Metrics
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Major
> Fix For: 1.19.0
>
>
> https://cwiki.apache.org/confluence/x/TguZE
> *Motivation*
> Currently Flink has a limited observability of checkpoint and recovery 
> processes.
> For checkpointing Flink has a very detailed overview in the Flink WebUI, 
> which works great in many use cases, however it’s problematic if one is 
> operating multiple Flink clusters, or if cluster/JM dies. Additionally there 
> are a couple of metrics (like lastCheckpointDuration or lastCheckpointSize), 
> however those metrics have a couple of issues:
> They are reported and refreshed periodically, depending on the MetricReporter 
> settings, which doesn’t take into account checkpointing frequency.
> If checkpointing interval > metric reporting interval, we would be reporting 
> the same values multiple times.
> If checkpointing interval < metric reporting interval, we would be randomly 
> dropping metrics for some of the checkpoints.
> For recovery we are missing even the most basic of the metrics and Flink 
> WebUI support. Also given the fact that recovery is even less frequent 
> compared to checkpoints, adding recovery metrics would have even bigger 
> problems with unnecessary reporting the same values.
> In this FLIP I’m proposing to add support for reporting traces/spans 
> (example: Traces) and use this mechanism to report checkpointing and recovery 
> traces. I hope in the future traces will also prove useful in other areas of 
> Flink like job submission, job state changes, ... . Moreover as the API to 
> report traces will be added to the MetricGroup , users will be also able to 
> access this API. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-22390) Support for OpenTelemetry

2023-11-29 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-22390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-22390.
--
Resolution: Won't Do

This ticket is most likely mostly superseded via FLINK-33695. 

> Support for OpenTelemetry
> -
>
> Key: FLINK-22390
> URL: https://issues.apache.org/jira/browse/FLINK-22390
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions
>Reporter: Konstantin Knauf
>Priority: Minor
>
> OpenTracing for Stateful Functions could help particularly with topology 
> analysis and performance analysis. 
> A few questions come to mind: 
> * would we instrument both the functions as well as the runtime, or only the 
> runtime? 
> * what tracing implementation to use?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-23411) Expose Flink checkpoint details metrics

2023-11-29 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-23411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-23411.
--
Resolution: Duplicate

Closing as duplicate in favour of FLINK-33695

> Expose Flink checkpoint details metrics
> ---
>
> Key: FLINK-23411
> URL: https://issues.apache.org/jira/browse/FLINK-23411
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing, Runtime / Metrics
>Affects Versions: 1.13.1, 1.12.4
>Reporter: Jun Qin
>Assignee: Hangxiang Yu
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.19.0
>
>
> The checkpoint metrics as shown in the Flink Web UI like the 
> sync/async/alignment/start delay are not exposed to the metrics system. This 
> makes problem investigation harder when Web UI is not enabled: those numbers 
> can not get in the DEBUG logs. I think we should see how we can expose 
> metrics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-33695) FLIP-384: Introduce TraceReporter and use it to create checkpointing and recovery traces

2023-11-29 Thread Piotr Nowojski (Jira)

Piotr Nowojski created FLINK-33695:
--

 Summary: FLIP-384: Introduce TraceReporter and use it to create 
checkpointing and recovery traces
 Key: FLINK-33695
 URL: https://issues.apache.org/jira/browse/FLINK-33695
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Checkpointing, Runtime / Metrics
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-6755) Allow triggering Checkpoints through command line client

2023-11-29 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-6755.
-
Fix Version/s: 1.19.0
   Resolution: Fixed

merged commit 7265438 into apache:master

> Allow triggering Checkpoints through command line client
> 
>
> Key: FLINK-6755
> URL: https://issues.apache.org/jira/browse/FLINK-6755
> Project: Flink
>  Issue Type: New Feature
>  Components: Command Line Client, Runtime / Checkpointing
>Affects Versions: 1.3.0
>Reporter: Gyula Fora
>Assignee: Zakelly Lan
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.19.0
>
>
> The command line client currently only allows triggering (and canceling with) 
> Savepoints. 
> While this is good if we want to fork or modify the pipelines in a 
> non-checkpoint compatible way, now with incremental checkpoints this becomes 
> wasteful for simple job restarts/pipeline updates. 
> I suggest we add a new command: 
> ./bin/flink checkpoint  [checkpointDirectory]
> and a new flag -c for the cancel command to indicate we want to trigger a 
> checkpoint:
> ./bin/flink cancel -c [targetDirectory] 
> Otherwise this can work similar to the current savepoint taking logic, we 
> could probably even piggyback on the current messages by adding boolean flag 
> indicating whether it should be a savepoint or a checkpoint.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-6755][CLI] Support manual checkpoints triggering [flink]

2023-11-29 Thread via GitHub



pnowojski merged PR #23679:
URL: https://github.com/apache/flink/pull/23679


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32714] Add dialect for OceanBase database [flink-connector-jdbc]

2023-11-29 Thread via GitHub



davidradl commented on code in PR #72:
URL: 
https://github.com/apache/flink-connector-jdbc/pull/72#discussion_r1409587288


##
flink-connector-jdbc/pom.xml:
##
@@ -110,6 +110,13 @@ under the License.
provided

 
+
+com.oceanbase
+oceanbase-client

Review Comment:
   The Java client here https://github.com/oceanbase/obconnector-j seems to 
have a LGPL License. This is not a permissive licence - so it seems we should 
not ship this until there is an acceptable license - such as Apache 2.0. Let me 
know if I have misunderstood.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-6755][CLI] Support manual checkpoints triggering [flink]

2023-11-29 Thread via GitHub



pnowojski commented on PR #23679:
URL: https://github.com/apache/flink/pull/23679#issuecomment-1832322834

   Thanks! Merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32714] Add dialect for OceanBase database [flink-connector-jdbc]

2023-11-29 Thread via GitHub



davidradl commented on code in PR #72:
URL: 
https://github.com/apache/flink-connector-jdbc/pull/72#discussion_r1409587288


##
flink-connector-jdbc/pom.xml:
##
@@ -110,6 +110,13 @@ under the License.
provided

 
+
+com.oceanbase
+oceanbase-client

Review Comment:
   The Java client here https://github.com/oceanbase/obconnector-j seems to 
have a LGPL License. This is not a permissive licence - so it seems we should 
not ship this until there is an acceptable license - such as Apache 2.0. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33675] Serialize ValueLiteralExpressions into SQL [flink]

2023-11-29 Thread via GitHub



dawidwys commented on code in PR #23829:
URL: https://github.com/apache/flink/pull/23829#discussion_r1409580486


##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/expressions/ValueLiteralExpression.java:
##
@@ -219,6 +222,83 @@ public String asSummaryString() {
 return stringifyValue(value);
 }
 
+@Override
+public String asSerializableString() {
+if (value == null && 
!dataType.getLogicalType().is(LogicalTypeRoot.NULL)) {
+return String.format(
+"CAST(NULL AS %s)",
+// casting does not support nullability
+
dataType.getLogicalType().copy(true).asSerializableString());
+}
+final LogicalType logicalType = dataType.getLogicalType();
+switch (logicalType.getTypeRoot()) {
+case TINYINT:
+return String.format("CAST(%s AS TINYINT)", value);
+case SMALLINT:
+return String.format("CAST(%s AS SMALLINT)", value);
+case BIGINT:
+return String.format("CAST(%s AS BIGINT)", value);
+case FLOAT:
+return String.format("CAST(%s AS FLOAT)", value);
+case DOUBLE:
+return String.format("CAST(%s AS DOUBLE)", value);
+case CHAR:
+case VARCHAR:
+case DECIMAL:
+case INTEGER:
+return stringifyValue(value);
+case BOOLEAN:
+case SYMBOL:
+case NULL:
+return stringifyValue(value).toUpperCase(Locale.ROOT);
+case BINARY:
+case VARBINARY:
+return String.format("X'%s'", 
StringUtils.byteToHexString((byte[]) value));
+case DATE:
+return String.format("DATE '%s'", 
getValueAs(LocalDate.class).get());
+case TIME_WITHOUT_TIME_ZONE:
+return String.format("TIME '%s'", 
getValueAs(LocalTime.class).get());
+case TIMESTAMP_WITHOUT_TIME_ZONE:
+final LocalDateTime localDateTime = 
getValueAs(LocalDateTime.class).get();
+return String.format(
+"TIMESTAMP '%s %s'",
+localDateTime.toLocalDate(), 
localDateTime.toLocalTime());
+case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
+final Instant instant = getValueAs(Instant.class).get();
+return String.format("TO_TIMESTAMP_LTZ(%d, %d)", 
instant.toEpochMilli(), 3);

Review Comment:
   I added a check to throw an exception if provided a greater precision.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33675] Serialize ValueLiteralExpressions into SQL [flink]

2023-11-29 Thread via GitHub



dawidwys commented on code in PR #23829:
URL: https://github.com/apache/flink/pull/23829#discussion_r1409579746


##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/expressions/ValueLiteralExpression.java:
##
@@ -219,6 +222,83 @@ public String asSummaryString() {
 return stringifyValue(value);
 }
 
+@Override
+public String asSerializableString() {
+if (value == null && 
!dataType.getLogicalType().is(LogicalTypeRoot.NULL)) {
+return String.format(
+"CAST(NULL AS %s)",
+// casting does not support nullability
+
dataType.getLogicalType().copy(true).asSerializableString());
+}
+final LogicalType logicalType = dataType.getLogicalType();
+switch (logicalType.getTypeRoot()) {
+case TINYINT:
+return String.format("CAST(%s AS TINYINT)", value);
+case SMALLINT:
+return String.format("CAST(%s AS SMALLINT)", value);
+case BIGINT:
+return String.format("CAST(%s AS BIGINT)", value);
+case FLOAT:
+return String.format("CAST(%s AS FLOAT)", value);
+case DOUBLE:
+return String.format("CAST(%s AS DOUBLE)", value);
+case CHAR:
+case VARCHAR:
+case DECIMAL:
+case INTEGER:
+return stringifyValue(value);
+case BOOLEAN:
+case SYMBOL:
+case NULL:
+return stringifyValue(value).toUpperCase(Locale.ROOT);
+case BINARY:
+case VARBINARY:
+return String.format("X'%s'", 
StringUtils.byteToHexString((byte[]) value));
+case DATE:
+return String.format("DATE '%s'", 
getValueAs(LocalDate.class).get());
+case TIME_WITHOUT_TIME_ZONE:
+return String.format("TIME '%s'", 
getValueAs(LocalTime.class).get());
+case TIMESTAMP_WITHOUT_TIME_ZONE:
+final LocalDateTime localDateTime = 
getValueAs(LocalDateTime.class).get();
+return String.format(
+"TIMESTAMP '%s %s'",
+localDateTime.toLocalDate(), 
localDateTime.toLocalTime());
+case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
+final Instant instant = getValueAs(Instant.class).get();
+return String.format("TO_TIMESTAMP_LTZ(%d, %d)", 
instant.toEpochMilli(), 3);

Review Comment:
   Actually we support only precision of (3): 
https://github.com/apache/flink/blob/53ece12c25579497338ed59a7aebe70f2b3d9ed6/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/functions/sql/FlinkSqlOperatorTable.java#L816
   
   
https://github.com/apache/flink/blob/84044f45830542d13d6c9baeb2bfe30de3eac4ac/flink-table/flink-table-common/src/main/java/org/apache/flink/table/utils/DateTimeUtils.java#L328



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33675] Serialize ValueLiteralExpressions into SQL [flink]

2023-11-29 Thread via GitHub



dawidwys commented on code in PR #23829:
URL: https://github.com/apache/flink/pull/23829#discussion_r1409557615


##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/expressions/ValueLiteralExpression.java:
##
@@ -219,6 +222,83 @@ public String asSummaryString() {
 return stringifyValue(value);
 }
 
+@Override
+public String asSerializableString() {
+if (value == null && 
!dataType.getLogicalType().is(LogicalTypeRoot.NULL)) {
+return String.format(
+"CAST(NULL AS %s)",
+// casting does not support nullability
+
dataType.getLogicalType().copy(true).asSerializableString());
+}
+final LogicalType logicalType = dataType.getLogicalType();
+switch (logicalType.getTypeRoot()) {
+case TINYINT:
+return String.format("CAST(%s AS TINYINT)", value);
+case SMALLINT:
+return String.format("CAST(%s AS SMALLINT)", value);
+case BIGINT:
+return String.format("CAST(%s AS BIGINT)", value);
+case FLOAT:
+return String.format("CAST(%s AS FLOAT)", value);
+case DOUBLE:
+return String.format("CAST(%s AS DOUBLE)", value);
+case CHAR:
+case VARCHAR:
+case DECIMAL:
+case INTEGER:
+return stringifyValue(value);
+case BOOLEAN:
+case SYMBOL:
+case NULL:
+return stringifyValue(value).toUpperCase(Locale.ROOT);
+case BINARY:
+case VARBINARY:
+return String.format("X'%s'", 
StringUtils.byteToHexString((byte[]) value));
+case DATE:
+return String.format("DATE '%s'", 
getValueAs(LocalDate.class).get());
+case TIME_WITHOUT_TIME_ZONE:
+return String.format("TIME '%s'", 
getValueAs(LocalTime.class).get());
+case TIMESTAMP_WITHOUT_TIME_ZONE:
+final LocalDateTime localDateTime = 
getValueAs(LocalDateTime.class).get();
+return String.format(
+"TIMESTAMP '%s %s'",
+localDateTime.toLocalDate(), 
localDateTime.toLocalTime());
+case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
+final Instant instant = getValueAs(Instant.class).get();
+return String.format("TO_TIMESTAMP_LTZ(%d, %d)", 
instant.toEpochMilli(), 3);

Review Comment:
    
   We should update our docs:
   > Converts a epoch seconds or epoch milliseconds to a TIMESTAMP_LTZ, the 
valid precision is 0 or 3, the 0 represents TO_TIMESTAMP_LTZ(epochSeconds, 0), 
the 3 represents TO_TIMESTAMP_LTZ(epochMilliseconds, 3).



##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/expressions/ValueLiteralExpression.java:
##
@@ -219,6 +222,83 @@ public String asSummaryString() {
 return stringifyValue(value);
 }
 
+@Override
+public String asSerializableString() {
+if (value == null && 
!dataType.getLogicalType().is(LogicalTypeRoot.NULL)) {
+return String.format(
+"CAST(NULL AS %s)",
+// casting does not support nullability
+
dataType.getLogicalType().copy(true).asSerializableString());
+}
+final LogicalType logicalType = dataType.getLogicalType();
+switch (logicalType.getTypeRoot()) {
+case TINYINT:
+return String.format("CAST(%s AS TINYINT)", value);
+case SMALLINT:
+return String.format("CAST(%s AS SMALLINT)", value);
+case BIGINT:
+return String.format("CAST(%s AS BIGINT)", value);
+case FLOAT:
+return String.format("CAST(%s AS FLOAT)", value);
+case DOUBLE:
+return String.format("CAST(%s AS DOUBLE)", value);
+case CHAR:
+case VARCHAR:
+case DECIMAL:
+case INTEGER:
+return stringifyValue(value);
+case BOOLEAN:
+case SYMBOL:
+case NULL:
+return stringifyValue(value).toUpperCase(Locale.ROOT);
+case BINARY:
+case VARBINARY:
+return String.format("X'%s'", 
StringUtils.byteToHexString((byte[]) value));
+case DATE:
+return String.format("DATE '%s'", 
getValueAs(LocalDate.class).get());
+case TIME_WITHOUT_TIME_ZONE:
+return String.format("TIME '%s'", 
getValueAs(LocalTime.class).get());
+case TIMESTAMP_WITHOUT_TIME_ZONE:
+final LocalDateTime localDateTime = 
getValueAs(LocalDateTime.class).get();
+return String.format(
+"TIMESTAMP '%s %s'",
+localDateTime.toLocalDate(),

[jira] [Commented] (FLINK-33694) GCS filesystem does not respect gs.storage.root.url config option

2023-11-29 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791171#comment-17791171
 ] 

Martijn Visser commented on FLINK-33694:


[~plucas] Do you want to take a stab at fixing it?

> GCS filesystem does not respect gs.storage.root.url config option
> -
>
> Key: FLINK-33694
> URL: https://issues.apache.org/jira/browse/FLINK-33694
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems
>Affects Versions: 1.18.0, 1.17.2
>Reporter: Patrick Lucas
>Priority: Major
>  Labels: gcs
>
> The GCS FileSystem's RecoverableWriter implementation uses the GCS SDK 
> directly rather than going through Hadoop. While support has been added to 
> configure credentials correctly based on the standard Hadoop implementation 
> configuration, no other options are passed through to the underlying client.
> Because this only affects the RecoverableWriter-related codepaths, it can 
> result in very surprising differing behavior whether the FileSystem is being 
> used as a source or a sink—while a {{{}gs://{}}}-URI FileSource may work 
> fine, a {{{}gs://{}}}-URI FileSink may not work at all.
> We use [fake-gcs-server|https://github.com/fsouza/fake-gcs-server] in 
> testing, and so we override the Hadoop GCS FileSystem config option 
> {{{}gs.storage.root.url{}}}. However, because this option is not considered 
> when creating the GCS client for the RecoverableWriter codepath, in a 
> FileSink the GCS FileSystem attempts to write to the real GCS service rather 
> than fake-gcs-server. At the same time, a FileSource works as expected, 
> reading from fake-gcs-server.
> The fix should be fairly straightforward, reading the {{gs.storage.root.url}} 
> config option from the Hadoop FileSystem config in 
> [{{GSFileSystemOptions}}|https://github.com/apache/flink/blob/release-1.18.0/flink-filesystems/flink-gs-fs-hadoop/src/main/java/org/apache/flink/fs/gs/GSFileSystemOptions.java#L30]
>  and, if set, passing it to {{storageOptionsBuilder}} in 
> [{{GSFileSystemFactory}}|https://github.com/apache/flink/blob/release-1.18.0/flink-filesystems/flink-gs-fs-hadoop/src/main/java/org/apache/flink/fs/gs/GSFileSystemFactory.java].
> The only workaround for this is to build a custom flink-gs-fs-hadoop JAR with 
> a patch and use it as a plugin.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

1 2 >

1 - 100 of 142 matches

Mail list logo