date:20230206

[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-02-06 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685114#comment-17685114
 ] 

Matthias Pohl commented on FLINK-18356:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45807=logs=a9db68b9-a7e0-54b6-0f98-010e0aff39e2=cdd32e0b-6047-565b-c58f-14054472f1be=34463

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0
>Reporter: Piotr Nowojski
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30757) Ugrade busybox version to a pinned version for operator

2023-02-06 Thread Gabor Somogyi (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685113#comment-17685113
 ] 

Gabor Somogyi commented on FLINK-30757:
---

Happy to see contributors :)

> Ugrade busybox version to a pinned version for operator
> ---
>
> Key: FLINK-30757
> URL: https://issues.apache.org/jira/browse/FLINK-30757
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gabor Somogyi
>Assignee: Shipeng Xie
>Priority: Minor
>  Labels: starter
>
> It has been seen that the operator e2e tests were flaky when used the latest 
> version of the busybox so we've pinned it to a relatively old version. 
> https://github.com/apache/flink-kubernetes-operator/commit/8e41ed5ec1adda9ae06bc0b6203abf42939fbf2b
> It would be good to do 2 things
> * Upgrade the busybox version to the latest in a pinned way
> * Add debug logs of the busybox pod in case of failure (to see why failed)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] flinkbot commented on pull request #21878: [FLINK-30817] Fix ClassCastException in TestValuesTableFactory

2023-02-06 Thread via GitHub



flinkbot commented on PR #21878:
URL: https://github.com/apache/flink/pull/21878#issuecomment-1420315623

   
   ## CI report:
   
   * 79a8d0a6e81453cc5b8a19f47d8fb1601e760e34 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk commented on a diff in pull request #21801: [FLINK-30838][doc] Update documentation about the AdaptiveBatchScheduler

2023-02-06 Thread via GitHub



zhuzhurk commented on code in PR #21801:
URL: https://github.com/apache/flink/pull/21801#discussion_r1098265457


##
docs/content/docs/deployment/speculative_execution.md:
##
@@ -52,15 +52,14 @@ in near future. DataStream API is now the recommended low 
level API to develop F
 {{< /hint >}}
 
 ### Enable Speculative Execution
-To enable speculative execution, you need to set the following configuration 
options:
-- `jobmanager.scheduler: AdaptiveBatch`
-- Because only [Adaptive Batch Scheduler]({{< ref 
"docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler) supports 
speculative execution.
-- `jobmanager.adaptive-batch-scheduler.speculative.enabled: true`
-
+You can enable speculative execution through the following configuration items：
+- `execution.batch.speculative.enabled: true`
+
+Note that currently only [Adaptive Batch Scheduler]({{< ref 
"docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler) supports 
speculative execution. And Flink batch jobs will use this scheduler by default 
unless another scheduler is explicitly configured.
 ### Tuning Configuration

Review Comment:
   An empty line should be added in above.



##
docs/content.zh/docs/deployment/speculative_execution.md:
##
@@ -45,15 +45,15 @@ under the License.
 {{< /hint >}}
 
 ### 启用预测执行
-要启用预测执行，你需要设置以下配置项：
-- `jobmanager.scheduler: AdaptiveBatch`
-- 因为当前只有 [Adaptive Batch Scheduler]({{< ref 
"docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler) 支持预测执行.
-- `jobmanager.adaptive-batch-scheduler.speculative.enabled: true`
+你可以通过以下配置项启用预测执行：
+- `execution.batch.speculative.enabled: true`
+

Review Comment:
   This line should be replaced with an empty new line.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #21877: [FLINK-30925][docs] Add docs about SQL Client remote mode

2023-02-06 Thread via GitHub



flinkbot commented on PR #21877:
URL: https://github.com/apache/flink/pull/21877#issuecomment-1420310015

   
   ## CI report:
   
   * 3c5f0d0e9ff27ddeb952bc505d3f5cf9f328de17 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk commented on a diff in pull request #21801: [FLINK-30838][doc] Update documentation about the AdaptiveBatchScheduler

2023-02-06 Thread via GitHub



zhuzhurk commented on code in PR #21801:
URL: https://github.com/apache/flink/pull/21801#discussion_r1098263375


##
docs/content/docs/deployment/elastic_scaling.md:
##
@@ -161,30 +161,28 @@ The Adaptive Batch Scheduler can automatically decide 
parallelisms of operators
 
 To automatically decide parallelisms for operators with Adaptive Batch 
Scheduler, you need to:
 - Configure to use Adaptive Batch Scheduler.
-- Set the parallelism of operators to `-1`.
+- Do not specify the parallelism of operators.
   
  Configure to use Adaptive Batch Scheduler
-To use Adaptive Batch Scheduler, you need to:
-- Set `jobmanager.scheduler: AdaptiveBatch`.
-- Leave the [`execution.batch-shuffle-mode`]({{< ref "docs/deployment/config" 
>}}#execution-batch-shuffle-mode) unset or explicitly set it to 
`ALL-EXCHANGES-BLOCKING` (default value) due to ["ALL-EXCHANGES-BLOCKING jobs 
only"](#limitations-2).
+At present, the Adaptive Batch Scheduler is the default scheduler for flink 
batch jobs, and no additional configuration is required unless you explicitly 
configured to use other schedulers, such as 'jobmanager. scheduler: default'. 
It should be noted that 
+leave the [`execution.batch-shuffle-mode`]({{< ref "docs/deployment/config" 
>}}#execution-batch-shuffle-mode) unset or explicitly set it to 
`ALL-EXCHANGES-BLOCKING` (default value) due to ["ALL-EXCHANGES-BLOCKING jobs 
only"](#limitations-2).
 
 In addition, there are several related configuration options that may need 
adjustment when using Adaptive Batch Scheduler:
-- [`jobmanager.adaptive-batch-scheduler.min-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-min-parallelism): The lower bound of 
allowed parallelism to set adaptively.
-- [`jobmanager.adaptive-batch-scheduler.max-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-max-parallelism): The upper bound of 
allowed parallelism to set adaptively.
-- [`jobmanager.adaptive-batch-scheduler.avg-data-volume-per-task`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-avg-data-volume-per-task): The average 
size of data volume to expect each task instance to process. Note that when 
data skew occurs, or the decided parallelism reaches the max parallelism (due 
to too much data), the data actually processed by some tasks may far exceed 
this value.
-- [`jobmanager.adaptive-batch-scheduler.default-source-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-default-source-parallelism): The 
default parallelism of data source.
-
- Set the parallelism of operators to `-1`
-Adaptive Batch Scheduler will only decide parallelism for operators whose 
parallelism is not specified by users (parallelism is `-1`). So if you want the 
parallelism of operators to be decided automatically, you should configure as 
follows:
-- Set `parallelism.default: -1`
-- Set `table.exec.resource.default-parallelism: -1` in SQL jobs.
-- Don't call `setParallelism()` for operators in DataStream/DataSet jobs.
-- Don't call `setParallelism()` on 
`StreamExecutionEnvironment/ExecutionEnvironment` in DataStream/DataSet jobs.
+- [`execution.batch.adaptive.auto-parallelism.min-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-adaptive-auto-parallelism-min-parallelism): The lower bound 
of allowed parallelism to set adaptively.
+- [`execution.batch.adaptive.auto-parallelism.max-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-adaptive-auto-parallelism-max-parallelism): The upper bound 
of allowed parallelism to set adaptively. Default parallelism will be used as 
upper bound of allowed parallelism if this configuration is not configured.

Review Comment:
   The zh doc should be updated accordingly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30934) Refactor ComputedColumnAndWatermarkTableITCase to get rid of managed table

2023-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30934:
---
Labels: pull-request-available  (was: )

> Refactor ComputedColumnAndWatermarkTableITCase to get rid of managed table
> --
>
> Key: FLINK-30934
> URL: https://issues.apache.org/jira/browse/FLINK-30934
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Affects Versions: table-store-0.4.0
>Reporter: yuzelin
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-table-store] yuzelin opened a new pull request, #506: [FLINK-30934] Refactor ComputedColumnAndWatermarkTableITCase to get rid of managed table

2023-02-06 Thread via GitHub



yuzelin opened a new pull request, #506:
URL: https://github.com/apache/flink-table-store/pull/506

   Main changes:
   1. Refactor `ComputedColumnAndWatermarkTableITCase`.
   2. Clean test util classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] shuiqiangchen commented on pull request #21878: [FLINK-30817] Fix ClassCastException in TestValuesTableFactory

2023-02-06 Thread via GitHub



shuiqiangchen commented on PR #21878:
URL: https://github.com/apache/flink/pull/21878#issuecomment-1420308108

   Hi, @luoyuxia Could you please help review the PR. This PR is a simple bug 
fix, so currently there is test case for it. If it's necessary, I will add one. 
What do you think? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] zhipeng93 commented on a diff in pull request #205: [FLINK-30730] Fix null values handling in StringIndexer and StringIndexerModel

2023-02-06 Thread via GitHub



zhipeng93 commented on code in PR #205:
URL: https://github.com/apache/flink-ml/pull/205#discussion_r1098259151


##
flink-ml-lib/src/test/java/org/apache/flink/ml/feature/stringindexer/StringIndexerTest.java:
##
@@ -53,6 +55,9 @@ public class StringIndexerTest extends AbstractTestBase {
 private Table trainTable;
 private Table predictTable;
 
+private Table trainTableWithNull;

Review Comment:
   Thanks for the PR. The fix looks good to me. 
   
   One minor question: Can you simplify the test by adding null values in 
`trainTable` and `predictTable`? And it would be great if we also update the 
python test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] zhipeng93 commented on a diff in pull request #205: [FLINK-30730] Fix null values handling in StringIndexer and StringIndexerModel

2023-02-06 Thread via GitHub



zhipeng93 commented on code in PR #205:
URL: https://github.com/apache/flink-ml/pull/205#discussion_r1098259151


##
flink-ml-lib/src/test/java/org/apache/flink/ml/feature/stringindexer/StringIndexerTest.java:
##
@@ -53,6 +55,9 @@ public class StringIndexerTest extends AbstractTestBase {
 private Table trainTable;
 private Table predictTable;
 
+private Table trainTableWithNull;

Review Comment:
   Thanks for the PR. The fix looks good to me. One minor question: Can you 
simplify the test by adding null values in `trainTable` and `predictTable`? In 
this case, the test case would be simpler. And it would be great if we also 
update the python test.



##
flink-ml-lib/src/test/java/org/apache/flink/ml/feature/stringindexer/StringIndexerTest.java:
##
@@ -53,6 +55,9 @@ public class StringIndexerTest extends AbstractTestBase {
 private Table trainTable;
 private Table predictTable;
 
+private Table trainTableWithNull;

Review Comment:
   Thanks for the PR. The fix looks good to me. 
   
   One minor question: Can you simplify the test by adding null values in 
`trainTable` and `predictTable`? In this case, the test case would be simpler. 
And it would be great if we also update the python test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30817) ClassCastException in TestValuesTableFactory.TestValuesScanTableSourceWithoutProjectionPushDown

2023-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30817:
---
Labels: pull-request-available  (was: )

> ClassCastException in 
> TestValuesTableFactory.TestValuesScanTableSourceWithoutProjectionPushDown
> ---
>
> Key: FLINK-30817
> URL: https://issues.apache.org/jira/browse/FLINK-30817
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Shuiqiang Chen
>Priority: Minor
>  Labels: pull-request-available
>
> When applying partitions in 
> TestValuesScanTableSourceWithoutProjectionPushDown with no partition 
> provided, the following code will cause ClassCastException
> {code:java}
>  remainingPartitions = (List>) Collections.emptyMap();
>  this.data.put(Collections.emptyMap(), Collections.emptyList());
> {code}
> {code:java}
> java.lang.ClassCastException: java.util.Collections$EmptyMap cannot be cast 
> to java.util.List
>   at 
> org.apache.flink.table.planner.factories.TestValuesTableFactory$TestValuesScanTableSourceWithoutProjectionPushDown.applyPartitions(TestValuesTableFactory.java:1222)
>   at 
> org.apache.flink.table.planner.plan.abilities.source.PartitionPushDownSpec.apply(PartitionPushDownSpec.java:57)
>   at 
> org.apache.flink.table.planner.plan.rules.logical.PushPartitionIntoTableSourceScanRule.onMatch(PushPartitionIntoTableSourceScanRule.java:183)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:343)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-ml] zhipeng93 commented on a diff in pull request #205: [FLINK-30730] Fix null values handling in StringIndexer and StringIndexerModel

2023-02-06 Thread via GitHub



zhipeng93 commented on code in PR #205:
URL: https://github.com/apache/flink-ml/pull/205#discussion_r1098259151


##
flink-ml-lib/src/test/java/org/apache/flink/ml/feature/stringindexer/StringIndexerTest.java:
##
@@ -53,6 +55,9 @@ public class StringIndexerTest extends AbstractTestBase {
 private Table trainTable;
 private Table predictTable;
 
+private Table trainTableWithNull;

Review Comment:
   Can you simplify the test by adding null values in `trainTable` and 
`predictTable`? In this case, the test case would be simpler.
   
   Moreover, it would be great if we also update the python test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] shuiqiangchen opened a new pull request, #21878: [FLINK-30817] Fix ClassCastException in TestValuesTableFactory

2023-02-06 Thread via GitHub



shuiqiangchen opened a new pull request, #21878:
URL: https://github.com/apache/flink/pull/21878

   
   
   ## What is the purpose of the change
   
   * Fix the wrong class cast in 
TestValuesTableFactory.TestValuesScanTableSourceWithoutProjectionPushDown*
   
   
   ## Brief change log
   
 - Changed the value of remainingPartitions to Collections.emptyList()
   
   
   ## Verifying this change
   
   This change is a simple bug fix without test case covered. 
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: ( no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? ( no )
 - If yes, how is the feature documented? (not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] lincoln-lil commented on pull request #21507: [FLINK-30386] Fix the issue that column constraint lacks primary key not enforced check

2023-02-06 Thread via GitHub



lincoln-lil commented on PR #21507:
URL: https://github.com/apache/flink/pull/21507#issuecomment-1420303452

   @LadyForest could you also create a backport pr for 1.16?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30386) Column constraint lacks primary key not enforced check

2023-02-06 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-30386:

Fix Version/s: 1.17.0

> Column constraint lacks primary key not enforced check
> --
>
> Key: FLINK-30386
> URL: https://issues.apache.org/jira/browse/FLINK-30386
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.16.0, 1.15.2, 1.15.3
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Currently, only table constraint performs the enforced check. Not sure if it 
> is by design or a bug.
> The following case can be reproduced on Flink 1.16.0, 1.15.3, and 1.15.2. I 
> think the earlier version might also reveal it.
> {code:sql}
> Flink SQL> create table T (f0 int not null primary key, f1 string) with 
> ('connector' = 'datagen');
> [INFO] Execute statement succeed.
> Flink SQL> explain select * from T;
> == Abstract Syntax Tree ==
> LogicalProject(f0=[$0], f1=[$1])
> +- LogicalTableScan(table=[[default_catalog, default_database, T]])
> == Optimized Physical Plan ==
> TableSourceScan(table=[[default_catalog, default_database, T]], fields=[f0, 
> f1])
> == Optimized Execution Plan ==
> TableSourceScan(table=[[default_catalog, default_database, T]], fields=[f0, 
> f1])
> Flink SQL> create table S (f0 int not null, f1 string, primary key(f0)) with 
> ('connector' = 'datagen');
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.ValidationException: Flink doesn't support 
> ENFORCED mode for PRIMARY KEY constraint. ENFORCED/NOT ENFORCED  controls if 
> the constraint checks are performed on the incoming/outgoing data. Flink does 
> not own the data therefore the only supported mode is the NOT ENFORCED mode
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-30386) Column constraint lacks primary key not enforced check

2023-02-06 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-30386.
---
Resolution: Fixed

fixed in master: e69e6514d0b901eb03f1a8bfc499680d076248c9

> Column constraint lacks primary key not enforced check
> --
>
> Key: FLINK-30386
> URL: https://issues.apache.org/jira/browse/FLINK-30386
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.16.0, 1.15.2, 1.15.3
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
>
> Currently, only table constraint performs the enforced check. Not sure if it 
> is by design or a bug.
> The following case can be reproduced on Flink 1.16.0, 1.15.3, and 1.15.2. I 
> think the earlier version might also reveal it.
> {code:sql}
> Flink SQL> create table T (f0 int not null primary key, f1 string) with 
> ('connector' = 'datagen');
> [INFO] Execute statement succeed.
> Flink SQL> explain select * from T;
> == Abstract Syntax Tree ==
> LogicalProject(f0=[$0], f1=[$1])
> +- LogicalTableScan(table=[[default_catalog, default_database, T]])
> == Optimized Physical Plan ==
> TableSourceScan(table=[[default_catalog, default_database, T]], fields=[f0, 
> f1])
> == Optimized Execution Plan ==
> TableSourceScan(table=[[default_catalog, default_database, T]], fields=[f0, 
> f1])
> Flink SQL> create table S (f0 int not null, f1 string, primary key(f0)) with 
> ('connector' = 'datagen');
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.ValidationException: Flink doesn't support 
> ENFORCED mode for PRIMARY KEY constraint. ENFORCED/NOT ENFORCED  controls if 
> the constraint checks are performed on the incoming/outgoing data. Flink does 
> not own the data therefore the only supported mode is the NOT ENFORCED mode
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30925) Add docs for the SQL Client gateway mode

2023-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30925:
---
Labels: pull-request-available  (was: )

> Add docs for the SQL Client gateway mode
> 
>
> Key: FLINK-30925
> URL: https://issues.apache.org/jira/browse/FLINK-30925
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation, Table SQL / Client
>Affects Versions: 1.17.0
>Reporter: Shengkai Fang
>Assignee: Shengkai Fang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] lincoln-lil merged pull request #21507: [FLINK-30386] Fix the issue that column constraint lacks primary key not enforced check

2023-02-06 Thread via GitHub



lincoln-lil merged PR #21507:
URL: https://github.com/apache/flink/pull/21507


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] fsk119 opened a new pull request, #21877: [FLINK-30925][docs] Add docs about SQL Client remote mode

2023-02-06 Thread via GitHub



fsk119 opened a new pull request, #21877:
URL: https://github.com/apache/flink/pull/21877

   
   
   ## What is the purpose of the change
   
   *Add docs about the SQL Client remote mode and update sql-gateway open api.*
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-30757) Ugrade busybox version to a pinned version for operator

2023-02-06 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-30757:
--

Assignee: Shipeng Xie

> Ugrade busybox version to a pinned version for operator
> ---
>
> Key: FLINK-30757
> URL: https://issues.apache.org/jira/browse/FLINK-30757
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gabor Somogyi
>Assignee: Shipeng Xie
>Priority: Minor
>  Labels: starter
>
> It has been seen that the operator e2e tests were flaky when used the latest 
> version of the busybox so we've pinned it to a relatively old version. 
> https://github.com/apache/flink-kubernetes-operator/commit/8e41ed5ec1adda9ae06bc0b6203abf42939fbf2b
> It would be good to do 2 things
> * Upgrade the busybox version to the latest in a pinned way
> * Add debug logs of the busybox pod in case of failure (to see why failed)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-30757) Ugrade busybox version to a pinned version for operator

2023-02-06 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-30757:
--

Assignee: (was: Shipeng Xie)

> Ugrade busybox version to a pinned version for operator
> ---
>
> Key: FLINK-30757
> URL: https://issues.apache.org/jira/browse/FLINK-30757
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gabor Somogyi
>Priority: Minor
>  Labels: starter
>
> It has been seen that the operator e2e tests were flaky when used the latest 
> version of the busybox so we've pinned it to a relatively old version. 
> https://github.com/apache/flink-kubernetes-operator/commit/8e41ed5ec1adda9ae06bc0b6203abf42939fbf2b
> It would be good to do 2 things
> * Upgrade the busybox version to the latest in a pinned way
> * Add debug logs of the busybox pod in case of failure (to see why failed)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-30757) Ugrade busybox version to a pinned version for operator

2023-02-06 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-30757:
--

Assignee: Shipeng Xie

> Ugrade busybox version to a pinned version for operator
> ---
>
> Key: FLINK-30757
> URL: https://issues.apache.org/jira/browse/FLINK-30757
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gabor Somogyi
>Assignee: Shipeng Xie
>Priority: Minor
>  Labels: starter
>
> It has been seen that the operator e2e tests were flaky when used the latest 
> version of the busybox so we've pinned it to a relatively old version. 
> https://github.com/apache/flink-kubernetes-operator/commit/8e41ed5ec1adda9ae06bc0b6203abf42939fbf2b
> It would be good to do 2 things
> * Upgrade the busybox version to the latest in a pinned way
> * Add debug logs of the busybox pod in case of failure (to see why failed)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30757) Ugrade busybox version to a pinned version for operator

2023-02-06 Thread Gyula Fora (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685081#comment-17685081
 ] 

Gyula Fora commented on FLINK-30757:


Of course, go ahead:)

> Ugrade busybox version to a pinned version for operator
> ---
>
> Key: FLINK-30757
> URL: https://issues.apache.org/jira/browse/FLINK-30757
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gabor Somogyi
>Priority: Minor
>  Labels: starter
>
> It has been seen that the operator e2e tests were flaky when used the latest 
> version of the busybox so we've pinned it to a relatively old version. 
> https://github.com/apache/flink-kubernetes-operator/commit/8e41ed5ec1adda9ae06bc0b6203abf42939fbf2b
> It would be good to do 2 things
> * Upgrade the busybox version to the latest in a pinned way
> * Add debug logs of the busybox pod in case of failure (to see why failed)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30934) Refactor ComputedColumnAndWatermarkTableITCase to get rid of managed table

2023-02-06 Thread yuzelin (Jira)

yuzelin created FLINK-30934:
---

 Summary: Refactor ComputedColumnAndWatermarkTableITCase to get rid 
of managed table
 Key: FLINK-30934
 URL: https://issues.apache.org/jira/browse/FLINK-30934
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Affects Versions: table-store-0.4.0
Reporter: yuzelin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30933) Result of join inside iterationBody loses max watermark

2023-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30933:
---
Labels: pull-request-available  (was: )

> Result of join inside iterationBody loses max watermark
> ---
>
> Key: FLINK-30933
> URL: https://issues.apache.org/jira/browse/FLINK-30933
> Project: Flink
>  Issue Type: Bug
>  Components: Library / Machine Learning
>Affects Versions: ml-2.0.0, ml-2.1.0, ml-2.2.0
>Reporter: Zhipeng Zhang
>Assignee: Zhipeng Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: ml-2.2.0
>
>
> Currently if we execute a join inside an iteration body, the following 
> program produces empty output. (In which the right result should be a list 
> with \{2}.
> {code:java}
> public class Test {
> public static void main(String[] args) throws Exception {
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.setParallelism(1);
> DataStream> input1 =
> env.fromElements(Tuple2.of(1L, 1), Tuple2.of(2L, 2));
> DataStream> input2 =
> env.fromElements(Tuple2.of(1L, 2L), Tuple2.of(2L, 3L));
> DataStream> iterationJoin =
> Iterations.iterateBoundedStreamsUntilTermination(
> DataStreamList.of(input1),
> ReplayableDataStreamList.replay(input2),
> IterationConfig.newBuilder()
> .setOperatorLifeCycle(
> 
> IterationConfig.OperatorLifeCycle.PER_ROUND)
> .build(),
> new MyIterationBody())
> .get(0);
> DataStream left = iterationJoin.map(x -> x.f0);
> DataStream right = iterationJoin.map(x -> x.f1);
> DataStream result =
> left.join(right)
> .where(x -> x)
> .equalTo(x -> x)
> .window(EndOfStreamWindows.get())
> .apply((JoinFunction) (l1, l2) -> 
> l1);
> List collectedResult = 
> IteratorUtils.toList(result.executeAndCollect());
> List expectedResult = Arrays.asList(2L);
> compareResultCollections(expectedResult, collectedResult, 
> Long::compareTo);
> }
> private static class MyIterationBody implements IterationBody {
> @Override
> public IterationBodyResult process(
> DataStreamList variableStreams, DataStreamList dataStreams) {
> DataStream> input1 = variableStreams.get(0);
> DataStream> input2 = dataStreams.get(0);
> DataStream terminationCriteria = input1.flatMap(new 
> TerminateOnMaxIter(1));
> DataStream> res =
> input1.join(input2)
> .where(x -> x.f0)
> .equalTo(x -> x.f0)
> .window(EndOfStreamWindows.get())
> .apply(
> new JoinFunction<
> Tuple2,
> Tuple2,
> Tuple2>() {
> @Override
> public Tuple2 join(
> Tuple2 
> longIntegerTuple2,
> Tuple2 
> longLongTuple2)
> throws Exception {
> return longLongTuple2;
> }
> });
> return new IterationBodyResult(
> DataStreamList.of(input1), DataStreamList.of(res), 
> terminationCriteria);
> }
> }
> }
> {code}
>  
> There are two possible reasons:
>  * The timer in `HeadOperator` is not a daemon process and it does not exit 
> even flink job finishes.
>  * The max watermark from the iteration body is missed.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-ml] zhipeng93 opened a new pull request, #206: [FLINK-30933] Fix missing max watermark when executing join in iteration body

2023-02-06 Thread via GitHub



zhipeng93 opened a new pull request, #206:
URL: https://github.com/apache/flink-ml/pull/206

   …ion body
   
   
   
   ## What is the purpose of the change
   
   Currently, when executing join for bounded input streams in a perround 
iteration body, there exist two problems:
   - The program may not exit even the result is collected using 
`datastream.executeAndCollect()`
   - The max watermark of the output stream is losed.
   
   This PR attempts to fix the above problems.
   
   ## Brief change log
   - Set `Timer` thread as daemon in `HeadOperator`.
   - Outputs maxwatermark when encountering max epoch watermark in 
`OutputOperator`.
   - Added unit test for verify the change.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-22793) HybridSource Table Implementation

2023-02-06 Thread Ran Tao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-22793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685078#comment-17685078
 ] 

Ran Tao commented on FLINK-22793:
-

[~martijnvisser] [~thw]  hi, martijn, Thomas.  This feature has been lagging 
for a long time. I contacted Jiang before but there was no reply. I don't know 
if he is still pushing this. Can you help to confirm? In addition, this 
function should require a flip. If possible, please review the previously 
created flip[1]. The current problem is almost solved, thank you.

[1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235836225

> HybridSource Table Implementation
> -
>
> Key: FLINK-22793
> URL: https://issues.apache.org/jira/browse/FLINK-22793
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common
>Reporter: Nicholas Jiang
>Assignee: Nicholas Jiang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-30493) FLIP 273: Improve the Catalog API to Support ALTER TABLE syntax

2023-02-06 Thread Shengkai Fang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengkai Fang closed FLINK-30493.
-
Resolution: Implemented

> FLIP 273: Improve the Catalog API to Support ALTER TABLE syntax
> ---
>
> Key: FLINK-30493
> URL: https://issues.apache.org/jira/browse/FLINK-30493
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.17.0
>Reporter: Shengkai Fang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Introduce the TableChange to support ALTER TABLE better
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-273%3A+Improve+the+Catalog+API+to+Support+ALTER+TABLE+syntax



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30757) Ugrade busybox version to a pinned version for operator

2023-02-06 Thread Shipeng Xie (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685076#comment-17685076
 ] 

Shipeng Xie commented on FLINK-30757:
-

Hi [~gaborgsomogyi] [~gyfora] , I am using the Flink k8s opeartor in production 
environment during work and want to learn and contribute to the Flink k8s 
operator. Can I take this task as the starter?

> Ugrade busybox version to a pinned version for operator
> ---
>
> Key: FLINK-30757
> URL: https://issues.apache.org/jira/browse/FLINK-30757
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gabor Somogyi
>Priority: Minor
>  Labels: starter
>
> It has been seen that the operator e2e tests were flaky when used the latest 
> version of the busybox so we've pinned it to a relatively old version. 
> https://github.com/apache/flink-kubernetes-operator/commit/8e41ed5ec1adda9ae06bc0b6203abf42939fbf2b
> It would be good to do 2 things
> * Upgrade the busybox version to the latest in a pinned way
> * Add debug logs of the busybox pod in case of failure (to see why failed)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] JunRuiLee commented on pull request #21801: [FLINK-30838][doc] Update documentation about the AdaptiveBatchScheduler

2023-02-06 Thread via GitHub



JunRuiLee commented on PR #21801:
URL: https://github.com/apache/flink/pull/21801#issuecomment-1420275735

   @zhuzhurk Thanks for CR and I've addressed all comments, PTAL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30933) Result of join inside iterationBody loses max watermark

2023-02-06 Thread Zhipeng Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhipeng Zhang updated FLINK-30933:
--
Description: 
Currently if we execute a join inside an iteration body, the following program 
produces empty output. (In which the right result should be a list with \{2}.
{code:java}
public class Test {

public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);

DataStream> input1 =
env.fromElements(Tuple2.of(1L, 1), Tuple2.of(2L, 2));

DataStream> input2 =
env.fromElements(Tuple2.of(1L, 2L), Tuple2.of(2L, 3L));

DataStream> iterationJoin =
Iterations.iterateBoundedStreamsUntilTermination(
DataStreamList.of(input1),
ReplayableDataStreamList.replay(input2),
IterationConfig.newBuilder()
.setOperatorLifeCycle(

IterationConfig.OperatorLifeCycle.PER_ROUND)
.build(),
new MyIterationBody())
.get(0);

DataStream left = iterationJoin.map(x -> x.f0);
DataStream right = iterationJoin.map(x -> x.f1);
DataStream result =
left.join(right)
.where(x -> x)
.equalTo(x -> x)
.window(EndOfStreamWindows.get())
.apply((JoinFunction) (l1, l2) -> l1);

List collectedResult = 
IteratorUtils.toList(result.executeAndCollect());
List expectedResult = Arrays.asList(2L);
compareResultCollections(expectedResult, collectedResult, 
Long::compareTo);
}

private static class MyIterationBody implements IterationBody {
@Override
public IterationBodyResult process(
DataStreamList variableStreams, DataStreamList dataStreams) {
DataStream> input1 = variableStreams.get(0);
DataStream> input2 = dataStreams.get(0);

DataStream terminationCriteria = input1.flatMap(new 
TerminateOnMaxIter(1));

DataStream> res =
input1.join(input2)
.where(x -> x.f0)
.equalTo(x -> x.f0)
.window(EndOfStreamWindows.get())
.apply(
new JoinFunction<
Tuple2,
Tuple2,
Tuple2>() {
@Override
public Tuple2 join(
Tuple2 
longIntegerTuple2,
Tuple2 
longLongTuple2)
throws Exception {
return longLongTuple2;
}
});

return new IterationBodyResult(
DataStreamList.of(input1), DataStreamList.of(res), 
terminationCriteria);
}
}
}
{code}
 

There are two possible reasons:
 * The timer in `HeadOperator` is not a daemon process and it does not exit 
even flink job finishes.
 * The max watermark from the iteration body is missed.

 

 

  was:
Currently if we execute a join inside an iteration body, the following program 
produces empty output. (In which the right result should be a list with \{1, 2}.
{code:java}
public class Test {

public static void main(String[] args) throws Exception {
Configuration config = new Configuration();
config.set(HeartbeatManagerOptions.HEARTBEAT_TIMEOUT, 1000L);
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment(config);
env.setParallelism(1);

DataStream> input1 =
env.fromElements(Tuple2.of(1L, 1), Tuple2.of(2L, 2));

DataStream> input2 =
env.fromElements(Tuple2.of(1L, 2L), Tuple2.of(2L, 3L));

DataStream> iterationJoin =
Iterations.iterateBoundedStreamsUntilTermination(
DataStreamList.of(input1),
ReplayableDataStreamList.replay(input2),
IterationConfig.newBuilder()
.setOperatorLifeCycle(

IterationConfig.OperatorLifeCycle.PER_ROUND)
.build(),
new MyIterationBody())

[jira] [Commented] (FLINK-25813) TableITCase.testCollectWithClose failed on azure

2023-02-06 Thread Caizhi Weng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685066#comment-17685066
 ] 

Caizhi Weng commented on FLINK-25813:
-

Hi [~mapohl]. Sorry for this late reply as I've almost forgotten this ticket. 
I've submitted a pull request and zentol reviewed. I asked him a question but 
he didn't reply so I guess things just stop there.

This ticket is just about an unstable test and it shouldn't have any impact on 
the users, so this is not a blocker for releases. If we want to resolve this 
ticket I'd like someone to help for the review. Thanks.

> TableITCase.testCollectWithClose failed on azure
> 
>
> Key: FLINK-25813
> URL: https://issues.apache.org/jira/browse/FLINK-25813
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.15.0, 1.16.0, 1.17.0
>Reporter: Yun Gao
>Assignee: Caizhi Weng
>Priority: Critical
>  Labels: pull-request-available, stale-assigned, test-stability
>
> {code:java}
> 2022-01-25T08:35:25.3735884Z Jan 25 08:35:25 [ERROR] 
> TableITCase.testCollectWithClose  Time elapsed: 0.377 s  <<< FAILURE!
> 2022-01-25T08:35:25.3737127Z Jan 25 08:35:25 java.lang.AssertionError: Values 
> should be different. Actual: RUNNING
> 2022-01-25T08:35:25.3738167Z Jan 25 08:35:25  at 
> org.junit.Assert.fail(Assert.java:89)
> 2022-01-25T08:35:25.3739085Z Jan 25 08:35:25  at 
> org.junit.Assert.failEquals(Assert.java:187)
> 2022-01-25T08:35:25.3739922Z Jan 25 08:35:25  at 
> org.junit.Assert.assertNotEquals(Assert.java:163)
> 2022-01-25T08:35:25.3740846Z Jan 25 08:35:25  at 
> org.junit.Assert.assertNotEquals(Assert.java:177)
> 2022-01-25T08:35:25.3742302Z Jan 25 08:35:25  at 
> org.apache.flink.table.api.TableITCase.testCollectWithClose(TableITCase.scala:135)
> 2022-01-25T08:35:25.3743327Z Jan 25 08:35:25  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2022-01-25T08:35:25.3744343Z Jan 25 08:35:25  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2022-01-25T08:35:25.3745575Z Jan 25 08:35:25  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2022-01-25T08:35:25.3746840Z Jan 25 08:35:25  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2022-01-25T08:35:25.3747922Z Jan 25 08:35:25  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> 2022-01-25T08:35:25.3749151Z Jan 25 08:35:25  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2022-01-25T08:35:25.3750422Z Jan 25 08:35:25  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> 2022-01-25T08:35:25.3751820Z Jan 25 08:35:25  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2022-01-25T08:35:25.3753196Z Jan 25 08:35:25  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2022-01-25T08:35:25.3754253Z Jan 25 08:35:25  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 2022-01-25T08:35:25.3755441Z Jan 25 08:35:25  at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258)
> 2022-01-25T08:35:25.3756656Z Jan 25 08:35:25  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> 2022-01-25T08:35:25.3757778Z Jan 25 08:35:25  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2022-01-25T08:35:25.3758821Z Jan 25 08:35:25  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
> 2022-01-25T08:35:25.3759840Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 2022-01-25T08:35:25.3760919Z Jan 25 08:35:25  at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> 2022-01-25T08:35:25.3762249Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> 2022-01-25T08:35:25.3763322Z Jan 25 08:35:25  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> 2022-01-25T08:35:25.3764436Z Jan 25 08:35:25  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> 2022-01-25T08:35:25.3765907Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> 2022-01-25T08:35:25.3766957Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> 2022-01-25T08:35:25.3768104Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> 2022-01-25T08:35:25.3769128Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> 2022-01-25T08:35:25.3770125Z Jan 25 08:35:25  at 
>

[GitHub] [flink] zhuzhurk commented on a diff in pull request #21801: [FLINK-30838][doc] Update documentation about the AdaptiveBatchScheduler

2023-02-06 Thread via GitHub



zhuzhurk commented on code in PR #21801:
URL: https://github.com/apache/flink/pull/21801#discussion_r1098203733


##
docs/content.zh/docs/deployment/elastic_scaling.md:
##
@@ -159,30 +159,27 @@ Adaptive Batch Scheduler 是一种可以自动推导每个算子并行度的批
 
 使用 Adaptive Batch Scheduler 自动推导算子的并行度，需要：
 - 启用 Adaptive Batch Scheduler
-- 配置算子的并行度为 `-1`
+- 不要明确指定算子的并行度
 
  启用 Adaptive Batch Scheduler
-为了启用 Adaptive Batch Scheduler, 你需要：
-- 配置 `jobmanager.scheduler: AdaptiveBatch`
-- 由于 ["只支持所有数据交换都为 BLOCKING 模式的作业"](#局限性-2), 需要将 
[`execution.batch-shuffle-mode`]({{< ref "docs/deployment/config" 
>}}#execution-batch-shuffle-mode) 配置为 `ALL-EXCHANGES-BLOCKING`(默认值) 。
+当前 Adaptive Batch Scheduler 是 Flink 默认的批作业调度器，无需额外配置。除非用户显式的配置了使用其他调度器，例如 
`jobmanager.scheduler: default`。需要注意的是，由于 ["只支持所有数据交换都为 BLOCKING 
模式的作业"](#局限性-2), 需要将 [`execution.batch-shuffle-mode`]({{< ref 
"docs/deployment/config" >}}#execution-batch-shuffle-mode) 配置为 
`ALL-EXCHANGES-BLOCKING`(默认值) 。
 
 除此之外，使用 Adaptive Batch Scheduler 时，以下相关配置也可以调整:
-- [`jobmanager.adaptive-batch-scheduler.min-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-min-parallelism): 允许自动设置的并行度最小值。
-- [`jobmanager.adaptive-batch-scheduler.max-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-max-parallelism): 允许自动设置的并行度最大值。
-- [`jobmanager.adaptive-batch-scheduler.avg-data-volume-per-task`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-avg-data-volume-per-task): 
期望每个任务平均处理的数据量大小。请注意，当出现数据倾斜，或者确定的并行度达到最大并行度（由于数据过多）时，一些任务实际处理的数据可能会远远超过这个值。
-- [`jobmanager.adaptive-batch-scheduler.default-source-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-default-source-parallelism): source 
算子的默认并行度
-
- 配置算子的并行度为 `-1`
-Adaptive Batch Scheduler 只会为用户未指定并行度的算子（并行度为 `-1`）推导并行度。 
所以如果你想自动推导算子的并行度，需要进行以下配置：
+- [`execution.batch.adaptive.auto-parallelism.min-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-adaptive-auto-parallelism-min-parallelism): 允许自动设置的并行度最小值。
+- [`execution.batch.adaptive.auto-parallelism.max-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-adaptive-auto-parallelism-max-parallelism): 
允许自动设置的并行度最大值，如果该配置项没有配置将使用默认并行度作为允许自动设置的并行度最大值。
+- [`execution.batch.adaptive.auto-parallelism.avg-data-volume-per-task`]({{< 
ref "docs/deployment/config" 
>}}#execution-batch-adaptive-auto-parallelism-avg-data-volume-per-task): 
期望每个任务平均处理的数据量大小。请注意，当出现数据倾斜，或者确定的并行度达到最大并行度（由于数据过多）时，一些任务实际处理的数据可能会远远超过这个值。
+- [`execution.batch.adaptive.auto-parallelism.default-source-parallelism`]({{< 
ref "docs/deployment/config" 
>}}#execution-batch-adaptive-auto-parallelism-default-source-parallelism): 
source 算子的默认并行度
+
+ 不要明确指定算子的并行度

Review Comment:
   -> 不要指定算子的并行度



##
docs/content.zh/docs/deployment/elastic_scaling.md:
##
@@ -159,30 +159,27 @@ Adaptive Batch Scheduler 是一种可以自动推导每个算子并行度的批
 
 使用 Adaptive Batch Scheduler 自动推导算子的并行度，需要：
 - 启用 Adaptive Batch Scheduler
-- 配置算子的并行度为 `-1`
+- 不要明确指定算子的并行度
 
  启用 Adaptive Batch Scheduler
-为了启用 Adaptive Batch Scheduler, 你需要：
-- 配置 `jobmanager.scheduler: AdaptiveBatch`
-- 由于 ["只支持所有数据交换都为 BLOCKING 模式的作业"](#局限性-2), 需要将 
[`execution.batch-shuffle-mode`]({{< ref "docs/deployment/config" 
>}}#execution-batch-shuffle-mode) 配置为 `ALL-EXCHANGES-BLOCKING`(默认值) 。
+当前 Adaptive Batch Scheduler 是 Flink 默认的批作业调度器，无需额外配置。除非用户显式的配置了使用其他调度器，例如 
`jobmanager.scheduler: default`。需要注意的是，由于 ["只支持所有数据交换都为 BLOCKING 
模式的作业"](#局限性-2), 需要将 [`execution.batch-shuffle-mode`]({{< ref 
"docs/deployment/config" >}}#execution-batch-shuffle-mode) 配置为 
`ALL-EXCHANGES-BLOCKING`(默认值) 。
 
 除此之外，使用 Adaptive Batch Scheduler 时，以下相关配置也可以调整:
-- [`jobmanager.adaptive-batch-scheduler.min-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-min-parallelism): 允许自动设置的并行度最小值。
-- [`jobmanager.adaptive-batch-scheduler.max-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-max-parallelism): 允许自动设置的并行度最大值。
-- [`jobmanager.adaptive-batch-scheduler.avg-data-volume-per-task`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-avg-data-volume-per-task): 
期望每个任务平均处理的数据量大小。请注意，当出现数据倾斜，或者确定的并行度达到最大并行度（由于数据过多）时，一些任务实际处理的数据可能会远远超过这个值。
-- [`jobmanager.adaptive-batch-scheduler.default-source-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-adaptive-batch-scheduler-default-source-parallelism): source 
算子的默认并行度
-
- 配置算子的并行度为 `-1`
-Adaptive Batch Scheduler 只会为用户未指定并行度的算子（并行度为 `-1`）推导并行度。 
所以如果你想自动推导算子的并行度，需要进行以下配置：
+- [`execution.batch.adaptive.auto-parallelism.min-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-adaptive-auto-parallelism-min-parallelism): 允许自动设置的并行度最小值。
+- [`execution.batch.adaptive.auto-parallelism.max-parallelism`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-adaptive-auto-parallelism-max-parallelism):

[jira] [Assigned] (FLINK-30933) Result of join inside iterationBody loses max watermark

2023-02-06 Thread Zhipeng Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhipeng Zhang reassigned FLINK-30933:
-

Assignee: Zhipeng Zhang

> Result of join inside iterationBody loses max watermark
> ---
>
> Key: FLINK-30933
> URL: https://issues.apache.org/jira/browse/FLINK-30933
> Project: Flink
>  Issue Type: Bug
>  Components: Library / Machine Learning
>Affects Versions: ml-2.0.0, ml-2.1.0, ml-2.2.0
>Reporter: Zhipeng Zhang
>Assignee: Zhipeng Zhang
>Priority: Major
> Fix For: ml-2.2.0
>
>
> Currently if we execute a join inside an iteration body, the following 
> program produces empty output. (In which the right result should be a list 
> with \{1, 2}.
> {code:java}
> public class Test {
> public static void main(String[] args) throws Exception {
> Configuration config = new Configuration();
> config.set(HeartbeatManagerOptions.HEARTBEAT_TIMEOUT, 1000L);
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment(config);
> env.setParallelism(1);
> DataStream> input1 =
> env.fromElements(Tuple2.of(1L, 1), Tuple2.of(2L, 2));
> DataStream> input2 =
> env.fromElements(Tuple2.of(1L, 2L), Tuple2.of(2L, 3L));
> DataStream> iterationJoin =
> Iterations.iterateBoundedStreamsUntilTermination(
> DataStreamList.of(input1),
> ReplayableDataStreamList.replay(input2),
> IterationConfig.newBuilder()
> .setOperatorLifeCycle(
> 
> IterationConfig.OperatorLifeCycle.PER_ROUND)
> .build(),
> new MyIterationBody())
> .get(0);
> DataStream left = iterationJoin.map(x -> x.f0);
> DataStream right = iterationJoin.map(x -> x.f0);
> DataStream result =
> left.join(right)
> .where(x -> x)
> .equalTo(x -> x)
> .window(EndOfStreamWindows.get())
> .apply((JoinFunction) (l1, l2) -> 
> l1);
> List collectedResult = 
> IteratorUtils.toList(result.executeAndCollect());
> List expectedResult = Arrays.asList(1L, 2L);
> compareResultCollections(expectedResult, collectedResult, 
> Long::compareTo);
> }
> private static class MyIterationBody implements IterationBody {
> @Override
> public IterationBodyResult process(
> DataStreamList variableStreams, DataStreamList dataStreams) {
> DataStream> input1 = variableStreams.get(0);
> DataStream> input2 = dataStreams.get(0);
> DataStream terminationCriteria = input1.flatMap(new 
> TerminateOnMaxIter(1));
> DataStream> res =
> input1.join(input2)
> .where(x -> x.f0)
> .equalTo(x -> x.f0)
> .window(EndOfStreamWindows.get())
> .apply(
> (JoinFunction<
> Tuple2,
> Tuple2,
> Tuple2>)
> (t1, t2) -> t2);
> return new IterationBodyResult(
> DataStreamList.of(input1), DataStreamList.of(res), 
> terminationCriteria);
> }
> }
> }
>  {code}
>  
> There are two possible reasons:
>  * The timer in `HeadOperator` is not a daemon process and it does not exit 
> even flink job finishes.
>  * The max watermark from the iteration body is missed.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30933) Result of join inside iterationBody loses max watermark

2023-02-06 Thread Zhipeng Zhang (Jira)

Zhipeng Zhang created FLINK-30933:
-

 Summary: Result of join inside iterationBody loses max watermark
 Key: FLINK-30933
 URL: https://issues.apache.org/jira/browse/FLINK-30933
 Project: Flink
  Issue Type: Bug
  Components: Library / Machine Learning
Affects Versions: ml-2.1.0, ml-2.0.0, ml-2.2.0
Reporter: Zhipeng Zhang
 Fix For: ml-2.2.0


Currently if we execute a join inside an iteration body, the following program 
produces empty output. (In which the right result should be a list with \{1, 2}.
{code:java}
public class Test {

public static void main(String[] args) throws Exception {
Configuration config = new Configuration();
config.set(HeartbeatManagerOptions.HEARTBEAT_TIMEOUT, 1000L);
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment(config);
env.setParallelism(1);

DataStream> input1 =
env.fromElements(Tuple2.of(1L, 1), Tuple2.of(2L, 2));

DataStream> input2 =
env.fromElements(Tuple2.of(1L, 2L), Tuple2.of(2L, 3L));

DataStream> iterationJoin =
Iterations.iterateBoundedStreamsUntilTermination(
DataStreamList.of(input1),
ReplayableDataStreamList.replay(input2),
IterationConfig.newBuilder()
.setOperatorLifeCycle(

IterationConfig.OperatorLifeCycle.PER_ROUND)
.build(),
new MyIterationBody())
.get(0);

DataStream left = iterationJoin.map(x -> x.f0);
DataStream right = iterationJoin.map(x -> x.f0);
DataStream result =
left.join(right)
.where(x -> x)
.equalTo(x -> x)
.window(EndOfStreamWindows.get())
.apply((JoinFunction) (l1, l2) -> l1);

List collectedResult = 
IteratorUtils.toList(result.executeAndCollect());
List expectedResult = Arrays.asList(1L, 2L);
compareResultCollections(expectedResult, collectedResult, 
Long::compareTo);
}

private static class MyIterationBody implements IterationBody {
@Override
public IterationBodyResult process(
DataStreamList variableStreams, DataStreamList dataStreams) {
DataStream> input1 = variableStreams.get(0);
DataStream> input2 = dataStreams.get(0);

DataStream terminationCriteria = input1.flatMap(new 
TerminateOnMaxIter(1));

DataStream> res =
input1.join(input2)
.where(x -> x.f0)
.equalTo(x -> x.f0)
.window(EndOfStreamWindows.get())
.apply(
(JoinFunction<
Tuple2,
Tuple2,
Tuple2>)
(t1, t2) -> t2);

return new IterationBodyResult(
DataStreamList.of(input1), DataStreamList.of(res), 
terminationCriteria);
}
}
}
 {code}
 

There are two possible reasons:
 * The timer in `HeadOperator` is not a daemon process and it does not exit 
even flink job finishes.
 * The max watermark from the iteration body is missed.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] lsyldliu commented on a diff in pull request #21789: [FLINK-30824][hive] Add document for option table.exec.hive.native-agg-function.enabled

2023-02-06 Thread via GitHub



lsyldliu commented on code in PR #21789:
URL: https://github.com/apache/flink/pull/21789#discussion_r1098208433


##
docs/content.zh/docs/connectors/table/hive/hive_functions.md:
##
@@ -73,6 +73,34 @@ Some Hive built-in functions in older versions have [thread 
safety issues](https
 We recommend users patch their own Hive to fix them.
 {{< /hint >}}
 
+## Use Native Hive Aggregate Functions
+
+If [HiveModule]({{< ref "docs/dev/table/modules" >}}#hivemodule) is loaded 
with a higher priority than CoreModule, Flink will try to use the Hive built-in 
function first. And then for Hive built-in aggregation function,
+Flink currently uses sort-based aggregation strategy. Compared to hash-based 
aggregation strategy, the performance is one to two times worse, so from Flink 
1.17, we have implemented some of Hive's aggregation functions natively in 
Flink.
+These functions will use the hash-agg strategy to improve performance. 
Currently, only five functions are supported, namely sum/count/avg/min/max, and 
more aggregation functions will be supported in the future.
+Users can use the native aggregation function by turning on the option 
`table.exec.hive.native-agg-function.enabled`, which brings significant 
performance improvement to the job.
+
+
+  
+
+Key
+Default
+Type
+Description
+
+  
+  
+
+table.exec.hive.native-agg-function.enabled
+false
+Boolean
+Enabling to use native aggregate function which use hash-agg 
strategy that can improve the aggregation performance after loading HiveModule. 
This is a job-level option, user can enable it per-job.
+
+  
+
+
+Attention The ability of the native 
aggregate functions don't fully align with Hive built-in aggregation functions 
now, for example, some data types are not supported. If performance is not a 
bottleneck, you don't need to turn on this option.

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] luoyuxia commented on pull request #21149: [FLINK-29527][formats/parquet] Make unknownFieldsIndices work for single ParquetReader

2023-02-06 Thread via GitHub



luoyuxia commented on PR #21149:
URL: https://github.com/apache/flink/pull/21149#issuecomment-1420190347

   @saLeox Thanks for your investigation.
   Won't it means it'll be more reasonable that we also introuce a option for 
that?
   @Tartarus0zm I'm a little of curious about why your company has strict usage 
rules. Is there any bad for  multiple schema files in the same partition 
directory.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-30843) Add Notifier For Autoscaler

2023-02-06 Thread Gaurav Miglani (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685055#comment-17685055
 ] 

Gaurav Miglani commented on FLINK-30843:


Ok, will do it and raise PR

> Add Notifier For Autoscaler
> ---
>
> Key: FLINK-30843
> URL: https://issues.apache.org/jira/browse/FLINK-30843
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.3.1
>Reporter: Gaurav Miglani
>Priority: Major
>  Labels: feature
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> We can add Notifier abstraction and slack as the first cut for notification 
> in case auto scaler performs the scaling activity. Let me know if this is 
> feasible from your side, then I will start working on it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-30648) [Umbrella]FLIP-282 Introduce Delete/Update API

2023-02-06 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-30648.
---
Resolution: Fixed

> [Umbrella]FLIP-282 Introduce Delete/Update API
> --
>
> Key: FLINK-30648
> URL: https://issues.apache.org/jira/browse/FLINK-30648
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: luoyuxia
>Assignee: luoyuxia
>Priority: Major
>  Labels: pull-request-available
>
> For more details, please refer to 
> [FLIP-282|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235838061]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-30666) Add document for Delete/Update API

2023-02-06 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-30666.
---
Resolution: Fixed

fixed in master: f3608f4277e47b76d59a5ec99c21947aea8cde5b

> Add document for Delete/Update API
> --
>
> Key: FLINK-30666
> URL: https://issues.apache.org/jira/browse/FLINK-30666
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: luoyuxia
>Assignee: luoyuxia
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30920) K8 cluster autoscaler | exclude operator ids from scaler

2023-02-06 Thread Gaurav Miglani (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685052#comment-17685052
 ] 

Gaurav Miglani commented on FLINK-30920:


yes you are right, but in our cases, load came suddenly, for which we need sink 
to be of define parallelism, although I have found out property 
`kubernetes.operator.job.autoscaler.vertex.min-parallelism` which can be set, 
then autoscaler will not trigger downscale below this threshold, but it seems 
to me as we are scaling vertices and we already have list of vertices, why not 
user can control which vertices he didn't want autoscaler to perform any 
action, your thoughts ?

> K8 cluster autoscaler | exclude operator ids from scaler
> 
>
> Key: FLINK-30920
> URL: https://issues.apache.org/jira/browse/FLINK-30920
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gaurav Miglani
>Priority: Major
> Attachments: sample-logs
>
>
> Sometime in cases of sink operator ids, where logic is heavy group 
> aggregation and scan mode is earliest, flink k8 operator tries to 
> scale/downscale sink operator ids as well, there should be a way where user 
> can give list of operator ids/vertices where cluster autoscaler doesn't 
> perform any scaling action on a configurable list of operator ids/vertices
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] lincoln-lil merged pull request #21863: [FLINK-30666][doc] Add document for Delete/Update API

2023-02-06 Thread via GitHub



lincoln-lil merged PR #21863:
URL: https://github.com/apache/flink/pull/21863


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-benchmarks] fredia commented on pull request #63: [FLINK-27571][benchmark] Recognize 'less is better' benchmarks in regression detection script

2023-02-06 Thread via GitHub



fredia commented on PR #63:
URL: https://github.com/apache/flink-benchmarks/pull/63#issuecomment-1420181992

   Right, the reports that have been generated before correcting the 
`units_title` are still treated as `less is better`.
   Thanks for your information,  regenerating past reports by codespeed admin 
is a more convenient method.
   I changed the `regression_report.py` and `save_jmh_result.py` finally, could 
you please take a look in your free time?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-30908) Fatal error in ResourceManager caused YARNSessionFIFOSecuredITCase.testDetachedMode to fail

2023-02-06 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685045#comment-17685045
 ] 

Yang Wang commented on FLINK-30908:
---

+1 for Xintong's analysis and proposal.

YARN-5999 introduced a side effect that {{CallbackHandler#onError}} will have a 
chance to be executed when stopping the AMRMAyncClient.

> Fatal error in ResourceManager caused 
> YARNSessionFIFOSecuredITCase.testDetachedMode to fail
> ---
>
> Key: FLINK-30908
> URL: https://issues.apache.org/jira/browse/FLINK-30908
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Xintong Song
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: mvn-1.FLINK-30908.log
>
>
> There's a build failure in {{YARNSessionFIFOSecuredITCase.testDetachedMode}} 
> which is caused by a fatal error in the ResourceManager:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45720=logs=245e1f2e-ba5b-5570-d689-25ae21e5302f=d04c9862-880c-52f5-574b-a7a79fef8e0f=29869
> {code}
> Feb 05 02:41:58 java.io.InterruptedIOException: Interrupted waiting to send 
> RPC request to server
> Feb 05 02:41:58 java.io.InterruptedIOException: Interrupted waiting to send 
> RPC request to server
> Feb 05 02:41:58   at org.apache.hadoop.ipc.Client.call(Client.java:1480) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at org.apache.hadoop.ipc.Client.call(Client.java:1422) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at com.sun.proxy.$Proxy31.allocate(Unknown Source) 
> ~[?:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>  ~[hadoop-yarn-common-3.2.3.jar:?]
> Feb 05 02:41:58   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_292]
> Feb 05 02:41:58   at java.lang.reflect.Method.invoke(Method.java:498) 
> ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at com.sun.proxy.$Proxy32.allocate(Unknown Source) 
> ~[?:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:325)
>  ~[hadoop-yarn-client-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:311)
>  [hadoop-yarn-client-3.2.3.jar:?]
> Feb 05 02:41:58 Caused by: java.lang.InterruptedException
> Feb 05 02:41:58   at 
> java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1180) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at org.apache.hadoop.ipc.Client.call(Client.java:1475) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   ... 17 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] luoyuxia commented on a diff in pull request #21663: [FLINK-30559][Planner] Fix unexpected multiple type inference issue

2023-02-06 Thread via GitHub



luoyuxia commented on code in PR #21663:
URL: https://github.com/apache/flink/pull/21663#discussion_r1098148881


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/type/NumericOrDefaultReturnTypeInference.java:
##
@@ -56,11 +55,7 @@ public RelDataType inferReturnType(SqlOperatorBinding 
opBinding) {
 List types = new ArrayList<>();
 for (int i = startTypeIdx; i < nOperands; i++) {
 RelDataType type = opBinding.getOperandType(i);
-if (SqlTypeUtil.isNumeric(type)) {
-types.add(type);
-} else {
-return opBinding.getOperandType(defaultTypeIdx);
-}
+types.add(type);

Review Comment:
   It's just my personal opinion, the reason behind that is the class 
`NumericOrDefaultReturnTypeInference` is designed for 
   - if the operands are Numeric, consider each Numeric and find the 
leastRestrictive
   - Otherwise, it will return the sencond as the return type.
   
   Above is what i infer from the class name and java doc
   
   If we skip all checks, the usage of the class 
`NumericOrDefaultReturnTypeInference` changes.
   Then we'll need to modify the class name and java doc.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] luoyuxia commented on a diff in pull request #21663: [FLINK-30559][Planner] Fix unexpected multiple type inference issue

2023-02-06 Thread via GitHub



luoyuxia commented on code in PR #21663:
URL: https://github.com/apache/flink/pull/21663#discussion_r1098148881


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/type/NumericOrDefaultReturnTypeInference.java:
##
@@ -56,11 +55,7 @@ public RelDataType inferReturnType(SqlOperatorBinding 
opBinding) {
 List types = new ArrayList<>();
 for (int i = startTypeIdx; i < nOperands; i++) {
 RelDataType type = opBinding.getOperandType(i);
-if (SqlTypeUtil.isNumeric(type)) {
-types.add(type);
-} else {
-return opBinding.getOperandType(defaultTypeIdx);
-}
+types.add(type);

Review Comment:
   It's just my personal opinion, the reason behind that is the class 
`NumericOrDefaultReturnTypeInference` is designed for 
   - if the operands are Numeric, consider each Numeric and find the 
leastRestrictive
   - Otherwise, it will return the sencond as the return type.
   That's what i infer from the class name and java doc.
   
   If we skip all checks, the usage of the class 
`NumericOrDefaultReturnTypeInference` changes.
   Then we'll need to modify the class name and java doc.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] luoyuxia commented on a diff in pull request #21663: [FLINK-30559][Planner] Fix unexpected multiple type inference issue

2023-02-06 Thread via GitHub



luoyuxia commented on code in PR #21663:
URL: https://github.com/apache/flink/pull/21663#discussion_r1098146682


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/type/NumericOrDefaultReturnTypeInference.java:
##
@@ -56,11 +55,7 @@ public RelDataType inferReturnType(SqlOperatorBinding 
opBinding) {
 List types = new ArrayList<>();
 for (int i = startTypeIdx; i < nOperands; i++) {
 RelDataType type = opBinding.getOperandType(i);
-if (SqlTypeUtil.isNumeric(type)) {
-types.add(type);
-} else {
-return opBinding.getOperandType(defaultTypeIdx);
-}
+types.add(type);

Review Comment:
   Do we really need to skip check for all type?
   
   In the fail case, if(b, 'xx', s), where the second operand is `char(2)`, the 
third operand is `string`, the it wll mistake the return type is `char(2)`, and 
then in the if code gen, it'll cast all operands to `char(2)` which cause the 
fail.
   Maybe we check 
   `if (SqlTypeUtil.isNumeric(type) || SqlTypeUtil.isCharacter(type))` in here?
   
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] luoyuxia commented on a diff in pull request #21663: [FLINK-30559][Planner] Fix unexpected multiple type inference issue

2023-02-06 Thread via GitHub



luoyuxia commented on code in PR #21663:
URL: https://github.com/apache/flink/pull/21663#discussion_r1098146682


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/type/NumericOrDefaultReturnTypeInference.java:
##
@@ -56,11 +55,7 @@ public RelDataType inferReturnType(SqlOperatorBinding 
opBinding) {
 List types = new ArrayList<>();
 for (int i = startTypeIdx; i < nOperands; i++) {
 RelDataType type = opBinding.getOperandType(i);
-if (SqlTypeUtil.isNumeric(type)) {
-types.add(type);
-} else {
-return opBinding.getOperandType(defaultTypeIdx);
-}
+types.add(type);

Review Comment:
   Do we really need to skip check for all type?
   
   In the fail case, if(b, 'xx', s), where the second operand is `char(2)`, the 
third operand is `string`, the it wll mistake the return type is `char(2)`, and 
then in the if code gen, it'll cast all operands to `char(2)`.
   Maybe we check 
   `if (SqlTypeUtil.isNumeric(type) || SqlTypeUtil.isCharacter(type))` in here?
   
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] luoyuxia commented on a diff in pull request #21663: [FLINK-30559][Planner] Fix unexpected multiple type inference issue

2023-02-06 Thread via GitHub



luoyuxia commented on code in PR #21663:
URL: https://github.com/apache/flink/pull/21663#discussion_r1098146682


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/type/NumericOrDefaultReturnTypeInference.java:
##
@@ -56,11 +55,7 @@ public RelDataType inferReturnType(SqlOperatorBinding 
opBinding) {
 List types = new ArrayList<>();
 for (int i = startTypeIdx; i < nOperands; i++) {
 RelDataType type = opBinding.getOperandType(i);
-if (SqlTypeUtil.isNumeric(type)) {
-types.add(type);
-} else {
-return opBinding.getOperandType(defaultTypeIdx);
-}
+types.add(type);

Review Comment:
   Do we really need to skip check for all type?
   
   In the fail case, if(b, 'xx', s), where the second operand is `char(2)`, the 
third operand is `string`, the it wll mistake the return type is char(2)`, and 
then in the if code gen, it'll cast all operands to `char(2)`.
   Maybe we check 
   `if (SqlTypeUtil.isNumeric(type) || SqlTypeUtil.isCharacter(type))` in here?
   
   WDYT?



##
flink-table/flink-table-planner/src/test/scala/org/apache/flink/table/planner/runtime/batch/sql/CalcITCase.scala:
##
@@ -1837,6 +1837,11 @@ class CalcITCase extends BatchTestBase {
   Seq(row('a'), row('b'), row('a')))
   }
 
+  @Test
+  def testIfWithFixCharAndVarchar(): Unit = {
+checkResult("SELECT IF(b > 10, 'ua', c) FROM Table3", data3.map(r => 
row(r.getField(2

Review Comment:
   Change to 
   ```suggestion
   checkResult("SELECT IF(b > 4, 'ua', c) FROM Table3", data3.map(r => 
row(r.getField(2
   ```
   to cover the negative case.



##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/type/NumericOrDefaultReturnTypeInference.java:
##
@@ -56,11 +55,7 @@ public RelDataType inferReturnType(SqlOperatorBinding 
opBinding) {
 List types = new ArrayList<>();
 for (int i = startTypeIdx; i < nOperands; i++) {
 RelDataType type = opBinding.getOperandType(i);
-if (SqlTypeUtil.isNumeric(type)) {
-types.add(type);
-} else {
-return opBinding.getOperandType(defaultTypeIdx);
-}
+types.add(type);

Review Comment:
   It's just my personal opinion, the reason behind that is the class 
`NumericOrDefaultReturnTypeInference` is designed for 
   - if the operands are Numeric, consider each Numeric and find the 
leastRestrictive
   - Otherwise, it will return the sencond as the return type.
   
   If we skip all checks, the usage of the class 
`NumericOrDefaultReturnTypeInference` changes.
   Then we'll need to modify the class name and java doc.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30908) Fatal error in ResourceManager caused YARNSessionFIFOSecuredITCase.testDetachedMode to fail

2023-02-06 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-30908:
-
Priority: Critical  (was: Blocker)

> Fatal error in ResourceManager caused 
> YARNSessionFIFOSecuredITCase.testDetachedMode to fail
> ---
>
> Key: FLINK-30908
> URL: https://issues.apache.org/jira/browse/FLINK-30908
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Xintong Song
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: mvn-1.FLINK-30908.log
>
>
> There's a build failure in {{YARNSessionFIFOSecuredITCase.testDetachedMode}} 
> which is caused by a fatal error in the ResourceManager:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45720=logs=245e1f2e-ba5b-5570-d689-25ae21e5302f=d04c9862-880c-52f5-574b-a7a79fef8e0f=29869
> {code}
> Feb 05 02:41:58 java.io.InterruptedIOException: Interrupted waiting to send 
> RPC request to server
> Feb 05 02:41:58 java.io.InterruptedIOException: Interrupted waiting to send 
> RPC request to server
> Feb 05 02:41:58   at org.apache.hadoop.ipc.Client.call(Client.java:1480) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at org.apache.hadoop.ipc.Client.call(Client.java:1422) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at com.sun.proxy.$Proxy31.allocate(Unknown Source) 
> ~[?:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>  ~[hadoop-yarn-common-3.2.3.jar:?]
> Feb 05 02:41:58   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_292]
> Feb 05 02:41:58   at java.lang.reflect.Method.invoke(Method.java:498) 
> ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at com.sun.proxy.$Proxy32.allocate(Unknown Source) 
> ~[?:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:325)
>  ~[hadoop-yarn-client-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:311)
>  [hadoop-yarn-client-3.2.3.jar:?]
> Feb 05 02:41:58 Caused by: java.lang.InterruptedException
> Feb 05 02:41:58   at 
> java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1180) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at org.apache.hadoop.ipc.Client.call(Client.java:1475) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   ... 17 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30908) Fatal error in ResourceManager caused YARNSessionFIFOSecuredITCase.testDetachedMode to fail

2023-02-06 Thread Xintong Song (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685040#comment-17685040
 ] 

Xintong Song commented on FLINK-30908:
--

After looking more into the logs and Hadoop codes, we believe FLINK-20988 is 
not the cause of this failure.

The test failure is caused by:
1. {{AMRMClientAsync}} sends an {{InterruptedIOException}} to the callback 
handler ({{YarnContainerEventHandler}}) after being stopped.
2. All errors sent to {{YarnContainerEventHandler}} are treated as fatal error 
in Flink.

This is not a newly introduced issue. 1) exists in Hadoop 2.9+ versions 
(https://issues.apache.org/jira/browse/YARN-5999), and 2) is the behavior since 
yarn deployment is supported. FLINK-20988 did introduce another chance for 
exceptions during shutdown to be handled as fatal error, but that is not the 
cause of this test failure. Given that this issue already exist in previous 
releases, I'm downgrading this ticket to Critical priority.

The proper fix might be to ignore the exceptions in 
{{YarnContainerEventHandler}} after being terminated. I'll update the PR and 
fix this.

> Fatal error in ResourceManager caused 
> YARNSessionFIFOSecuredITCase.testDetachedMode to fail
> ---
>
> Key: FLINK-30908
> URL: https://issues.apache.org/jira/browse/FLINK-30908
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Xintong Song
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Attachments: mvn-1.FLINK-30908.log
>
>
> There's a build failure in {{YARNSessionFIFOSecuredITCase.testDetachedMode}} 
> which is caused by a fatal error in the ResourceManager:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45720=logs=245e1f2e-ba5b-5570-d689-25ae21e5302f=d04c9862-880c-52f5-574b-a7a79fef8e0f=29869
> {code}
> Feb 05 02:41:58 java.io.InterruptedIOException: Interrupted waiting to send 
> RPC request to server
> Feb 05 02:41:58 java.io.InterruptedIOException: Interrupted waiting to send 
> RPC request to server
> Feb 05 02:41:58   at org.apache.hadoop.ipc.Client.call(Client.java:1480) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at org.apache.hadoop.ipc.Client.call(Client.java:1422) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at com.sun.proxy.$Proxy31.allocate(Unknown Source) 
> ~[?:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>  ~[hadoop-yarn-common-3.2.3.jar:?]
> Feb 05 02:41:58   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_292]
> Feb 05 02:41:58   at java.lang.reflect.Method.invoke(Method.java:498) 
> ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58   at com.sun.proxy.$Proxy32.allocate(Unknown Source) 
> ~[?:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:325)
>  ~[hadoop-yarn-client-3.2.3.jar:?]
> Feb 05 02:41:58   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:311)
>  [hadoop-yarn-client-3.2.3.jar:?]
> Feb 05 02:41:58 Caused by: java.lang.InterruptedException
> Feb 05 02:41:58   at 
> java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_292]
> Feb 05 02:41:58   at 
> java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_292]
> Feb 05 02:41:58

[jira] [Comment Edited] (FLINK-30929) Add helpful message when ElasticsearchSink.Builder.build() throws an IllegalArgumentException.

2023-02-06 Thread Kenny Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685029#comment-17685029
 ] 

Kenny Wu edited comment on FLINK-30929 at 2/7/23 3:36 AM:
--

[~martijnvisser] I have checked the latest version. The case,  as well as other 
property check statements, has been improved in the new implement already, thx. 
I will close this ticket
{code:java}
public B setHosts(HttpHost... hosts) {
Preconditions.checkNotNull(hosts);
Preconditions.checkState(hosts.length > 0, "Hosts cannot be empty.");
this.hosts = Arrays.asList(hosts);
return this.self();
}{code}
 


was (Author: JIRAUSER29):
[~martijnvisser] I have checked the latest version. The case has been improved 
already, as well as other property check statements.
{code:java}
public B setHosts(HttpHost... hosts) {
Preconditions.checkNotNull(hosts);
Preconditions.checkState(hosts.length > 0, "Hosts cannot be empty.");
this.hosts = Arrays.asList(hosts);
return this.self();
}{code}
 

> Add helpful message when ElasticsearchSink.Builder.build() throws an 
> IllegalArgumentException.
> --
>
> Key: FLINK-30929
> URL: https://issues.apache.org/jira/browse/FLINK-30929
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.13.6
>Reporter: Kenny Wu
>Priority: Major
> Fix For: 1.15.0
>
> Attachments: image-2023-02-06-21-28-37-835.png
>
>
> When I test flink-connector-elasticsearch on my project. And *I got an 
> IllegalArgumentException with nothing helpful message.* 
> *Here's the exception message:* 
>  
> {code:java}
> Exception in thread "main" java.lang.IllegalArgumentException
>     at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:122)
>     at 
> org.apache.flink.streaming.connectors.elasticsearch7.Elasticsearch7ApiCallBridge.(Elasticsearch7ApiCallBridge.java:61)
>     at 
> org.apache.flink.streaming.connectors.elasticsearch7.ElasticsearchSink.(ElasticsearchSink.java:74)
>     ... {code}
>  
> *I could not see which exactly the argument was illegal.*
> After I read the code in (Elasticsearch7ApiCallBridge.java:61), I realized 
> that proberbly  the ES hosts was empty. And finally fixed it.
> !image-2023-02-06-21-28-37-835.png!
> I think helpful message should be printed when such an important argument is 
> illegal and fails to build the connector.
> And I'd love to improve it. thanks
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30734) KBinsDiscretizer handles Double.NaN incorrectly

2023-02-06 Thread Fan Hong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685034#comment-17685034
 ] 

Fan Hong commented on FLINK-30734:
--

Sklearn has a discussion about this feature: [1] 

SparkML already supports this feature in a similar algorithm named 
QuantileDiscretizer: [2]

 

[1][https://github.com/scikit-learn/scikit-learn/issues/9341]

[2]https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.feature.QuantileDiscretizer.html

> KBinsDiscretizer handles Double.NaN incorrectly
> ---
>
> Key: FLINK-30734
> URL: https://issues.apache.org/jira/browse/FLINK-30734
> Project: Flink
>  Issue Type: Bug
>  Components: Library / Machine Learning
>Affects Versions: ml-2.1.0
>Reporter: Fan Hong
>Priority: Major
>
> When the training data contains Double.NaN values and the strategy is set to 
> "quantile", the generated model data has Double.NaN as the right edge of the 
> largest bin.
> My expected behavior is to ignore Double.NaN values when training, and to 
> support skip/error/keep strategy when transforming with generated 
> KBinsDiscretizerModel.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-30929) Add helpful message when ElasticsearchSink.Builder.build() throws an IllegalArgumentException.

2023-02-06 Thread Kenny Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenny Wu closed FLINK-30929.

Fix Version/s: 1.15.0
   Resolution: Fixed

> Add helpful message when ElasticsearchSink.Builder.build() throws an 
> IllegalArgumentException.
> --
>
> Key: FLINK-30929
> URL: https://issues.apache.org/jira/browse/FLINK-30929
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.13.6
>Reporter: Kenny Wu
>Priority: Major
> Fix For: 1.15.0
>
> Attachments: image-2023-02-06-21-28-37-835.png
>
>
> When I test flink-connector-elasticsearch on my project. And *I got an 
> IllegalArgumentException with nothing helpful message.* 
> *Here's the exception message:* 
>  
> {code:java}
> Exception in thread "main" java.lang.IllegalArgumentException
>     at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:122)
>     at 
> org.apache.flink.streaming.connectors.elasticsearch7.Elasticsearch7ApiCallBridge.(Elasticsearch7ApiCallBridge.java:61)
>     at 
> org.apache.flink.streaming.connectors.elasticsearch7.ElasticsearchSink.(ElasticsearchSink.java:74)
>     ... {code}
>  
> *I could not see which exactly the argument was illegal.*
> After I read the code in (Elasticsearch7ApiCallBridge.java:61), I realized 
> that proberbly  the ES hosts was empty. And finally fixed it.
> !image-2023-02-06-21-28-37-835.png!
> I think helpful message should be printed when such an important argument is 
> illegal and fails to build the connector.
> And I'd love to improve it. thanks
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-30916) RocksDBStateUploaderTest.testUploadedSstCanBeCleanedUp failed with assertion

2023-02-06 Thread Yuan Mei (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei resolved FLINK-30916.
--
Resolution: Duplicate

> RocksDBStateUploaderTest.testUploadedSstCanBeCleanedUp failed with assertion
> 
>
> Key: FLINK-30916
> URL: https://issues.apache.org/jira/browse/FLINK-30916
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Rui Fan
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> {{RocksDBStateUploaderTest.testUploadedSstCanBeCleanedUp}} failed due to an 
> assertion:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45730=logs=f0ac5c25-1168-55a5-07ff-0e88223afed9=50bf7a25-bdc4-5e56-5478-c7b4511dde53=10328
> {code}
> Feb 06 02:54:01 [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 0.427 s <<< FAILURE! - in 
> org.apache.flink.contrib.streaming.state.RocksDBStateUploaderTest
> Feb 06 02:54:01 [ERROR] 
> org.apache.flink.contrib.streaming.state.RocksDBStateUploaderTest.testUploadedSstCanBeCleanedUp
>   Time elapsed: 0.115 s  <<< FAILURE!
> Feb 06 02:54:01 java.lang.AssertionError: 
> Feb 06 02:54:01 
> Feb 06 02:54:01 Expecting empty but was: 
> ["379065d4-dd29-4455-b30f-4dbc53336ea2"]
> Feb 06 02:54:01   at 
> org.apache.flink.contrib.streaming.state.RocksDBStateUploaderTest.testUploadedSstCanBeCleanedUp(RocksDBStateUploaderTest.java:166)
> Feb 06 02:54:01   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [...]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30916) RocksDBStateUploaderTest.testUploadedSstCanBeCleanedUp failed with assertion

2023-02-06 Thread Yuan Mei (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685033#comment-17685033
 ] 

Yuan Mei commented on FLINK-30916:
--

Resolved in FLINK-30461

> RocksDBStateUploaderTest.testUploadedSstCanBeCleanedUp failed with assertion
> 
>
> Key: FLINK-30916
> URL: https://issues.apache.org/jira/browse/FLINK-30916
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Rui Fan
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> {{RocksDBStateUploaderTest.testUploadedSstCanBeCleanedUp}} failed due to an 
> assertion:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45730=logs=f0ac5c25-1168-55a5-07ff-0e88223afed9=50bf7a25-bdc4-5e56-5478-c7b4511dde53=10328
> {code}
> Feb 06 02:54:01 [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 0.427 s <<< FAILURE! - in 
> org.apache.flink.contrib.streaming.state.RocksDBStateUploaderTest
> Feb 06 02:54:01 [ERROR] 
> org.apache.flink.contrib.streaming.state.RocksDBStateUploaderTest.testUploadedSstCanBeCleanedUp
>   Time elapsed: 0.115 s  <<< FAILURE!
> Feb 06 02:54:01 java.lang.AssertionError: 
> Feb 06 02:54:01 
> Feb 06 02:54:01 Expecting empty but was: 
> ["379065d4-dd29-4455-b30f-4dbc53336ea2"]
> Feb 06 02:54:01   at 
> org.apache.flink.contrib.streaming.state.RocksDBStateUploaderTest.testUploadedSstCanBeCleanedUp(RocksDBStateUploaderTest.java:166)
> Feb 06 02:54:01   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [...]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-30461) Some rocksdb sst files will remain forever

2023-02-06 Thread Yuan Mei (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei resolved FLINK-30461.
--
Release Note: clean-up left-over rocksdb ssts in the shared scope 
introduced in FLINK-11008 (Flink version 1.8)
  Resolution: Fixed

> Some rocksdb sst files will remain forever
> --
>
> Key: FLINK-30461
> URL: https://issues.apache.org/jira/browse/FLINK-30461
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.16.0, 1.17.0, 1.15.3
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.15.4, 1.16.2
>
> Attachments: image-2022-12-20-18-45-32-948.png, 
> image-2022-12-20-18-47-42-385.png, screenshot-1.png
>
>
> In rocksdb incremental checkpoint mode, during file upload, if some files 
> have been uploaded and some files have not been uploaded, the checkpoint is 
> canceled due to checkpoint timeout at this time, and the uploaded files will 
> remain.
>  
> h2. Impact: 
> The shared directory of a flink job has more than 1 million files. It 
> exceeded the hdfs upper limit, causing new files not to be written.
> However only 50k files are available, the other 950k files should be cleaned 
> up.
> !https://user-images.githubusercontent.com/38427477/207588272-dda7ba69-c84c-4372-aeb4-c54657b9b956.png|width=1962,height=364!
> h2. Root cause:
> If an exception is thrown during the checkpoint async phase, flink will clean 
> up metaStateHandle, miscFiles and sstFiles.
> However, when all sst files are uploaded, they are added together to 
> sstFiles. If some sst files have been uploaded and some sst files are still 
> being uploaded, and  the checkpoint is canceled due to checkpoint timeout at 
> this time, all sst files will not be added to sstFiles. The uploaded sst will 
> remain on hdfs.
> [code 
> link|https://github.com/apache/flink/blob/49146cdec41467445de5fc81f100585142728bdf/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/snapshot/RocksIncrementalSnapshotStrategy.java#L328]
> h2. Solution:
> Using the CloseableRegistry as the tmpResourcesRegistry. If the async phase 
> is failed, the tmpResourcesRegistry will cleanup these temporary resources.
>  
> POC code:
> [https://github.com/1996fanrui/flink/commit/86a456b2bbdad6c032bf8e0bff71c4824abb3ce1]
>  
>  
> !image-2022-12-20-18-45-32-948.png|width=1114,height=442!
> !image-2022-12-20-18-47-42-385.png|width=1332,height=552!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30929) Add helpful message when ElasticsearchSink.Builder.build() throws an IllegalArgumentException.

2023-02-06 Thread Kenny Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685029#comment-17685029
 ] 

Kenny Wu commented on FLINK-30929:
--

[~martijnvisser] I have checked the latest version. The case has been improved 
already, as well as other property check statements.
{code:java}
public B setHosts(HttpHost... hosts) {
Preconditions.checkNotNull(hosts);
Preconditions.checkState(hosts.length > 0, "Hosts cannot be empty.");
this.hosts = Arrays.asList(hosts);
return this.self();
}{code}
 

> Add helpful message when ElasticsearchSink.Builder.build() throws an 
> IllegalArgumentException.
> --
>
> Key: FLINK-30929
> URL: https://issues.apache.org/jira/browse/FLINK-30929
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.13.6
>Reporter: Kenny Wu
>Priority: Major
> Attachments: image-2023-02-06-21-28-37-835.png
>
>
> When I test flink-connector-elasticsearch on my project. And *I got an 
> IllegalArgumentException with nothing helpful message.* 
> *Here's the exception message:* 
>  
> {code:java}
> Exception in thread "main" java.lang.IllegalArgumentException
>     at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:122)
>     at 
> org.apache.flink.streaming.connectors.elasticsearch7.Elasticsearch7ApiCallBridge.(Elasticsearch7ApiCallBridge.java:61)
>     at 
> org.apache.flink.streaming.connectors.elasticsearch7.ElasticsearchSink.(ElasticsearchSink.java:74)
>     ... {code}
>  
> *I could not see which exactly the argument was illegal.*
> After I read the code in (Elasticsearch7ApiCallBridge.java:61), I realized 
> that proberbly  the ES hosts was empty. And finally fixed it.
> !image-2023-02-06-21-28-37-835.png!
> I think helpful message should be printed when such an important argument is 
> illegal and fails to build the connector.
> And I'd love to improve it. thanks
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-30461) Some rocksdb sst files will remain forever

2023-02-06 Thread Yuan Mei (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685023#comment-17685023
 ] 

Yuan Mei edited comment on FLINK-30461 at 2/7/23 3:19 AM:
--

Thanks [~fanrui] for the fix and [~yanfei] and [~Zakelly]  for discussion and 
review!

 

merged commit 
[{{7326add}}|https://github.com/apache/flink/commit/7326addbb3f2b4867689755a04512412bcf69657]
 into apache:master (for cleaning-up left-over rocksdb shared ssts)

merged commit 
[{{d761f51}}|https://github.com/apache/flink/commit/d761f51ab5a76331eb9b8f423e0ffaf1d04f97f5]
 into apache:master (for the concurrent thread race-condition in unit tests)

 

merged commit 
[{{e2c3d61}}|https://github.com/apache/flink/commit/e2c3d6152d95074561d062ce0b61b6428f6dd6e1]
 into apache:release-1.16 (back port the above two to 1.16 branch)

merged commit 
[{{a63f298}}|https://github.com/apache/flink/commit/a63f2983703f438989d069ababcf4cc173441646]
 into apache:release-1.15 (back port the above two to 1.15 branch)

 


was (Author: ym):
Thanks [~fanrui] for the fix and [~yanfei] for discussion and review!

 

merged commit 
[{{7326add}}|https://github.com/apache/flink/commit/7326addbb3f2b4867689755a04512412bcf69657]
 into apache:master (for cleaning-up left-over rocksdb shared ssts)

merged commit 
[{{d761f51}}|https://github.com/apache/flink/commit/d761f51ab5a76331eb9b8f423e0ffaf1d04f97f5]
 into apache:master (for the concurrent thread race-condition in unit tests)

 

merged commit 
[{{e2c3d61}}|https://github.com/apache/flink/commit/e2c3d6152d95074561d062ce0b61b6428f6dd6e1]
 into apache:release-1.16 (back port the above two to 1.16 branch)

merged commit 
[{{a63f298}}|https://github.com/apache/flink/commit/a63f2983703f438989d069ababcf4cc173441646]
 into apache:release-1.15 (back port the above two to 1.15 branch)

 

> Some rocksdb sst files will remain forever
> --
>
> Key: FLINK-30461
> URL: https://issues.apache.org/jira/browse/FLINK-30461
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.16.0, 1.17.0, 1.15.3
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.15.4, 1.16.2
>
> Attachments: image-2022-12-20-18-45-32-948.png, 
> image-2022-12-20-18-47-42-385.png, screenshot-1.png
>
>
> In rocksdb incremental checkpoint mode, during file upload, if some files 
> have been uploaded and some files have not been uploaded, the checkpoint is 
> canceled due to checkpoint timeout at this time, and the uploaded files will 
> remain.
>  
> h2. Impact: 
> The shared directory of a flink job has more than 1 million files. It 
> exceeded the hdfs upper limit, causing new files not to be written.
> However only 50k files are available, the other 950k files should be cleaned 
> up.
> !https://user-images.githubusercontent.com/38427477/207588272-dda7ba69-c84c-4372-aeb4-c54657b9b956.png|width=1962,height=364!
> h2. Root cause:
> If an exception is thrown during the checkpoint async phase, flink will clean 
> up metaStateHandle, miscFiles and sstFiles.
> However, when all sst files are uploaded, they are added together to 
> sstFiles. If some sst files have been uploaded and some sst files are still 
> being uploaded, and  the checkpoint is canceled due to checkpoint timeout at 
> this time, all sst files will not be added to sstFiles. The uploaded sst will 
> remain on hdfs.
> [code 
> link|https://github.com/apache/flink/blob/49146cdec41467445de5fc81f100585142728bdf/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/snapshot/RocksIncrementalSnapshotStrategy.java#L328]
> h2. Solution:
> Using the CloseableRegistry as the tmpResourcesRegistry. If the async phase 
> is failed, the tmpResourcesRegistry will cleanup these temporary resources.
>  
> POC code:
> [https://github.com/1996fanrui/flink/commit/86a456b2bbdad6c032bf8e0bff71c4824abb3ce1]
>  
>  
> !image-2022-12-20-18-45-32-948.png|width=1114,height=442!
> !image-2022-12-20-18-47-42-385.png|width=1332,height=552!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-30461) Some rocksdb sst files will remain forever

2023-02-06 Thread Yuan Mei (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685023#comment-17685023
 ] 

Yuan Mei edited comment on FLINK-30461 at 2/7/23 3:18 AM:
--

Thanks [~fanrui] for the fix and [~yanfei] for discussion and review!

 

merged commit 
[{{7326add}}|https://github.com/apache/flink/commit/7326addbb3f2b4867689755a04512412bcf69657]
 into apache:master (for cleaning-up left-over rocksdb shared ssts)

merged commit 
[{{d761f51}}|https://github.com/apache/flink/commit/d761f51ab5a76331eb9b8f423e0ffaf1d04f97f5]
 into apache:master (for the concurrent thread race-condition in unit tests)

 

merged commit 
[{{e2c3d61}}|https://github.com/apache/flink/commit/e2c3d6152d95074561d062ce0b61b6428f6dd6e1]
 into apache:release-1.16 (back port the above two to 1.16 branch)

merged commit 
[{{a63f298}}|https://github.com/apache/flink/commit/a63f2983703f438989d069ababcf4cc173441646]
 into apache:release-1.15 (back port the above two to 1.15 branch)

 


was (Author: ym):
Thanks [~fanrui] for the fix and [~yanfei] for discussion and review!

 

merged commit 
[{{7326add}}|https://github.com/apache/flink/commit/7326addbb3f2b4867689755a04512412bcf69657]
 into apache:master (for the left-over rocksdb shared ssts)

merged commit 
[{{d761f51}}|https://github.com/apache/flink/commit/d761f51ab5a76331eb9b8f423e0ffaf1d04f97f5]
 into apache:master (for the concurrent thread race-condition in the unit test)

 

merged commit 
[{{e2c3d61}}|https://github.com/apache/flink/commit/e2c3d6152d95074561d062ce0b61b6428f6dd6e1]
 into apache:release-1.16 (back port the above two to 1.16 branch)

merged commit 
[{{a63f298}}|https://github.com/apache/flink/commit/a63f2983703f438989d069ababcf4cc173441646]
 into apache:release-1.15 (back port the above two to 1.15 branch)

 

> Some rocksdb sst files will remain forever
> --
>
> Key: FLINK-30461
> URL: https://issues.apache.org/jira/browse/FLINK-30461
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.16.0, 1.17.0, 1.15.3
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.15.4, 1.16.2
>
> Attachments: image-2022-12-20-18-45-32-948.png, 
> image-2022-12-20-18-47-42-385.png, screenshot-1.png
>
>
> In rocksdb incremental checkpoint mode, during file upload, if some files 
> have been uploaded and some files have not been uploaded, the checkpoint is 
> canceled due to checkpoint timeout at this time, and the uploaded files will 
> remain.
>  
> h2. Impact: 
> The shared directory of a flink job has more than 1 million files. It 
> exceeded the hdfs upper limit, causing new files not to be written.
> However only 50k files are available, the other 950k files should be cleaned 
> up.
> !https://user-images.githubusercontent.com/38427477/207588272-dda7ba69-c84c-4372-aeb4-c54657b9b956.png|width=1962,height=364!
> h2. Root cause:
> If an exception is thrown during the checkpoint async phase, flink will clean 
> up metaStateHandle, miscFiles and sstFiles.
> However, when all sst files are uploaded, they are added together to 
> sstFiles. If some sst files have been uploaded and some sst files are still 
> being uploaded, and  the checkpoint is canceled due to checkpoint timeout at 
> this time, all sst files will not be added to sstFiles. The uploaded sst will 
> remain on hdfs.
> [code 
> link|https://github.com/apache/flink/blob/49146cdec41467445de5fc81f100585142728bdf/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/snapshot/RocksIncrementalSnapshotStrategy.java#L328]
> h2. Solution:
> Using the CloseableRegistry as the tmpResourcesRegistry. If the async phase 
> is failed, the tmpResourcesRegistry will cleanup these temporary resources.
>  
> POC code:
> [https://github.com/1996fanrui/flink/commit/86a456b2bbdad6c032bf8e0bff71c4824abb3ce1]
>  
>  
> !image-2022-12-20-18-45-32-948.png|width=1114,height=442!
> !image-2022-12-20-18-47-42-385.png|width=1332,height=552!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30461) Some rocksdb sst files will remain forever

2023-02-06 Thread Yuan Mei (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685023#comment-17685023
 ] 

Yuan Mei commented on FLINK-30461:
--

Thanks [~fanrui] for the fix and [~yanfei] for discussion and review!

 

merged commit 
[{{7326add}}|https://github.com/apache/flink/commit/7326addbb3f2b4867689755a04512412bcf69657]
 into apache:master (for the left-over rocksdb shared ssts)

merged commit 
[{{d761f51}}|https://github.com/apache/flink/commit/d761f51ab5a76331eb9b8f423e0ffaf1d04f97f5]
 into apache:master (for the concurrent thread race-condition in the unit test)

 

merged commit 
[{{e2c3d61}}|https://github.com/apache/flink/commit/e2c3d6152d95074561d062ce0b61b6428f6dd6e1]
 into apache:release-1.16 (back port the above two to 1.16 branch)

merged commit 
[{{a63f298}}|https://github.com/apache/flink/commit/a63f2983703f438989d069ababcf4cc173441646]
 into apache:release-1.15 (back port the above two to 1.15 branch)

 

> Some rocksdb sst files will remain forever
> --
>
> Key: FLINK-30461
> URL: https://issues.apache.org/jira/browse/FLINK-30461
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.16.0, 1.17.0, 1.15.3
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.15.4, 1.16.2
>
> Attachments: image-2022-12-20-18-45-32-948.png, 
> image-2022-12-20-18-47-42-385.png, screenshot-1.png
>
>
> In rocksdb incremental checkpoint mode, during file upload, if some files 
> have been uploaded and some files have not been uploaded, the checkpoint is 
> canceled due to checkpoint timeout at this time, and the uploaded files will 
> remain.
>  
> h2. Impact: 
> The shared directory of a flink job has more than 1 million files. It 
> exceeded the hdfs upper limit, causing new files not to be written.
> However only 50k files are available, the other 950k files should be cleaned 
> up.
> !https://user-images.githubusercontent.com/38427477/207588272-dda7ba69-c84c-4372-aeb4-c54657b9b956.png|width=1962,height=364!
> h2. Root cause:
> If an exception is thrown during the checkpoint async phase, flink will clean 
> up metaStateHandle, miscFiles and sstFiles.
> However, when all sst files are uploaded, they are added together to 
> sstFiles. If some sst files have been uploaded and some sst files are still 
> being uploaded, and  the checkpoint is canceled due to checkpoint timeout at 
> this time, all sst files will not be added to sstFiles. The uploaded sst will 
> remain on hdfs.
> [code 
> link|https://github.com/apache/flink/blob/49146cdec41467445de5fc81f100585142728bdf/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/snapshot/RocksIncrementalSnapshotStrategy.java#L328]
> h2. Solution:
> Using the CloseableRegistry as the tmpResourcesRegistry. If the async phase 
> is failed, the tmpResourcesRegistry will cleanup these temporary resources.
>  
> POC code:
> [https://github.com/1996fanrui/flink/commit/86a456b2bbdad6c032bf8e0bff71c4824abb3ce1]
>  
>  
> !image-2022-12-20-18-45-32-948.png|width=1114,height=442!
> !image-2022-12-20-18-47-42-385.png|width=1332,height=552!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] curcur merged pull request #21855: [bp-1.15][FLINK-30461][checkpoint] Fix the bug that the shared files will remain forever after the rocksdb incremental checkpoint exception

2023-02-06 Thread via GitHub



curcur merged PR #21855:
URL: https://github.com/apache/flink/pull/21855


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] curcur merged pull request #21853: [bp-1.16][FLINK-30461][checkpoint] Fix the bug that the shared files will remain forever after the rocksdb incremental checkpoint exception

2023-02-06 Thread via GitHub



curcur merged PR #21853:
URL: https://github.com/apache/flink/pull/21853


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] curcur commented on pull request #21853: [bp-1.16][FLINK-30461][checkpoint] Fix the bug that the shared files will remain forever after the rocksdb incremental checkpoint exception

2023-02-06 Thread via GitHub



curcur commented on PR #21853:
URL: https://github.com/apache/flink/pull/21853#issuecomment-1420134053

   Thanks for the fix! @1996fanrui 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] curcur merged pull request #21859: [FLINK-30916][checkpoint] Set the number of rocksdb uploader threads to 1 to solve the unstable bug of unit test

2023-02-06 Thread via GitHub



curcur merged PR #21859:
URL: https://github.com/apache/flink/pull/21859


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30730) StringIndexer cannot handle null values correctly

2023-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30730:
---
Labels: pull-request-available  (was: )

> StringIndexer cannot handle null values correctly
> -
>
> Key: FLINK-30730
> URL: https://issues.apache.org/jira/browse/FLINK-30730
> Project: Flink
>  Issue Type: Bug
>  Components: Library / Machine Learning
>Affects Versions: ml-2.1.0
>Reporter: Fan Hong
>Priority: Major
>  Labels: pull-request-available
>
> When training data contains null values, StringIndexer throws a exception. 
> The reason is this method [1]: null values are neither String type nor Number 
> type.
> In StringIndexerModel, null values are also not handled correctly when 
> performing transformation.
>  
> [1] 
> [https://github.com/apache/flink-ml/blob/966cedd7bbab4e12d8d8b37dbd582146714e68a6/flink-ml-lib/src/main/java/org/apache/flink/ml/feature/stringindexer/StringIndexer.java#L164]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] curcur commented on pull request #21859: [FLINK-30916][checkpoint] Set the number of rocksdb uploader threads to 1 to solve the unstable bug of unit test

2023-02-06 Thread via GitHub



curcur commented on PR #21859:
URL: https://github.com/apache/flink/pull/21859#issuecomment-1420132426

   Thanks @1996fanrui!
   
   I think you are right, 1.17 branch has not been cut yet!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] Fanoid opened a new pull request, #205: [FLINK-30730] Fix null values handling in StringIndexer and StringIndexerModel

2023-02-06 Thread via GitHub



Fanoid opened a new pull request, #205:
URL: https://github.com/apache/flink-ml/pull/205

   ## What is the purpose of the change
   
   This PR fixes null values handling in StringIndexer and StringIndexerModel.
   
   ## Brief change log
   
- Skip null values in training, and handle null values in transforming.
- Add tests for null values handling in corresponding test class.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk commented on pull request #21854: [FLINK-30901][runtime] Fix the jobVertex's parallelismConfigured is incorrect when chaining with source operators

2023-02-06 Thread via GitHub



zhuzhurk commented on PR #21854:
URL: https://github.com/apache/flink/pull/21854#issuecomment-1420124649

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] leonardBang commented on a diff in pull request #21875: [FLINK-30921][ci] Adds sed command for updating the sources

2023-02-06 Thread via GitHub



leonardBang commented on code in PR #21875:
URL: https://github.com/apache/flink/pull/21875#discussion_r1098124530


##
tools/azure-pipelines/uploading_watchdog.sh:
##
@@ -27,6 +27,9 @@ if [ -z "$HERE" ] ; then
   exit 1
 fi
 
+sudo sed -i 's/azure.archive.ubuntu.com/archive.ubuntu.com/g' 
/etc/apt/sources.list
+sudo apt-get update
+

Review Comment:
   I‘m not sure this works as the azure service has recovered  when this CI 
running



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk commented on pull request #21857: [FLINK-30903][runtime] Fix the max parallelism used in adaptive batch scheduler doesn't fallbacks to default parallelism

2023-02-06 Thread via GitHub



zhuzhurk commented on PR #21857:
URL: https://github.com/apache/flink/pull/21857#issuecomment-1420124614

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-30921) Too many CI failed due to "Could not connect to azure.archive.ubuntu.com"

2023-02-06 Thread Leonard Xu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685020#comment-17685020
 ] 

Leonard Xu commented on FLINK-30921:


I downgrade the issue priority to Critical as it's an external(azure infra) 
service crash and has recovered, we have use this service for a long time and 
thus it should not block our release.

 Sure that we need to fix this in our community.

> Too many CI failed due to "Could not connect to azure.archive.ubuntu.com"
> -
>
> Key: FLINK-30921
> URL: https://issues.apache.org/jira/browse/FLINK-30921
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Reporter: Rui Fan
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-02-06-17-59-20-019.png
>
>
> !image-2023-02-06-17-59-20-019.png!
>  
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45762=logs=bea52777-eaf8-5663-8482-18fbc3630e81=b2642e3a-5b86-574d-4c8a-f7e2842bfb14]
>  
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45766=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=160c9ae5-96fd-516e-1c91-deb81f59292a



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30921) Too many CI failed due to "Could not connect to azure.archive.ubuntu.com"

2023-02-06 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-30921:
---
Priority: Critical  (was: Blocker)

> Too many CI failed due to "Could not connect to azure.archive.ubuntu.com"
> -
>
> Key: FLINK-30921
> URL: https://issues.apache.org/jira/browse/FLINK-30921
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Reporter: Rui Fan
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-02-06-17-59-20-019.png
>
>
> !image-2023-02-06-17-59-20-019.png!
>  
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45762=logs=bea52777-eaf8-5663-8482-18fbc3630e81=b2642e3a-5b86-574d-4c8a-f7e2842bfb14]
>  
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45766=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=160c9ae5-96fd-516e-1c91-deb81f59292a



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] luoyuxia commented on a diff in pull request #21789: [FLINK-30824][hive] Add document for option table.exec.hive.native-agg-function.enabled

2023-02-06 Thread via GitHub



luoyuxia commented on code in PR #21789:
URL: https://github.com/apache/flink/pull/21789#discussion_r1098118684


##
docs/content.zh/docs/connectors/table/hive/hive_functions.md:
##
@@ -73,6 +73,34 @@ Some Hive built-in functions in older versions have [thread 
safety issues](https
 We recommend users patch their own Hive to fix them.
 {{< /hint >}}
 
+## Use Native Hive Aggregate Functions
+
+If [HiveModule]({{< ref "docs/dev/table/modules" >}}#hivemodule) is loaded 
with a higher priority than CoreModule, Flink will try to use the Hive built-in 
function first. And then for Hive built-in aggregation function,
+Flink currently uses sort-based aggregation strategy. Compared to hash-based 
aggregation strategy, the performance is one to two times worse, so from Flink 
1.17, we have implemented some of Hive's aggregation functions natively in 
Flink.
+These functions will use the hash-agg strategy to improve performance. 
Currently, only five functions are supported, namely sum/count/avg/min/max, and 
more aggregation functions will be supported in the future.
+Users can use the native aggregation function by turning on the option 
`table.exec.hive.native-agg-function.enabled`, which brings significant 
performance improvement to the job.
+
+
+  
+
+Key
+Default
+Type
+Description
+
+  
+  
+
+table.exec.hive.native-agg-function.enabled
+false
+Boolean
+Enabling to use native aggregate function which use hash-agg 
strategy that can improve the aggregation performance after loading HiveModule. 
This is a job-level option, user can enable it per-job.
+
+  
+
+
+Attention The ability of the native 
aggregate functions don't fully align with Hive built-in aggregation functions 
now, for example, some data types are not supported. If performance is not a 
bottleneck, you don't need to turn on this option.

Review Comment:
   Sorry for misleading. May be replace `don't` with `does not`.
   ```suggestion
   Attention The ability of the native 
aggregate functions does notfully align with Hive built-in aggregation 
functions now, for example, some data types are not supported. If performance 
is not a bottleneck, you don't need to turn on this option.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] luoyuxia commented on a diff in pull request #21789: [FLINK-30824][hive] Add document for option table.exec.hive.native-agg-function.enabled

2023-02-06 Thread via GitHub



luoyuxia commented on code in PR #21789:
URL: https://github.com/apache/flink/pull/21789#discussion_r1098118684


##
docs/content.zh/docs/connectors/table/hive/hive_functions.md:
##
@@ -73,6 +73,34 @@ Some Hive built-in functions in older versions have [thread 
safety issues](https
 We recommend users patch their own Hive to fix them.
 {{< /hint >}}
 
+## Use Native Hive Aggregate Functions
+
+If [HiveModule]({{< ref "docs/dev/table/modules" >}}#hivemodule) is loaded 
with a higher priority than CoreModule, Flink will try to use the Hive built-in 
function first. And then for Hive built-in aggregation function,
+Flink currently uses sort-based aggregation strategy. Compared to hash-based 
aggregation strategy, the performance is one to two times worse, so from Flink 
1.17, we have implemented some of Hive's aggregation functions natively in 
Flink.
+These functions will use the hash-agg strategy to improve performance. 
Currently, only five functions are supported, namely sum/count/avg/min/max, and 
more aggregation functions will be supported in the future.
+Users can use the native aggregation function by turning on the option 
`table.exec.hive.native-agg-function.enabled`, which brings significant 
performance improvement to the job.
+
+
+  
+
+Key
+Default
+Type
+Description
+
+  
+  
+
+table.exec.hive.native-agg-function.enabled
+false
+Boolean
+Enabling to use native aggregate function which use hash-agg 
strategy that can improve the aggregation performance after loading HiveModule. 
This is a job-level option, user can enable it per-job.
+
+  
+
+
+Attention The ability of the native 
aggregate functions don't fully align with Hive built-in aggregation functions 
now, for example, some data types are not supported. If performance is not a 
bottleneck, you don't need to turn on this option.

Review Comment:
   Sorry for misleading. May be replace `don't` with `doesn't`.
   ```suggestion
   Attention The ability of the native 
aggregate functions doesn't fully align with Hive built-in aggregation 
functions now, for example, some data types are not supported. If performance 
is not a bottleneck, you don't need to turn on this option.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] leonardBang merged pull request #21845: [hotfix][flink-core][doc] add javadoc to improve the code readability

2023-02-06 Thread via GitHub



leonardBang merged PR #21845:
URL: https://github.com/apache/flink/pull/21845


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-30811) Fix sql gateway can not stop job correctly

2023-02-06 Thread Shengkai Fang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengkai Fang closed FLINK-30811.
-
Resolution: Fixed

Merged into master: bbd5a7876eb1542ff89f05f5f5d82bb8bd41b7bd

> Fix sql gateway can not stop job correctly
> --
>
> Key: FLINK-30811
> URL: https://issues.apache.org/jira/browse/FLINK-30811
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Gateway
>Affects Versions: 1.17.0
>Reporter: Shengkai Fang
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30833) TestMetricGroup supports being closed

2023-02-06 Thread Mason Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mason Chen updated FLINK-30833:
---
Description: For testing purposes, closing a metric group is required to 
verify logic that maintains metrics. The TestMetricGroup needs to extend 
AbstractMetricGroup<> to support this.  (was: The internal 
TestingMetricRegistry does not register an action when a metric group is 
closed. This would allow the use of the MetricListener to test logic that 
closes metric groups.

[https://github.com/apache/flink/blob/44dbb8ec84792540095b826616d0f21b745aa995/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/metrics/testutils/MetricListener.java#L53]

This change doesn't require exposing any public API and is transparent.)

> TestMetricGroup supports being closed
> -
>
> Key: FLINK-30833
> URL: https://issues.apache.org/jira/browse/FLINK-30833
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Tests
>Affects Versions: 1.16.1
>Reporter: Mason Chen
>Priority: Major
>
> For testing purposes, closing a metric group is required to verify logic that 
> maintains metrics. The TestMetricGroup needs to extend AbstractMetricGroup<> 
> to support this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30833) TestMetricGroup supports being closed

2023-02-06 Thread Mason Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mason Chen updated FLINK-30833:
---
Summary: TestMetricGroup supports being closed  (was: MetricListener 
supports closing metric groups)

> TestMetricGroup supports being closed
> -
>
> Key: FLINK-30833
> URL: https://issues.apache.org/jira/browse/FLINK-30833
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Tests
>Affects Versions: 1.16.1
>Reporter: Mason Chen
>Priority: Major
>
> The internal TestingMetricRegistry does not register an action when a metric 
> group is closed. This would allow the use of the MetricListener to test logic 
> that closes metric groups.
> [https://github.com/apache/flink/blob/44dbb8ec84792540095b826616d0f21b745aa995/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/metrics/testutils/MetricListener.java#L53]
> This change doesn't require exposing any public API and is transparent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30833) MetricListener supports closing metric groups

2023-02-06 Thread Mason Chen (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685012#comment-17685012
 ] 

Mason Chen commented on FLINK-30833:


For context, I was writing a connector that depends on closing some metric 
groups. The closed metric group can be created independently of MetricListener 
although other tests can continue to use it. [~chesnay] 's described approach 
makes more sense and I will update the Jira ticket to reflect that. Thanks!

> MetricListener supports closing metric groups
> -
>
> Key: FLINK-30833
> URL: https://issues.apache.org/jira/browse/FLINK-30833
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Tests
>Affects Versions: 1.16.1
>Reporter: Mason Chen
>Priority: Major
>
> The internal TestingMetricRegistry does not register an action when a metric 
> group is closed. This would allow the use of the MetricListener to test logic 
> that closes metric groups.
> [https://github.com/apache/flink/blob/44dbb8ec84792540095b826616d0f21b745aa995/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/metrics/testutils/MetricListener.java#L53]
> This change doesn't require exposing any public API and is transparent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] fsk119 closed pull request #21802: [FLINK-30811][sql-gateway] Fix SqlGateway cannot stop job correctly

2023-02-06 Thread via GitHub



fsk119 closed pull request #21802: [FLINK-30811][sql-gateway] Fix SqlGateway 
cannot stop job correctly
URL: https://github.com/apache/flink/pull/21802


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] fsk119 commented on pull request #21802: [FLINK-30811][sql-gateway] Fix SqlGateway cannot stop job correctly

2023-02-06 Thread via GitHub



fsk119 commented on PR #21802:
URL: https://github.com/apache/flink/pull/21802#issuecomment-1420104729

   All tests pass in my private pipeline: 
https://dev.azure.com/1059623455/Flink/_build/results?buildId=544=results
   
   Merging...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30932) Enabling producer metrics for KafkaSink is not documented

2023-02-06 Thread Mason Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mason Chen updated FLINK-30932:
---
Description: Users can enable producer metrics by setting 
`register.producer.metrics` to True. We should expose this as a ConfigOption to 
automate it with Flink's documentation process. Kafka Table connector config 
options are already auto-generated  (was: Users can enable producer metrics by 
setting `register.producer.metrics` to True. We should expose this as a 
ConfigOption to automate it with Flink's documentation process.)

> Enabling producer metrics for KafkaSink is not documented
> -
>
> Key: FLINK-30932
> URL: https://issues.apache.org/jira/browse/FLINK-30932
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Affects Versions: 1.17.0
>Reporter: Mason Chen
>Priority: Major
> Fix For: 1.17.0
>
>
> Users can enable producer metrics by setting `register.producer.metrics` to 
> True. We should expose this as a ConfigOption to automate it with Flink's 
> documentation process. Kafka Table connector config options are already 
> auto-generated



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30921) Too many CI failed due to "Could not connect to azure.archive.ubuntu.com"

2023-02-06 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685006#comment-17685006
 ] 

Rui Fan commented on FLINK-30921:
-

I see it's recovered, the new CI is green.

> Too many CI failed due to "Could not connect to azure.archive.ubuntu.com"
> -
>
> Key: FLINK-30921
> URL: https://issues.apache.org/jira/browse/FLINK-30921
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Reporter: Rui Fan
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-02-06-17-59-20-019.png
>
>
> !image-2023-02-06-17-59-20-019.png!
>  
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45762=logs=bea52777-eaf8-5663-8482-18fbc3630e81=b2642e3a-5b86-574d-4c8a-f7e2842bfb14]
>  
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45766=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=160c9ae5-96fd-516e-1c91-deb81f59292a



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-30864) Optional pattern at the start of a group pattern not working

2023-02-06 Thread Dian Fu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu reassigned FLINK-30864:
---

Assignee: Juntao Hu

> Optional pattern at the start of a group pattern not working
> 
>
> Key: FLINK-30864
> URL: https://issues.apache.org/jira/browse/FLINK-30864
> Project: Flink
>  Issue Type: Bug
>  Components: Library / CEP
>Affects Versions: 1.16.1
>Reporter: Juntao Hu
>Assignee: Juntao Hu
>Priority: Major
>  Labels: pull-request-available
>
> The optional pattern at the start of a group pattern turns out be "not 
> optional", e.g.
> {code:java}
> Pattern.begin("A").next(Pattern.begin("B").optional().next("C")).next("D")
> {code}
> cannot match sequence "a1 c1 d1".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30932) Enabling producer metrics for KafkaSink is not documented

2023-02-06 Thread Mason Chen (Jira)

Mason Chen created FLINK-30932:
--

 Summary: Enabling producer metrics for KafkaSink is not documented
 Key: FLINK-30932
 URL: https://issues.apache.org/jira/browse/FLINK-30932
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kafka
Affects Versions: 1.17.0
Reporter: Mason Chen
 Fix For: 1.17.0


Users can enable producer metrics by setting `register.producer.metrics` to 
True. We should expose this as a ConfigOption to automate it with Flink's 
documentation process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-29825) Improve benchmark stability

2023-02-06 Thread Dong Lin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684999#comment-17684999
 ] 

Dong Lin edited comment on FLINK-29825 at 2/7/23 2:15 AM:
--

[~pnowojski] I think one drawback with your proposal is that it is comparing 
two distributions and depends on having large enough number of samples in both 
distributions. It means that after a regression has happened, you need to run 
engouh commit-points so that the recent distribution starts to be considerably 
different from the previous distribution according to Kolmogorov-Smirnov test. 
This would considerably delay the time-to-regression-detection. It seems that 
my proposal would not suffer from this issue since it allows users to specify 
how many commit-points we need to repeat the regression before sending alert. 
And this number can be 1-3 commit points.

Regarding the drawback of not detecting "there is visible performance 
regression within benchmark noise", my proposal is to either exclude noisy 
benchmark completely, or we can require the regression to be 2X the noise (the 
ratio is also tunable). These sound like a reasonable practical solution, right?

I don't think we will be able to have perfect regression detection without any 
drawback(e.g. 0 false positive and 0 false negative). The question is whether 
the proposed solution can be useful enough (i.e. low false positive and low 
false negative) and whether it is the best solution across all available 
choices. So it can be OK if some regression is not detected, like the one 
mentioned above


BTW, regarding the noisy benchmark mentioned above, I am curious how 
Kolmogorov-Smirnov test can address issue. Maybe I can update my proposal to 
re-use the idea. Can you help explain it?

I will take a look at the tooling mentioned above later to see if we can learn 
from them or re-use them.


was (Author: lindong):
[~pnowojski] I think one drawback with your proposal is that it is comparing 
two distributions and depends on having large enough number of samples in both 
distributions. It means that if a benchmark happens, you need to run engouh 
commit-points so that the recent distribution starts to be considerably 
different from the previous distribution according to Kolmogorov-Smirnov test. 
This would considerably delay the time-to-regression-detection. It seems that 
my proposal would not suffer from this issue since it allows users to specify 
how many commit-points we need to repeat the regression before sending alert. 
And this number can be 1-3 commit points.

Regarding the drawback of not detecting "there is visible performance 
regression within benchmark noise", my proposal is to either exclude noisy 
benchmark completely, or we can require the regression to be 2X the noise (the 
ratio is also tunable). These sound like a reasonable practical solution, right?

BTW, I don't think we will be able to have perfect regression detection without 
any drawback(e.g. 0 false positive and 0 false negative). The question is 
whether the proposed solution can be useful enough (i.e. low false positive and 
low false negative) and whether it is the best solution across all available 
choices. So it can be OK if some regression is not detected, like the one 
mentioned above


BTW, regarding the noisy benchmark mentioned above, I am curious how 
Kolmogorov-Smirnov test can address issue. Maybe I can update my proposal to 
re-use the idea. Can you help explain it?

I will take a look at the tooling mentioned above later to see if we can learn 
from them or re-use them.

> Improve benchmark stability
> ---
>
> Key: FLINK-29825
> URL: https://issues.apache.org/jira/browse/FLINK-29825
> Project: Flink
>  Issue Type: Improvement
>  Components: Benchmarks
>Affects Versions: 1.17.0
>Reporter: Yanfei Lei
>Assignee: Yanfei Lei
>Priority: Minor
>
> Currently, regressions are detected by a simple script which may have false 
> positives and false negatives, especially for benchmarks with small absolute 
> values, small value changes would cause large percentage changes. see 
> [here|https://github.com/apache/flink-benchmarks/blob/master/regression_report.py#L132-L136]
>  for details.
> And all benchmarks are executed on one physical machine, it might happen that 
> hardware issues affect performance, like "[FLINK-18614] Performance 
> regression 2020.07.13".
>  
> This ticket aims to improve the precision and recall of the regression-check 
> script.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30886) Content area is too narrow to show all content

2023-02-06 Thread Xingbo Huang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685000#comment-17685000
 ] 

Xingbo Huang commented on FLINK-30886:
--

T he content that is too long needs to be viewed by scrolling, we can improve 
this style.

!image-2023-02-07-10-12-09-237.png!

> Content area is too narrow to show all content
> --
>
> Key: FLINK-30886
> URL: https://issues.apache.org/jira/browse/FLINK-30886
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Ari Huttunen
>Priority: Minor
> Attachments: Screenshot 2023-02-02 at 15.13.57.png, 
> image-2023-02-07-10-12-09-237.png
>
>
> If you open a page like this, you notice that the main content is not fully 
> visible.
> [https://nightlies.apache.org/flink/flink-docs-master/api/python/reference/pyflink.table/table_environment.html]
> Here's a screenshot. You can see that the right-most characters are cut off. 
> The screenshot is of Vivaldi, but it looks like that on Safari as well.
> !Screenshot 2023-02-02 at 15.13.57.png|width=1498,height=833!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30886) Content area is too narrow to show all content

2023-02-06 Thread Xingbo Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingbo Huang updated FLINK-30886:
-
Attachment: image-2023-02-07-10-12-09-237.png

> Content area is too narrow to show all content
> --
>
> Key: FLINK-30886
> URL: https://issues.apache.org/jira/browse/FLINK-30886
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Ari Huttunen
>Priority: Minor
> Attachments: Screenshot 2023-02-02 at 15.13.57.png, 
> image-2023-02-07-10-12-09-237.png
>
>
> If you open a page like this, you notice that the main content is not fully 
> visible.
> [https://nightlies.apache.org/flink/flink-docs-master/api/python/reference/pyflink.table/table_environment.html]
> Here's a screenshot. You can see that the right-most characters are cut off. 
> The screenshot is of Vivaldi, but it looks like that on Safari as well.
> !Screenshot 2023-02-02 at 15.13.57.png|width=1498,height=833!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29825) Improve benchmark stability

2023-02-06 Thread Dong Lin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684999#comment-17684999
 ] 

Dong Lin commented on FLINK-29825:
--

[~pnowojski] I think one drawback with your proposal is that it is comparing 
two distributions and depends on having large enough number of samples in both 
distributions. It means that if a benchmark happens, you need to run engouh 
commit-points so that the recent distribution starts to be considerably 
different from the previous distribution according to Kolmogorov-Smirnov test. 
This would considerably delay the time-to-regression-detection. It seems that 
my proposal would not suffer from this issue since it allows users to specify 
how many commit-points we need to repeat the regression before sending alert. 
And this number can be 1-3 commit points.

Regarding the drawback of not detecting "there is visible performance 
regression within benchmark noise", my proposal is to either exclude noisy 
benchmark completely, or we can require the regression to be 2X the noise (the 
ratio is also tunable). These sound like a reasonable practical solution, right?

BTW, I don't think we will be able to have perfect regression detection without 
any drawback(e.g. 0 false positive and 0 false negative). The question is 
whether the proposed solution can be useful enough (i.e. low false positive and 
low false negative) and whether it is the best solution across all available 
choices. So it can be OK if some regression is not detected, like the one 
mentioned above


BTW, regarding the noisy benchmark mentioned above, I am curious how 
Kolmogorov-Smirnov test can address issue. Maybe I can update my proposal to 
re-use the idea. Can you help explain it?

I will take a look at the tooling mentioned above later to see if we can learn 
from them or re-use them.

> Improve benchmark stability
> ---
>
> Key: FLINK-29825
> URL: https://issues.apache.org/jira/browse/FLINK-29825
> Project: Flink
>  Issue Type: Improvement
>  Components: Benchmarks
>Affects Versions: 1.17.0
>Reporter: Yanfei Lei
>Assignee: Yanfei Lei
>Priority: Minor
>
> Currently, regressions are detected by a simple script which may have false 
> positives and false negatives, especially for benchmarks with small absolute 
> values, small value changes would cause large percentage changes. see 
> [here|https://github.com/apache/flink-benchmarks/blob/master/regression_report.py#L132-L136]
>  for details.
> And all benchmarks are executed on one physical machine, it might happen that 
> hardware issues affect performance, like "[FLINK-18614] Performance 
> regression 2020.07.13".
>  
> This ticket aims to improve the precision and recall of the regression-check 
> script.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] lsyldliu commented on pull request #21789: [FLINK-30824][hive] Add document for option table.exec.hive.native-agg-function.enabled

2023-02-06 Thread via GitHub



lsyldliu commented on PR #21789:
URL: https://github.com/apache/flink/pull/21789#issuecomment-1420065801

   @luoyuxia Thanks for reviewing, I've addressed the comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] LadyForest commented on pull request #21507: [FLINK-30386] Fix the issue that column constraint lacks primary key not enforced check

2023-02-06 Thread via GitHub



LadyForest commented on PR #21507:
URL: https://github.com/apache/flink/pull/21507#issuecomment-1420061832

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] LadyForest commented on pull request #21802: [FLINK-30811][sql-gateway] Fix SqlGateway cannot stop job correctly

2023-02-06 Thread via GitHub



LadyForest commented on PR #21802:
URL: https://github.com/apache/flink/pull/21802#issuecomment-1420061162

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] luoyuxia commented on pull request #21863: [FLINK-30666][doc] Add document for Delete/Update API

2023-02-06 Thread via GitHub



luoyuxia commented on PR #21863:
URL: https://github.com/apache/flink/pull/21863#issuecomment-1420059618

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] 1996fanrui commented on pull request #21859: [FLINK-30916][checkpoint] Set the number of rocksdb uploader threads to 1 to solve the unstable bug of unit test

2023-02-06 Thread via GitHub



1996fanrui commented on PR #21859:
URL: https://github.com/apache/flink/pull/21859#issuecomment-1420052321

   Hi @curcur , all PRs of FLINK-30461 and FLINK-30916 are green. Please help 
take a look, thanks~
   
   https://github.com/apache/flink/pull/21853
   https://github.com/apache/flink/pull/21855


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] jiangxin369 commented on a diff in pull request #203: [FLINK-30688][followup] Disable Kryo fallback in Flink ML tests

2023-02-06 Thread via GitHub



jiangxin369 commented on code in PR #203:
URL: https://github.com/apache/flink-ml/pull/203#discussion_r1097419512


##
flink-ml-lib/src/main/java/org/apache/flink/ml/common/util/quantile/QuantileSummary.java:
##
@@ -395,12 +400,14 @@ public QueryResult(int index, long minRankAtIndex, double 
percentile) {
  *   delta: the maximum span of the rank.
  * 
  */
-private static class StatsTuple implements Serializable {
+public static class StatsTuple implements Serializable {
 private static final long serialVersionUID = 1L;
-private final double value;
+private double value;
 private long g;
 private long delta;
 
+public StatsTuple() {}

Review Comment:
   No, it is not. It's needed to make the class can be used by `Types.POJO`. In 
this way, we need to provide the field information of the class so that Flink 
can access these fields by reflection and does not require all member variables 
to be public.
   
   But you remind me that actually, the `StatsTuple` is a standard POJO, I 
don't need to define the TypeInfo by `Types.POJO`, making its member variables 
public is a simpler way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] mbalassi commented on pull request #524: [FLINK-30405] Add ResourceLifecycleStatus to CommonStatus and printer column

2023-02-06 Thread via GitHub



mbalassi commented on PR #524:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/524#issuecomment-1420013054

   I will take a look during CET work hours to help wrap this up for 1.4.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #21876: [FLINK-30088][doc] Flink Doc(zh version) has misspelled words

2023-02-06 Thread via GitHub



flinkbot commented on PR #21876:
URL: https://github.com/apache/flink/pull/21876#issuecomment-1420010916

   
   ## CI report:
   
   * 438895abd0c11a4fce6d451ad01977cf91ab726d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] paul8263 opened a new pull request, #21876: [FLINK-30088][doc] Flink Doc(zh version) has misspelled words

2023-02-06 Thread via GitHub



paul8263 opened a new pull request, #21876:
URL: https://github.com/apache/flink/pull/21876

   ## What is the purpose of the change
   
   Fix typos in Flink doc (Chinese version).
   
   
   ## Brief change log
   
   - docs/content.zh/docs/dev/datastream/overview.md
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 3 4 5 >

1 - 100 of 428 matches

Mail list logo