Re: [VOTE] FLIP-217: Support watermark alignment of source splits

2022-08-01 Thread Mason Chen
Hi all,

+1 (non-binding). Can't wait to use this feature in Flink!

Best,
Mason

On Sun, Jul 31, 2022 at 10:57 PM Sebastian Mattheis 
wrote:

> Hi everyone,
>
> I would like to start the vote for FLIP-217 [1]. Thanks for your feedback
> and the discussion in [2].
>
> FLIP-217 is a follow-up on FLIP-182 [3] and adds support for watermark
> alignment of source splits.
>
> The poll will be open until August 4th, 8.00AM GMT (72h) unless there is a
> binding veto or an insufficient number of votes.
>
> Best regards,
> Sebastian
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-217+Support+watermark+alignment+of+source+splits
> [2] https://lists.apache.org/thread/4qwkcr3y1hrnlm2h9d69ofb4vo1lprvr
> [3]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-182%3A+Support+watermark+alignment+of+FLIP-27+Sources
>


Re: [Phishing Risk] [External] [DISCUSS] FLIP-255 Introduce pre-aggregated merge to Table Store

2022-08-01 Thread Jingsong Li
Thanks Nathan for starting this discussion.

This [1] is a very good requirement to build the materialized view on Flink
Table Store.

## Aggregate Functions
For 'aggregate-function' = '{sum_field1:sum,max_field2:max}'.
Do you refer to any other systems? For us at Flink, a viable approach is
something like the Datagen connector [2]. Something like
`'fields.sum_field1.function'='sum'`.

## Default function
>> Tips: Columns which do not have designated aggregate functions using
newest value to overwrite old value.
Do you mean `replace`?
Is there anything about default functions that other systems can refer to?

## Supported functions
I'm not quite sure that the names of these functions are standard enough:
`replace_if_not_null/replace/concatenate`. Can you look at other systems?
You can also specify whether they support retraction messages.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-255+Introduce+pre-aggregated+merge+to+Table+Store
[2]
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/table/datagen/#connector-options

Best,
Jingsong

On Tue, Aug 2, 2022 at 11:09 AM 李国君  wrote:

> Hi Nathan,
>
> Seems a great proposal for table store aggregation.
> In the example, I think the 'max_field1' should be 1 instead of 2 after
> the max aggregation in the output result.
> And there may be a minor typo in the WITH clause, 'max_field2' ->
> 'max_field1'.
>
> Best,
> Guojun
>
> From: "Hannan Kan"
> Date: Mon, Aug 1, 2022, 11:35 PM
> Subject: [Phishing Risk] [External] [DISCUSS] FLIP-255 Introduce
> pre-aggregated merge to Table Store
> To: "dev"
> Cc: "lzljs3620320"
> Hi everyone, I would like to open a discussion onFLIP-255 Introduce
> pre-aggregated merge to table store[1]. Pre-aggregation mechanism has
> been adopted by ma​ny bi​g d​ata system​​s (such as Apache Doris,​
> Apache Kylin​ , Druid etc.)to save storage and accelerate the
> aggregate query.​ FLIP-255 proposes to introduce pre-aggregated merge into
> Flink Table Store to acquire the same benefit. ​Supported aggregate
> functions includesum, max/min, count, replace_if_not_null,
> replace, concatenate, or/and​. ​ Looking forward to your feedback.
> [1]https://cwiki.apache.org/confluence/display​/FLINK/FLIP-2​55+Introduce+pre-aggregated+merge+to+table+store
> Best, Nathan Kan (​Hongnan Gan)​​​
>


[jira] [Created] (FLINK-28771) Assign speculative execution attempt with correct CREATED timestamp

2022-08-01 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28771:
---

 Summary: Assign speculative execution attempt with correct CREATED 
timestamp
 Key: FLINK-28771
 URL: https://issues.apache.org/jira/browse/FLINK-28771
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.16.0
Reporter: Zhu Zhu
Assignee: Zhu Zhu
 Fix For: 1.16.0


Currently, newly created speculative execution attempt is assigned with a wrong 
CREATED timestamp in SpeculativeScheduler. We need to fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [Phishing Risk] [External] [DISCUSS] FLIP-255 Introduce pre-aggregated merge to Table Store

2022-08-01 Thread 李国君
Hi Nathan,

Seems a great proposal for table store aggregation.
In the example, I think the 'max_field1' should be 1 instead of 2 after the
max aggregation in the output result.
And there may be a minor typo in the WITH clause, 'max_field2' ->
'max_field1'.

Best,
Guojun

From: "Hannan Kan"
Date: Mon, Aug 1, 2022, 11:35 PM
Subject: [Phishing Risk] [External] [DISCUSS] FLIP-255 Introduce
pre-aggregated merge to Table Store
To: "dev"
Cc: "lzljs3620320"
Hi everyone, I would like to open a discussion onFLIP-255 Introduce
pre-aggregated merge to table store[1]. Pre-aggregation mechanism has
been adopted by ma​ny bi​g d​ata system​​s (such as Apache Doris,​
Apache Kylin​ , Druid etc.)to save storage and accelerate the
aggregate query.​ FLIP-255 proposes to introduce pre-aggregated merge into
Flink Table Store to acquire the same benefit. ​Supported aggregate
functions includesum, max/min, count, replace_if_not_null,
replace, concatenate, or/and​. ​ Looking forward to your feedback.
[1]https://cwiki.apache.org/confluence/display​/FLINK/FLIP-2​55+Introduce+pre-aggregated+merge+to+table+store
Best, Nathan Kan (​Hongnan Gan)​​​


[jira] [Created] (FLINK-28770) CREATE TABLE AS SELECT supports explain

2022-08-01 Thread tartarus (Jira)
tartarus created FLINK-28770:


 Summary: CREATE TABLE AS SELECT supports explain
 Key: FLINK-28770
 URL: https://issues.apache.org/jira/browse/FLINK-28770
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: tartarus


Unsupported operation: 
org.apache.flink.table.operations.ddl.CreateTableASOperation



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[RESULT][VOTE] FLIP-250: Support Customized Kubernetes Schedulers Proposal

2022-08-01 Thread bo zhaobo
Hi everyone,

FLIP-250: Support Customized Kubernetes Schedulers Proposal [1] has been
accepted.

>From vote thread[2] , there were 7 votes(+1), 4 of them are binding votes:
- Yang Wang (binding)
- Martijn Visser (binding)
- Kelu Tao (non-binding)
- William Wang (non-binding)
- MrL (non-binding)
- Gyula Fóra (binding)
- Márton Balassi (binding)

No disapproving vote is found.

I'm so happy that FLIP-250 could be accepted. Many thanks to Flink
community and nice people here.

Go through the discussion and vote thread. We got so many helps and advices
from you. Thanks everyone for joining the discussion, giving feedback and
voting !!!

BR

Bo Zhao

[1] 
*https://cwiki.apache.org/confluence/display/FLINK/FLIP-250%3A+Support+Customized+Kubernetes+Schedulers+Proposal
*
[2] *https://lists.apache.org/thread/w1tpl5cwdrj7bc8qrlp96q25y9sv5gjk
*


[jira] [Created] (FLINK-28769) Flink History Server show wrong name of batch jobs

2022-08-01 Thread Biao Geng (Jira)
Biao Geng created FLINK-28769:
-

 Summary: Flink History Server show wrong name of batch jobs
 Key: FLINK-28769
 URL: https://issues.apache.org/jira/browse/FLINK-28769
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Reporter: Biao Geng
 Attachments: image-2022-08-02-00-41-51-815.png

When running {{examples/batch/WordCount.jar}} using flink1.15 and 1.16 together 
with history server started, the history server shows default name(e.g. Flink 
Java Job at Tue Aug 02.. ) of the batch job instead of the name( "WordCount 
Example" ) specified in the java code.
But for {{examples/streaming/WordCount.jar}}, the job name in history server is 
correct.

It looks like that 
{{org.apache.flink.api.java.ExecutionEnvironment#executeAsync(java.lang.String)}}
 does not set job name as what 
{{org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#execute(java.lang.String)}}
 does(e.g. streamGraph.setJobName(jobName); ).


!image-2022-08-02-00-41-51-815.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28768) Update JUnit5 to v5.9.0

2022-08-01 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-28768:
---

 Summary: Update JUnit5 to v5.9.0
 Key: FLINK-28768
 URL: https://issues.apache.org/jira/browse/FLINK-28768
 Project: Flink
  Issue Type: Technical Debt
  Components: Test Infrastructure, Tests
Reporter: Sergey Nuyanzin


Junit 5.9.0 is released

with release notes 
https://junit.org/junit5/docs/current/release-notes/#release-notes-5.9.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] FLIP-255 Introduce pre-aggregated merge to Table Store

2022-08-01 Thread Hannan Kan
Hi everyone,


I would like to open a discussion onFLIP-255 Introduce pre-aggregated 
merge to table store[1].
Pre-aggregation mechanism has been adopted by ma?6?7ny bi?6?7g d?6?7ata 
system?6?7?6?7s (such as Apache Doris,?6?7

Apache Kylin?6?7

, Druid etc.)to save storage and accelerate the aggregate query.?6?7
FLIP-255 proposes to introduce pre-aggregated merge into Flink Table Store to 
acquire the same benefit.
?6?7Supported aggregate functions includesum, max/min, count, 
replace_if_not_null, replace, concatenate, or/and?6?7.
?6?7

Looking forward to your feedback.


[1]https://cwiki.apache.org/confluence/display?6?7/FLINK/FLIP-2?6?755+Introduce+pre-aggregated+merge+to+table+store


Best,
Nathan Kan (?6?7Hongnan Gan)?6?7?6?7?6?7

Re: Re: Re: Re: [DISCUSS] FLIP-239: Port JDBC Connector Source to FLIP-27

2022-08-01 Thread Martijn Visser
Hi,

There is currently already a PR submitted to port the JDBC interface to the
new interfaces. Can we make sure that this FLIP is being finalized, so that
you and other maintainers can work on getting the PRs correct and
eventually merged in?

Best regards,

Martijn

Op ma 4 jul. 2022 om 16:38 schreef Martijn Visser :

> Hi Roc,
>
> Thanks for the FLIP and opening the discussion. I have a couple of initial
> questions/remarks:
>
> * The FLIP contains information for both Source and Sink, but nothing
> explicitly on the Lookup functionality. I'm assuming we also want to have
> that implementation covered while porting this to the new interfaces.
> * The FLIP mentions porting to both the new Source and the new Sink API,
> but the FLIP only contains detailed information on the Source. Are you
> planning to add that to the FLIP before casting a vote? Because the
> discussion should definitely be resolved for both the Source and the Sink.
>
> Best regards,
>
> Martijn
>
> Op za 2 jul. 2022 om 06:35 schreef Roc Marshal :
>
>> Hi, Weike.
>>
>> Thank you for your reply
>> As you said, too many splits stored in SourceEnumerator will increase the
>> load of JM.
>> What do you think if we introduce a capacity of splits in
>> SourceEnumerator to limit the total number, and introduce a reject or
>> callback mechanism with too many splits in the timely generation strategy
>> to solve this problem?
>> Looking forward to a better solution .
>>
>> Best regards,
>> Roc Marshal
>>
>> On 2022/07/01 07:58:22 Dong Weike wrote:
>> > Hi,
>> >
>> > Thank you for bringing this up, and I am +1 for this feature.
>> >
>> > IMO, one important thing that I would like to mention is that an
>> improperly-designed FLIP-27 connector could impose very severe memory
>> pressure on the JobManager, especially when there are enormous number of
>> splits for the source tables, e.g. there are billions of records to read.
>> Frankly speaking, we have been haunted by this problem for a long time when
>> using the Flink CDC Connectors to read large tables.
>> >
>> > Therefore, in order to prevent JobManager from experiencing frequent
>> OOM faults, JdbcSourceEnumerator should avoid saving too many
>> JdbcSourceSplits in the unassigned list. And it would be better if all the
>> splits would be computed on the fly.
>> >
>> > Best,
>> > Weike
>> >
>> > -邮件原件-
>> > 发件人: Lijie Wang 
>> > 发送时间: 2022年7月1日 上午 10:25
>> > 收件人: dev@flink.apache.org
>> > 主题: Re: Re: [DISCUSS] FLIP-239: Port JDBC Connector Source to FLIP-27
>> >
>> > Hi Roc,
>> >
>> > Thanks for driving the discussion.
>> >
>> > Could you describe in detail what the JdbcSourceSplit represents? It
>> looks like something wrong with the comments of JdbcSourceSplit in FLIP(it
>> describe as "A {@link SourceSplit} that represents a file, or a region of a
>> file").
>> >
>> > Best,
>> > Lijie
>> >
>> >
>> > Roc Marshal  于2022年6月30日周四 21:41写道:
>> >
>> > > Hi, Boto.
>> > > Thanks for your reply.
>> > >
>> > >+1 to me on watermark strategy definition in ‘streaming’ & table
>> > > source. I'm not sure if FLIP-202[1]  is suitable for a separate
>> > > discussion, but I think your proposal is very helpful to the new
>> > > source. It would be great if the new source could be compatible with
>> this abstraction.
>> > >
>> > >In addition, whether we need to support such a special bounded
>> > > scenario abstraction?
>> > >The number of JdbcSourceSplit is certain, but the time to generate
>> > > all JdbcSourceSplit completely is not certain in the user defined
>> > > implementation. When the condition that the JdbcSourceSplit
>> > > generate-process end is met, the JdbcSourceSplit will not be
>> generated.
>> > > After all JdbcSourceSplit processing is completed, the reader will be
>> > > notified that there are no more JdbcSourceSplit from
>> > > JdbcSourceSplitEnumerator.
>> > >
>> > > - [1]
>> > >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-202%3A+Introduc
>> > > e+ClickHouse+Connector
>> > >
>> > > Best regards,
>> > > Roc Marshal
>> > >
>> > > On 2022/06/30 09:02:23 João Boto wrote:
>> > > > Hi,
>> > > >
>> > > > On source we could improve the JdbcParameterValuesProvider.. to be
>> > > defined as a query(s) or something more dynamic.
>> > > > The most time if your job is dynamic or have some condition to be
>> > > > met
>> > > (based on data on table) you have to create a connection an get that
>> > > info from database
>> > > >
>> > > > If we are going to create/allow a "streaming" jdbc source, we
>> should
>> > > > be
>> > > able to define watermark and get new data from table using that
>> watermark..
>> > > >
>> > > >
>> > > > For the sink (but it could apply on source) will be great to be
>> able
>> > > > to
>> > > set your implementation of the connection type.. For example if you
>> > > are connecting to clickhouse, be able to set a implementation based
>> on
>> > > "BalancedClickhouseDataSource" for example (in this[1] implementation
>> > > we have a 

[DISCUSS] FLIP-254 Redis Streams connector

2022-08-01 Thread Martijn Visser
Hi everyone,

Together with Mohammad we would like to start a discussion thread on
FLIP-254 for a Redis Streams connector. This would also be a new connector
in its own external connector repository, being
https://www.github.com/apache/flink-connector-redis. This repository
already exists, because there was an open PR to add this to the Flink main
repository and they were planned to be one of the first movers to the
external connector repo. However, we never formally agreed that we want
this, therefore we're now creating this FLIP.

Looking forward to any comments or feedback. If not, I'll open a VOTE
thread in the next couple of days.

Best regards,

Martijn


[jira] [Created] (FLINK-28767) SqlGatewayServiceITCase.testCancelOperation failed with AssertionFailedError

2022-08-01 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-28767:


 Summary: SqlGatewayServiceITCase.testCancelOperation failed with 
AssertionFailedError
 Key: FLINK-28767
 URL: https://issues.apache.org/jira/browse/FLINK-28767
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Gateway
Affects Versions: 1.16.0
Reporter: Huang Xingbo



{code:java}
2022-07-31T03:32:09.3148584Z Jul 31 03:32:09 [ERROR] Tests run: 16, Failures: 
1, Errors: 0, Skipped: 0, Time elapsed: 7.268 s <<< FAILURE! - in 
org.apache.flink.table.gateway.service.SqlGatewayServiceITCase
2022-07-31T03:32:09.3205977Z Jul 31 03:32:09 [ERROR] 
org.apache.flink.table.gateway.service.SqlGatewayServiceITCase.testCancelOperation
  Time elapsed: 0.008 s  <<< FAILURE!
2022-07-31T03:32:09.3207151Z Jul 31 03:32:09 
org.opentest4j.AssertionFailedError: expected: 
 but was: 

2022-07-31T03:32:09.3207956Z Jul 31 03:32:09at 
org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
2022-07-31T03:32:09.3211582Z Jul 31 03:32:09at 
org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
2022-07-31T03:32:09.3212267Z Jul 31 03:32:09at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)
2022-07-31T03:32:09.3212945Z Jul 31 03:32:09at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)
2022-07-31T03:32:09.3213607Z Jul 31 03:32:09at 
org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1141)
2022-07-31T03:32:09.3216761Z Jul 31 03:32:09at 
org.apache.flink.table.gateway.service.SqlGatewayServiceITCase.testCancelOperation(SqlGatewayServiceITCase.java:245)
2022-07-31T03:32:09.3217645Z Jul 31 03:32:09at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2022-07-31T03:32:09.3218243Z Jul 31 03:32:09at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2022-07-31T03:32:09.3218971Z Jul 31 03:32:09at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2022-07-31T03:32:09.3219622Z Jul 31 03:32:09at 
java.lang.reflect.Method.invoke(Method.java:498)
2022-07-31T03:32:09.3227901Z Jul 31 03:32:09at 
org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
2022-07-31T03:32:09.3228714Z Jul 31 03:32:09at 
org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
2022-07-31T03:32:09.3229561Z Jul 31 03:32:09at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
2022-07-31T03:32:09.3230353Z Jul 31 03:32:09at 
org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
2022-07-31T03:32:09.3231409Z Jul 31 03:32:09at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
2022-07-31T03:32:09.3235059Z Jul 31 03:32:09at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
2022-07-31T03:32:09.3236100Z Jul 31 03:32:09at 
org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
2022-07-31T03:32:09.3236978Z Jul 31 03:32:09at 
org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
2022-07-31T03:32:09.3237841Z Jul 31 03:32:09at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
2022-07-31T03:32:09.3241384Z Jul 31 03:32:09at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
2022-07-31T03:32:09.3242557Z Jul 31 03:32:09at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
2022-07-31T03:32:09.3243372Z Jul 31 03:32:09at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
2022-07-31T03:32:09.3244155Z Jul 31 03:32:09at 
org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
2022-07-31T03:32:09.3244899Z Jul 31 03:32:09at 
org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98)
2022-07-31T03:32:09.3249715Z Jul 31 03:32:09at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:214)
2022-07-31T03:32:09.3250594Z Jul 31 03:32:09at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2022-07-31T03:32:09.3251432Z Jul 31 03:32:09at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:210)
2022-07-31T03:32:09.3252291Z Jul 31 03:32:09at 

[jira] [Created] (FLINK-28766) UnalignedCheckpointStressITCase.runStressTest failed with NoSuchFileException

2022-08-01 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-28766:


 Summary: UnalignedCheckpointStressITCase.runStressTest failed with 
NoSuchFileException
 Key: FLINK-28766
 URL: https://issues.apache.org/jira/browse/FLINK-28766
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.16.0
Reporter: Huang Xingbo



{code:java}
2022-08-01T01:36:16.0563880Z Aug 01 01:36:16 [ERROR] 
org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest
  Time elapsed: 12.579 s  <<< ERROR!
2022-08-01T01:36:16.0565407Z Aug 01 01:36:16 java.io.UncheckedIOException: 
java.nio.file.NoSuchFileException: 
/tmp/junit1058240190382532303/f0f99754a53d2c4633fed75011da58dd/chk-7/61092e4a-5b9a-4f56-83f7-d9960c53ed3e
2022-08-01T01:36:16.0566296Z Aug 01 01:36:16at 
java.nio.file.FileTreeIterator.fetchNextIfNeeded(FileTreeIterator.java:88)
2022-08-01T01:36:16.0566972Z Aug 01 01:36:16at 
java.nio.file.FileTreeIterator.hasNext(FileTreeIterator.java:104)
2022-08-01T01:36:16.0567600Z Aug 01 01:36:16at 
java.util.Iterator.forEachRemaining(Iterator.java:115)
2022-08-01T01:36:16.0568290Z Aug 01 01:36:16at 
java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
2022-08-01T01:36:16.0569172Z Aug 01 01:36:16at 
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
2022-08-01T01:36:16.0569877Z Aug 01 01:36:16at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
2022-08-01T01:36:16.0570554Z Aug 01 01:36:16at 
java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
2022-08-01T01:36:16.0571371Z Aug 01 01:36:16at 
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2022-08-01T01:36:16.0572417Z Aug 01 01:36:16at 
java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:546)
2022-08-01T01:36:16.0573618Z Aug 01 01:36:16at 
org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.discoverRetainedCheckpoint(UnalignedCheckpointStressITCase.java:289)
2022-08-01T01:36:16.0575187Z Aug 01 01:36:16at 
org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runAndTakeExternalCheckpoint(UnalignedCheckpointStressITCase.java:262)
2022-08-01T01:36:16.0576540Z Aug 01 01:36:16at 
org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest(UnalignedCheckpointStressITCase.java:158)
2022-08-01T01:36:16.0577684Z Aug 01 01:36:16at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2022-08-01T01:36:16.0578546Z Aug 01 01:36:16at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2022-08-01T01:36:16.0579374Z Aug 01 01:36:16at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2022-08-01T01:36:16.0580298Z Aug 01 01:36:16at 
java.lang.reflect.Method.invoke(Method.java:498)
2022-08-01T01:36:16.0581243Z Aug 01 01:36:16at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
2022-08-01T01:36:16.0582029Z Aug 01 01:36:16at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2022-08-01T01:36:16.0582766Z Aug 01 01:36:16at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
2022-08-01T01:36:16.0583488Z Aug 01 01:36:16at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2022-08-01T01:36:16.0584203Z Aug 01 01:36:16at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
2022-08-01T01:36:16.0585087Z Aug 01 01:36:16at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
2022-08-01T01:36:16.0585778Z Aug 01 01:36:16at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
2022-08-01T01:36:16.0586482Z Aug 01 01:36:16at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
2022-08-01T01:36:16.0587155Z Aug 01 01:36:16at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
2022-08-01T01:36:16.0587809Z Aug 01 01:36:16at 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
2022-08-01T01:36:16.0588434Z Aug 01 01:36:16at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
2022-08-01T01:36:16.0589203Z Aug 01 01:36:16at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
2022-08-01T01:36:16.0589867Z Aug 01 01:36:16at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
2022-08-01T01:36:16.0590672Z Aug 01 01:36:16at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
2022-08-01T01:36:16.0591534Z Aug 01 01:36:16at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
2022-08-01T01:36:16.0592209Z Aug 01 01:36:16at 

[jira] [Created] (FLINK-28765) Create a doc for protobuf format

2022-08-01 Thread Suhan Mao (Jira)
Suhan Mao created FLINK-28765:
-

 Summary: Create a doc for protobuf format
 Key: FLINK-28765
 URL: https://issues.apache.org/jira/browse/FLINK-28765
 Project: Flink
  Issue Type: Improvement
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Reporter: Suhan Mao
 Fix For: 1.16.0


After this feature has been done 
https://issues.apache.org/jira/browse/FLINK-18202, we should write a doc to 
introduce how to use the protobuf format in SQL.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28764) Support more than 64 distinct aggregate function calls in one aggregate SQL query

2022-08-01 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-28764:
-

 Summary: Support more than 64 distinct aggregate function calls in 
one aggregate SQL query
 Key: FLINK-28764
 URL: https://issues.apache.org/jira/browse/FLINK-28764
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.15.1, 1.14.5, 1.13.6
Reporter: Wei Zhong


Currently Flink SQL does not support more than 64 distinct aggregate function 
calls in one aggregate SQL query. We encountered this problem while migrating 
batch jobs from spark to flink. The spark job has 79 distinct aggregate 
function calls in one aggregate SQL query.

Reproduce code:
{code:java}
public class Test64Distinct {
    public static void main(String[] args) {
        TableEnvironment tableEnv = 
TableEnvironment.create(EnvironmentSettings.inBatchMode());
        tableEnv.executeSql("create table datagen_source(id BIGINT, val BIGINT) 
with " +
                "('connector'='datagen', 'number-of-rows'='1000')");
        tableEnv.executeSql("select " +
                "count(distinct val * 1), " +
                "count(distinct val * 2), " +
                "count(distinct val * 3), " +
                "count(distinct val * 4), " +
                "count(distinct val * 5), " +
                "count(distinct val * 6), " +
                "count(distinct val * 7), " +
                "count(distinct val * 8), " +
                "count(distinct val * 9), " +
                "count(distinct val * 10), " +
                "count(distinct val * 11), " +
                "count(distinct val * 12), " +
                "count(distinct val * 13), " +
                "count(distinct val * 14), " +
                "count(distinct val * 15), " +
                "count(distinct val * 16), " +
                "count(distinct val * 17), " +
                "count(distinct val * 18), " +
                "count(distinct val * 19), " +
                "count(distinct val * 20), " +
                "count(distinct val * 21), " +
                "count(distinct val * 22), " +
                "count(distinct val * 23), " +
                "count(distinct val * 24), " +
                "count(distinct val * 25), " +
                "count(distinct val * 26), " +
                "count(distinct val * 27), " +
                "count(distinct val * 28), " +
                "count(distinct val * 29), " +
                "count(distinct val * 30), " +
                "count(distinct val * 31), " +
                "count(distinct val * 32), " +
                "count(distinct val * 33), " +
                "count(distinct val * 34), " +
                "count(distinct val * 35), " +
                "count(distinct val * 36), " +
                "count(distinct val * 37), " +
                "count(distinct val * 38), " +
                "count(distinct val * 39), " +
                "count(distinct val * 40), " +
                "count(distinct val * 41), " +
                "count(distinct val * 42), " +
                "count(distinct val * 43), " +
                "count(distinct val * 44), " +
                "count(distinct val * 45), " +
                "count(distinct val * 46), " +
                "count(distinct val * 47), " +
                "count(distinct val * 48), " +
                "count(distinct val * 49), " +
                "count(distinct val * 50), " +
                "count(distinct val * 51), " +
                "count(distinct val * 52), " +
                "count(distinct val * 53), " +
                "count(distinct val * 54), " +
                "count(distinct val * 55), " +
                "count(distinct val * 56), " +
                "count(distinct val * 57), " +
                "count(distinct val * 58), " +
                "count(distinct val * 59), " +
                "count(distinct val * 60), " +
                "count(distinct val * 61), " +
                "count(distinct val * 62), " +
                "count(distinct val * 63), " +
                "count(distinct val * 64), " +
                "count(distinct val * 65) from datagen_source").print();
    }
} {code}
Exception:
{code:java}
Exception in thread "main" org.apache.flink.table.api.TableException: Sql 
optimization: Cannot generate a valid execution plan for the given query: 
LogicalSink(table=[*anonymous_collect$1*], fields=[EXPR$0, EXPR$1, EXPR$2, 
EXPR$3, EXPR$4, EXPR$5, EXPR$6, EXPR$7, EXPR$8, EXPR$9, EXPR$10, EXPR$11, 
EXPR$12, EXPR$13, EXPR$14, EXPR$15, EXPR$16, EXPR$17, EXPR$18, EXPR$19, 
EXPR$20, EXPR$21, EXPR$22, EXPR$23, EXPR$24, EXPR$25, EXPR$26, EXPR$27, 
EXPR$28, EXPR$29, EXPR$30, EXPR$31, EXPR$32, EXPR$33, EXPR$34, EXPR$35, 
EXPR$36, EXPR$37, EXPR$38, EXPR$39, EXPR$40, EXPR$41, EXPR$42, EXPR$43, 
EXPR$44, EXPR$45, EXPR$46, EXPR$47, EXPR$48, EXPR$49, EXPR$50, EXPR$51, 
EXPR$52, EXPR$53, EXPR$54, EXPR$55, EXPR$56, EXPR$57, EXPR$58, EXPR$59, 
EXPR$60, 

[jira] [Created] (FLINK-28763) Make savepoint format configurable for upgrades/savepoint operations

2022-08-01 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-28763:
--

 Summary: Make savepoint format configurable for upgrades/savepoint 
operations
 Key: FLINK-28763
 URL: https://issues.apache.org/jira/browse/FLINK-28763
 Project: Flink
  Issue Type: New Feature
  Components: Kubernetes Operator
Reporter: Gyula Fora
 Fix For: kubernetes-operator-1.2.0


We should make the savepoint format configurable for Flink 1.15+ so users can 
select canonical/native. This should be configured in flinkConfiguration 
directly.

[https://stackoverflow.com/questions/73169474/apache-flink-k8s-operator-and-native-savepoint-format]
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28762) Support AvroParquetWriters in PyFlink

2022-08-01 Thread Juntao Hu (Jira)
Juntao Hu created FLINK-28762:
-

 Summary: Support AvroParquetWriters in PyFlink
 Key: FLINK-28762
 URL: https://issues.apache.org/jira/browse/FLINK-28762
 Project: Flink
  Issue Type: New Feature
  Components: API / Python
Affects Versions: 1.15.1
Reporter: Juntao Hu
 Fix For: 1.16.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28761) BinaryClassificationEvaluator cannot work with unaligned checkpoint

2022-08-01 Thread Yunfeng Zhou (Jira)
Yunfeng Zhou created FLINK-28761:


 Summary: BinaryClassificationEvaluator cannot work with unaligned 
checkpoint
 Key: FLINK-28761
 URL: https://issues.apache.org/jira/browse/FLINK-28761
 Project: Flink
  Issue Type: Bug
  Components: Library / Machine Learning
Affects Versions: ml-2.1.0
Reporter: Yunfeng Zhou


If we make  {{BinaryClassificationEvaluatorTest}} extend  {{AbstractTestBase}}, 
this test class would throw the following exceptions during execution:
 
{code:java}
org.apache.flink.table.api.TableException: Failed to execute sql

at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeQueryOperation(TableEnvironmentImpl.java:854)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:1317)
at 
org.apache.flink.table.api.internal.TableImpl.execute(TableImpl.java:605)
at 
org.apache.flink.ml.evaluation.BinaryClassificationEvaluatorTest.testEvaluateWithWeight(BinaryClassificationEvaluatorTest.java:305)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
at 
com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38)
at 
com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11)
at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54)
Caused by: java.lang.UnsupportedOperationException: Unaligned checkpoints are 
currently not supported for custom partitioners, as rescaling is not guaranteed 
to work correctly.
The user can force Unaligned Checkpoints by using 
'execution.checkpointing.unaligned.forced'
at 
org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.preValidate(StreamingJobGraphGenerator.java:390)
at 
org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:166)
at 
org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:121)
at 
org.apache.flink.streaming.api.graph.StreamGraph.getJobGraph(StreamGraph.java:993)
at 
org.apache.flink.client.StreamGraphTranslator.translateToJobGraph(StreamGraphTranslator.java:50)
at 

[jira] [Created] (FLINK-28760) Hive table store support can't be fully enabled by ADD JAR statement

2022-08-01 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-28760:
---

 Summary: Hive table store support can't be fully enabled by ADD 
JAR statement
 Key: FLINK-28760
 URL: https://issues.apache.org/jira/browse/FLINK-28760
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Affects Versions: table-store-0.2.0, table-store-0.3.0
Reporter: Caizhi Weng
 Fix For: table-store-0.2.0, table-store-0.3.0


Current table store Hive document states that, to enable support in Hive, we 
should use ADD JAR statement to add hive connector jar.

However some types of queries, for example join statements in MR engine, may 
still throw ClassNotFound exception. The correct way to enable table store 
support is to create an auxlib directory under the root directory of Hive and 
copy jar into that directory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28759) Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests

2022-08-01 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28759:
---

 Summary: Enable speculative execution for in 
AdaptiveBatchScheduler TPC-DS e2e tests
 Key: FLINK-28759
 URL: https://issues.apache.org/jira/browse/FLINK-28759
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination, Tests
Reporter: Zhu Zhu
 Fix For: 1.16.0


To verify the correctness of speculative execution, we can enabled it in 
AdaptiveBatchScheduler TPC-DS e2e tests, which runs a lot of different batch 
jobs and verifies the result.
Note that we need to disable the blocklist (by setting block duration to 0) in 
such single machine e2e tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)