Re: [DISCUSS] FLIP-268: Rack Awareness for Kafka Sources

2022-11-10 Thread Martijn Visser
Hi Yanfei,

The version for Kafka Clients has been upgraded to 3.2.3 since 1.16 via
FLINK-29513 [1]

Best regards,

Martijn

[1] https://issues.apache.org/jira/browse/FLINK-29513

On Fri, Nov 11, 2022 at 4:11 AM Yanfei Lei  wrote:

> Hi Jeremy,
>
> Thanks for the proposal.
> I'm curious if this feature can adapt to the current kafka-client version
> in the connector. The rack awareness feature requires a version>=3.2.0 to
> use, while the kafka-client version of Kafka connector is 2.8.1.  Will we
> upgrade the Kafka-client version first?
>
> Best,
> Yanfei
>
> Jeremy DeGroot  于2022年11月11日周五 05:35写道:
>
> > Kafak has a rack awareness feature that allows brokers and consumers to
> > communicate about the rack (or AWS Availability Zone) they're located in.
> > Reading from a local broker can save money in bandwidth and improve
> latency
> > for your consumers.
> >
> > This improvement proposes that a Kafka Consumer could be configured with
> a
> > callback that could be run when it's being configured on the task
> manager,
> > that will set the appropriate value at runtime if a value is provided.
> >
> > More detail about this proposal can be found at
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-268%3A+Kafka+Rack+Awareness
> >
> >
> > More information about the Kafka rack awareness feature is at
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awareness+for+Kafka+Streams
> >
> >
> > Best,
> >
> > Jeremy
> >
>


[jira] [Created] (FLINK-29990) Unparsed SQL for SqlTableLike cannot be parsed correctly

2022-11-10 Thread Shuo Cheng (Jira)
Shuo Cheng created FLINK-29990:
--

 Summary: Unparsed SQL for SqlTableLike cannot be parsed correctly
 Key: FLINK-29990
 URL: https://issues.apache.org/jira/browse/FLINK-29990
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.16.0
Reporter: Shuo Cheng
 Fix For: 1.17.0


Consider the following DDL sql:
{code:java}
create table source_table(
  a int,
  b bigint,
  c string
)
LIKE parent_table{code}
After unparsed by sql parser, we get the following result:
{code:java}
CREATE TABLE `SOURCE_TABLE` (
  `A` INTEGER,
  `B` BIGINT,
  `C` STRING
)
LIKE `PARENT_TABLE` (
) {code}
Exception will be thrown, when you trying to parse the above sql.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29989) Enable FlameGraph for arbitrary thread on TaskManager

2022-11-10 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-29989:
---

 Summary: Enable FlameGraph for arbitrary thread on TaskManager
 Key: FLINK-29989
 URL: https://issues.apache.org/jira/browse/FLINK-29989
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: Zhu Zhu


FlameGraph for arbitrary thread on TaskManager can be helpful for tasks which 
will spawn other worker threads. See FLINK-29629 for more details.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29988) Improve upper case fields for hive metastore

2022-11-10 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29988:


 Summary: Improve upper case fields for hive metastore
 Key: FLINK-29988
 URL: https://issues.apache.org/jira/browse/FLINK-29988
 Project: Flink
  Issue Type: Improvement
Reporter: Jingsong Lee


If the fields in the fts table are uppercase, there will be a mismatched 
exception when used in the Hive.

1. If it is not supported at the beginning, throw an exception when flink 
creates a table to the hive metastore.
2. If it is supported, so that no error is reported in the whole process, but 
save lower case in hive metastore. We can check columns with the same name when 
creating a table in Flink with hive metastore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29987) PartialUpdateITCase.testForeignKeyJo is unstable

2022-11-10 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29987:


 Summary: PartialUpdateITCase.testForeignKeyJo is unstable
 Key: FLINK-29987
 URL: https://issues.apache.org/jira/browse/FLINK-29987
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-268: Rack Awareness for Kafka Sources

2022-11-10 Thread Yanfei Lei
Hi Jeremy,

Thanks for the proposal.
I'm curious if this feature can adapt to the current kafka-client version
in the connector. The rack awareness feature requires a version>=3.2.0 to
use, while the kafka-client version of Kafka connector is 2.8.1.  Will we
upgrade the Kafka-client version first?

Best,
Yanfei

Jeremy DeGroot  于2022年11月11日周五 05:35写道:

> Kafak has a rack awareness feature that allows brokers and consumers to
> communicate about the rack (or AWS Availability Zone) they're located in.
> Reading from a local broker can save money in bandwidth and improve latency
> for your consumers.
>
> This improvement proposes that a Kafka Consumer could be configured with a
> callback that could be run when it's being configured on the task manager,
> that will set the appropriate value at runtime if a value is provided.
>
> More detail about this proposal can be found at
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-268%3A+Kafka+Rack+Awareness
>
>
> More information about the Kafka rack awareness feature is at
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awareness+for+Kafka+Streams
>
>
> Best,
>
> Jeremy
>


Re: SQL Gateway and SQL Client

2022-11-10 Thread Shengkai Fang
Hi, Jim.

> how to pass additional headers when sending REST requests

Could you share what headers do you want to send when using SQL Client?  I
think there are two cases we need to consider. Please correct me if I am
wrong.

# Case 1

If users wants to connect to the SQL Gateway with its password, I think the
users should type
```
./sql-client.sh --user xxx --password xxx
```
in the terminal and the OpenSessionRequest should be enough.

# Case 2

If users  wants to modify the execution config, users should type
```
Flink SQL> SET  `` = ``;
```
in the terminal. The Client can send ExecuteStatementRequest to the Gateway.

> As you have FLIPs or PRs, feel free to let me, Jamie, and Alexey know.

It would be nice you can join us to finish the feature. I think the
modification about the SQL Gateway side is ready to review.

Best,
Shengkai


Jim Hughes  于2022年11月11日周五 05:19写道:

> Hi Yu Zelin,
>
> I have read through your draft and it looks good.  I am new to Flink, so I
> haven't learned about everything which needs to be done yet.
>
> One of the considerations that I'm interested in understanding is how to
> pass additional headers when sending REST requests.  From looking at the
> code, it looks like a custom `OutboundChannelHandlerFactory` could be
> created to read additional configuration and set headers.  Does that make
> sense?
>
> Thank you very much for sharing the proof of concept code and the
> document.  As you have FLIPs or PRs, feel free to let me, Jamie, and Alexey
> know.  We'll be happy to review them.
>
> Cheers,
>
> Jim
>
> On Wed, Nov 9, 2022 at 11:43 PM yu zelin  wrote:
>
> > Hi, all
> > Sorry for late response. As Shengkai mentioned, Currently I’m working
> with
> > him on SQL Client, dedicating to implement the Remote Mode of SQL
> Client. I
> > have written a draft of implementation plan and will write a FLIP about
> it
> > ASAP. If you are interested in, please take a look at the draft and it’s
> > nice if you give me some feedback.
> > The doc is at:
> >
> https://docs.google.com/document/d/14cS4VBSamMUnlM_PZuK6QKLfriUuQU51iqET5oiYy_c/edit?usp=sharing
> >
> > > 2022年11月7日 11:19,Shengkai Fang  写道:
> > >
> > > Hi, all. Sorry for the late reply.
> > >
> > > > Is the gateway mode planned to be supported for SQL Client in 1.17?
> > > > Do you have anything you can already share so we can start with your
> > work or just play around with it.
> > >
> > > Yes. @yzl is working on it and he will list the implementation plan
> > later and share the progress. I think the change is not very large and I
> > think it's not a big problem to finish this in the release-1.17. I will
> > join to develop this in the mid of November.
> > >
> > > Best,
> > > Shengkai
> > >
> > >
> > >
> > >
> > > Jamie Grier mailto:jgr...@apache.org>>
> > 于2022年11月5日周六 00:48写道:
> > >> Hi Shengkai,
> > >>
> > >> We're doing more and more Flink development at Confluent these days
> and
> > we're currently trying to bootstrap a prototype that relies on the SQL
> > Client and Gateway.  We will be using the the components in some of our
> > projects and would like to co-develop them with you and the rest of the
> > Flink community.
> > >>
> > >> As of right now it's a pretty big blocker for our upcoming milestone
> > that the SQL Client has not yet been modified to talk to the SQL Gateway
> > and we want to help with this effort ASAP!  We would be even willing to
> > take over the work if it's not yet started but I suspect it already is.
> > >>
> > >> Anyway, rather than start working immediately on the SQL Client and
> > adding a the new Gateway mode ourselves we wanted to start a conversation
> > with you and see where you're at with things and offer to help.
> > >>
> > >> Do you have anything you can already share so we can start with your
> > work or just play around with it.  Like I said, we just want to get
> started
> > and are very able to help in this area.  We see both the SQL Client and
> > Gateway being very important for us and have a good team to help develop
> it.
> > >>
> > >> Let me know if there is a branch you can share, etc.  It would be much
> > appreciated!
> > >>
> > >> -Jamie Grier
> > >>
> > >>
> > >> On 2022/10/28 06:06:49 Shengkai Fang wrote:
> > >> > Hi.
> > >> >
> > >> > > Is there a possibility for us to get engaged and at least
> introduce
> > >> > initial changes to support authentication/authorization?
> > >> >
> > >> > Yes. You can write a FLIP about the design and change. We can
> discuss
> > this
> > >> > in the dev mail. If the FLIP passes, we can develop it together.
> > >> >
> > >> > > Another question about persistent Gateway: did you have any
> specific
> > >> > thoughts about it or some draft design?
> > >> >
> > >> > We don't have any detailed plan about this. But I know Livy has a
> > similar
> > >> > feature.
> > >> >
> > >> > Best,
> > >> > Shengkai
> > >> >
> > >> >
> > >> > Alexey Leonov-Vendrovskiy  > vendrov...@gmail.com>> 于2022年10月27日周四 15:12写道:
> > >> >
> > >> > > Apologies 

[jira] [Created] (FLINK-29986) How to persist flink catalogs configuration?

2022-11-10 Thread Alvin Ge (Jira)
Alvin Ge created FLINK-29986:


 Summary: How to persist flink catalogs configuration?
 Key: FLINK-29986
 URL: https://issues.apache.org/jira/browse/FLINK-29986
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.16.0
 Environment: Cent OS

Flink 1.16

Hive 3.1.3

Hadoop 3.3.4
Reporter: Alvin Ge


Hello every one.

 

When I closed sql terminal(./sql-client.sh), The Catalogs I created are all 
gone when I open sql terminal(./sql-client.sh) next time.

Do I have some other way to persist these catalogs?

 

Thanks.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] FLIP-268: Rack Awareness for Kafka Sources

2022-11-10 Thread Jeremy DeGroot
Kafak has a rack awareness feature that allows brokers and consumers to
communicate about the rack (or AWS Availability Zone) they're located in.
Reading from a local broker can save money in bandwidth and improve latency
for your consumers.

This improvement proposes that a Kafka Consumer could be configured with a
callback that could be run when it's being configured on the task manager,
that will set the appropriate value at runtime if a value is provided.

More detail about this proposal can be found at
https://cwiki.apache.org/confluence/display/FLINK/FLIP-268%3A+Kafka+Rack+Awareness


More information about the Kafka rack awareness feature is at
https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awareness+for+Kafka+Streams


Best,

Jeremy


Re: SQL Gateway and SQL Client

2022-11-10 Thread Jim Hughes
Hi Yu Zelin,

I have read through your draft and it looks good.  I am new to Flink, so I
haven't learned about everything which needs to be done yet.

One of the considerations that I'm interested in understanding is how to
pass additional headers when sending REST requests.  From looking at the
code, it looks like a custom `OutboundChannelHandlerFactory` could be
created to read additional configuration and set headers.  Does that make
sense?

Thank you very much for sharing the proof of concept code and the
document.  As you have FLIPs or PRs, feel free to let me, Jamie, and Alexey
know.  We'll be happy to review them.

Cheers,

Jim

On Wed, Nov 9, 2022 at 11:43 PM yu zelin  wrote:

> Hi, all
> Sorry for late response. As Shengkai mentioned, Currently I’m working with
> him on SQL Client, dedicating to implement the Remote Mode of SQL Client. I
> have written a draft of implementation plan and will write a FLIP about it
> ASAP. If you are interested in, please take a look at the draft and it’s
> nice if you give me some feedback.
> The doc is at:
> https://docs.google.com/document/d/14cS4VBSamMUnlM_PZuK6QKLfriUuQU51iqET5oiYy_c/edit?usp=sharing
>
> > 2022年11月7日 11:19,Shengkai Fang  写道:
> >
> > Hi, all. Sorry for the late reply.
> >
> > > Is the gateway mode planned to be supported for SQL Client in 1.17?
> > > Do you have anything you can already share so we can start with your
> work or just play around with it.
> >
> > Yes. @yzl is working on it and he will list the implementation plan
> later and share the progress. I think the change is not very large and I
> think it's not a big problem to finish this in the release-1.17. I will
> join to develop this in the mid of November.
> >
> > Best,
> > Shengkai
> >
> >
> >
> >
> > Jamie Grier mailto:jgr...@apache.org>>
> 于2022年11月5日周六 00:48写道:
> >> Hi Shengkai,
> >>
> >> We're doing more and more Flink development at Confluent these days and
> we're currently trying to bootstrap a prototype that relies on the SQL
> Client and Gateway.  We will be using the the components in some of our
> projects and would like to co-develop them with you and the rest of the
> Flink community.
> >>
> >> As of right now it's a pretty big blocker for our upcoming milestone
> that the SQL Client has not yet been modified to talk to the SQL Gateway
> and we want to help with this effort ASAP!  We would be even willing to
> take over the work if it's not yet started but I suspect it already is.
> >>
> >> Anyway, rather than start working immediately on the SQL Client and
> adding a the new Gateway mode ourselves we wanted to start a conversation
> with you and see where you're at with things and offer to help.
> >>
> >> Do you have anything you can already share so we can start with your
> work or just play around with it.  Like I said, we just want to get started
> and are very able to help in this area.  We see both the SQL Client and
> Gateway being very important for us and have a good team to help develop it.
> >>
> >> Let me know if there is a branch you can share, etc.  It would be much
> appreciated!
> >>
> >> -Jamie Grier
> >>
> >>
> >> On 2022/10/28 06:06:49 Shengkai Fang wrote:
> >> > Hi.
> >> >
> >> > > Is there a possibility for us to get engaged and at least introduce
> >> > initial changes to support authentication/authorization?
> >> >
> >> > Yes. You can write a FLIP about the design and change. We can discuss
> this
> >> > in the dev mail. If the FLIP passes, we can develop it together.
> >> >
> >> > > Another question about persistent Gateway: did you have any specific
> >> > thoughts about it or some draft design?
> >> >
> >> > We don't have any detailed plan about this. But I know Livy has a
> similar
> >> > feature.
> >> >
> >> > Best,
> >> > Shengkai
> >> >
> >> >
> >> > Alexey Leonov-Vendrovskiy  vendrov...@gmail.com>> 于2022年10月27日周四 15:12写道:
> >> >
> >> > > Apologies from the delayed response on my side.
> >> > >
> >> > >  I think the authentication module is not part of our plan in 1.17
> because
> >> > >> of the busy work. I think we'll start the design at the end of the
> >> > >> release-1.17.
> >> > >
> >> > >
> >> > > Is there a possibility for us to get engaged and at least introduce
> >> > > initial changes to support authentication/authorization?
> Specifically,
> >> > > changes in the API and in SQL Client.
> >> > >
> >> > > We expect the following authentication flow:
> >> > >
> >> > > On the SQL gateway we want to be able to use a delegation token.
> >> > > SQL client should be able to supply an API key.
> >> > > The SQL Gateway *would not *be submitting jobs on behalf of the
> client.
> >> > >
> >> > > Ideally it would be nice to introduce some interfaces in the SQL
> Gateway
> >> > > that would allow implementing custom authentication and
> authorization.
> >> > >
> >> > > Another question about persistent Gateway: did you have any specific
> >> > > thoughts about it or some draft design?
> >> > >
> >> > > Thanks,
> >> > > Alexey
> >> > >
> >> 

Re: [DISCUSS] Release Flink 1.15.3

2022-11-10 Thread Martijn Visser
Hi Fabian,

I've added 1.15.4 as a new release version.

Thanks, Martijn

On Thu, Nov 10, 2022 at 5:18 PM Fabian Paul
 wrote:

> I conclude that the community has accepted another release, and I will open
> the voting thread shortly. Can someone with PMC rights add 1.15.4 as a new
> release version in JIRA [1] so that I can update the still open tickets?
>
> Best,
> Fabian
>
> [1]
>
> https://issues.apache.org/jira/plugins/servlet/project-config/FLINK/versions
>
> On Wed, Nov 2, 2022 at 2:07 PM Fabian Paul  wrote:
>
> > Thanks for all the replies. @xintong I'll definitely come back to your
> > offer when facing steps that require PMC rights for the release.
> >
> > I checked the JIRA and found four blocking/critical issues affecting
> 1.15.2
> >
> > - FLINK-29830 
> > - FLINK-29492 
> > - FLINK-29315 
> > - FLINK-29234 
> >
> > I'll reach out to the ticket owners to get their opinion about the
> current
> > status. In case, someone knows of some pending fixes that I haven't
> > mentioned please let me know.
> >
> > Best,
> > Fabian
> >
> > On Wed, Oct 26, 2022 at 2:01 PM Konstantin Knauf 
> > wrote:
> >
> >> +1, thanks Fabian.
> >>
> >> Am Mi., 26. Okt. 2022 um 08:26 Uhr schrieb Danny Cranmer <
> >> dannycran...@apache.org>:
> >>
> >> > +1, thanks for driving this Fabian.
> >> >
> >> > Danny,
> >> >
> >> > On Wed, Oct 26, 2022 at 2:22 AM yuxia 
> >> wrote:
> >> >
> >> > > Thanks for driving this.
> >> > > +1 for release 1.15.3
> >> > >
> >> > > Best regards,
> >> > > Yuxia
> >> > >
> >> > > - 原始邮件 -
> >> > > 发件人: "Leonard Xu" 
> >> > > 收件人: "dev" 
> >> > > 发送时间: 星期二, 2022年 10 月 25日 下午 10:00:47
> >> > > 主题: Re: [DISCUSS] Release Flink 1.15.3
> >> > >
> >> > > Thanks Fabian for driving this.
> >> > >
> >> > > +1 to release 1.15.3.
> >> > >
> >> > > The bug tickets FLINK-26394 and FLINK-27148 should be fixed as well,
> >> I’ll
> >> > > help to address them soon.
> >> > >
> >> > > Best,
> >> > > Leonard Xu
> >> > >
> >> > >
> >> > >
> >> > > > 2022年10月25日 下午8:28,Jing Ge  写道:
> >> > > >
> >> > > > +1 The timing is good to have 1.15.3 release. Thanks Fabian for
> >> > bringing
> >> > > > this to our attention.
> >> > > >
> >> > > > I just checked PRs and didn't find the 1.15 backport of
> FLINK-29567
> >> > > > . Please be
> >> aware
> >> > of
> >> > > it.
> >> > > > Thanks!
> >> > > >
> >> > > > Best regards,
> >> > > > Jing
> >> > > >
> >> > > > On Tue, Oct 25, 2022 at 11:44 AM Xintong Song <
> >> tonysong...@gmail.com>
> >> > > wrote:
> >> > > >
> >> > > >> Thanks for bringing this up, Fabian.
> >> > > >>
> >> > > >> +1 for creating a 1.15.3 release. I've also seen users requiring
> >> this
> >> > > >> version [1].
> >> > > >>
> >> > > >> I can help with actions that require a PMC role, if needed.
> >> > > >>
> >> > > >> Best,
> >> > > >>
> >> > > >> Xintong
> >> > > >>
> >> > > >>
> >> > > >> [1]
> >> https://lists.apache.org/thread/501q4l1c6gs8hwh433bw3v1y8fs9cw2n
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >> On Tue, Oct 25, 2022 at 5:11 PM Fabian Paul 
> >> wrote:
> >> > > >>
> >> > > >>> Hi all,
> >> > > >>>
> >> > > >>> I want to start the discussion of creating a new 1.15 patch
> >> release
> >> > > >>> (1.15.3). The last 1.15 release is almost two months old, and
> >> since
> >> > > then,
> >> > > >>> ~60 tickets have been closed, targeting 1.15.3. It includes
> >> critical
> >> > > >>> changes to the sink architecture, including:
> >> > > >>>
> >> > > >>> - Reverting the sink metric naming [1]
> >> > > >>> - Recovery problems for sinks using the GlobalCommitter
> [2][3][4]
> >> > > >>>
> >> > > >>> If the community agrees to create a new patch release, I could
> >> > > volunteer
> >> > > >> as
> >> > > >>> the release manager.
> >> > > >>>
> >> > > >>> Best,
> >> > > >>> Fabian
> >> > > >>>
> >> > > >>> [1] https://issues.apache.org/jira/browse/FLINK-29567
> >> > > >>> [2] https://issues.apache.org/jira/browse/FLINK-29509
> >> > > >>> [3] https://issues.apache.org/jira/browse/FLINK-29512
> >> > > >>> [4] https://issues.apache.org/jira/browse/FLINK-29627
> >> > > >>>
> >> > > >>
> >> > >
> >> >
> >>
> >>
> >> --
> >> https://twitter.com/snntrable
> >> https://github.com/knaufk
> >>
> >
>


[jira] [Created] (FLINK-29985) SlotTable not close on TM termination

2022-11-10 Thread Roman Khachatryan (Jira)
Roman Khachatryan created FLINK-29985:
-

 Summary: SlotTable not close on TM termination
 Key: FLINK-29985
 URL: https://issues.apache.org/jira/browse/FLINK-29985
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.16.0, 1.15.3
Reporter: Roman Khachatryan


When a slot is released, the associated resources are released as well, in 
particular, MemoryManager. MemoryManager might hold not only memory, but also 
some arbitrary shared resources (currently, PythonSharedResources and 
RocksDBSharedResources).

When TM is stopped by JManager, its slot table is closed, causing all its slot 
to be released

When TM is stopped by SIGTERM (i.e. external resource manager), its slot table 
is NOT closed.

That means that in standalone clusters, some resources might not be released.

 

As of now, RocksDBSharedResources contains only ephemeral resources.

Not sure about PythonSharedResources, but likely it is associated with a 
separate process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29984) Flink Histogram not emitting min and max when using Prometheus Reporter

2022-11-10 Thread Lim Qing Wei (Jira)
Lim Qing Wei created FLINK-29984:


 Summary: Flink Histogram not emitting min and max when using 
Prometheus Reporter
 Key: FLINK-29984
 URL: https://issues.apache.org/jira/browse/FLINK-29984
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Metrics
Affects Versions: 1.13.6
Reporter: Lim Qing Wei


I am currently registering a Flink Histogram and using the Prometheus Metrics 
Reporter to send this metric to our Time Series Data Storage. When Prometheus 
grabs this metric and converts it to the "summary" type, there is no sum found 
(only the streaming quantiles and count). This is causing an issue when our 
metrics agent is attempting to capture the Flink Histogram/Prometheus Summary.

I was wondering if in a newer version (than 1.13.6) the histogram sum is 
computed by Flink and what that version would be? If not, is there any work 
around so that a Flink histogram can emit all 3 elements (quantiles, sum, and 
count) in Prometheus format?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] Release 1.15.3, release candidate #1

2022-11-10 Thread Fabian Paul
Hi everyone, Please review and vote on the release candidate #1 for the
version 1.15.3, as follows: [ ] +1, Approve the release [ ] -1, Do not
approve the release (please provide specific comments) The complete staging
area is available for your review, which includes: - JIRA release notes
[1], - the official Apache source release and binary convenience releases
to be deployed to dist.apache.org [2], which are signed with the key with
fingerprint 90755B0A184BD9FFD22B6BE19D4F76C84EC11E37 [3], - all artifacts
to be deployed to the Maven Central Repository [4], - source code tag
"release-1.15.3-rc1" [5], - website pull request listing the new release
and adding announcement blog post [6]. The vote will be open for at least
72 hours. It is adopted by majority approval, with at least 3 PMC
affirmative votes.

Best, Fabian
[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352210

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.15.3-r
c1
[3] https://dist.apache.org/repos/dist/release/flink/KEYS [4]
https://repository.apache.org/content/repositories/orgapacheflink-1548

[5] https://github.com/apache/flink/tree/release-1.15.3-rc
1 [6]
https://github.com/apache/flink-web/pull/581



Re: [DISCUSS] Release Flink 1.15.3

2022-11-10 Thread Fabian Paul
I conclude that the community has accepted another release, and I will open
the voting thread shortly. Can someone with PMC rights add 1.15.4 as a new
release version in JIRA [1] so that I can update the still open tickets?

Best,
Fabian

[1]
https://issues.apache.org/jira/plugins/servlet/project-config/FLINK/versions

On Wed, Nov 2, 2022 at 2:07 PM Fabian Paul  wrote:

> Thanks for all the replies. @xintong I'll definitely come back to your
> offer when facing steps that require PMC rights for the release.
>
> I checked the JIRA and found four blocking/critical issues affecting 1.15.2
>
> - FLINK-29830 
> - FLINK-29492 
> - FLINK-29315 
> - FLINK-29234 
>
> I'll reach out to the ticket owners to get their opinion about the current
> status. In case, someone knows of some pending fixes that I haven't
> mentioned please let me know.
>
> Best,
> Fabian
>
> On Wed, Oct 26, 2022 at 2:01 PM Konstantin Knauf 
> wrote:
>
>> +1, thanks Fabian.
>>
>> Am Mi., 26. Okt. 2022 um 08:26 Uhr schrieb Danny Cranmer <
>> dannycran...@apache.org>:
>>
>> > +1, thanks for driving this Fabian.
>> >
>> > Danny,
>> >
>> > On Wed, Oct 26, 2022 at 2:22 AM yuxia 
>> wrote:
>> >
>> > > Thanks for driving this.
>> > > +1 for release 1.15.3
>> > >
>> > > Best regards,
>> > > Yuxia
>> > >
>> > > - 原始邮件 -
>> > > 发件人: "Leonard Xu" 
>> > > 收件人: "dev" 
>> > > 发送时间: 星期二, 2022年 10 月 25日 下午 10:00:47
>> > > 主题: Re: [DISCUSS] Release Flink 1.15.3
>> > >
>> > > Thanks Fabian for driving this.
>> > >
>> > > +1 to release 1.15.3.
>> > >
>> > > The bug tickets FLINK-26394 and FLINK-27148 should be fixed as well,
>> I’ll
>> > > help to address them soon.
>> > >
>> > > Best,
>> > > Leonard Xu
>> > >
>> > >
>> > >
>> > > > 2022年10月25日 下午8:28,Jing Ge  写道:
>> > > >
>> > > > +1 The timing is good to have 1.15.3 release. Thanks Fabian for
>> > bringing
>> > > > this to our attention.
>> > > >
>> > > > I just checked PRs and didn't find the 1.15 backport of FLINK-29567
>> > > > . Please be
>> aware
>> > of
>> > > it.
>> > > > Thanks!
>> > > >
>> > > > Best regards,
>> > > > Jing
>> > > >
>> > > > On Tue, Oct 25, 2022 at 11:44 AM Xintong Song <
>> tonysong...@gmail.com>
>> > > wrote:
>> > > >
>> > > >> Thanks for bringing this up, Fabian.
>> > > >>
>> > > >> +1 for creating a 1.15.3 release. I've also seen users requiring
>> this
>> > > >> version [1].
>> > > >>
>> > > >> I can help with actions that require a PMC role, if needed.
>> > > >>
>> > > >> Best,
>> > > >>
>> > > >> Xintong
>> > > >>
>> > > >>
>> > > >> [1]
>> https://lists.apache.org/thread/501q4l1c6gs8hwh433bw3v1y8fs9cw2n
>> > > >>
>> > > >>
>> > > >>
>> > > >> On Tue, Oct 25, 2022 at 5:11 PM Fabian Paul 
>> wrote:
>> > > >>
>> > > >>> Hi all,
>> > > >>>
>> > > >>> I want to start the discussion of creating a new 1.15 patch
>> release
>> > > >>> (1.15.3). The last 1.15 release is almost two months old, and
>> since
>> > > then,
>> > > >>> ~60 tickets have been closed, targeting 1.15.3. It includes
>> critical
>> > > >>> changes to the sink architecture, including:
>> > > >>>
>> > > >>> - Reverting the sink metric naming [1]
>> > > >>> - Recovery problems for sinks using the GlobalCommitter [2][3][4]
>> > > >>>
>> > > >>> If the community agrees to create a new patch release, I could
>> > > volunteer
>> > > >> as
>> > > >>> the release manager.
>> > > >>>
>> > > >>> Best,
>> > > >>> Fabian
>> > > >>>
>> > > >>> [1] https://issues.apache.org/jira/browse/FLINK-29567
>> > > >>> [2] https://issues.apache.org/jira/browse/FLINK-29509
>> > > >>> [3] https://issues.apache.org/jira/browse/FLINK-29512
>> > > >>> [4] https://issues.apache.org/jira/browse/FLINK-29627
>> > > >>>
>> > > >>
>> > >
>> >
>>
>>
>> --
>> https://twitter.com/snntrable
>> https://github.com/knaufk
>>
>


Re: ASF Slack

2022-11-10 Thread Ryan Skraba
I have to admit, I also (weakly) prefer a single workspace where possible.
It's easy to miss (or just not look at) the apache-flink workspace because
I didn't think of it.

The spam and spam account issues are probably inevitable for any slack
workspace as it grows, and we can probably reasonably expect to deal with
it in apache-flink as well if it grows significantly popular.  Have we seen
any spammy issues so far at 1362 members (as opposed to the-asf's 11433
members?)

All my best, Ryan

On Thu, Nov 10, 2022 at 4:51 PM Maximilian Michels  wrote:

> >On the other hand, could you explain a bit more about what are the
> problems / drawbacks that you see in the current Flink Slack?
> >- I assume having to join too many workspaces counts one
>
> I like the idea of having a single workspace for all ASF projects,
> similarly
> to how we share JIRA, mail servers, or other infrastructure. Sharing
> resources usually means there are some constraints but it also has the
> upside of solving problems once for all projects. Arguably that's less true
> for a cloud product like Slack but some customizations can still be applied
> to Slack workspaces to streamline the experience. It looks like the ASF
> hasn't come up with a good workflow for projects, e.g. channel moderation.
>
> It might not be worth migrating back at this point but we can continue the
> evaluation at a later time.
>
> -Max
>
>
> On Thu, Nov 10, 2022 at 3:31 PM Chesnay Schepler 
> wrote:
>
> > https://issues.apache.org/jira/browse/INFRA-22573
> >
> > On 10/11/2022 11:17, Martijn Visser wrote:
> > > It was discussed at the latest Apachecon conference by Infra during one
> > of
> > > the lightning talks. If I recall correctly, it was primarily turned to
> > > invite-only due to spam. But definitely good to validate that.
> > >
> > > On Thu, Nov 10, 2022 at 11:09 AM Maximilian Michels 
> > wrote:
> > >
> > >> The registration problem should be solvable. Maybe it is due to the
> > Slack
> > >> pricing model that the ASF Slack is invite-only. I'll ping the
> community
> > >> mailing list.
> > >>
> > >> Have these issues at any point been discussed with the ASF? I feel
> like
> > >> this is one of the examples where a community spins off to do its own
> > thing
> > >> instead of working together with the foundation.
> > >>
> > >> -Max
> > >>
> > >> On Wed, Nov 9, 2022 at 10:46 AM Konstantin Knauf 
> > >> wrote:
> > >>
> > >>> Hi everyone,
> > >>>
> > >>> I agree with Xintong in the sense that I don't see what has changed
> > since
> > >>> the original decision on this topic. In my opinion, there is a high
> > cost
> > >> in
> > >>> moving to ASF now, namely I fear we will loose many of the >1200
> > members
> > >>> and the momentum that I see in the workspace. To me there would need
> to
> > >> be
> > >>> strong reason for reverting this decision now.
> > >>>
> > >>> Cheers,
> > >>>
> > >>> Kosntantin
> > >>>
> > >>> Am Di., 8. Nov. 2022 um 10:35 Uhr schrieb Xintong Song <
> > >>> tonysong...@gmail.com>:
> > >>>
> >  Hi Max,
> > 
> >  Thanks for bringing this up. I'm open to a re-evaluation of the
> Slack
> >  usages.
> > 
> >  In the initial discussion for creating the Slack workspace [1],
> > >>> leveraging
> >  the ASF Slack was indeed brought up as an alternative by many
> folks. I
> >  think we have chosen a dedicated Flink Slack over the ASF Slack
> mainly
> > >>> for
> >  two reasons.
> >  - ASF Slack is limited to people with an @apache.org email address
> >  - With a dedicated Flink Slack, we have the full authority to manage
> > >> and
> >  customize it. E.g., archiving / removing improper channels,
> reporting
> > >> the
> >  build and benchmark reports to channels, subscribing and re-post
> Flink
> > >>> blog
> >  posts.
> >  As far as I can see, these concerns for the ASF slack have not
> changed
> >  since the previous decision.
> > 
> >  On the other hand, could you explain a bit more about what are the
> > >>> problems
> >  / drawbacks that you see in the current Flink Slack?
> >  - I assume having to join too many workspaces counts one
> > 
> >  Best,
> > 
> >  Xintong
> > 
> > 
> >  [1]
> https://lists.apache.org/thread/n43r4qmwprhdmzrj494dbbwr9w7bbdcv
> > 
> >  On Tue, Nov 8, 2022 at 4:51 PM Martijn Visser <
> > >> martijnvis...@apache.org>
> >  wrote:
> > 
> > > If you click on the link from Beam via an incognito window/logged
> out
> > >>> of
> > > Slack, you will be prompted to provide the workspace URL of the
> ASF.
> > >> If
> >  you
> > > do that, you're prompted for a login screen or you can create an
> > >>> account.
> > > Creating an account prompts you to have an @apache.org email
> > >> address.
> >  See
> > > https://imgur.com/a/jXvr5Ai
> > >
> > > So for me that's a -1 for switching to the ASF workspace.
> > >
> > > On Mon, Nov 7, 2022 at 10:52 PM Austin 

[jira] [Created] (FLINK-29983) Table Store with Hive3 profile lacks hive-standalone-metastore dependency

2022-11-10 Thread Jane Chan (Jira)
Jane Chan created FLINK-29983:
-

 Summary: Table Store with Hive3 profile lacks 
hive-standalone-metastore dependency
 Key: FLINK-29983
 URL: https://issues.apache.org/jira/browse/FLINK-29983
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Affects Versions: table-store-0.3.0
Reporter: Jane Chan
 Fix For: table-store-0.3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ACCOUNCE] Apache Flink Elasticsearch Connector 3.0.0 released

2022-11-10 Thread Ryan Skraba
Excellent news -- welcome to the new era of easier, more timely and more
feature-rich releases for everyone!

Great job!  Ryan

On Thu, Nov 10, 2022 at 3:15 PM Leonard Xu  wrote:

> Thanks Chesnay and Martijn for the great work!   I believe the
> flink-connector-shared-utils[1] you built will help Flink connector
> developers a lot.
>
>
> Best,
> Leonard
> [1] https://github.com/apache/flink-connector-shared-utils
>
> 2022年11月10日 下午9:53,Martijn Visser  写道:
>
> Really happy with the first externalized connector for Flink. Thanks a lot
> to all of you involved!
>
> On Thu, Nov 10, 2022 at 12:51 PM Chesnay Schepler 
> wrote:
>
>> The Apache Flink community is very happy to announce the release of
>> Apache Flink Elasticsearch Connector 3.0.0.
>>
>> Apache Flink® is an open-source stream processing framework for
>> distributed, high-performing, always-available, and accurate data
>> streaming applications.
>>
>> The release is available for download at:
>> https://flink.apache.org/downloads.html
>>
>> This release marks the first time we have released a connector
>> separately from the main Flink release.
>> Over time more connectors will be migrated to this release model.
>>
>> This release is equivalent to the connector version released alongside
>> Flink 1.16.0 and acts as a drop-in replacement.
>>
>> The full release notes are available in Jira:
>> https://issues.apache.org/jira/projects/FLINK/versions/12352291
>>
>> We would like to thank all contributors of the Apache Flink community
>> who made this release possible!
>>
>> Regards,
>> Chesnay
>>
>
>


Re: ASF Slack

2022-11-10 Thread Maximilian Michels
>On the other hand, could you explain a bit more about what are the
problems / drawbacks that you see in the current Flink Slack?
>- I assume having to join too many workspaces counts one

I like the idea of having a single workspace for all ASF projects, similarly
to how we share JIRA, mail servers, or other infrastructure. Sharing
resources usually means there are some constraints but it also has the
upside of solving problems once for all projects. Arguably that's less true
for a cloud product like Slack but some customizations can still be applied
to Slack workspaces to streamline the experience. It looks like the ASF
hasn't come up with a good workflow for projects, e.g. channel moderation.

It might not be worth migrating back at this point but we can continue the
evaluation at a later time.

-Max


On Thu, Nov 10, 2022 at 3:31 PM Chesnay Schepler  wrote:

> https://issues.apache.org/jira/browse/INFRA-22573
>
> On 10/11/2022 11:17, Martijn Visser wrote:
> > It was discussed at the latest Apachecon conference by Infra during one
> of
> > the lightning talks. If I recall correctly, it was primarily turned to
> > invite-only due to spam. But definitely good to validate that.
> >
> > On Thu, Nov 10, 2022 at 11:09 AM Maximilian Michels 
> wrote:
> >
> >> The registration problem should be solvable. Maybe it is due to the
> Slack
> >> pricing model that the ASF Slack is invite-only. I'll ping the community
> >> mailing list.
> >>
> >> Have these issues at any point been discussed with the ASF? I feel like
> >> this is one of the examples where a community spins off to do its own
> thing
> >> instead of working together with the foundation.
> >>
> >> -Max
> >>
> >> On Wed, Nov 9, 2022 at 10:46 AM Konstantin Knauf 
> >> wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> I agree with Xintong in the sense that I don't see what has changed
> since
> >>> the original decision on this topic. In my opinion, there is a high
> cost
> >> in
> >>> moving to ASF now, namely I fear we will loose many of the >1200
> members
> >>> and the momentum that I see in the workspace. To me there would need to
> >> be
> >>> strong reason for reverting this decision now.
> >>>
> >>> Cheers,
> >>>
> >>> Kosntantin
> >>>
> >>> Am Di., 8. Nov. 2022 um 10:35 Uhr schrieb Xintong Song <
> >>> tonysong...@gmail.com>:
> >>>
>  Hi Max,
> 
>  Thanks for bringing this up. I'm open to a re-evaluation of the Slack
>  usages.
> 
>  In the initial discussion for creating the Slack workspace [1],
> >>> leveraging
>  the ASF Slack was indeed brought up as an alternative by many folks. I
>  think we have chosen a dedicated Flink Slack over the ASF Slack mainly
> >>> for
>  two reasons.
>  - ASF Slack is limited to people with an @apache.org email address
>  - With a dedicated Flink Slack, we have the full authority to manage
> >> and
>  customize it. E.g., archiving / removing improper channels, reporting
> >> the
>  build and benchmark reports to channels, subscribing and re-post Flink
> >>> blog
>  posts.
>  As far as I can see, these concerns for the ASF slack have not changed
>  since the previous decision.
> 
>  On the other hand, could you explain a bit more about what are the
> >>> problems
>  / drawbacks that you see in the current Flink Slack?
>  - I assume having to join too many workspaces counts one
> 
>  Best,
> 
>  Xintong
> 
> 
>  [1] https://lists.apache.org/thread/n43r4qmwprhdmzrj494dbbwr9w7bbdcv
> 
>  On Tue, Nov 8, 2022 at 4:51 PM Martijn Visser <
> >> martijnvis...@apache.org>
>  wrote:
> 
> > If you click on the link from Beam via an incognito window/logged out
> >>> of
> > Slack, you will be prompted to provide the workspace URL of the ASF.
> >> If
>  you
> > do that, you're prompted for a login screen or you can create an
> >>> account.
> > Creating an account prompts you to have an @apache.org email
> >> address.
>  See
> > https://imgur.com/a/jXvr5Ai
> >
> > So for me that's a -1 for switching to the ASF workspace.
> >
> > On Mon, Nov 7, 2022 at 10:52 PM Austin Bennett 
>  wrote:
> >> +1 to leveraging the larger ASF Community/Resources Slack Channel
>  rather
> >> than an independant one ... ASSUMING ANYONE CAN JOIN [ so that
> >> needs
> >>> to
> > be
> >> verified ].
> >>
> >> On Mon, Nov 7, 2022 at 9:05 AM Maximilian Michels 
> > wrote:
> >>> There is a way to work around the invite issue. For example, the
> >>> Beam
> >>> project has a direct invite link which sends you to the #beam
> >>> channel:
> >>> https://app.slack.com/client/T4S1WH2J3/C9H0YNP3P I'm not 100%
> >> sure
> >>> whether
> >>> this link actually works. I've take it from:
> >>> https://beam.apache.org/community/join-beam/
> >>>
> >>> -Max
> >>>
> >>> On Fri, Nov 4, 2022 at 1:48 PM Martijn Visser <
>  

[jira] [Created] (FLINK-29982) Move cassandra connector to dedicated repo

2022-11-10 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-29982:


 Summary: Move cassandra connector to dedicated repo
 Key: FLINK-29982
 URL: https://issues.apache.org/jira/browse/FLINK-29982
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Cassandra
Reporter: Etienne Chauchot


[This repo|https://github.com/apache/flink-connector-cassandra] has just been 
created to host the new Cassandra Source. But before merging the new source, 
the whole connector-cassandra module needs to be migrated to this dedicated 
repo.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: ASF Slack

2022-11-10 Thread Chesnay Schepler

https://issues.apache.org/jira/browse/INFRA-22573

On 10/11/2022 11:17, Martijn Visser wrote:

It was discussed at the latest Apachecon conference by Infra during one of
the lightning talks. If I recall correctly, it was primarily turned to
invite-only due to spam. But definitely good to validate that.

On Thu, Nov 10, 2022 at 11:09 AM Maximilian Michels  wrote:


The registration problem should be solvable. Maybe it is due to the Slack
pricing model that the ASF Slack is invite-only. I'll ping the community
mailing list.

Have these issues at any point been discussed with the ASF? I feel like
this is one of the examples where a community spins off to do its own thing
instead of working together with the foundation.

-Max

On Wed, Nov 9, 2022 at 10:46 AM Konstantin Knauf 
wrote:


Hi everyone,

I agree with Xintong in the sense that I don't see what has changed since
the original decision on this topic. In my opinion, there is a high cost

in

moving to ASF now, namely I fear we will loose many of the >1200 members
and the momentum that I see in the workspace. To me there would need to

be

strong reason for reverting this decision now.

Cheers,

Kosntantin

Am Di., 8. Nov. 2022 um 10:35 Uhr schrieb Xintong Song <
tonysong...@gmail.com>:


Hi Max,

Thanks for bringing this up. I'm open to a re-evaluation of the Slack
usages.

In the initial discussion for creating the Slack workspace [1],

leveraging

the ASF Slack was indeed brought up as an alternative by many folks. I
think we have chosen a dedicated Flink Slack over the ASF Slack mainly

for

two reasons.
- ASF Slack is limited to people with an @apache.org email address
- With a dedicated Flink Slack, we have the full authority to manage

and

customize it. E.g., archiving / removing improper channels, reporting

the

build and benchmark reports to channels, subscribing and re-post Flink

blog

posts.
As far as I can see, these concerns for the ASF slack have not changed
since the previous decision.

On the other hand, could you explain a bit more about what are the

problems

/ drawbacks that you see in the current Flink Slack?
- I assume having to join too many workspaces counts one

Best,

Xintong


[1] https://lists.apache.org/thread/n43r4qmwprhdmzrj494dbbwr9w7bbdcv

On Tue, Nov 8, 2022 at 4:51 PM Martijn Visser <

martijnvis...@apache.org>

wrote:


If you click on the link from Beam via an incognito window/logged out

of

Slack, you will be prompted to provide the workspace URL of the ASF.

If

you

do that, you're prompted for a login screen or you can create an

account.

Creating an account prompts you to have an @apache.org email

address.

See

https://imgur.com/a/jXvr5Ai

So for me that's a -1 for switching to the ASF workspace.

On Mon, Nov 7, 2022 at 10:52 PM Austin Bennett 

wrote:

+1 to leveraging the larger ASF Community/Resources Slack Channel

rather

than an independant one ... ASSUMING ANYONE CAN JOIN [ so that

needs

to

be

verified ].

On Mon, Nov 7, 2022 at 9:05 AM Maximilian Michels 

wrote:

There is a way to work around the invite issue. For example, the

Beam

project has a direct invite link which sends you to the #beam

channel:

https://app.slack.com/client/T4S1WH2J3/C9H0YNP3P I'm not 100%

sure

whether
this link actually works. I've take it from:
https://beam.apache.org/community/join-beam/

-Max

On Fri, Nov 4, 2022 at 1:48 PM Martijn Visser <

martijnvis...@apache.org

wrote:


Hi Max,


I wonder how that has played out since the creation of the

Slack

workspace? I haven't been following the Slack communication.

I think it has primarily played a role next to the existing User

mailing

list: lots of user questions. There were maybe a handful of

conversations

where the result of the conversation was a request to open a

Jira,

create a

FLIP or open up a discussion on the Dev list.


There is an invite link that they can use. Maybe not the most

straight-forward process but I think it doesn't stop users from

joining.

That's only possible to use for those with an @apache.org

e-mail

address,

see https://infra.apache.org/slack.html. Anyone else would need

to

be

invited by a committer, but I wouldn't be in favour of needing

to

spend

committers time in adding/inviting members on an ASF Slack

instance.

Best regards,

Martijn

On Fri, Nov 4, 2022 at 12:33 PM Maximilian Michels <

m...@apache.org>

wrote:

Hey Martijn, hi Chesnay,


The big problem of using the ASF Slack instance is that users

can

join

any Slack channel, including ones outside of Flink.

That is one of the main motivations for proposing to move to

the

ASF

Slack. We can create an unlimited number of "flink-XY" channels

in

the

ASF

Slack, although one or two are probably all we need. It seems

logical

that

we share the Slack workspace, just like the other

infrastructure

like

JIRA,

mail, Jenkins, web server, etc. I guess I'm just in too many

Slack

workspaces already.

 From an ASF standpoint, I think it would be desirable 

Re: [ACCOUNCE] Apache Flink Elasticsearch Connector 3.0.0 released

2022-11-10 Thread Leonard Xu
Thanks Chesnay and Martijn for the great work!   I believe the 
flink-connector-shared-utils[1] you built will help Flink connector developers 
a lot.


Best,
Leonard
[1] https://github.com/apache/flink-connector-shared-utils

> 2022年11月10日 下午9:53,Martijn Visser  写道:
> 
> Really happy with the first externalized connector for Flink. Thanks a lot to 
> all of you involved!
> 
> On Thu, Nov 10, 2022 at 12:51 PM Chesnay Schepler  > wrote:
> The Apache Flink community is very happy to announce the release of 
> Apache Flink Elasticsearch Connector 3.0.0.
> 
> Apache Flink® is an open-source stream processing framework for 
> distributed, high-performing, always-available, and accurate data 
> streaming applications.
> 
> The release is available for download at:
> https://flink.apache.org/downloads.html 
> 
> 
> This release marks the first time we have released a connector 
> separately from the main Flink release.
> Over time more connectors will be migrated to this release model.
> 
> This release is equivalent to the connector version released alongside 
> Flink 1.16.0 and acts as a drop-in replacement.
> 
> The full release notes are available in Jira:
> https://issues.apache.org/jira/projects/FLINK/versions/12352291 
> 
> 
> We would like to thank all contributors of the Apache Flink community 
> who made this release possible!
> 
> Regards,
> Chesnay



Re: [ACCOUNCE] Apache Flink Elasticsearch Connector 3.0.0 released

2022-11-10 Thread Martijn Visser
Really happy with the first externalized connector for Flink. Thanks a lot
to all of you involved!

On Thu, Nov 10, 2022 at 12:51 PM Chesnay Schepler 
wrote:

> The Apache Flink community is very happy to announce the release of
> Apache Flink Elasticsearch Connector 3.0.0.
>
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data
> streaming applications.
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>
> This release marks the first time we have released a connector
> separately from the main Flink release.
> Over time more connectors will be migrated to this release model.
>
> This release is equivalent to the connector version released alongside
> Flink 1.16.0 and acts as a drop-in replacement.
>
> The full release notes are available in Jira:
> https://issues.apache.org/jira/projects/FLINK/versions/12352291
>
> We would like to thank all contributors of the Apache Flink community
> who made this release possible!
>
> Regards,
> Chesnay
>


[jira] [Created] (FLINK-29981) Improve WatermarkAssignerChangelogNormalizeTransposeRule

2022-11-10 Thread godfrey he (Jira)
godfrey he created FLINK-29981:
--

 Summary: Improve WatermarkAssignerChangelogNormalizeTransposeRule
 Key: FLINK-29981
 URL: https://issues.apache.org/jira/browse/FLINK-29981
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Reporter: godfrey he


 

WatermarkAssignerChangelogNormalizeTransposeRule is too complex to maintain. 
It's better we can do some improvement, such as splitting 
WatermarkAssignerChangelogNormalizeTransposeRule into two rules



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29980) Wrap the HiveSource's bulkFormat to handle the partition keys

2022-11-10 Thread Aitozi (Jira)
Aitozi created FLINK-29980:
--

 Summary: Wrap the HiveSource's bulkFormat to handle the partition 
keys
 Key: FLINK-29980
 URL: https://issues.apache.org/jira/browse/FLINK-29980
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Aitozi


As described in https://issues.apache.org/jira/browse/FLINK-25113 to clean up 
the partition keys logic in the parquet and orc formats, hive source should 
leverage the {{FileInfoExtractorBulkFormat}} to handle the partition keys 
internally



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [blog article] Howto migrate a real-life batch pipeline from the DataSet API to the DataStream API

2022-11-10 Thread Jing Ge
Hi Etienne,

Nice blog! Thanks for sharing!

Best regards,
Jing


On Wed, Nov 9, 2022 at 5:49 PM Etienne Chauchot 
wrote:

> Hi Yun Gao,
>
> FYI I just updated the article after your review:
> https://echauchot.blogspot.com/2022/11/flink-howto-migrate-real-life-batch.html
>
> Best
>
> Etienne
> Le 09/11/2022 à 10:04, Etienne Chauchot a écrit :
>
> Hi Yun Gao,
>
> thanks for your email and your review !
>
> My comments are inline
> Le 08/11/2022 à 06:51, Yun Gao a écrit :
>
> Hi Etienne,
>
> Very thanks for the article! Flink is currently indeed keeping increasing
> the
> ability of unified batch / stream processing with the same api, and its a
> great
> pleasure that more and more users are trying this functionality. But I also
> have some questions regarding some details.
>
> First IMO, as a whole for the long run Flink will have two unified APIs,
> namely Table / SQL
> API and DataStream API. Users could express the computation logic with
> these two APIs
> for both bounded and unbounded data processing.
>
>
> Yes that is what I understood also throughout the discussions and jiras.
> And I also think IMHO that reducing the number of APIs to 2 was the good
> move.
>
>
> Underlying Flink provides two
> execution modes:  the streaming mode works with both bounded and unbounded
> data,
> and it executes in a way of incremental processing based on state; the
> batch mode works
> only with bounded data, and it executes in a ways level-by-level similar
> to the traditional
> batch processing frameworks. Users could switch the execution mode via
> EnvironmentSettings.inBatchMode() for
> StreamExecutionEnvironment.setRuntimeMode().
>
> As recommended in Flink docs(1) I have enabled the batch mode as I though
> it would be more efficient on my bounded pipeline but as a matter of fact
> the streaming mode seems to be more efficient on my use case. I'll test
> with higher volumes to confirm.
>
>
>
> Specially for DataStream, as implemented in FLIP-140, currently all the
> existing DataStream
> operation supports the batch execution mode in a unified way[1]:  data
> will be sorted for the
> keyBy() edges according to the key, then the following operations like
> reduce() could receive
> all the data belonging to the same key consecutively, then it could
> directly reducing the records
> of the same key without maintaining the intermediate states. In this way
> users could write the
> same code for both streaming and batch processing with the same code.
>
>
> Yes I have no doubt that my resulting Query3ViaFlinkRowDatastream pipeline
> will work with no modification if I plug an unbounded source to it.
>
>
>
> # Regarding the migration of Join / Reduce
>
> First I think Reduce is always supported and users could write
> dataStream.keyBy().reduce(xx)
> directly, and  if batch  execution mode is set, the reduce will not be
> executed in a incremental way,
> instead is acts much  like sort-based  aggregation in the traditional
> batch processing framework.
>
> Regarding Join, although the issue of FLINK-22587 indeed exists: current
> join has to be bound
> to a window and the GlobalWindow does not work properly, but with some
> more try currently
> it does not need users to  re-write the whole join from scratch: Users
> could write a dedicated
> window assigner that assigns all the  records to the same window instance
> and return
> EventTimeTrigger.create() as the default event-time trigger [2]. Then it
> works
>
> source1.join(source2)
> .where(a -> a.f0)
> .equalTo(b -> b.f0)
> .window(new EndOfStreamWindows())
> .apply();
>
> It does not requires records have event-time attached since the trigger of
> window is only
> relying on the time range of the window and the assignment does not need
> event-time either.
>
> The behavior of the join is also similar to sort-based join if batch mode
> is enabled.
>
> Of course it is not easy to use to let users do the workaround and we'll
> try to fix this issue in 1.17.
>
>
> Yes, this is a better workaround than the manual state-based join that I
> proposed. I tried it and it works perfectly with similar performance.
> Thanks.
>
>
> # Regarding support of Sort / Limit
>
> Currently these two operators are indeed not supported in the DataStream
> API directly. One initial
> though for these two operations are that users may convert the DataStream
> to Table API and use
> Table API for these two operators:
>
> DataStream xx = ... // Keeps the customized logic in DataStream
> Table tableXX = tableEnv.fromDataStream(dataStream);
> tableXX.orderBy($("a").asc());
>
>
> Yes I knew that workaround but I decided not to use it because I have a
> special SQL based implementation (for comparison reasons) so I did not want
> to mix SQL and DataStream APIs in the same pipeline.
>
>
> How do you think about this option? We are also assessing if the
> combination of DataStream
> API / Table API is sufficient for all the batch 

[GitHub] [flink-connector-shared-utils] zentol merged pull request #1: [FLINK-29472] Add first version of release scripts

2022-11-10 Thread GitBox


zentol merged PR #1:
URL: https://github.com/apache/flink-connector-shared-utils/pull/1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[ACCOUNCE] Apache Flink Elasticsearch Connector 3.0.0 released

2022-11-10 Thread Chesnay Schepler
The Apache Flink community is very happy to announce the release of 
Apache Flink Elasticsearch Connector 3.0.0.


Apache Flink® is an open-source stream processing framework for 
distributed, high-performing, always-available, and accurate data 
streaming applications.


The release is available for download at:
https://flink.apache.org/downloads.html

This release marks the first time we have released a connector 
separately from the main Flink release.

Over time more connectors will be migrated to this release model.

This release is equivalent to the connector version released alongside 
Flink 1.16.0 and acts as a drop-in replacement.


The full release notes are available in Jira:
https://issues.apache.org/jira/projects/FLINK/versions/12352291

We would like to thank all contributors of the Apache Flink community 
who made this release possible!


Regards,
Chesnay


[jira] [Created] (FLINK-29979) connector_artifact should only work for stable releases

2022-11-10 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-29979:


 Summary: connector_artifact should only work for stable releases
 Key: FLINK-29979
 URL: https://issues.apache.org/jira/browse/FLINK-29979
 Project: Flink
  Issue Type: Technical Debt
  Components: Documentation
Affects Versions: 1.17.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release flink-shaded 16.1, release candidate #1

2022-11-10 Thread Martijn Visser
+1 (binding)

- Validated hashes
- Verified signature
- Verified that no binaries exist in the source archive
- Build the source with Maven
- Verified licenses
- Verified web PR


On Tue, Nov 8, 2022 at 11:48 AM Chesnay Schepler  wrote:

> Hi everyone,
> Please review and vote on the release candidate #1 for the version 16.1,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to dist.apache.org
> [2], which are signed with the key with fingerprint C2EED7B111D464BA [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-16.1-rc1" [5],
> * website pull request listing the new release [6].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Release Manager
>
> [1]
> https://repository.apache.org/content/repositories/orgapacheflink-1546/
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-shaded-16.1-rc1
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1546/
> [5] https://github.com/apache/flink-shaded/releases/tag/release-16.1-rc1
> [6] https://github.com/apache/flink-web/pull/580
>


Re: ASF Slack

2022-11-10 Thread Martijn Visser
It was discussed at the latest Apachecon conference by Infra during one of
the lightning talks. If I recall correctly, it was primarily turned to
invite-only due to spam. But definitely good to validate that.

On Thu, Nov 10, 2022 at 11:09 AM Maximilian Michels  wrote:

> The registration problem should be solvable. Maybe it is due to the Slack
> pricing model that the ASF Slack is invite-only. I'll ping the community
> mailing list.
>
> Have these issues at any point been discussed with the ASF? I feel like
> this is one of the examples where a community spins off to do its own thing
> instead of working together with the foundation.
>
> -Max
>
> On Wed, Nov 9, 2022 at 10:46 AM Konstantin Knauf 
> wrote:
>
> > Hi everyone,
> >
> > I agree with Xintong in the sense that I don't see what has changed since
> > the original decision on this topic. In my opinion, there is a high cost
> in
> > moving to ASF now, namely I fear we will loose many of the >1200 members
> > and the momentum that I see in the workspace. To me there would need to
> be
> > strong reason for reverting this decision now.
> >
> > Cheers,
> >
> > Kosntantin
> >
> > Am Di., 8. Nov. 2022 um 10:35 Uhr schrieb Xintong Song <
> > tonysong...@gmail.com>:
> >
> > > Hi Max,
> > >
> > > Thanks for bringing this up. I'm open to a re-evaluation of the Slack
> > > usages.
> > >
> > > In the initial discussion for creating the Slack workspace [1],
> > leveraging
> > > the ASF Slack was indeed brought up as an alternative by many folks. I
> > > think we have chosen a dedicated Flink Slack over the ASF Slack mainly
> > for
> > > two reasons.
> > > - ASF Slack is limited to people with an @apache.org email address
> > > - With a dedicated Flink Slack, we have the full authority to manage
> and
> > > customize it. E.g., archiving / removing improper channels, reporting
> the
> > > build and benchmark reports to channels, subscribing and re-post Flink
> > blog
> > > posts.
> > > As far as I can see, these concerns for the ASF slack have not changed
> > > since the previous decision.
> > >
> > > On the other hand, could you explain a bit more about what are the
> > problems
> > > / drawbacks that you see in the current Flink Slack?
> > > - I assume having to join too many workspaces counts one
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > > [1] https://lists.apache.org/thread/n43r4qmwprhdmzrj494dbbwr9w7bbdcv
> > >
> > > On Tue, Nov 8, 2022 at 4:51 PM Martijn Visser <
> martijnvis...@apache.org>
> > > wrote:
> > >
> > > > If you click on the link from Beam via an incognito window/logged out
> > of
> > > > Slack, you will be prompted to provide the workspace URL of the ASF.
> If
> > > you
> > > > do that, you're prompted for a login screen or you can create an
> > account.
> > > > Creating an account prompts you to have an @apache.org email
> address.
> > > See
> > > > https://imgur.com/a/jXvr5Ai
> > > >
> > > > So for me that's a -1 for switching to the ASF workspace.
> > > >
> > > > On Mon, Nov 7, 2022 at 10:52 PM Austin Bennett 
> > > wrote:
> > > >
> > > > > +1 to leveraging the larger ASF Community/Resources Slack Channel
> > > rather
> > > > > than an independant one ... ASSUMING ANYONE CAN JOIN [ so that
> needs
> > to
> > > > be
> > > > > verified ].
> > > > >
> > > > > On Mon, Nov 7, 2022 at 9:05 AM Maximilian Michels 
> > > > wrote:
> > > > >
> > > > >> There is a way to work around the invite issue. For example, the
> > Beam
> > > > >> project has a direct invite link which sends you to the #beam
> > channel:
> > > > >> https://app.slack.com/client/T4S1WH2J3/C9H0YNP3P I'm not 100%
> sure
> > > > >> whether
> > > > >> this link actually works. I've take it from:
> > > > >> https://beam.apache.org/community/join-beam/
> > > > >>
> > > > >> -Max
> > > > >>
> > > > >> On Fri, Nov 4, 2022 at 1:48 PM Martijn Visser <
> > > martijnvis...@apache.org
> > > > >
> > > > >> wrote:
> > > > >>
> > > > >> > Hi Max,
> > > > >> >
> > > > >> > > I wonder how that has played out since the creation of the
> Slack
> > > > >> > workspace? I haven't been following the Slack communication.
> > > > >> >
> > > > >> > I think it has primarily played a role next to the existing User
> > > > mailing
> > > > >> > list: lots of user questions. There were maybe a handful of
> > > > >> conversations
> > > > >> > where the result of the conversation was a request to open a
> Jira,
> > > > >> create a
> > > > >> > FLIP or open up a discussion on the Dev list.
> > > > >> >
> > > > >> > > There is an invite link that they can use. Maybe not the most
> > > > >> > straight-forward process but I think it doesn't stop users from
> > > > joining.
> > > > >> >
> > > > >> > That's only possible to use for those with an @apache.org
> e-mail
> > > > >> address,
> > > > >> > see https://infra.apache.org/slack.html. Anyone else would need
> > to
> > > be
> > > > >> > invited by a committer, but I wouldn't be in favour of needing
> to
> > > > spend
> > > > >> > committers time in 

Re: ASF Slack

2022-11-10 Thread Maximilian Michels
The registration problem should be solvable. Maybe it is due to the Slack
pricing model that the ASF Slack is invite-only. I'll ping the community
mailing list.

Have these issues at any point been discussed with the ASF? I feel like
this is one of the examples where a community spins off to do its own thing
instead of working together with the foundation.

-Max

On Wed, Nov 9, 2022 at 10:46 AM Konstantin Knauf  wrote:

> Hi everyone,
>
> I agree with Xintong in the sense that I don't see what has changed since
> the original decision on this topic. In my opinion, there is a high cost in
> moving to ASF now, namely I fear we will loose many of the >1200 members
> and the momentum that I see in the workspace. To me there would need to be
> strong reason for reverting this decision now.
>
> Cheers,
>
> Kosntantin
>
> Am Di., 8. Nov. 2022 um 10:35 Uhr schrieb Xintong Song <
> tonysong...@gmail.com>:
>
> > Hi Max,
> >
> > Thanks for bringing this up. I'm open to a re-evaluation of the Slack
> > usages.
> >
> > In the initial discussion for creating the Slack workspace [1],
> leveraging
> > the ASF Slack was indeed brought up as an alternative by many folks. I
> > think we have chosen a dedicated Flink Slack over the ASF Slack mainly
> for
> > two reasons.
> > - ASF Slack is limited to people with an @apache.org email address
> > - With a dedicated Flink Slack, we have the full authority to manage and
> > customize it. E.g., archiving / removing improper channels, reporting the
> > build and benchmark reports to channels, subscribing and re-post Flink
> blog
> > posts.
> > As far as I can see, these concerns for the ASF slack have not changed
> > since the previous decision.
> >
> > On the other hand, could you explain a bit more about what are the
> problems
> > / drawbacks that you see in the current Flink Slack?
> > - I assume having to join too many workspaces counts one
> >
> > Best,
> >
> > Xintong
> >
> >
> > [1] https://lists.apache.org/thread/n43r4qmwprhdmzrj494dbbwr9w7bbdcv
> >
> > On Tue, Nov 8, 2022 at 4:51 PM Martijn Visser 
> > wrote:
> >
> > > If you click on the link from Beam via an incognito window/logged out
> of
> > > Slack, you will be prompted to provide the workspace URL of the ASF. If
> > you
> > > do that, you're prompted for a login screen or you can create an
> account.
> > > Creating an account prompts you to have an @apache.org email address.
> > See
> > > https://imgur.com/a/jXvr5Ai
> > >
> > > So for me that's a -1 for switching to the ASF workspace.
> > >
> > > On Mon, Nov 7, 2022 at 10:52 PM Austin Bennett 
> > wrote:
> > >
> > > > +1 to leveraging the larger ASF Community/Resources Slack Channel
> > rather
> > > > than an independant one ... ASSUMING ANYONE CAN JOIN [ so that needs
> to
> > > be
> > > > verified ].
> > > >
> > > > On Mon, Nov 7, 2022 at 9:05 AM Maximilian Michels 
> > > wrote:
> > > >
> > > >> There is a way to work around the invite issue. For example, the
> Beam
> > > >> project has a direct invite link which sends you to the #beam
> channel:
> > > >> https://app.slack.com/client/T4S1WH2J3/C9H0YNP3P I'm not 100% sure
> > > >> whether
> > > >> this link actually works. I've take it from:
> > > >> https://beam.apache.org/community/join-beam/
> > > >>
> > > >> -Max
> > > >>
> > > >> On Fri, Nov 4, 2022 at 1:48 PM Martijn Visser <
> > martijnvis...@apache.org
> > > >
> > > >> wrote:
> > > >>
> > > >> > Hi Max,
> > > >> >
> > > >> > > I wonder how that has played out since the creation of the Slack
> > > >> > workspace? I haven't been following the Slack communication.
> > > >> >
> > > >> > I think it has primarily played a role next to the existing User
> > > mailing
> > > >> > list: lots of user questions. There were maybe a handful of
> > > >> conversations
> > > >> > where the result of the conversation was a request to open a Jira,
> > > >> create a
> > > >> > FLIP or open up a discussion on the Dev list.
> > > >> >
> > > >> > > There is an invite link that they can use. Maybe not the most
> > > >> > straight-forward process but I think it doesn't stop users from
> > > joining.
> > > >> >
> > > >> > That's only possible to use for those with an @apache.org e-mail
> > > >> address,
> > > >> > see https://infra.apache.org/slack.html. Anyone else would need
> to
> > be
> > > >> > invited by a committer, but I wouldn't be in favour of needing to
> > > spend
> > > >> > committers time in adding/inviting members on an ASF Slack
> instance.
> > > >> >
> > > >> > Best regards,
> > > >> >
> > > >> > Martijn
> > > >> >
> > > >> > On Fri, Nov 4, 2022 at 12:33 PM Maximilian Michels <
> m...@apache.org>
> > > >> wrote:
> > > >> >
> > > >> >> Hey Martijn, hi Chesnay,
> > > >> >>
> > > >> >> >The big problem of using the ASF Slack instance is that users
> can
> > > join
> > > >> >> any Slack channel, including ones outside of Flink.
> > > >> >>
> > > >> >> That is one of the main motivations for proposing to move to the
> > ASF
> > > >> >> Slack. We can create an unlimited 

Re: [DISCUSS] Repeatable cleanup of checkpoint data

2022-11-10 Thread Matthias Pohl
Thanks for sharing your opinions on the proposal. The concerns sound
reasonable. I guess, I'm going to follow-up on Chesnay's idea about
combining multiple requests into one for the k8s implementation. Having a
performance test for the k8s API server access sounds like a good idea,
too. Both action items are a prerequisite before continuing with FLIP-270
[1].

@Yang Wang: Do we have some Jira issue or ML discussion on the k8s API
server performance issues? I couldn't come up with a good search query
myself. :-D

@Robert (CC'd): was it you who worked on the k8s API server overload issue
in 1.15? Do you have some memory about it or some starting point with
source code or something similar?

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-270%3A+Repeatable+Cleanup+of+Checkpoints

On Mon, Nov 7, 2022 at 12:59 PM Chesnay Schepler  wrote:

> This is a nice FLIP. I particular like how much background it provides
> on the issue; something that other FLIPs could certainly benefit from...
>
> I went over the FLIP and had a chat with Matthias about it.
>
> Somewhat unrelated to the FLIP we found a flaw in the current cleanup
> mechanism of failed checkpoints, where the JM deletes files while a TM
> may still be in the process of writing checkpoint data. This is because
> we never wait for an ack from the TMs that that have aborted the
> checkpoint.
> We additionally noted that when incremental checkpoints are enabled we
> might be storing a large number of checkpoints in HA, without a
> conclusion on what to do about it.
>
>
> As for the FLIP itself, I'm concerned about proposal #2 because it
> requires iterating over the entire checkpoint directory on /any/
> failover to find checkpoints that can be deleted. This can be an
> expensive operation for certain filesystems (S3), particularly when
> incremental checkpoints are being used.
> In the interest of fast failovers we ideally don't use mechanisms that
> scale with.../anything/, really.
>
> However, storing more data in HA is also concerning, as Yang Wang
> pointed out.
> To not increase the number of requests made against HA we could maybe
> consider looking into piggy-backing delete operations on other HA
> operations, like the checkpoint counter increments.
>
> On that note, do we have any benchmarks for HA? I remember we looked
> into that for...1.15 I believe at some point. With HA load being such a
> major concern for this FLIP it would be good to have _something_ to
> measure that.
>
> On 27/10/2022 14:20, Matthias Pohl wrote:
> > I would like to bring this topic up one more time. I put some more
> thought
> > into it and created FLIP-270 [1] as a follow-up of FLIP-194 [2] with an
> > updated version of what I summarized in my previous email. It would be
> > interesting to get some additional perspectives on this; more
> specifically,
> > the two included proposals about either just repurposing the
> > CompletedCheckpointStore into a more generic CheckpointStore or
> refactoring
> > the StateHandleStore interface moving all the cleanup logic from the
> > CheckpointsCleaner and StateHandleStore into what's currently called
> > CompletedCheckpointStore.
> >
> > Looking forward to feedback on that proposal.
> >
> > Best,
> > Matthias
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-270%3A+Repeatable+Cleanup+of+Checkpoints
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-194%3A+Introduce+the+JobResultStore
> >
> > On Wed, Sep 28, 2022 at 4:07 PM Matthias Pohl
> > wrote:
> >
> >> Hi everyone,
> >>
> >> I’d like to start a discussion on repeatable cleanup of checkpoint data.
> >> In FLIP-194 [1] we introduced repeatable cleanup of HA data along the
> >> introduction of the JobResultStore component. The goal was to make Flink
> >> being in charge of cleanup for the data it owns. The Flink cluster
> should
> >> only shutdown gracefully after all its artifacts are removed. That way,
> one
> >> would not miss abandoned artifacts accidentally.
> >>
> >> We forgot to cover one code path around cleaning up checkpoint data.
> >> Currently, in case of an error (e.g. permission issues), checkpoints are
> >> tried to be cleaned up in the CheckpointsCleaner and left like that if
> >> that cleanup failed. A log message is printed. The user would be
> >> responsible for cleaning up the data. This was discussed as part of the
> >> release testing efforts for Flink 1.15 in FLINK-26388 [2].
> >>
> >> We could add repeatable cleanup in the CheckpointsCleaner. We would have
> >> to make sure that all StateObject#discardState implementations are
> >> idempotent. This is not necessarily the case right now (see FLINK-26606
> >> [3]).
> >>
> >> Additionally, there is the problem of losing information about what
> >> Checkpoints are subject to cleanup in case of JobManager failovers.
> These
> >> Checkpoints are not stored as part of the HA data. Additionally,
> >> PendingCheckpoints are not serialized in any way, either. None of 

[jira] [Created] (FLINK-29978) FlinkKafkaInternalProducer not compatible with kafka-client-3.3.x

2022-11-10 Thread Owen Lee (Jira)
Owen Lee created FLINK-29978:


 Summary: FlinkKafkaInternalProducer not compatible with 
kafka-client-3.3.x
 Key: FLINK-29978
 URL: https://issues.apache.org/jira/browse/FLINK-29978
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.16.0
Reporter: Owen Lee


FlinkKafkaInternalProducer _resumeTransaction_ fetches 
_topicPartitionBookkeeper_ field from _TransactionManager_ which has been 
renamed to _TxnPartitionMap_ in 3.3.x. Failing to retrieve the field raises an 
exception (Incompatible KafkaProducer version)

 
{code:java}
public void resumeTransaction(long producerId, short epoch) {
  ...
  Object topicPartitionBookkeeper = getField(transactionManager, 
"topicPartitionBookkeeper");
...
}
{code}
 

Users should be advised not to use version 3.3.x or variable name should be 
corrected.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29977) Kafka connector not compatible with kafka-clients 3.3x

2022-11-10 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-29977:


 Summary: Kafka connector not compatible with kafka-clients 3.3x
 Key: FLINK-29977
 URL: https://issues.apache.org/jira/browse/FLINK-29977
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Kafka
Reporter: Chesnay Schepler


In 
https://github.com/apache/kafka/commit/3ea7b418fb3d7e9fc74c27751c1b02b04877f197 
the TransactionManager was modified and no longer has a 
{{topicPartitionBookkeeper}} that we access via reflection.

The {{TopicPartitionBookkeeper}} was refactored to a {{TxnPartitionMap}} class.
The {{reset}} method we require is still present though; so this could 
potentially be fixed by just falling back to another field name if required.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29976) The cron connectors test run out of given timeout in Azure

2022-11-10 Thread Leonard Xu (Jira)
Leonard Xu created FLINK-29976:
--

 Summary: The cron connectors test run out of given timeout in Azure
 Key: FLINK-29976
 URL: https://issues.apache.org/jira/browse/FLINK-29976
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.15.0
Reporter: Leonard Xu


The cron connector tests run out of the available time 222minutes.

1.15 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=43000=logs=e1276d0f-df12-55ec-86b5-c0ad597d83c9




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29975) Let hybrid full spilling strategy supports partition reuse

2022-11-10 Thread Weijie Guo (Jira)
Weijie Guo created FLINK-29975:
--

 Summary: Let hybrid full spilling strategy supports partition reuse
 Key: FLINK-29975
 URL: https://issues.apache.org/jira/browse/FLINK-29975
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Affects Versions: 1.17.0
Reporter: Weijie Guo
Assignee: Weijie Guo


Partition reuse is a very useful optimization in some topologies. In essence, 
multiple downstream tasks consume the same subpartition's data. Therefore, 
hybrid shuffle should also enjoy the benefits it brings. After FLINK-28889, we 
are finally able to achieve repeated consumption at the subpartition level for 
hybrid full spilling strategy, so let's make it better.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29974) Session jobs in FINISHED/FAILED state cannot be upgraded

2022-11-10 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-29974:
--

 Summary: Session jobs in FINISHED/FAILED state cannot be upgraded
 Key: FLINK-29974
 URL: https://issues.apache.org/jira/browse/FLINK-29974
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.2.0
Reporter: Gyula Fora


The AbstractFlinkService#cancelSessionJob method currently does not take into 
consideration the current job state.

This means that if we call this on an already failed/canceld job we will get an 
exception from Flink:


ava.util.concurrent.ExecutionException: 
org.apache.flink.runtime.messages.FlinkJobNotFoundException



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29973) connector_artifact should append Flink minor version

2022-11-10 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-29973:


 Summary: connector_artifact should append Flink minor version
 Key: FLINK-29973
 URL: https://issues.apache.org/jira/browse/FLINK-29973
 Project: Flink
  Issue Type: Technical Debt
  Components: Documentation
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.17.0, 1.16.1






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29972) Pin Flink docs to Elasticsearch Connector 3.0.0

2022-11-10 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-29972:


 Summary: Pin Flink docs to Elasticsearch Connector 3.0.0
 Key: FLINK-29972
 URL: https://issues.apache.org/jira/browse/FLINK-29972
 Project: Flink
  Issue Type: Technical Debt
  Components: Documentation
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.17.0, 1.16.1






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29971) Hbase sink will lose data at extreme case

2022-11-10 Thread wenchao.wu (Jira)
wenchao.wu created FLINK-29971:
--

 Summary: Hbase sink will lose data at extreme case
 Key: FLINK-29971
 URL: https://issues.apache.org/jira/browse/FLINK-29971
 Project: Flink
  Issue Type: Bug
  Components: Connectors / HBase
Affects Versions: 1.15.2, 1.13.6
Reporter: wenchao.wu
 Attachments: image-2022-11-10-16-02-51-402.png, 
image-2022-11-10-16-08-23-325.png, image-2022-11-10-16-10-50-711.png, 
image-2022-11-10-16-24-01-396.png

h2. Situation:

When I use kafka as source and hbase as sink but the hbase table I didn't have 
the permission, I send data to kafka one message with a long time gap.

In this situation the normal result will be when trigger checkpoint the job 
will failed. But actually the jobs will continue to run and can trigger 
checkpoint successfully.
h2. Analysis

The hbase sink will throw exception in *checkErrorAndRethrow()* funciton. And 
this function will be called in two function, *invoke()* and {*}flush(){*}. 
Beside {*}invoke(){*}, *flush()* will be called at two place, one is 
{*}snapshot(){*}, one is in the scheduledThread as the follow snippet of code:

!image-2022-11-10-16-02-51-402.png!

We can see that in the  scheduledThread the exception throw by *flush()* will 
be catch and reset to failureThrowable.

So if there's no message come, the only way to throw the exception is in 
{*}snapshot(){*}. But the snapshot function call flush() is conditional as  the 
follow snippet of code:

!image-2022-11-10-16-08-23-325.png!

But the scheduledThread will called flush() periodically and set 
numPendingRequests as 0.

!image-2022-11-10-16-10-50-711.png!

So if no other message comes the snapshot will run successfully which means the 
checkpoint will be success but that message was not written to hbase, the 
message is loss.

 
h2. Solution

I think the reason is that when trigger checkpoint and call snapshot function, 
need to call *checkErrorAndRethrow()* first as follow: 

!image-2022-11-10-16-24-01-396.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)