[jira] [Created] (FLINK-25068) Show the maximum parallelism (number of key groups) of a job in Web UI

2021-11-25 Thread zlzhang0122 (Jira)
zlzhang0122 created FLINK-25068:
---

 Summary: Show the maximum parallelism (number of key groups) of a 
job in Web UI
 Key: FLINK-25068
 URL: https://issues.apache.org/jira/browse/FLINK-25068
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.14.0
Reporter: zlzhang0122


Now, Flink use maximum parallelism as the number of key groups to distribute 
the key, the maximum parallelism can be set manually, or flink will set-up 
automatically, sometimes the value is useful and we may want to know it, maybe 
we can expose in the Web UI.

By doing this, we can easily know the max parallelism we can suggest the user 
to scale when they are facing the leak of through-output, and we can know which 
subtask will processing the special value and we can find the log soon.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25067) Correct the description of RocksDB's background threads

2021-11-25 Thread Yun Tang (Jira)
Yun Tang created FLINK-25067:


 Summary: Correct the description of RocksDB's background threads
 Key: FLINK-25067
 URL: https://issues.apache.org/jira/browse/FLINK-25067
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Runtime / State Backends
Reporter: Yun Tang
Assignee: Yun Tang
 Fix For: 1.15.0, 1.14.1, 1.13.4


RocksDB actually has changed the maximum number of concurrent background flush 
and compaction jobs to 2 for long time, we should fix the related documentation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE][FLIP-195] Improve the name and structure of vertex and operator name for job

2021-11-25 Thread Yun Gao

+1 (binding).

Very thanks Wenlong for the proposal!

Best,
Yun--
Sender:Xianxun Ye
Date:2021/11/24 16:49:44
Recipient:dev@flink.apache.org
Theme:[VOTE][FLIP-195] Improve the name and structure of vertex and operator 
name for job

+1 (non-binding)




On 11/24/2021 15:50,Sergey Nuyanzin wrote:
+1 (non-binding)

On Wed, Nov 24, 2021 at 8:38 AM godfrey he  wrote:

+1 (binding)

Best,
Godfrey

Jark Wu  于2021年11月24日周三 下午12:02写道:

+1 (binding)

Btw, @JingZhang I think your vote can be counted into binding now.

Best,
Jark

On Tue, 23 Nov 2021 at 20:19, Jing Zhang  wrote:

+1 (non-binding)

Best,
Jing Zhang

Martijn Visser  于2021年11月23日周二 下午7:42写道:

+1 (non-binding)

On Tue, 23 Nov 2021 at 12:13, Aitozi  wrote:

+1 (non-binding)

Best,
Aitozi

wenlong.lwl  于2021年11月23日周二 下午4:00写道:

Hi everyone,

Based on the discussion[1], we seem to have consensus, so I would
like
to
start a vote on FLIP-195 [2].
Thanks for all of your feedback.

The vote will last for at least 72 hours (Nov 26th 16:00 GMT)
unless
there is an objection or insufficient votes.

[1]
https://lists.apache.org/thread/kvdxr8db0l5s6wk7hwlt0go5fms99b8t
[2]





https://cwiki.apache.org/confluence/display/FLINK/FLIP-195%3A+Improve+the+name+and+structure+of+vertex+and+operator+name+for+job

Best,
Wenlong Lyu







--
Best regards,
Sergey



[jira] [Created] (FLINK-25066) Support using multi hdfs namenode to resolved dependences when submitting job to yarn

2021-11-25 Thread jocean.shi (Jira)
jocean.shi created FLINK-25066:
--

 Summary: Support using multi hdfs namenode to resolved dependences 
when submitting job to yarn
 Key: FLINK-25066
 URL: https://issues.apache.org/jira/browse/FLINK-25066
 Project: Flink
  Issue Type: Improvement
  Components: Client / Job Submission
Reporter: jocean.shi


if the hdfs-site.xml like this


     dfs.nameservices
     namenode1,namenode2


 

and the core-site.xml like this


    fs.defaultFS
    hdfs://namenode1


 

flink only can use namenode1 to resolve dependences when we submit job to 
yarn(include yarn session, yarn per job, yarn application-mode)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25065) Update lookup document for mysql-connector

2021-11-25 Thread Gaurav Miglani (Jira)
Gaurav Miglani created FLINK-25065:
--

 Summary: Update lookup document for mysql-connector
 Key: FLINK-25065
 URL: https://issues.apache.org/jira/browse/FLINK-25065
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.15.0
Reporter: Gaurav Miglani


Update `lookup.cache.caching-missing-key` document for mysql-connector



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Drop Zookeeper 3.4

2021-11-25 Thread Chesnay Schepler

I included the user ML in the thread.

@users Are you still using Zookeeper 3.4? If so, were you planning to 
upgrade Zookeeper in the near future?


I'm not sure about ZK compatibility, but we'd also upgrade Curator to 
5.x, which doesn't support ookeeperK 3.4 anymore.


On 25/11/2021 21:56, Till Rohrmann wrote:

Should we ask on the user mailing list whether anybody is still using
ZooKeeper 3.4 and thus needs support for this version or can a ZooKeeper
3.5/3.6 client talk to a ZooKeeper 3.4 cluster? I would expect that not a
lot of users depend on it but just to make sure that we aren't annoying a
lot of our users with this change. Apart from that +1 for removing it if
not a lot of user depend on it.

Cheers,
Till

On Wed, Nov 24, 2021 at 11:03 AM Matthias Pohl 
wrote:


Thanks for starting this discussion, Chesnay. +1 from my side. It's time to
move forward with the ZK support considering the EOL of 3.4 you already
mentioned. The benefits we gain from upgrading Curator to 5.x as a
consequence is another plus point. Just for reference on the inconsistent
state issue you mentioned: FLINK-24543 [1].

Matthias

[1] https://issues.apache.org/jira/browse/FLINK-24543

On Wed, Nov 24, 2021 at 10:19 AM Chesnay Schepler 
wrote:


Hello,

I'd like to drop support for Zookeeper 3.4 in 1.15, upgrading the
default to 3.5 with an opt-in for 3.6.

Supporting Zookeeper 3.4 (which is already EOL) prevents us from
upgrading Curator to 5.x, which would allow us to properly fix an issue
with inconsistent state. It is also required to eventually support ZK

3.6.





Re: [DISCUSS] Deprecate Java 8 support

2021-11-25 Thread Till Rohrmann
+1 for the deprecation and reaching out to the user ML to ask for feedback
from our users. Thanks for driving this Chesnay!

Cheers,
Till

On Thu, Nov 25, 2021 at 10:15 AM Roman Khachatryan  wrote:

> The situation is probably a bit different now compared to the previous
> upgrade: some users might be using Amazon Coretto (or other builds)
> which have longer support.
>
> Still +1 for deprecation to trigger migration, and thanks for bringing
> this up!
>
> Regards,
> Roman
>
> On Thu, Nov 25, 2021 at 10:09 AM Arvid Heise  wrote:
> >
> > +1 to deprecate Java 8, so we can hopefully incorporate the module
> concept
> > in Flink.
> >
> > On Thu, Nov 25, 2021 at 9:49 AM Chesnay Schepler 
> wrote:
> >
> > > Users can already use APIs from Java 8/11.
> > >
> > > On 25/11/2021 09:35, Francesco Guardiani wrote:
> > > > +1 with what both Ingo and Matthias sad, personally, I cannot wait to
> > > start using some of
> > > > the APIs introduced in Java 9. And I'm pretty sure that's the same
> for
> > > our users as well.
> > > >
> > > > On Tuesday, 23 November 2021 13:35:07 CET Ingo Bürk wrote:
> > > >> Hi everyone,
> > > >>
> > > >> continued support for Java 8 can also create project risks, e.g. if
> a
> > > >> vulnerability arises in Flink's dependencies and we cannot upgrade
> them
> > > >> because they no longer support Java 8. Some projects already started
> > > >> deprecating support as well, like Kafka, and other projects will
> likely
> > > >> follow.
> > > >> Let's also keep in mind that the proposal here is not to drop
> support
> > > right
> > > >> away, but to deprecate it, send the message, and motivate users to
> start
> > > >> migrating. Delaying this process could ironically mean users have
> less
> > > time
> > > >> to prepare for it.
> > > >>
> > > >>
> > > >> Ingo
> > > >>
> > > >> On Tue, Nov 23, 2021 at 8:54 AM Matthias Pohl <
> matth...@ververica.com>
> > > >>
> > > >> wrote:
> > > >>> Thanks for constantly driving these maintenance topics, Chesnay. +1
> > > from
> > > >>> my
> > > >>> side for deprecating Java 8. I see the point Jingsong is raising.
> But I
> > > >>> agree with what David already said here. Deprecating the Java
> version
> > > is a
> > > >>> tool to make users aware of it (same as starting this discussion
> > > thread).
> > > >>> If there's no major opposition against deprecating it in the
> community
> > > we
> > > >>> should move forward in this regard to make the users who do not
> > > >>> regularly browse the mailing list aware of it. That said,
> deprecating
> > > Java
> > > >>> 8 in 1.15 does not necessarily mean that it is dropped in 1.16.
> > > >>>
> > > >>> Best,
> > > >>> Matthias
> > > >>>
> > > >>> On Tue, Nov 23, 2021 at 8:46 AM David Morávek 
> wrote:
> > >  Thank you Chesnay for starting the discussion! This will generate
> bit
> > > of
> > > >>> a
> > > >>>
> > >  work for some users, but it's a good thing to keep moving the
> project
> > >  forward. Big +1 for this.
> > > 
> > >  Jingsong:
> > > 
> > >  Receiving this signal, the user may be unhappy because his
> application
> > > 
> > > > may be all on Java 8. Upgrading is a big job, after all, many
> systems
> > > > have not been upgraded yet. (Like you said, HBase and Hive)
> > >  The whole point of deprecation is to raise awareness, that this
> will
> > > be
> > >  happening eventually and users should take some steps to address
> this
> > > in
> > >  medium-term. If I understand Chesnay correctly, we'd still keep
> Java 8
> > >  around for quite some time to give users enough time to upgrade,
> but
> > >  without raising awareness we'd fight the very same argument later
> in
> > > >>> time.
> > > >>>
> > >  All of the prerequisites from 3rd party projects for both HBase
> [1]
> > > and
> > >  Hive [2] to fully support Java 11 have been completed, so the
> ball is
> > > on
> > >  their side and there doesn't seem to be much activity. Generating
> bit
> > > >>> more
> > > >>>
> > >  pressure on these efforts might be a good thing.
> > > 
> > >  It would be great to identify some of these users and learn bit
> more
> > > >>> about
> > > >>>
> > >  their situation. Are they keeping up with latest Flink
> developments or
> > > >>> are
> > > >>>
> > >  they lagging behind (this would also give them way more time for
> > >  eventual
> > >  upgrade)?
> > > 
> > >  [1] https://issues.apache.org/jira/browse/HBASE-22972
> > >  [2] https://issues.apache.org/jira/browse/HIVE-22415
> > > 
> > >  Best,
> > >  D.
> > > 
> > >  On Tue, Nov 23, 2021 at 3:08 AM Jingsong Li <
> jingsongl...@gmail.com>
> > > 
> > >  wrote:
> > > > Hi Chesnay,
> > > >
> > > > Thanks for bringing this for discussion.
> > > >
> > > > We should dig deeper into the current Java version of Flink
> users. At
> > > > least make sure Java 8 is not a mainstream version.
> > > >
> > > 

Re: [DISCUSS] Drop Zookeeper 3.4

2021-11-25 Thread Till Rohrmann
Should we ask on the user mailing list whether anybody is still using
ZooKeeper 3.4 and thus needs support for this version or can a ZooKeeper
3.5/3.6 client talk to a ZooKeeper 3.4 cluster? I would expect that not a
lot of users depend on it but just to make sure that we aren't annoying a
lot of our users with this change. Apart from that +1 for removing it if
not a lot of user depend on it.

Cheers,
Till

On Wed, Nov 24, 2021 at 11:03 AM Matthias Pohl 
wrote:

> Thanks for starting this discussion, Chesnay. +1 from my side. It's time to
> move forward with the ZK support considering the EOL of 3.4 you already
> mentioned. The benefits we gain from upgrading Curator to 5.x as a
> consequence is another plus point. Just for reference on the inconsistent
> state issue you mentioned: FLINK-24543 [1].
>
> Matthias
>
> [1] https://issues.apache.org/jira/browse/FLINK-24543
>
> On Wed, Nov 24, 2021 at 10:19 AM Chesnay Schepler 
> wrote:
>
> > Hello,
> >
> > I'd like to drop support for Zookeeper 3.4 in 1.15, upgrading the
> > default to 3.5 with an opt-in for 3.6.
> >
> > Supporting Zookeeper 3.4 (which is already EOL) prevents us from
> > upgrading Curator to 5.x, which would allow us to properly fix an issue
> > with inconsistent state. It is also required to eventually support ZK
> 3.6.
> >
>


Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

2021-11-25 Thread Francesco Guardiani
Hi Timo,

Thanks for putting this amazing work together, I have some 
considerations/questions 
about the FLIP:
*Proposed changes #6*: Other than defining this rule of thumb, we must also 
make sure 
that compiling plans with these objects that cannot be serialized in the plan 
must fail hard, 
so users don't bite themselves with such issues, or at least we need to output 
warning 
logs. In general, whenever the user is trying to use the CompiledPlan APIs and 
at the same 
time, they're trying to do something "illegal" for the plan, we should 
immediately either 
log or fail depending on the issue, in order to avoid any surprises once the 
user upgrades. 
I would also say the same for things like registering a function, registering a 
DataStream, 
and for every other thing which won't end up in the plan, we should log such 
info to the 
user by default.

*General JSON Plan Assumptions #9:* When thinking to connectors and formats, I 
think 
it's reasonable to assume and keep out of the feature design that no 
feature/ability can 
deleted from a connector/format. I also don't think new features/abilities can 
influence 
this FLIP as well, given the plan is static, so if for example, MyCoolTableSink 
in the next 
flink version implements SupportsProjectionsPushDown, then it shouldn't be a 
problem 
for the upgrade story since the plan is still configured as computed from the 
previous flink 
version. What worries me is breaking changes, in particular behavioural changes 
that 
might happen in connectors/formats. Although this argument doesn't seem 
relevant for 
the connectors shipped by the flink project itself, because we try to keep them 
as stable as 
possible and avoid eventual breaking changes, it's compelling to external 
connectors and 
formats, which might be decoupled from the flink release cycle and might have 
different 
backward compatibility guarantees. It's totally reasonable if we don't want to 
tackle it in 
this first iteration of the feature, but it's something we need to keep in mind 
for the future.


*Functions:* It's not clear to me what you mean for "identifier", because then 
somewhere 
else in the same context you talk about "name". Are we talking about the 
function name 
or the function complete signature? Let's assume for example we have these 
function 
definitions:


* TO_TIMESTAMP_LTZ(BIGINT)
* TO_TIMESTAMP_LTZ(STRING)
* TO_TIMESTAMP_LTZ(STRING, STRING)

These for me are very different functions with different implementations, where 
each of 
them might evolve separately at a different pace. Hence when we store them in 
the json 
plan we should perhaps use a logically defined unique id like 
/bigIntToTimestamp/, /
stringToTimestamp/ and /stringToTimestampWithFormat/. This also solves the 
issue of 
correctly referencing the functions when restoring the plan, without running 
again the 
inference logic (which might have been changed in the meantime) and it might 
also solve 
the versioning, that is the function identifier can contain the function 
version like /
stringToTimestampWithFormat_1_1 /or /stringToTimestampWithFormat_1_2/. An 
alternative could be to use the string signature representation, which might 
not be trivial 
to compute, given the complexity of our type inference logic. 

*The term "JSON plan"*: I think we should rather keep JSON out of the concept 
and just 
name it "Compiled Plan" (like the proposed API) or something similar, as I see 
how in 
future we might decide to support/modify our persistence format to something 
more 
efficient storage wise like BSON. For example, I would rename /
CompiledPlan.fromJsonFile/ to simply /CompiledPlan.fromFile/.

*Who is the owner of the plan file?* I asked myself this question when reading 
this:


> For simplification of the design, we assume that upgrades use a step size of 
> a single 
minor version. We don't guarantee skipping minor versions (e.g. 1.11 to 1.14).

My understanding of this statement is that a user can upgrade between minors 
but then 
following all the minors, the same query can remain up and running. E.g. I 
upgrade from 
1.15 to 1.16, and then from 1.16 to 1.17 and I still expect my original query 
to work 
without recomputing the plan. This necessarily means that at some point in 
future 
releases we'll need some basic "migration" tool to keep the queries up and 
running, 
ending up modifying the compiled plan. So I guess flink should write it back in 
the original 
plan file, perhaps doing a backup of the previous one? Can you please clarify 
this aspect?

Except these considerations, the proposal looks good to me and I'm eagerly 
waiting to see 
it in play.

Thanks,
FG


-- 
Francesco Guardiani | Software Engineer
france...@ververica.com[1]

Follow us @VervericaData
--
Join Flink Forward[2] - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 

Re: [DISCUSS] PR component labeling

2021-11-25 Thread Martijn Visser
Hi Chesnay,

It would be fine for me to have more high-over labels, like SQL,
Connectors, Formats, FileSystems.

Thanks, Martijn

On Thu, 25 Nov 2021 at 16:33, Chesnay Schepler  wrote:

> @Martijn do you need all the various sublabels, or would one like SQL or
> Connectors suffice?
>
> On 25/11/2021 16:28, Martijn Visser wrote:
> > Hi Nico,
> >
> > I'm using the labels to filter on PRs that I find interesting (for
> example,
> > I look at all the new or updated PRs that are related to SQL or
> > connectors).
> >
> > Best regards,
> >
> > Martijn
> >
> > On Thu, 25 Nov 2021 at 16:17, Nicolaus Weidner <
> > nicolaus.weid...@ververica.com> wrote:
> >
> >> Hi all,
> >>
> >> since we are currently working on several infrastructure topics, I would
> >> like to start a discussion on whether the component labeling on PRs is
> >> actually helping people. To be clear, I am talking about the blue labels
> >> like "component=Runtime/metrics" that are added by the omnipresent
> Robert
> >> to each PR on Github. They are currently copied over from Jira.
> >>
> >> So:
> >> - Does anyone rely on the labels being present on Github in any way?
> >> - If no, do you at least notice them and find them helpful, e.g. when
> >> browsing open PRs?
> >>
> >> If it turns out that noone uses this feature anyway, we can drop this
> >> mirroring of labels instead of spending time making improvements to it
> (the
> >> labels in Jira would stay, of course).
> >>
> >> Best,
> >> Nico
> >>
>
>


Re: [DISCUSS] PR component labeling

2021-11-25 Thread Chesnay Schepler
@Martijn do you need all the various sublabels, or would one like SQL or 
Connectors suffice?


On 25/11/2021 16:28, Martijn Visser wrote:

Hi Nico,

I'm using the labels to filter on PRs that I find interesting (for example,
I look at all the new or updated PRs that are related to SQL or
connectors).

Best regards,

Martijn

On Thu, 25 Nov 2021 at 16:17, Nicolaus Weidner <
nicolaus.weid...@ververica.com> wrote:


Hi all,

since we are currently working on several infrastructure topics, I would
like to start a discussion on whether the component labeling on PRs is
actually helping people. To be clear, I am talking about the blue labels
like "component=Runtime/metrics" that are added by the omnipresent Robert
to each PR on Github. They are currently copied over from Jira.

So:
- Does anyone rely on the labels being present on Github in any way?
- If no, do you at least notice them and find them helpful, e.g. when
browsing open PRs?

If it turns out that noone uses this feature anyway, we can drop this
mirroring of labels instead of spending time making improvements to it (the
labels in Jira would stay, of course).

Best,
Nico





Re: [DISCUSS] PR component labeling

2021-11-25 Thread Martijn Visser
Hi Nico,

I'm using the labels to filter on PRs that I find interesting (for example,
I look at all the new or updated PRs that are related to SQL or
connectors).

Best regards,

Martijn

On Thu, 25 Nov 2021 at 16:17, Nicolaus Weidner <
nicolaus.weid...@ververica.com> wrote:

> Hi all,
>
> since we are currently working on several infrastructure topics, I would
> like to start a discussion on whether the component labeling on PRs is
> actually helping people. To be clear, I am talking about the blue labels
> like "component=Runtime/metrics" that are added by the omnipresent Robert
> to each PR on Github. They are currently copied over from Jira.
>
> So:
> - Does anyone rely on the labels being present on Github in any way?
> - If no, do you at least notice them and find them helpful, e.g. when
> browsing open PRs?
>
> If it turns out that noone uses this feature anyway, we can drop this
> mirroring of labels instead of spending time making improvements to it (the
> labels in Jira would stay, of course).
>
> Best,
> Nico
>


[DISCUSS] PR component labeling

2021-11-25 Thread Nicolaus Weidner
Hi all,

since we are currently working on several infrastructure topics, I would
like to start a discussion on whether the component labeling on PRs is
actually helping people. To be clear, I am talking about the blue labels
like "component=Runtime/metrics" that are added by the omnipresent Robert
to each PR on Github. They are currently copied over from Jira.

So:
- Does anyone rely on the labels being present on Github in any way?
- If no, do you at least notice them and find them helpful, e.g. when
browsing open PRs?

If it turns out that noone uses this feature anyway, we can drop this
mirroring of labels instead of spending time making improvements to it (the
labels in Jira would stay, of course).

Best,
Nico


[jira] [Created] (FLINK-25064) Remove @VisibleForTesting from RestServerEndpoint#createUploadDir

2021-11-25 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-25064:


 Summary: Remove @VisibleForTesting from 
RestServerEndpoint#createUploadDir
 Key: FLINK-25064
 URL: https://issues.apache.org/jira/browse/FLINK-25064
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / REST
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25063) Allow calls to @VisibleForTesting from enclosing class

2021-11-25 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-25063:


 Summary: Allow calls to @VisibleForTesting from enclosing class
 Key: FLINK-25063
 URL: https://issues.apache.org/jira/browse/FLINK-25063
 Project: Flink
  Issue Type: Technical Debt
  Components: Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0


Pretty much FLINK-25042, just in reverse.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25062) Implement all catalog methods for PostgresCatalog

2021-11-25 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-25062:
--

 Summary: Implement all catalog methods for PostgresCatalog
 Key: FLINK-25062
 URL: https://issues.apache.org/jira/browse/FLINK-25062
 Project: Flink
  Issue Type: Improvement
Reporter: Martijn Visser


PostgresCatalog doesn't support all `Catalog` methods, but only:

{code:java}
PostgresCatalog.databaseExists(String databaseName)
PostgresCatalog.listDatabases()
PostgresCatalog.getDatabase(String databaseName)
PostgresCatalog.listTables(String databaseName)
PostgresCatalog.getTable(ObjectPath tablePath)
PostgresCatalog.tableExists(ObjectPath tablePath)
{code}

PostgresCatalog should support all `Catalog` methods. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Creating an external connector repository

2021-11-25 Thread Arvid Heise
Hi Brian,

Thank you for sharing. I think your approach is very valid and is in line
with what I had in mind.

Basically Pravega community aligns the connector releases with the Pravega
> mainline release
>
This certainly would mean that there is little value in coupling connector
versions. So it's making a good case for having separate connector repos.


> and maintains the connector with the latest 3 Flink versions(CI will
> publish snapshots for all these 3 branches)
>
I'd like to give connector devs a simple way to express to which Flink
versions the current branch is compatible. From there we can generate the
compatibility matrix automatically and optionally also create different
releases per supported Flink version. Not sure if the latter is indeed
better than having just one artifact that happens to run with multiple
Flink versions. I guess it depends on what dependencies we are exposing. If
the connector uses flink-connector-base, then we probably need separate
artifacts with poms anyways.

Best,

Arvid

On Fri, Nov 19, 2021 at 10:55 AM Zhou, Brian  wrote:

> Hi Arvid,
>
> For branching model, the Pravega Flink connector has some experience what
> I would like to share. Here[1][2] is the compatibility matrix and wiki
> explaining the branching model and releases. Basically Pravega community
> aligns the connector releases with the Pravega mainline release, and
> maintains the connector with the latest 3 Flink versions(CI will publish
> snapshots for all these 3 branches).
> For example, recently we have 0.10.1 release[3], and in maven central we
> need to upload three artifacts(For Flink 1.13, 1.12, 1.11) for 0.10.1
> version[4].
>
> There are some alternatives. Another solution that we once discussed but
> finally got abandoned is to have a independent version just like the
> current CDC connector, and then give a big compatibility matrix to users.
> We think it would be too confusing when the connector develops. On the
> contrary, we can also do the opposite way to align with Flink version and
> maintain several branches for different system version.
>
> I would say this is only a fairly-OK solution because it is a bit painful
> for maintainers as cherry-picks are very common and releases would require
> much work. However, if neither systems do not have a nice backward
> compatibility, there seems to be no comfortable solution to the their
> connector.
>
> [1] https://github.com/pravega/flink-connectors#compatibility-matrix
> [2]
> https://github.com/pravega/flink-connectors/wiki/Versioning-strategy-for-Flink-connector
> [3] https://github.com/pravega/flink-connectors/releases/tag/v0.10.1
> [4] https://search.maven.org/search?q=pravega-connectors-flink
>
> Best Regards,
> Brian
>
>
> Internal Use - Confidential
>
> -Original Message-
> From: Arvid Heise 
> Sent: Friday, November 19, 2021 4:12 PM
> To: dev
> Subject: Re: [DISCUSS] Creating an external connector repository
>
>
> [EXTERNAL EMAIL]
>
> Hi everyone,
>
> we are currently in the process of setting up the flink-connectors repo
> [1] for new connectors but we hit a wall that we currently cannot take:
> branching model.
> To reiterate the original motivation of the external connector repo: We
> want to decouple the release cycle of a connector with Flink. However, if
> we want to support semantic versioning in the connectors with the ability
> to introduce breaking changes through major version bumps and support
> bugfixes on old versions, then we need release branches similar to how
> Flink core operates.
> Consider two connectors, let's call them kafka and hbase. We have kafka in
> version 1.0.X, 1.1.Y (small improvement), 2.0.Z (config option) change and
> hbase only on 1.0.A.
>
> Now our current assumption was that we can work with a mono-repo under ASF
> (flink-connectors). Then, for release-branches, we found 3 options:
> 1. We would need to create some ugly mess with the cross product of
> connector and version: so you have kafka-release-1.0, kafka-release-1.1,
> kafka-release-2.0, hbase-release-1.0. The main issue is not the amount of
> branches (that's something that git can handle) but there the state of
> kafka is undefined in hbase-release-1.0. That's a call for desaster and
> makes releasing connectors very cumbersome (CI would only execute and
> publish hbase SNAPSHOTS on hbase-release-1.0).
> 2. We could avoid the undefined state by having an empty master and each
> release branch really only holds the code of the connector. But that's also
> not great: any user that looks at the repo and sees no connector would
> assume that it's dead.
> 3. We could have synced releases similar to the CDC connectors [2]. That
> means that if any connector introduces a breaking change, all connectors
> get a new major. I find that quite confusing to a user if hbase gets a new
> release without any change because kafka introduced a breaking change.
>
> To fully decouple release cycles and CI of connectors, we could add
> individual 

回复: RocksDBMapState get the binary key bytes

2021-11-25 Thread Zen4YYDS
Hi Pengfei Li:

I have not encountered this in production yet, I just check the code 
and found this is not the same with no userkey suppilied.

I checked the MapSerializer and found the serialized format for each Map.entry 
is  [key – isValueNull(boolean) – value].  And When boolean is serialized, byte 
1 is for true and byte 0 is for false. If we change the serialized format to 
[key-flag(byte)-value] , That is we do not view the byte as Boolean, just pure 
byte, I think we can migrate the key too:

  Flag == 0, value is null.
  Flag == 1, the key is in old format.
  Flag == 2, we can use this to write new key.

从 Windows 版邮件发送

发件人: Pengfei Li
发送时间: 2021年11月25日 18:17
收件人: dev@flink.apache.org
主题: Re: RocksDBMapState get the binary key bytes

This problem is tracked in FLINK-11141
, but there is not a
solution yet considering the state compatibility. Have you encountered the
problem in production?

Zen4YYDS  于2021年11月25日周四 下午3:30写道:

> Hi devs:
>
>  Using RocksDB, when key and namespace both have variable binary
> length, to prevent [key, namespace] have equal binary number, we add key
> length and namespace length after key and namespace respectively. Then the
> format is:
> Keygroup – key -keyLength– namespace-namespaceLenth
>
>  Then what about we use a fixed length key and variable length
> namespace and userkey. In current implement, I found the binary key format
> is as below:
>
>Keygroup – key – namespace- userkey
>
> Think about following situation, I think we may get the same value for
> different [namespace, userkey]. or I get something wrong?
>
> Keygroup  key   namespace  userkey
> 1 1  11  1
> 1  1 111
>
>
> 从 Windows 版邮件发送
>
>




Re: [DISCUSS] Conventions on assertions to use in tests

2021-11-25 Thread Marios Trivyzas
As @Matthias Pohl  mentioned, I agree that no1 is
to end up with consistency
regarding the assertions in our tests, but I also like how those assertions
shape up with the AssertJ approach.

On Thu, Nov 25, 2021 at 9:38 AM Francesco Guardiani 
wrote:

> This is the result of experimenting around creating custom assertions for
> Table API types
> https://github.com/slinkydeveloper/flink/commit/
> d1ce37a62c2200b2c3008a9cc2cac91234222fd5[1]. I will PR it once the two PRs
> in the
> previous mail get merged
>
> On Monday, 22 November 2021 17:59:29 CET Francesco Guardiani wrote:
> > Hi all,
> >
> > Given I see generally consensus around having a convention and using
> > assertj, I propose to merge these 2 PRs:
> >
> > * Add the explanation of this convention in our code quality guide:
> > https://github.com/apache/flink-web/pull/482
> > * Add assertj to dependency management in the parent pom and link in the
> PR
> > template the code quality guide:
> https://github.com/apache/flink/pull/17871
> >
> > WDYT?
> >
> > Once we merge those, I'll work in the next days to add some custom
> > assertions in table-common for RowData and Row (commonly asserted
> > everywhere in the table codebase).
> >
> > @Matthias Pohl  about the confluence page, it
> seems
> > a bit outdated, judging from the last modified date. I propose to
> continue
> > to use this guide
> > https://flink.apache.org/contributing/code-style-and-quality-common.html
> as
> > it seems more complete.
> >
> >
> > On Mon, Nov 22, 2021 at 8:58 AM Matthias Pohl 
> >
> > wrote:
> > > Agree. Clarifying once more what our preferred option is here, is a
> good
> > > idea. So, +1 for unification. I don't have a strong opinion on what
> > > framework to use. But we may want to add this at the end of the
> discussion
> > > to our documentation (e.g. [1] or maybe the PR description?) to make
> users
> > > aware of it and be able to provide a reference in case it comes up
> again
> > > (besides this ML thread). Or do we already have something like that
> > > somewhere in the docs where I missed it?
> > >
> > > Matthias
> > >
> > > [1]
> > >
> https://cwiki.apache.org/confluence/display/FLINK/Best+Practices+and+Lesso
> > > ns+Learned>
> > > On Wed, Nov 17, 2021 at 11:13 AM Marios Trivyzas 
> wrote:
> > >> I'm also +1 both for unification and specifically for assertJ.
> > >> I think it covers a wide variety of assertions and as Francesco
> mentioned
> > >> it's easily extensible, so that
> > >> we can create custom assertions where needed, and avoid repeating test
> > >> code.
> > >>
> > >> On Tue, Nov 16, 2021 at 9:57 AM David Morávek 
> wrote:
> > >> > I don't have any strong opinions on the asserting framework that we
> > >> > use,
> > >> > but big +1 for the unification.
> > >> >
> > >> > Best,
> > >> > D.
> > >> >
> > >> > On Tue, Nov 16, 2021 at 9:37 AM Till Rohrmann  >
> > >> >
> > >> > wrote:
> > >> > > Using JUnit5 with assertJ is fine with me if the community agrees.
> > >>
> > >> Having
> > >>
> > >> > > guides for best practices would definitely help with the
> transition.
> > >> > >
> > >> > > Cheers,
> > >> > > Till
> > >> > >
> > >> > > On Mon, Nov 15, 2021 at 5:34 PM Francesco Guardiani <
> > >> > > france...@ververica.com>
> > >> > >
> > >> > > wrote:
> > >> > > > > It is a bit unfortunate that we have tests that follow
> different
> > >> > > >
> > >> > > > patterns.
> > >> > > > This, however, is mainly due to organic growth. I think the
> > >>
> > >> community
> > >>
> > >> > > > started with Junit4, then we chose to use Hamcrest because of
> its
> > >> >
> > >> > better
> > >> >
> > >> > > > expressiveness.
> > >> > > >
> > >> > > > That is fine, I'm sorry if my mail felt like a rant :)
> > >> > > >
> > >> > > > > Personally, I don't have a strong preference for which testing
> > >>
> > >> tools
> > >>
> > >> > to
> > >> >
> > >> > > > use. The important bit is that we agree as a community, then
> > >>
> > >> document
> > >>
> > >> > the
> > >> >
> > >> > > > choice and finally stick to it. So before starting to use
> assertj,
> > >>
> > >> we
> > >>
> > >> > > > should probably align with the folks working on the Junit5
> effort
> > >> >
> > >> > first.
> > >> >
> > >> > > > As Arvid pointed out, using assertj might help the people
> working



-- 
Marios


[jira] [Created] (FLINK-25061) Add assertj as dependency in flink-parent

2021-11-25 Thread Francesco Guardiani (Jira)
Francesco Guardiani created FLINK-25061:
---

 Summary: Add assertj as dependency in flink-parent
 Key: FLINK-25061
 URL: https://issues.apache.org/jira/browse/FLINK-25061
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Francesco Guardiani
Assignee: Francesco Guardiani


We recently discussed test assertion on the ML 
(https://lists.apache.org/thread/33t7hz8w873p1bc5msppk65792z08rgt), and came to 
the conclusion to that we want to encourage contributions to use assertj as 
much as possible. In order to do that, we should add assertj in flink-parent 
pom as test dependency



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25060) Replace DataType.projectFields with Projection type

2021-11-25 Thread Francesco Guardiani (Jira)
Francesco Guardiani created FLINK-25060:
---

 Summary: Replace DataType.projectFields with Projection type
 Key: FLINK-25060
 URL: https://issues.apache.org/jira/browse/FLINK-25060
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Reporter: Francesco Guardiani
Assignee: Francesco Guardiani


FLINK-24399 introduced new methods to perform data types projections in 
DataType. Note: no release included such changes.

FLINK-24776 introduced a new, more powerful, type to perform operations on 
projections, that is project types, but also difference and complement.

In spite of avoiding to provide different entrypoints for the same 
functionality, we should cleanup the new methods introduced by FLINK-24399 and 
replace them with the new Projection type. We should also deprecate the 
functions in DataTypeUtils.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25059) Update code style guide to encourage usage of AssertJ

2021-11-25 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-25059:


 Summary: Update code style guide to encourage usage of AssertJ
 Key: FLINK-25059
 URL: https://issues.apache.org/jira/browse/FLINK-25059
 Project: Flink
  Issue Type: Technical Debt
  Components: Documentation, Tests
Reporter: Chesnay Schepler
Assignee: Francesco Guardiani
 Fix For: 1.15.0


We recently discussed test assertion on the ML 
(https://lists.apache.org/thread/33t7hz8w873p1bc5msppk65792z08rgt), and came to 
the conclusion to
a) update the code style guide to explicitly mention the currently favored 
style of assertions for consistency
b) use AssertJ for new tests.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25058) Declared exceptions should be considered in architectural tests

2021-11-25 Thread Jira
Ingo Bürk created FLINK-25058:
-

 Summary: Declared exceptions should be considered in architectural 
tests
 Key: FLINK-25058
 URL: https://issues.apache.org/jira/browse/FLINK-25058
 Project: Flink
  Issue Type: Improvement
Reporter: Ingo Bürk
Assignee: Ingo Bürk






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25057) Streaming File Sink writing to the HDFS

2021-11-25 Thread hanjie (Jira)
hanjie created FLINK-25057:
--

 Summary: Streaming File Sink writing to the HDFS
 Key: FLINK-25057
 URL: https://issues.apache.org/jira/browse/FLINK-25057
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / FileSystem
Affects Versions: 1.12.1
Reporter: hanjie


When I first start flink task:

    *First part file example:*

       part-0-0
       part-0-1
      .part-0-2.inprogress.952eb958-dac9-4f2c-b92f-9084ed536a1c

  I cancel flink task. then, i restart task without savepoint or checkpoint. 
Task run for a while.

   *Second part file example:*

          **          part-0-0
          part-0-1
          .part-0-2.inprogress.952eb958-dac9-4f2c-b92f-9084ed536a1c
          .part-0-0.inprogress.0e2f234b-042d-4232-a5f7-c980f04ca82d

    'part-0-2.inprogress.952eb958-dac9-4f2c-b92f-9084ed536a1c' not rename file 
and bucketIndex will start zero.

     I view related code. Start task need savepoint or checkpoint. I choose 
savepoint.The above question disappears, when i start third test. 

    But, if i use expire savepoint. Task will  throw exception.

     java.io.FileNotFoundException: File does not exist: 
/ns-hotel/hotel_sa_log/stream/sa_cpc_ad_log_list_detail_dwd/2021-11-25/.part-6-1537.inprogress.cd9c756a-1756-4dc5-9325-485fe99a2803\n\tat
 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)\n\tat
 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)\n\tat
 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)\n\tat
 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)\n\tat
 org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:752)\n\tat 
org.apache.hadoop.fs.FilterFileSystem.resolvePath(FilterFileSystem.java:153)\n\tat
 
org.apache.hadoop.fs.viewfs.ChRootedFileSystem.resolvePath(ChRootedFileSystem.java:373)\n\tat
 
org.apache.hadoop.fs.viewfs.ViewFileSystem.resolvePath(ViewFileSystem.java:243)\n\tat
 
org.apache.flink.runtime.fs.hdfs.HadoopRecoverableFsDataOutputStream.revokeLeaseByFileSystem(HadoopRecoverableFsDataOutputStream.java:327)\n\tat
 
org.apache.flink.runtime.fs.hdfs.HadoopRecoverableFsDataOutputStream.safelyTruncateFile(HadoopRecoverableFsDataOutputStream.java:163)\n\tat
 
org.apache.flink.runtime.fs.hdfs.HadoopRecoverableFsDataOutputStream.(HadoopRecoverableFsDataOutputStream.java:88)\n\tat
 
org.apache.flink.runtime.fs.hdfs.HadoopRecoverableWriter.recover(HadoopRecoverableWriter.java:86)\n\tat
 
org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter$OutputStreamBasedBucketWriter.resumeInProgressFileFrom(OutputStreamBasedPartFileWriter.java:104)\n\tat
 org.apache.flink.streaming.api.functions.sink.filesyst

  Task set 'execution.checkpointing.interval': 1min,  I  invoke savepoint  
every fifth minutes.

   Consult next everybody solution.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25056) Modify Flink dashboard task manager page, "Path, ID" column,to support sorting by id.

2021-11-25 Thread john (Jira)
john created FLINK-25056:


 Summary: Modify Flink dashboard task manager page, "Path, ID" 
column,to support sorting by id.
 Key: FLINK-25056
 URL: https://issues.apache.org/jira/browse/FLINK-25056
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: john


Modify Flink dashboard task manager page, "Path, ID" column,to support sorting 
by id.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: RocksDBMapState get the binary key bytes

2021-11-25 Thread Pengfei Li
This problem is tracked in FLINK-11141
, but there is not a
solution yet considering the state compatibility. Have you encountered the
problem in production?

Zen4YYDS  于2021年11月25日周四 下午3:30写道:

> Hi devs:
>
>  Using RocksDB, when key and namespace both have variable binary
> length, to prevent [key, namespace] have equal binary number, we add key
> length and namespace length after key and namespace respectively. Then the
> format is:
> Keygroup – key -keyLength– namespace-namespaceLenth
>
>  Then what about we use a fixed length key and variable length
> namespace and userkey. In current implement, I found the binary key format
> is as below:
>
>Keygroup – key – namespace- userkey
>
> Think about following situation, I think we may get the same value for
> different [namespace, userkey]. or I get something wrong?
>
> Keygroup  key   namespace  userkey
> 1 1  11  1
> 1  1 111
>
>
> 从 Windows 版邮件发送
>
>


Re: [DISCUSS] Definition of Done for Apache Flink

2021-11-25 Thread Martijn Visser
> * 187 contain "yes / no / don't know" and thus have not answered at least
one of the questions

I think there are quite some PRs that highlight in bold the answer to this
question while keeping all options there. I'm all in favour of replacing
this with a checkbox.



On Thu, 25 Nov 2021 at 10:47, Ingo Bürk  wrote:

> Hi Till,
>
> > * I agree with Ingo that the "Verifying this change" section can be
> > cumbersome to fill in. On the other hand it reminds contributors to
> verify
> > that his/her changes are covered by tests. Therefore, I would keep it.
>
> IMO it could be replaced with a checkbox "I have made sure all changes are
> covered by tests", though. I still see no benefit in painstakingly picking
> out all individual test cases that have been touched or affected.
>
> > In order to verify this I went through
> > the first two pages of open PRs and I was very positively surprised that
> > almost all PRs filled out the current template.
>
> Just to try and complete the picture here, counting open PRs by including
> search terms like "yes / no / don't know" the rough statistics are
>
> * 731 open PRs
> * 187 contain "yes / no / don't know" and thus have not answered at least
> one of the questions
> * 134 PRs seem to have deleted the PR template altogether as the question
> text does not appear in it
>
> Given that the latter two have to be largely mutually exclusive, that would
> mean that 40+% of all PRs do not fill out the PR template.
>
> > because what is not used can be removed.
>
> Just because someone answers a question doesn't make it a useful question,
> though. Some questions undoubtedly have a purpose for PR authors, like
> having to decide whether the public API is affected. But what purpose do
> "Anything that affects deployment or recovery" or "The S3 file system
> connector" serve? I don't think this is useful to authors in any way, so
> the other question is whether any committer actually looks at these answers
> and does something with it. Even if we don't remove everything (which I
> wouldn't want to, either), we can still remove those things that aren't
> needed (if that is the case).
>
> (Also, in any case I would really love if all questions could be converted
> to checkboxes instead)
>
>
> Ingo
>
> On Thu, Nov 25, 2021 at 10:31 AM Till Rohrmann 
> wrote:
>
> > When I started writing this response I thought that the current PR
> template
> > is mostly ignored by the community. In order to verify this I went
> through
> > the first two pages of open PRs and I was very positively surprised that
> > almost all PRs filled out the current template. Hence, I had to redact my
> > original response to get rid of the template because what is not used can
> > be removed.
> >
> > What I like about the current template is that it gives structure and
> > reminds contributors about certain things. At some places, the current
> > template is a bit verbose imo and could be shortened. Especially, with
> > Joe's proposal to add some more text to the template we might want to use
> > the chance to consolidate things. Here are some ideas:
> >
> > * Having a description of what the PR changes is very important. Whether
> > every PR needs additionally the "Brief changelog" is not clear to me.
> Maybe
> > we can fuse those two sections.
> > * I agree with Ingo that the "Verifying this change" section can be
> > cumbersome to fill in. On the other hand it reminds contributors to
> verify
> > that his/her changes are covered by tests. Therefore, I would keep it.
> > * The section "Does this pull request potentially affect one of the
> > following parts" could be shortened. Some of the points are obvious when
> > looking at the code, others are really hard to know for a contributor and
> > others are niche. Maybe keeping whether we changed dependencies and
> changed
> > the public API (even though this should be automatically verifiable)
> could
> > be a start.
> > * The section "Documentation" could be replaced by Joe's checklist.
> >
> > Cheers,
> > Till
> >
> > On Mon, Nov 22, 2021 at 11:13 AM Matthias Pohl 
> > wrote:
> >
> > > I also like the checklist provided by our current PR template. One
> > annoying
> > > thing in my opinion, though, is that we do not rely on checkboxes. Ingo
> > > already proposed such a change in [1]. Chesnay had some good points on
> > > certain items that were meant to be added to the template but are
> > actually
> > > already checked automatically [2]. In the end it comes down to noticing
> > > these checks and acting on it if one of them fails. I see the benefits
> of
> > > having an explicit check for something like that in the PR. But again,
> > > adding more items increases the risk of users just ignoring it.
> > >
> > > One other thing to raise awareness for users might be to move the
> > > CONTRIBUTING.md into the root folder. Github still recognizes the file
> if
> > > it is located in the project's root [3]. Hence, I don't see a need to
> > > "hide" it in the 

Re: [DISCUSS] Definition of Done for Apache Flink

2021-11-25 Thread Ingo Bürk
Hi Till,

> * I agree with Ingo that the "Verifying this change" section can be
> cumbersome to fill in. On the other hand it reminds contributors to verify
> that his/her changes are covered by tests. Therefore, I would keep it.

IMO it could be replaced with a checkbox "I have made sure all changes are
covered by tests", though. I still see no benefit in painstakingly picking
out all individual test cases that have been touched or affected.

> In order to verify this I went through
> the first two pages of open PRs and I was very positively surprised that
> almost all PRs filled out the current template.

Just to try and complete the picture here, counting open PRs by including
search terms like "yes / no / don't know" the rough statistics are

* 731 open PRs
* 187 contain "yes / no / don't know" and thus have not answered at least
one of the questions
* 134 PRs seem to have deleted the PR template altogether as the question
text does not appear in it

Given that the latter two have to be largely mutually exclusive, that would
mean that 40+% of all PRs do not fill out the PR template.

> because what is not used can be removed.

Just because someone answers a question doesn't make it a useful question,
though. Some questions undoubtedly have a purpose for PR authors, like
having to decide whether the public API is affected. But what purpose do
"Anything that affects deployment or recovery" or "The S3 file system
connector" serve? I don't think this is useful to authors in any way, so
the other question is whether any committer actually looks at these answers
and does something with it. Even if we don't remove everything (which I
wouldn't want to, either), we can still remove those things that aren't
needed (if that is the case).

(Also, in any case I would really love if all questions could be converted
to checkboxes instead)


Ingo

On Thu, Nov 25, 2021 at 10:31 AM Till Rohrmann  wrote:

> When I started writing this response I thought that the current PR template
> is mostly ignored by the community. In order to verify this I went through
> the first two pages of open PRs and I was very positively surprised that
> almost all PRs filled out the current template. Hence, I had to redact my
> original response to get rid of the template because what is not used can
> be removed.
>
> What I like about the current template is that it gives structure and
> reminds contributors about certain things. At some places, the current
> template is a bit verbose imo and could be shortened. Especially, with
> Joe's proposal to add some more text to the template we might want to use
> the chance to consolidate things. Here are some ideas:
>
> * Having a description of what the PR changes is very important. Whether
> every PR needs additionally the "Brief changelog" is not clear to me. Maybe
> we can fuse those two sections.
> * I agree with Ingo that the "Verifying this change" section can be
> cumbersome to fill in. On the other hand it reminds contributors to verify
> that his/her changes are covered by tests. Therefore, I would keep it.
> * The section "Does this pull request potentially affect one of the
> following parts" could be shortened. Some of the points are obvious when
> looking at the code, others are really hard to know for a contributor and
> others are niche. Maybe keeping whether we changed dependencies and changed
> the public API (even though this should be automatically verifiable) could
> be a start.
> * The section "Documentation" could be replaced by Joe's checklist.
>
> Cheers,
> Till
>
> On Mon, Nov 22, 2021 at 11:13 AM Matthias Pohl 
> wrote:
>
> > I also like the checklist provided by our current PR template. One
> annoying
> > thing in my opinion, though, is that we do not rely on checkboxes. Ingo
> > already proposed such a change in [1]. Chesnay had some good points on
> > certain items that were meant to be added to the template but are
> actually
> > already checked automatically [2]. In the end it comes down to noticing
> > these checks and acting on it if one of them fails. I see the benefits of
> > having an explicit check for something like that in the PR. But again,
> > adding more items increases the risk of users just ignoring it.
> >
> > One other thing to raise awareness for users might be to move the
> > CONTRIBUTING.md into the root folder. Github still recognizes the file if
> > it is located in the project's root [3]. Hence, I don't see a need to
> > "hide" it in the .github subfolder. Or was there another reason to put
> the
> > file into that folder?
> >
> > Matthias
> >
> > [1] https://github.com/apache/flink/pull/17801#issuecomment-969363303
> > [2] https://github.com/apache/flink/pull/17801#issuecomment-970048058
> > [3]
> >
> >
> https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/setting-guidelines-for-repository-contributors#about-contributing-guidelines
> >
> > On Thu, Nov 18, 2021 at 12:03 PM Yun Tang  wrote:
> >
> > > Hi Joe,

Re: [DISCUSS] Definition of Done for Apache Flink

2021-11-25 Thread Till Rohrmann
When I started writing this response I thought that the current PR template
is mostly ignored by the community. In order to verify this I went through
the first two pages of open PRs and I was very positively surprised that
almost all PRs filled out the current template. Hence, I had to redact my
original response to get rid of the template because what is not used can
be removed.

What I like about the current template is that it gives structure and
reminds contributors about certain things. At some places, the current
template is a bit verbose imo and could be shortened. Especially, with
Joe's proposal to add some more text to the template we might want to use
the chance to consolidate things. Here are some ideas:

* Having a description of what the PR changes is very important. Whether
every PR needs additionally the "Brief changelog" is not clear to me. Maybe
we can fuse those two sections.
* I agree with Ingo that the "Verifying this change" section can be
cumbersome to fill in. On the other hand it reminds contributors to verify
that his/her changes are covered by tests. Therefore, I would keep it.
* The section "Does this pull request potentially affect one of the
following parts" could be shortened. Some of the points are obvious when
looking at the code, others are really hard to know for a contributor and
others are niche. Maybe keeping whether we changed dependencies and changed
the public API (even though this should be automatically verifiable) could
be a start.
* The section "Documentation" could be replaced by Joe's checklist.

Cheers,
Till

On Mon, Nov 22, 2021 at 11:13 AM Matthias Pohl 
wrote:

> I also like the checklist provided by our current PR template. One annoying
> thing in my opinion, though, is that we do not rely on checkboxes. Ingo
> already proposed such a change in [1]. Chesnay had some good points on
> certain items that were meant to be added to the template but are actually
> already checked automatically [2]. In the end it comes down to noticing
> these checks and acting on it if one of them fails. I see the benefits of
> having an explicit check for something like that in the PR. But again,
> adding more items increases the risk of users just ignoring it.
>
> One other thing to raise awareness for users might be to move the
> CONTRIBUTING.md into the root folder. Github still recognizes the file if
> it is located in the project's root [3]. Hence, I don't see a need to
> "hide" it in the .github subfolder. Or was there another reason to put the
> file into that folder?
>
> Matthias
>
> [1] https://github.com/apache/flink/pull/17801#issuecomment-969363303
> [2] https://github.com/apache/flink/pull/17801#issuecomment-970048058
> [3]
>
> https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/setting-guidelines-for-repository-contributors#about-contributing-guidelines
>
> On Thu, Nov 18, 2021 at 12:03 PM Yun Tang  wrote:
>
> > Hi Joe,
> >
> > Thanks for bringing this to our attention.
> >
> > In general, I agreed with Chesnay's reply on PR [1]. For the rule-3, we
> > might indeed create another PR to add documentation previously. And I
> think
> > if forcing to obey it to include the documentation in the same PR, that
> > could benefit the review progress. Thus, I am not against for this rule.
> >
> > For the rule related to the PR description, I think current flinkbot has
> > tools to let committer to run command like "@flinkbot approve
> description".
> > However, I think many committers did not leverage this, which makes the
> bot
> > useless at most of the time. I think this discussion draws the attention
> > that whether we should strictly obey the review process via using
> flinkbot
> > or still not force committer to leverage it.
> >
> > [1] https://github.com/apache/flink/pull/17801#issuecomment-970048058
> >
> > Best
> > Yun Tang
> >
> > On 2021/11/16 10:38:39 Ingo Bürk wrote:
> > > > On the other hand I am a silent fan of the current PR template
> because
> > > > it also provides a summary of the PR to make it easier for committers
> > > > to determine the impacts.
> > >
> > > I 100% agree that part of a PR (and thus the template) should be the
> > > summary of the what, why, and how of the changes. I also see value in
> > > marking a PR as a breaking change if the author is aware of it being
> one
> > > (of course a committer needs to verify this nonetheless).
> > >
> > > But apart from that, there's a lot of questions in there that no one
> > seems
> > > to care about, and e.g. the question of how a change can be verified
> > seems
> > > fairly useless to me: if tests have been changed, that can trivially be
> > > seen in the PR. The CI runs on top of that anyway as well. So I never
> > > really understood why I need to manually list all the tests I have
> > touched
> > > here (or maybe I misunderstood this question the entire time?).
> > >
> > > If the template is supposed to be useful for the committer rather than
> > the
> > 

[jira] [Created] (FLINK-25055) Support listen and notify mechanism for PartitionRequest

2021-11-25 Thread Shammon (Jira)
Shammon created FLINK-25055:
---

 Summary: Support listen and notify mechanism for PartitionRequest
 Key: FLINK-25055
 URL: https://issues.apache.org/jira/browse/FLINK-25055
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Affects Versions: 1.13.3, 1.12.5, 1.14.0
Reporter: Shammon


We submit batch jobs to flink session cluster with eager scheduler for olap. JM 
deploys subtasks to TaskManager independently, and the downstream subtasks may 
start before the upstream ones are not ready. The downstream subtask sends 
PartitionRequest to upstream ones, and may receive PartitionNotFoundException 
from them. Then it will retry to send PartitionRequest after a few ms until 
timeout.

The current approach raises two problems. First, there will be too many retry 
PartitionRequest messages. Each downstream subtask will send PartitionRequest 
to all its upstream subtasks and the total number of messages will be O(N*N), 
where N is the parallelism of subtasks. Secondly, the interval between polling 
retries will increase the delay for upstream and downstream tasks to confirm 
PartitionRequest.

We want to support listen and notify mechanism for PartitionRequest when the 
job needs no failover. Upstream TaskManager will add the PartitionRequest to a 
listen list with a timeout checker, and notify the request when the task 
register its partition in the TaskManager.

[~nkubicek] I noticed that your scenario of using flink is similar to ours. 
What do you think?  And hope to hear from you [~trohrmann] THX



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Deprecate Java 8 support

2021-11-25 Thread Roman Khachatryan
The situation is probably a bit different now compared to the previous
upgrade: some users might be using Amazon Coretto (or other builds)
which have longer support.

Still +1 for deprecation to trigger migration, and thanks for bringing this up!

Regards,
Roman

On Thu, Nov 25, 2021 at 10:09 AM Arvid Heise  wrote:
>
> +1 to deprecate Java 8, so we can hopefully incorporate the module concept
> in Flink.
>
> On Thu, Nov 25, 2021 at 9:49 AM Chesnay Schepler  wrote:
>
> > Users can already use APIs from Java 8/11.
> >
> > On 25/11/2021 09:35, Francesco Guardiani wrote:
> > > +1 with what both Ingo and Matthias sad, personally, I cannot wait to
> > start using some of
> > > the APIs introduced in Java 9. And I'm pretty sure that's the same for
> > our users as well.
> > >
> > > On Tuesday, 23 November 2021 13:35:07 CET Ingo Bürk wrote:
> > >> Hi everyone,
> > >>
> > >> continued support for Java 8 can also create project risks, e.g. if a
> > >> vulnerability arises in Flink's dependencies and we cannot upgrade them
> > >> because they no longer support Java 8. Some projects already started
> > >> deprecating support as well, like Kafka, and other projects will likely
> > >> follow.
> > >> Let's also keep in mind that the proposal here is not to drop support
> > right
> > >> away, but to deprecate it, send the message, and motivate users to start
> > >> migrating. Delaying this process could ironically mean users have less
> > time
> > >> to prepare for it.
> > >>
> > >>
> > >> Ingo
> > >>
> > >> On Tue, Nov 23, 2021 at 8:54 AM Matthias Pohl 
> > >>
> > >> wrote:
> > >>> Thanks for constantly driving these maintenance topics, Chesnay. +1
> > from
> > >>> my
> > >>> side for deprecating Java 8. I see the point Jingsong is raising. But I
> > >>> agree with what David already said here. Deprecating the Java version
> > is a
> > >>> tool to make users aware of it (same as starting this discussion
> > thread).
> > >>> If there's no major opposition against deprecating it in the community
> > we
> > >>> should move forward in this regard to make the users who do not
> > >>> regularly browse the mailing list aware of it. That said, deprecating
> > Java
> > >>> 8 in 1.15 does not necessarily mean that it is dropped in 1.16.
> > >>>
> > >>> Best,
> > >>> Matthias
> > >>>
> > >>> On Tue, Nov 23, 2021 at 8:46 AM David Morávek  wrote:
> >  Thank you Chesnay for starting the discussion! This will generate bit
> > of
> > >>> a
> > >>>
> >  work for some users, but it's a good thing to keep moving the project
> >  forward. Big +1 for this.
> > 
> >  Jingsong:
> > 
> >  Receiving this signal, the user may be unhappy because his application
> > 
> > > may be all on Java 8. Upgrading is a big job, after all, many systems
> > > have not been upgraded yet. (Like you said, HBase and Hive)
> >  The whole point of deprecation is to raise awareness, that this will
> > be
> >  happening eventually and users should take some steps to address this
> > in
> >  medium-term. If I understand Chesnay correctly, we'd still keep Java 8
> >  around for quite some time to give users enough time to upgrade, but
> >  without raising awareness we'd fight the very same argument later in
> > >>> time.
> > >>>
> >  All of the prerequisites from 3rd party projects for both HBase [1]
> > and
> >  Hive [2] to fully support Java 11 have been completed, so the ball is
> > on
> >  their side and there doesn't seem to be much activity. Generating bit
> > >>> more
> > >>>
> >  pressure on these efforts might be a good thing.
> > 
> >  It would be great to identify some of these users and learn bit more
> > >>> about
> > >>>
> >  their situation. Are they keeping up with latest Flink developments or
> > >>> are
> > >>>
> >  they lagging behind (this would also give them way more time for
> >  eventual
> >  upgrade)?
> > 
> >  [1] https://issues.apache.org/jira/browse/HBASE-22972
> >  [2] https://issues.apache.org/jira/browse/HIVE-22415
> > 
> >  Best,
> >  D.
> > 
> >  On Tue, Nov 23, 2021 at 3:08 AM Jingsong Li 
> > 
> >  wrote:
> > > Hi Chesnay,
> > >
> > > Thanks for bringing this for discussion.
> > >
> > > We should dig deeper into the current Java version of Flink users. At
> > > least make sure Java 8 is not a mainstream version.
> > >
> > > Receiving this signal, the user may be unhappy because his
> > application
> > > may be all on Java 8. Upgrading is a big job, after all, many systems
> > > have not been upgraded yet. (Like you said, HBase and Hive)
> > >
> > > In my opinion, it is too early to deprecate support for Java 8. We
> > > should wait for a safer point in time.
> > >
> > > On Mon, Nov 22, 2021 at 11:45 PM Ingo Bürk 
> > wrote:
> > >> Hi,
> > >>
> > >> also a +1 from me because of everything Chesnay already said.
> > >>
> > 

Re: [DISCUSS] Deprecate Java 8 support

2021-11-25 Thread Arvid Heise
+1 to deprecate Java 8, so we can hopefully incorporate the module concept
in Flink.

On Thu, Nov 25, 2021 at 9:49 AM Chesnay Schepler  wrote:

> Users can already use APIs from Java 8/11.
>
> On 25/11/2021 09:35, Francesco Guardiani wrote:
> > +1 with what both Ingo and Matthias sad, personally, I cannot wait to
> start using some of
> > the APIs introduced in Java 9. And I'm pretty sure that's the same for
> our users as well.
> >
> > On Tuesday, 23 November 2021 13:35:07 CET Ingo Bürk wrote:
> >> Hi everyone,
> >>
> >> continued support for Java 8 can also create project risks, e.g. if a
> >> vulnerability arises in Flink's dependencies and we cannot upgrade them
> >> because they no longer support Java 8. Some projects already started
> >> deprecating support as well, like Kafka, and other projects will likely
> >> follow.
> >> Let's also keep in mind that the proposal here is not to drop support
> right
> >> away, but to deprecate it, send the message, and motivate users to start
> >> migrating. Delaying this process could ironically mean users have less
> time
> >> to prepare for it.
> >>
> >>
> >> Ingo
> >>
> >> On Tue, Nov 23, 2021 at 8:54 AM Matthias Pohl 
> >>
> >> wrote:
> >>> Thanks for constantly driving these maintenance topics, Chesnay. +1
> from
> >>> my
> >>> side for deprecating Java 8. I see the point Jingsong is raising. But I
> >>> agree with what David already said here. Deprecating the Java version
> is a
> >>> tool to make users aware of it (same as starting this discussion
> thread).
> >>> If there's no major opposition against deprecating it in the community
> we
> >>> should move forward in this regard to make the users who do not
> >>> regularly browse the mailing list aware of it. That said, deprecating
> Java
> >>> 8 in 1.15 does not necessarily mean that it is dropped in 1.16.
> >>>
> >>> Best,
> >>> Matthias
> >>>
> >>> On Tue, Nov 23, 2021 at 8:46 AM David Morávek  wrote:
>  Thank you Chesnay for starting the discussion! This will generate bit
> of
> >>> a
> >>>
>  work for some users, but it's a good thing to keep moving the project
>  forward. Big +1 for this.
> 
>  Jingsong:
> 
>  Receiving this signal, the user may be unhappy because his application
> 
> > may be all on Java 8. Upgrading is a big job, after all, many systems
> > have not been upgraded yet. (Like you said, HBase and Hive)
>  The whole point of deprecation is to raise awareness, that this will
> be
>  happening eventually and users should take some steps to address this
> in
>  medium-term. If I understand Chesnay correctly, we'd still keep Java 8
>  around for quite some time to give users enough time to upgrade, but
>  without raising awareness we'd fight the very same argument later in
> >>> time.
> >>>
>  All of the prerequisites from 3rd party projects for both HBase [1]
> and
>  Hive [2] to fully support Java 11 have been completed, so the ball is
> on
>  their side and there doesn't seem to be much activity. Generating bit
> >>> more
> >>>
>  pressure on these efforts might be a good thing.
> 
>  It would be great to identify some of these users and learn bit more
> >>> about
> >>>
>  their situation. Are they keeping up with latest Flink developments or
> >>> are
> >>>
>  they lagging behind (this would also give them way more time for
>  eventual
>  upgrade)?
> 
>  [1] https://issues.apache.org/jira/browse/HBASE-22972
>  [2] https://issues.apache.org/jira/browse/HIVE-22415
> 
>  Best,
>  D.
> 
>  On Tue, Nov 23, 2021 at 3:08 AM Jingsong Li 
> 
>  wrote:
> > Hi Chesnay,
> >
> > Thanks for bringing this for discussion.
> >
> > We should dig deeper into the current Java version of Flink users. At
> > least make sure Java 8 is not a mainstream version.
> >
> > Receiving this signal, the user may be unhappy because his
> application
> > may be all on Java 8. Upgrading is a big job, after all, many systems
> > have not been upgraded yet. (Like you said, HBase and Hive)
> >
> > In my opinion, it is too early to deprecate support for Java 8. We
> > should wait for a safer point in time.
> >
> > On Mon, Nov 22, 2021 at 11:45 PM Ingo Bürk 
> wrote:
> >> Hi,
> >>
> >> also a +1 from me because of everything Chesnay already said.
> >>
> >>
> >> Ingo
> >>
> >> On Mon, Nov 22, 2021 at 4:41 PM Martijn Visser <
> >>> mart...@ververica.com>
>
>
>


[jira] [Created] (FLINK-25054) flink sql System (Built-in) Functions 【SHA2】,hashLength validation Unsurpport

2021-11-25 Thread chenbowen (Jira)
chenbowen created FLINK-25054:
-

 Summary: flink sql System (Built-in) Functions 【SHA2】,hashLength 
validation Unsurpport
 Key: FLINK-25054
 URL: https://issues.apache.org/jira/browse/FLINK-25054
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.12.3
Reporter: chenbowen
 Attachments: image-2021-11-25-16-59-56-699.png

【exception sql】
SELECT
SHA2(, 128)
FROM
 
【effect】
when sql is long , it`s hard to clear where is the problem on this issue
【reason】
build-in function SHA2, hashLength do not support “128”, but I could not 
understand from
【Exception log】
!image-2021-11-25-16-59-56-699.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25053) Document how to use the usrlib to load code in the user code class loader

2021-11-25 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-25053:
-

 Summary: Document how to use the usrlib to load code in the user 
code class loader
 Key: FLINK-25053
 URL: https://issues.apache.org/jira/browse/FLINK-25053
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Runtime / Coordination
Affects Versions: 1.13.3, 1.12.5, 1.14.0, 1.15.0
Reporter: Till Rohrmann
 Fix For: 1.15.0, 1.14.1, 1.13.4


With FLINK-13993 we introduced the {{usrlib}} directory that can be used to 
load code in the user code class loader. This functionality has not been 
properly documented so that it is very hard to use. I would suggest to change 
this so that our users can benefit from this cool feature.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25052) Port row to row cast logic to CastRule

2021-11-25 Thread Francesco Guardiani (Jira)
Francesco Guardiani created FLINK-25052:
---

 Summary: Port row to row cast logic to CastRule
 Key: FLINK-25052
 URL: https://issues.apache.org/jira/browse/FLINK-25052
 Project: Flink
  Issue Type: Sub-task
Reporter: Francesco Guardiani






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25051) Port raw <-> binary logic to CastRule

2021-11-25 Thread Francesco Guardiani (Jira)
Francesco Guardiani created FLINK-25051:
---

 Summary: Port raw <-> binary logic to CastRule
 Key: FLINK-25051
 URL: https://issues.apache.org/jira/browse/FLINK-25051
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Francesco Guardiani
Assignee: Francesco Guardiani


More details on the parent task



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Deprecate Java 8 support

2021-11-25 Thread Chesnay Schepler

Users can already use APIs from Java 8/11.

On 25/11/2021 09:35, Francesco Guardiani wrote:

+1 with what both Ingo and Matthias sad, personally, I cannot wait to start 
using some of
the APIs introduced in Java 9. And I'm pretty sure that's the same for our 
users as well.

On Tuesday, 23 November 2021 13:35:07 CET Ingo Bürk wrote:

Hi everyone,

continued support for Java 8 can also create project risks, e.g. if a
vulnerability arises in Flink's dependencies and we cannot upgrade them
because they no longer support Java 8. Some projects already started
deprecating support as well, like Kafka, and other projects will likely
follow.
Let's also keep in mind that the proposal here is not to drop support right
away, but to deprecate it, send the message, and motivate users to start
migrating. Delaying this process could ironically mean users have less time
to prepare for it.


Ingo

On Tue, Nov 23, 2021 at 8:54 AM Matthias Pohl 

wrote:

Thanks for constantly driving these maintenance topics, Chesnay. +1 from
my
side for deprecating Java 8. I see the point Jingsong is raising. But I
agree with what David already said here. Deprecating the Java version is a
tool to make users aware of it (same as starting this discussion thread).
If there's no major opposition against deprecating it in the community we
should move forward in this regard to make the users who do not
regularly browse the mailing list aware of it. That said, deprecating Java
8 in 1.15 does not necessarily mean that it is dropped in 1.16.

Best,
Matthias

On Tue, Nov 23, 2021 at 8:46 AM David Morávek  wrote:

Thank you Chesnay for starting the discussion! This will generate bit of

a


work for some users, but it's a good thing to keep moving the project
forward. Big +1 for this.

Jingsong:

Receiving this signal, the user may be unhappy because his application


may be all on Java 8. Upgrading is a big job, after all, many systems
have not been upgraded yet. (Like you said, HBase and Hive)

The whole point of deprecation is to raise awareness, that this will be
happening eventually and users should take some steps to address this in
medium-term. If I understand Chesnay correctly, we'd still keep Java 8
around for quite some time to give users enough time to upgrade, but
without raising awareness we'd fight the very same argument later in

time.


All of the prerequisites from 3rd party projects for both HBase [1] and
Hive [2] to fully support Java 11 have been completed, so the ball is on
their side and there doesn't seem to be much activity. Generating bit

more


pressure on these efforts might be a good thing.

It would be great to identify some of these users and learn bit more

about


their situation. Are they keeping up with latest Flink developments or

are


they lagging behind (this would also give them way more time for
eventual
upgrade)?

[1] https://issues.apache.org/jira/browse/HBASE-22972
[2] https://issues.apache.org/jira/browse/HIVE-22415

Best,
D.

On Tue, Nov 23, 2021 at 3:08 AM Jingsong Li 

wrote:

Hi Chesnay,

Thanks for bringing this for discussion.

We should dig deeper into the current Java version of Flink users. At
least make sure Java 8 is not a mainstream version.

Receiving this signal, the user may be unhappy because his application
may be all on Java 8. Upgrading is a big job, after all, many systems
have not been upgraded yet. (Like you said, HBase and Hive)

In my opinion, it is too early to deprecate support for Java 8. We
should wait for a safer point in time.

On Mon, Nov 22, 2021 at 11:45 PM Ingo Bürk  wrote:

Hi,

also a +1 from me because of everything Chesnay already said.


Ingo

On Mon, Nov 22, 2021 at 4:41 PM Martijn Visser <

mart...@ververica.com>





Re: [DISCUSS] Conventions on assertions to use in tests

2021-11-25 Thread Francesco Guardiani
This is the result of experimenting around creating custom assertions for Table 
API types 
https://github.com/slinkydeveloper/flink/commit/
d1ce37a62c2200b2c3008a9cc2cac91234222fd5[1]. I will PR it once the two PRs in 
the 
previous mail get merged

On Monday, 22 November 2021 17:59:29 CET Francesco Guardiani wrote:
> Hi all,
> 
> Given I see generally consensus around having a convention and using
> assertj, I propose to merge these 2 PRs:
> 
> * Add the explanation of this convention in our code quality guide:
> https://github.com/apache/flink-web/pull/482
> * Add assertj to dependency management in the parent pom and link in the PR
> template the code quality guide: https://github.com/apache/flink/pull/17871
> 
> WDYT?
> 
> Once we merge those, I'll work in the next days to add some custom
> assertions in table-common for RowData and Row (commonly asserted
> everywhere in the table codebase).
> 
> @Matthias Pohl  about the confluence page, it seems
> a bit outdated, judging from the last modified date. I propose to continue
> to use this guide
> https://flink.apache.org/contributing/code-style-and-quality-common.html as
> it seems more complete.
> 
> 
> On Mon, Nov 22, 2021 at 8:58 AM Matthias Pohl 
> 
> wrote:
> > Agree. Clarifying once more what our preferred option is here, is a good
> > idea. So, +1 for unification. I don't have a strong opinion on what
> > framework to use. But we may want to add this at the end of the discussion
> > to our documentation (e.g. [1] or maybe the PR description?) to make users
> > aware of it and be able to provide a reference in case it comes up again
> > (besides this ML thread). Or do we already have something like that
> > somewhere in the docs where I missed it?
> > 
> > Matthias
> > 
> > [1]
> > https://cwiki.apache.org/confluence/display/FLINK/Best+Practices+and+Lesso
> > ns+Learned> 
> > On Wed, Nov 17, 2021 at 11:13 AM Marios Trivyzas  wrote:
> >> I'm also +1 both for unification and specifically for assertJ.
> >> I think it covers a wide variety of assertions and as Francesco mentioned
> >> it's easily extensible, so that
> >> we can create custom assertions where needed, and avoid repeating test
> >> code.
> >> 
> >> On Tue, Nov 16, 2021 at 9:57 AM David Morávek  wrote:
> >> > I don't have any strong opinions on the asserting framework that we
> >> > use,
> >> > but big +1 for the unification.
> >> > 
> >> > Best,
> >> > D.
> >> > 
> >> > On Tue, Nov 16, 2021 at 9:37 AM Till Rohrmann 
> >> > 
> >> > wrote:
> >> > > Using JUnit5 with assertJ is fine with me if the community agrees.
> >> 
> >> Having
> >> 
> >> > > guides for best practices would definitely help with the transition.
> >> > > 
> >> > > Cheers,
> >> > > Till
> >> > > 
> >> > > On Mon, Nov 15, 2021 at 5:34 PM Francesco Guardiani <
> >> > > france...@ververica.com>
> >> > > 
> >> > > wrote:
> >> > > > > It is a bit unfortunate that we have tests that follow different
> >> > > > 
> >> > > > patterns.
> >> > > > This, however, is mainly due to organic growth. I think the
> >> 
> >> community
> >> 
> >> > > > started with Junit4, then we chose to use Hamcrest because of its
> >> > 
> >> > better
> >> > 
> >> > > > expressiveness.
> >> > > > 
> >> > > > That is fine, I'm sorry if my mail felt like a rant :)
> >> > > > 
> >> > > > > Personally, I don't have a strong preference for which testing
> >> 
> >> tools
> >> 
> >> > to
> >> > 
> >> > > > use. The important bit is that we agree as a community, then
> >> 
> >> document
> >> 
> >> > the
> >> > 
> >> > > > choice and finally stick to it. So before starting to use assertj,
> >> 
> >> we
> >> 
> >> > > > should probably align with the folks working on the Junit5 effort
> >> > 
> >> > first.
> >> > 
> >> > > > As Arvid pointed out, using assertj might help the people working

Re: [DISCUSS] Deprecate Java 8 support

2021-11-25 Thread Francesco Guardiani
+1 with what both Ingo and Matthias sad, personally, I cannot wait to start 
using some of 
the APIs introduced in Java 9. And I'm pretty sure that's the same for our 
users as well.

On Tuesday, 23 November 2021 13:35:07 CET Ingo Bürk wrote:
> Hi everyone,
> 
> continued support for Java 8 can also create project risks, e.g. if a
> vulnerability arises in Flink's dependencies and we cannot upgrade them
> because they no longer support Java 8. Some projects already started
> deprecating support as well, like Kafka, and other projects will likely
> follow.
> Let's also keep in mind that the proposal here is not to drop support right
> away, but to deprecate it, send the message, and motivate users to start
> migrating. Delaying this process could ironically mean users have less time
> to prepare for it.
> 
> 
> Ingo
> 
> On Tue, Nov 23, 2021 at 8:54 AM Matthias Pohl 
> 
> wrote:
> > Thanks for constantly driving these maintenance topics, Chesnay. +1 from
> > my
> > side for deprecating Java 8. I see the point Jingsong is raising. But I
> > agree with what David already said here. Deprecating the Java version is a
> > tool to make users aware of it (same as starting this discussion thread).
> > If there's no major opposition against deprecating it in the community we
> > should move forward in this regard to make the users who do not
> > regularly browse the mailing list aware of it. That said, deprecating Java
> > 8 in 1.15 does not necessarily mean that it is dropped in 1.16.
> > 
> > Best,
> > Matthias
> > 
> > On Tue, Nov 23, 2021 at 8:46 AM David Morávek  wrote:
> > > Thank you Chesnay for starting the discussion! This will generate bit of
> > 
> > a
> > 
> > > work for some users, but it's a good thing to keep moving the project
> > > forward. Big +1 for this.
> > > 
> > > Jingsong:
> > > 
> > > Receiving this signal, the user may be unhappy because his application
> > > 
> > > > may be all on Java 8. Upgrading is a big job, after all, many systems
> > > > have not been upgraded yet. (Like you said, HBase and Hive)
> > > 
> > > The whole point of deprecation is to raise awareness, that this will be
> > > happening eventually and users should take some steps to address this in
> > > medium-term. If I understand Chesnay correctly, we'd still keep Java 8
> > > around for quite some time to give users enough time to upgrade, but
> > > without raising awareness we'd fight the very same argument later in
> > 
> > time.
> > 
> > > All of the prerequisites from 3rd party projects for both HBase [1] and
> > > Hive [2] to fully support Java 11 have been completed, so the ball is on
> > > their side and there doesn't seem to be much activity. Generating bit
> > 
> > more
> > 
> > > pressure on these efforts might be a good thing.
> > > 
> > > It would be great to identify some of these users and learn bit more
> > 
> > about
> > 
> > > their situation. Are they keeping up with latest Flink developments or
> > 
> > are
> > 
> > > they lagging behind (this would also give them way more time for
> > > eventual
> > > upgrade)?
> > > 
> > > [1] https://issues.apache.org/jira/browse/HBASE-22972
> > > [2] https://issues.apache.org/jira/browse/HIVE-22415
> > > 
> > > Best,
> > > D.
> > > 
> > > On Tue, Nov 23, 2021 at 3:08 AM Jingsong Li 
> > > 
> > > wrote:
> > > > Hi Chesnay,
> > > > 
> > > > Thanks for bringing this for discussion.
> > > > 
> > > > We should dig deeper into the current Java version of Flink users. At
> > > > least make sure Java 8 is not a mainstream version.
> > > > 
> > > > Receiving this signal, the user may be unhappy because his application
> > > > may be all on Java 8. Upgrading is a big job, after all, many systems
> > > > have not been upgraded yet. (Like you said, HBase and Hive)
> > > > 
> > > > In my opinion, it is too early to deprecate support for Java 8. We
> > > > should wait for a safer point in time.
> > > > 
> > > > On Mon, Nov 22, 2021 at 11:45 PM Ingo Bürk  wrote:
> > > > > Hi,
> > > > > 
> > > > > also a +1 from me because of everything Chesnay already said.
> > > > > 
> > > > > 
> > > > > Ingo
> > > > > 
> > > > > On Mon, Nov 22, 2021 at 4:41 PM Martijn Visser <
> > 
> > mart...@ververica.com>

Re: [DISCUSS] Releasing Flink 1.14.1

2021-11-25 Thread Dawid Wysakowicz
Hey Martijn,

+1 for releasing 1.14.1

As for https://issues.apache.org/jira/browse/FLINK-24328 I removed the
1.14.1 fix version. It definitely should not block the release. If we
decide to backport it to 1.14.x it can safely land in 1.14.2.

Best,

Dawid

On 24/11/2021 19:40, Martijn Visser wrote:
> Hi all,
>
> I would like to start a discussion on releasing Flink 1.14.1. Flink 1.14
> was released on the 29th of September [1] and so far 107 issues have been
> resolved, including multiple blockers and critical priorities [2].
>
> There are currently 169 open tickets which contain a fixVersion for 1.14.1
> [3]. I'm including the ones that are currently marked as critical or a
> blocker to verify if these should be included in Flink 1.14.1. It would be
> great if those that are assigned or working on one or more of these tickets
> can give an update on its status.
>
> * https://issues.apache.org/jira/browse/FLINK-24543 - Zookeeper connection
> issue causes inconsistent state in Flink -> I think this depends on the
> outcome of dropping Zookeeper 3.4 as was proposed on the Dev mailing list
> * https://issues.apache.org/jira/browse/FLINK-25027 - Allow GC of a
> finished job's JobMaster before the slot timeout is reached
> * https://issues.apache.org/jira/browse/FLINK-25022 - ClassLoader leak with
> ThreadLocals on the JM when submitting a job through the REST API
> * https://issues.apache.org/jira/browse/FLINK-24789 - IllegalStateException
> with CheckpointCleaner being closed already
> * https://issues.apache.org/jira/browse/FLINK-24328 - Long term fix for
> receiving new buffer size before network reader configured -> I'm not sure
> if this would end up in Flink 1.14.1, I think it's more likely that it
> would be Flink 1.15. Anton/Dawid, could you confirm this?
> * https://issues.apache.org/jira/browse/FLINK-23946 - Application mode
> fails fatally when being shut down -> This depends on
> https://issues.apache.org/jira/browse/FLINK-24038 and I don't see much
> happening there, so I also expect that this would move to Flink 1.15.
> David, could you confirm?
> * https://issues.apache.org/jira/browse/FLINK-22113 - UniqueKey constraint
> is lost with multiple sources join in SQL
> * https://issues.apache.org/jira/browse/FLINK-21788 - Throw
> PartitionNotFoundException if the partition file has been lost for blocking
> shuffle -> I'm also expecting that this would move to Flink 1.15, can you
> confirm Yingjie ?
>
> There are quite some other tickets that I've excluded from this list,
> because they are either test instabilities or are not depending on a Flink
> release to be resolved.
>
> Note: there are quite a few test instabilities in the list and help on
> those is always appreciated. You can check all unassigned tickets
> instabilities in Jira [4].
>
> Are there any other open tickets that we should wait for? Is there a PMC
> member who would like to manage the release? I'm more than happy to help
> with monitoring the status of the tickets.
>
> Best regards,
>
> Martijn
>
> [1] https://flink.apache.org/news/2021/09/29/release-1.14.0.html
> [2]
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%201.14.1%20ORDER%20BY%20priority%20DESC%2C%20created%20DESC
> [3]
> https://issues.apache.org/jira/issues?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%201.14.1%20ORDER%20BY%20priority%20DESC%2C%20created%20DESC
>
> [4]
> https://issues.apache.org/jira/issues?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%201.14.1%20AND%20labels%20%3D%20test-stability%20AND%20assignee%20in%20(EMPTY)%20ORDER%20BY%20priority%20DESC%2C%20created%20DESC
>
> Martijn Visser | Product Manager
>
> mart...@ververica.com
>
> 
>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward  - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>



OpenPGP_signature
Description: OpenPGP digital signature