Re: [VOTE] FLIP-179: Expose Standardized Operator Metrics

2021-07-30 Thread Steven Wu
+1 (non-binding)

On Fri, Jul 30, 2021 at 3:55 AM Arvid Heise  wrote:

> Dear devs,
>
> I'd like to open a vote on FLIP-179: Expose Standardized Operator Metrics
> [1] which was discussed in this thread [2].
> The vote will be open for at least 72 hours unless there is an objection
> or not enough votes.
>
> The proposal excludes the implementation for the currentFetchEventTimeLag
> metric, which caused a bit of discussion without a clear convergence. We
> will implement that metric in a generic way at a later point and encourage
> sources to implement it themselves in the meantime.
>
> Best,
>
> Arvid
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-179%3A+Expose+Standardized+Operator+Metrics
> [2]
>
> https://lists.apache.org/thread.html/r856920cbfe6a262b521109c5bdb9e904e00a9b3f1825901759c24d85%40%3Cdev.flink.apache.org%3E
>


[jira] [Created] (FLINK-23568) Plaintext Java Keystore Password Risks in the flink-conf.yaml File

2021-07-30 Thread Hui Wang (Jira)
Hui Wang created FLINK-23568:


 Summary: Plaintext Java Keystore Password Risks in the 
flink-conf.yaml File
 Key: FLINK-23568
 URL: https://issues.apache.org/jira/browse/FLINK-23568
 Project: Flink
  Issue Type: Improvement
  Components: Client / Job Submission, Runtime / REST
Affects Versions: 1.11.3
Reporter: Hui Wang


When REST SSL is enabled, the plaintext password of the Java keystore needs to 
be configured in the flink-conf.yaml configuration of Flink, which poses great 
security risks. It is hoped that the community can provide the capability of 
encrypting and storing passwords in the flink-conf.yaml file.

{{}}
{code:java}
security.ssl.internal.keystore-password: keystore_password
security.ssl.internal.key-password: key_password
security.ssl.internal.truststore-password: truststore_password{code}
{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23567) Hive 1.1.0 failed to write using flink sql 1.3.1 because the JSON class was not found

2021-07-30 Thread wuyang (Jira)
wuyang created FLINK-23567:
--

 Summary: Hive 1.1.0 failed to write using flink sql 1.3.1 because 
the JSON class was not found
 Key: FLINK-23567
 URL: https://issues.apache.org/jira/browse/FLINK-23567
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Affects Versions: 1.13.1
Reporter: wuyang
 Fix For: 1.13.1
 Attachments: image-2021-07-31-10-39-52-126.png

*First:I added the flink-sql-connector-hive-3.1.2 under the Lib directory, the 
following error is prompted when publishing the task of Flink SQL:*

java.lang.NoClassDefFoundError: org/json/JSONException
 at 
org.apache.flink.table.planner.delegation.hive.parse.HiveParserDDLSemanticAnalyzer.analyzeCreateTable(HiveParserDDLSemanticAnalyzer.java:646)
 at 
org.apache.flink.table.planner.delegation.hive.parse.HiveParserDDLSemanticAnalyzer.analyzeInternal(HiveParserDDLSemanticAnalyzer.java:373)
 at 
org.apache.flink.table.planner.delegation.hive.HiveParser.processCmd(HiveParser.java:235)
 at 
org.apache.flink.table.planner.delegation.hive.HiveParser.parse(HiveParser.java:217)
 at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:724)
 at 
me.ddmc.bigdata.sqlsubmit.helper.SqlSubmitHelper.callSql(SqlSubmitHelper.java:201)
 at 
me.ddmc.bigdata.sqlsubmit.helper.SqlSubmitHelper.callCommand(SqlSubmitHelper.java:182)
 at 
me.ddmc.bigdata.sqlsubmit.helper.SqlSubmitHelper.run(SqlSubmitHelper.java:124)
 at me.ddmc.bigdata.sqlsubmit.SqlSubmit.main(SqlSubmit.java:34)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355)

at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222)
 at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114)
 at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812)
 at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246)
 at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054)
 at 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
 at 
org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
 at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)
 Caused by: java.lang.ClassNotFoundException: org.json.JSONException
 at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
 ... 25 

 

*Second:  After investigation, it is found that the exclude is added to the POM 
in the flink-sql-connector-hive-1.2.2 module, but other hive connectors are 
not.*

!image-2021-07-31-10-37-21-813.png!

*But I didn't understand this remark. Is this a problem?*

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.12.5, release candidate #3

2021-07-30 Thread godfrey he
+1 (non-binding)

- Checked checksums and signatures: OK
- Built from source: OK
- Checked the flink-web PR
   - find one typo about flink version
- Submit some jobs from sql-client to local cluster, checked the web-ui,
cp, sp, log, etc: OK

Best,
Godfrey

Robert Metzger  于2021年7月30日周五 下午4:33写道:

> Thanks a lot for providing the new staging repository. I dropped the 1440
> and 1441 staging repositories, to avoid that other RC reviewers
> accidentally look into it, or that we accidentally release it.
>
> +1 (binding)
>
> Checks:
> - I didn't find any additional issues in the release announcement
> - the pgp signatures on the source archive seem fine
> - source archive compilation starts successfully (rat check passes etc.)
> - standalone mode, job submission and cli cancellation works. logs look
> fine
> - maven staging repository looks fine
>
> On Fri, Jul 30, 2021 at 7:30 AM Jingsong Li 
> wrote:
>
> > Hi everyone,
> >
> > Thanks Robert, I created a new one.
> >
> > all artifacts to be deployed to the Maven Central Repository [4],
> >
> > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1444/
> >
> > Best,
> > Jingsong
> >
> > On Thu, Jul 29, 2021 at 9:50 PM Robert Metzger 
> > wrote:
> >
> > > The difference is that the 1440 staging repository contains the Scala
> > _2.11
> > > files, the 1441 repo contains scala_2.12. I'm not sure if this works,
> > > because things like "flink-core:1.11.5" will be released twice?
> > > I would prefer to have a single staging repository containing all
> > binaries
> > > we intend to release to maven central, to avoid complications in the
> > > release process.
> > >
> > > Since only the convenience binaries are affected by this, we don't need
> > to
> > > cancel the release. We just need to create a new staging repository.
> > >
> > >
> > > On Thu, Jul 29, 2021 at 3:36 PM Robert Metzger 
> > > wrote:
> > >
> > > > Thanks a lot for creating a release candidate!
> > > >
> > > > What is the difference between the two maven staging repos?
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1440/
> > > >  and
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1441/
> > > ?
> > > >
> > > > On Thu, Jul 29, 2021 at 1:52 PM Xingbo Huang 
> > wrote:
> > > >
> > > >> +1 (non-binding)
> > > >>
> > > >> - Verified checksums and signatures
> > > >> - Built from sources
> > > >> - Verified Python wheel package contents
> > > >> - Pip install Python wheel package in Mac
> > > >> - Run Python UDF job in Python REPL
> > > >>
> > > >> Best,
> > > >> Xingbo
> > > >>
> > > >> Zakelly Lan  于2021年7月29日周四 下午5:57写道:
> > > >>
> > > >> > +1 (non-binding)
> > > >> >
> > > >> > * Built from source.
> > > >> > * Run wordcount datastream job on yarn
> > > >> > * Web UI and checkpoint seem good.
> > > >> > * Kill a container to make job failover, everything is good.
> > > >> > * Try run job from checkpoint, everything is good.
> > > >> >
> > > >> > On Thu, Jul 29, 2021 at 2:34 PM Yun Tang 
> wrote:
> > > >> >
> > > >> > > +1 (non-binding)
> > > >> > >
> > > >> > > Checked the signature.
> > > >> > >
> > > >> > > Reviewed the PR of flink-web.
> > > >> > >
> > > >> > > Download the pre-built tar package and launched an application
> > mode
> > > >> > > standalone job successfully.
> > > >> > >
> > > >> > > Best
> > > >> > > Yun Tang
> > > >> > >
> > > >> > >
> > > >> > > 
> > > >> > > From: Jingsong Li 
> > > >> > > Sent: Tuesday, July 27, 2021 11:54
> > > >> > > To: dev 
> > > >> > > Subject: [VOTE] Release 1.12.5, release candidate #3
> > > >> > >
> > > >> > > Hi everyone,
> > > >> > >
> > > >> > > Please review and vote on the release candidate #3 for the
> version
> > > >> > 1.12.5,
> > > >> > > as follows:
> > > >> > > [ ] +1, Approve the release
> > > >> > > [ ] -1, Do not approve the release (please provide specific
> > > comments)
> > > >> > >
> > > >> > > The complete staging area is available for your review, which
> > > >> includes:
> > > >> > > * JIRA release notes [1],
> > > >> > > * the official Apache source release and binary convenience
> > releases
> > > >> to
> > > >> > be
> > > >> > > deployed to dist.apache.org [2], which are signed with the key
> > with
> > > >> > > fingerprint FBB83C0A4FFB9CA8 [3],
> > > >> > > * all artifacts to be deployed to the Maven Central Repository
> > [4],
> > > >> > > * source code tag "release-1.12.5-rc3" [5],
> > > >> > > * website pull request listing the new release and adding
> > > announcement
> > > >> > blog
> > > >> > > post [6].
> > > >> > >
> > > >> > > The vote will be open for at least 72 hours. It is adopted by
> > > majority
> > > >> > > approval, with at least 3 PMC affirmative votes.
> > > >> > >
> > > >> > > Best,
> > > >> > > Jingsong Lee
> > > >> > >
> > > >> > > [1]
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350166
> > > >> > > [2]
> > 

Re: [VOTE] Release 1.13.2, release candidate #3

2021-07-30 Thread godfrey he
+1 (non-binding)

- Checked checksums and signatures: OK
- Built from source: OK
- Checked the flink-web PR
   - find one typo about the number of the fixes and improvements
- Submit some jobs from sql-client to local cluster, checked the web-ui,
cp, sp, log, etc: OK

Best,
Godfrey


Xingbo Huang  于2021年7月30日周五 下午2:51写道:

> +1 (non-binding)
>
> - Verified checksums and signatures
> - Verified Python wheel package contents
> - Pip install apache-flink-libraries source package and apache-flink wheel
> package in Mac
> - Write and Run a Simple Python UDF job in Python REPL
>
> Best,
> Xingbo
>
> Yu Li  于2021年7月30日周五 下午2:33写道:
>
> > +1 (binding)
> >
> > - Checked the diff between 1.13.1 and 1.13.2-rc3: OK (
> >
> https://github.com/apache/flink/compare/release-1.13.1...release-1.13.2-rc3
> > )
> >   - commons-io version has been bumped to 2.8.0 through FLINK-22747 and
> all
> > NOTICE files updated correctly
> >   - guava version has been bumped to 29.0 for kinesis connector through
> > FLINK-23009 and all NOTICE files updated correctly
> > - Checked release notes: OK
> >   - minor: I've moved FLINK-23315 and FLINK-23418 out of 1.13.2 to keep
> > accordance with RC status
> > - Checked sums and signatures: OK
> > - Maven clean install from source: OK
> > - Checked the jars in the staging repo: OK
> > - Checked the website updates: OK
> >   - minor: left some minor comments in PR (such as RN needs update, etc.)
> > and please remember to address them before merging
> >
> > Best Regards,
> > Yu
> >
> >
> > On Fri, 30 Jul 2021 at 14:00, Jingsong Li 
> wrote:
> >
> > > +1 (non-binding)
> > >
> > > - Check if checksums and GPG files match the corresponding release
> files
> > > - staging repository looks fine
> > > - Start a local cluster (start-cluster.sh), logs fine
> > > - Run sql-client and run a job, looks fine
> > >
> > > I found an unexpected log in sql-client:
> > > "Searching for
> > >
> > >
> >
> '/Users/lijingsong/Downloads/tmp/flink-1.13.2/conf/sql-client-defaults.yaml'...not
> > > found"
> > > This log should be removed. I created a JIRA for this:
> > > https://issues.apache.org/jira/browse/FLINK-23552
> > > (This should not be a blocker)
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Thu, Jul 29, 2021 at 10:44 PM Robert Metzger 
> > > wrote:
> > >
> > > > Thanks a lot for creating this release candidate
> > > >
> > > > +1 (binding)
> > > >
> > > > - staging repository looks fine
> > > > - Diff to 1.13.1 looks fine wrt to dependency changes:
> > > >
> > >
> >
> https://github.com/apache/flink/compare/release-1.13.1...release-1.13.2-rc3
> > > > - standalone mode works locally
> > > >- I found this issue, which is not specific to 1.13.2:
> > > > https://issues.apache.org/jira/browse/FLINK-23546
> > > > - src archive signature is matched; sha512 is correct
> > > >
> > > > On Thu, Jul 29, 2021 at 9:10 AM Zakelly Lan 
> > > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > * Built from source.
> > > > > * Run wordcount datastream job on yarn
> > > > > * Web UI and checkpoint seem good.
> > > > > * Kill a container to make job failover, everything is good.
> > > > > * Try run job from checkpoint, everything is good.
> > > > >
> > > > > On Fri, Jul 23, 2021 at 10:04 PM Yun Tang 
> wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > > Please review and vote on the release candidate #3 for the
> version
> > > > > 1.13.2,
> > > > > > as follows:
> > > > > > [ ] +1, Approve the release
> > > > > > [ ] -1, Do not approve the release (please provide specific
> > comments)
> > > > > >
> > > > > >
> > > > > > The complete staging area is available for your review, which
> > > includes:
> > > > > > * JIRA release notes [1],
> > > > > > * the official Apache source release and binary convenience
> > releases
> > > to
> > > > > be
> > > > > > deployed to dist.apache.org [2], which are signed with the key
> > with
> > > > > > fingerprint 78A306590F1081CC6794DC7F62DAD618E07CF996 [3],
> > > > > > * all artifacts to be deployed to the Maven Central Repository
> [4],
> > > > > > * source code tag "release-1.13.2-rc3" [5],
> > > > > > * website pull request listing the new release and adding
> > > announcement
> > > > > > blog post [6].
> > > > > >
> > > > > > The vote will be open for at least 72 hours. It is adopted by
> > > majority
> > > > > > approval, with at least 3 PMC affirmative votes.
> > > > > >
> > > > > > Best,
> > > > > > Yun Tang
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350218==12315522
> > > > > > [2]
> https://dist.apache.org/repos/dist/dev/flink/flink-1.13.2-rc3/
> > > > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > > > [4]
> > > > > >
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1439/
> > > > > > [5]
> > https://github.com/apache/flink/releases/tag/release-1.13.2-rc3
> > > > > > [6] https://github.com/apache/flink-web/pull/453
> > > > > >
> 

[jira] [Created] (FLINK-23566) Mysql 8.0 Public Key Retrieval is not allowed

2021-07-30 Thread MING (Jira)
MING created FLINK-23566:


 Summary: Mysql 8.0  Public Key Retrieval is not allowed
 Key: FLINK-23566
 URL: https://issues.apache.org/jira/browse/FLINK-23566
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.13.1
Reporter: MING


mysql 8.0 这个问题怎么解决呀



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23565) Window TVF Supports session window in runtime

2021-07-30 Thread JING ZHANG (Jira)
JING ZHANG created FLINK-23565:
--

 Summary: Window TVF Supports session window in runtime
 Key: FLINK-23565
 URL: https://issues.apache.org/jira/browse/FLINK-23565
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Runtime
Reporter: JING ZHANG






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-180: Adjust StreamStatus and Idleness definition

2021-07-30 Thread Arvid Heise
Hi Martijn,

1. Good question. The watermarks and statuses of the splits are first
aggregated before emitted through the reader. The watermark strategy of the
user is actually applied on all SourceOutputs (=splits). Since one split is
active and one is idle, the watermark of the reader will not advance until
the user-defined idleness is triggered on the idle split. At this point,
the combined watermark solely depends on the active split. The combined
status remains ACTIVE.
2. Kafka has no dynamic partitions. This is a complete misnomer on Flink
side. In fact, if you search for Kafka and partition discovery, you will
only find Flink resources. What we actually do is dynamic topic discovery
and that can only be triggered through pattern afaik. We could go for topic
discovery on all patterns by default if we don't do that already.
3. Yes, idleness on assigned partitions would even work with dynamic
assignments. I will update the FLIP to reflect that.
4. Afaik it was only meant for scenario 2 (and your question 3) and it
should be this way after the FLIP. I don't know of any source
implementation that uses the user-specified idleness to handle scenario 3.
The thing that is currently extra is that some readers go idle, when the
reader doesn't have an active assignment.

Best,

Arvid

On Fri, Jul 30, 2021 at 12:17 PM Martijn Visser 
wrote:

> Hi all,
>
> I have a couple of questions after studying the FLIP and the docs:
>
> 1. What happens when one of the readers has two splits assigned and one of
> the splits actually receives data?
>
> 2. If I understand it correctly the Kinesis Source uses dynamic shard
> discovery by default (so in case of idleness scenario 3 would happen there)
> and the FileSource also has a dynamic assignment. The Kafka Source doesn't
> use dynamic partition discovery by default (so scenario 2 would be the
> default to happen there). Why did we choose to not enable dynamic partition
> discovery by default and should we actually change that?
>
> 3. To be sure, is it correct that in case of a dynamic assignment and there
> is temporarily no data, that scenario 2 is applicable?
>
> 4. Does WatermarkStrategy#withIdleness currently cover scenario 2, 3 and
> the one from my 3rd question? (edited)
>
> Best regards,
>
> Martijn
>
> On Fri, 23 Jul 2021 at 15:57, Till Rohrmann  wrote:
>
> > Hi everyone,
> >
> > I would be in favour of what Arvid said about not exposing the
> > WatermarkStatus to the Sink. Unless there is a very strong argument that
> > this is required I think that keeping this concept internal seems to me
> the
> > better choice right now. Moreover, as Arvid said the downstream
> application
> > can derive the WatermarkStatus on their own depending on its business
> > logic.
> >
> > Cheers,
> > Till
> >
> > On Fri, Jul 23, 2021 at 2:15 PM Arvid Heise  wrote:
> >
> > > Hi Eron,
> > >
> > > thank you very much for your feedback.
> > >
> > > Please mention that the "temporary status toggle" code will be removed.
> > > >
> > > This code is already removed but there is still some automation of
> going
> > > idle when temporary no splits are assigned. I will include it in the
> > FLIP.
> > >
> > > I agree with adding the markActive() functionality, for symmetry.
> > Speaking
> > > > of symmetry, could we now include the minor enhancement we discussed
> in
> > > > FLIP-167, the exposure of watermark status changes on the Sink
> > interface.
> > > > I drafted a PR and would be happy to revisit it.
> > > >
> > > >
> > >
> >
> https://github.com/streamnative/flink/pull/2/files#diff-64d9c652ffc3c60b6d838200a24b106eeeda4b2d853deae94dbbdf16d8d694c2R62-R70
> > >
> > > I'm still not sure if that's a good idea.
> > >
> > > If we have now refined idleness to be an user-specified,
> > > application-specific way to handle with temporarily stalled partitions,
> > > then downstream applications should actually implement their own
> idleness
> > > definition. Let's see what other devs think. I'm pinging the once that
> > have
> > > been most involved in the discussion: @Stephan Ewen 
> > > @Till
> > > Rohrmann  @Dawid Wysakowicz <
> > dwysakow...@apache.org>
> > > .
> > >
> > > The flip mentions a 'watermarkstatus' package for the WatermarkStatus
> > > > class.  Should it be 'eventtime' package?
> > > >
> > > Are you proposing org.apache.flink.api.common.eventtime? I was simply
> > > suggesting to simply rename
> > > org.apache.flink.streaming.runtime.streamstatus but I'm very open for
> > other
> > > suggestions (given that there are only 2 classes in the package).
> > >
> > >
> > > > Regarding the change of 'streamStatus' to 'watermarkStatus', could
> you
> > > > spell out what the new method names will be on each interface? May I
> > > > suggest that Input.emitStreamStatus be Input.processStreamStatus?
> This
> > > is
> > > > to help decouple the input's watermark status from the output's
> > watermark
> > > > status.
> > > >
> > > I haven't found
> > > 

Re: [DISCUSS] FLIP-177: Extend Sink API

2021-07-30 Thread Hausmann, Steffen
Hey Guowei,

there is one additional aspect I want to highlight that is relevant for the 
types of destinations we had in mind when designing the AsyncSink.

I'll again use Kinesis as an example, but the same argument applies to other 
destinations. We are using the PutRecords API to persist up to 500 events with 
a single API call to reduce the overhead compared to using individual calls per 
event. But not all of the 500 events may be persisted successfully, eg, a 
single event fails to be persisted due to server side throttling. With the 
MailboxExecutor based implementation, we can just add this event back to the 
internal queue. The event will then be retied with the next batch of 500 events.

In my understanding, that's not possible with the AsyncIO based approach. 
During a retry, we can only retry the failed events of the original batch of 
events, which means we would need to send a single event with a separate 
PutRecords call. Depending how often that happens, this could add up.

Does that make sense?

Cheers, Steffen


On 30.07.21, 05:51, "Guowei Ma"  wrote:

CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



Hi, Arvid & Piotr
Sorry for the late reply.
1. Thank you all very much for your patience and explanation. Recently, I
have also studied the related code of 'MailBox', which may not be as
serious as I thought, after all, it is very similar to Java's `Executor`;
2. Whether to use AsyncIO or MailBox to implement Kinesis connector is more
up to the contributor to decide (after all, `Mailbox` has decided to be
exposed :-) ). It’s just that I personally prefer to combine some simple
functions to complete a more advanced function.
Best,
Guowei


On Sat, Jul 24, 2021 at 3:38 PM Arvid Heise  wrote:

> Just to reiterate on Piotr's point: MailboxExecutor is pretty much an
> Executor [1] with named lambdas, except for the name MailboxExecutor
> nothing is hinting at a specific threading model.
>
> Currently, we expose it on StreamOperator API. Afaik the idea is to make
> the StreamOperator internal and beef up ProcessFunction but for several 
use
> cases (e.g., AsyncIO), we actually need to expose the executor anyways.
>
> We could rename MailboxExecutor to avoid exposing the name of the 
threading
> model. For example, we could rename it to TaskThreadExecutor (but that's
> pretty much the same), to CooperativeExecutor (again implies Mailbox), to
> o.a.f.Executor, to DeferredExecutor... Ideas are welcome.
>
> We could also simply use Java's Executor interface, however, when working
> with that interface, I found that the missing context of async executed
> lambdas made debugging much much harder. So that's why I designed
> MailboxExecutor to force the user to give some debug string to each
> invokation. In the sink context, we could, however, use an adaptor from
> MailboxExecutor to Java's Executor and simply bind the sink name to the
> invokations.
>
> [1]
>
> 
https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/Executor.html
>
> On Fri, Jul 23, 2021 at 5:36 PM Piotr Nowojski 
> wrote:
>
> > Hi,
> >
> > Regarding the question whether to expose the MailboxExecutor or not:
> > 1. We have plans on exposing it in the ProcessFunction (in short we want
> to
> > make StreamOperator API private/internal only, and move all of it's 
extra
> > functionality in one way or another to the ProcessFunction). I don't
> > remember and I'm not sure if *Dawid* had a different idea about this (do
> > not expose Mailbox but wrap it somehow?)
> > 2. If we provide a thin wrapper around MailboxExecutor, I'm not sure how
> > helpful it will be for keeping backward compatibility in the future.
> > `MailboxExecutor` is already a very generic interface that doesn't 
expose
> > much about the current threading model. Note that the previous threading
> > model (multi threaded with checkpoint lock), should be easy to implement
> > using the `MailboxExecutor` interface (use a direct executor that
> acquires
> > checkpoint lock).
> >
> > Having said that, I haven't spent too much time thinking about whether
> it's
> > better to enrich AsyncIO or provide the AsyncSink. If we can just as
> > efficiently provide the same functionality using the existing/enhanced
> > AsyncIO API, that may be a good idea if it indeed reduces our
> > maintenance costs.
> >
> > Piotrek
> >
> > pt., 23 lip 2021 o 12:55 Guowei Ma  napisał(a):
> >
> > > Hi, Arvid
> > >
> > > >>>The main question here is what do you think is the harm of exposing
> > > Mailbox? Is it the complexity or the maintenance overhead?
> > >
> > > I think 

Re: [DISCUSS] FLIP-177: Extend Sink API

2021-07-30 Thread Arvid Heise
Hi Guowei, hi all,

The main drawback of the AsyncIO approach is the decreased flexibility. In
particular, as you mentioned for the advanced backpressure use cases, you
would need to chain several AsyncIOs:

>>>But whether a sink is overloaded not only depends on the queue size. It
> also depends on the number of in-flight async requests
> 1. How about chaining two AsyncIOs? One is for controlling the size of the
> buffer elements; The other is for controlling the in-flight async requests.
>

If we need an AsyncIO for each dimension of backpressure, we also might end
up with an incompatible state when a dimension is added or removed through
a configuration change.

With that being said, I'd like to start a vote on the proposal as your
strong objection disappeared. We can continue the discussion here but I'd
also appreciate any vote on [1].

[1]
https://lists.apache.org/thread.html/r7194846ec671e9e0e64908a7ae4cf32c2bccf1dd6ee7db107a52cf04%40%3Cdev.flink.apache.org%3E

On Fri, Jul 30, 2021 at 5:51 AM Guowei Ma  wrote:

> Hi, Arvid & Piotr
> Sorry for the late reply.
> 1. Thank you all very much for your patience and explanation. Recently, I
> have also studied the related code of 'MailBox', which may not be as
> serious as I thought, after all, it is very similar to Java's `Executor`;
> 2. Whether to use AsyncIO or MailBox to implement Kinesis connector is more
> up to the contributor to decide (after all, `Mailbox` has decided to be
> exposed :-) ). It’s just that I personally prefer to combine some simple
> functions to complete a more advanced function.
> Best,
> Guowei
>
>
> On Sat, Jul 24, 2021 at 3:38 PM Arvid Heise  wrote:
>
> > Just to reiterate on Piotr's point: MailboxExecutor is pretty much an
> > Executor [1] with named lambdas, except for the name MailboxExecutor
> > nothing is hinting at a specific threading model.
> >
> > Currently, we expose it on StreamOperator API. Afaik the idea is to make
> > the StreamOperator internal and beef up ProcessFunction but for several
> use
> > cases (e.g., AsyncIO), we actually need to expose the executor anyways.
> >
> > We could rename MailboxExecutor to avoid exposing the name of the
> threading
> > model. For example, we could rename it to TaskThreadExecutor (but that's
> > pretty much the same), to CooperativeExecutor (again implies Mailbox), to
> > o.a.f.Executor, to DeferredExecutor... Ideas are welcome.
> >
> > We could also simply use Java's Executor interface, however, when working
> > with that interface, I found that the missing context of async executed
> > lambdas made debugging much much harder. So that's why I designed
> > MailboxExecutor to force the user to give some debug string to each
> > invokation. In the sink context, we could, however, use an adaptor from
> > MailboxExecutor to Java's Executor and simply bind the sink name to the
> > invokations.
> >
> > [1]
> >
> >
> https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/Executor.html
> >
> > On Fri, Jul 23, 2021 at 5:36 PM Piotr Nowojski 
> > wrote:
> >
> > > Hi,
> > >
> > > Regarding the question whether to expose the MailboxExecutor or not:
> > > 1. We have plans on exposing it in the ProcessFunction (in short we
> want
> > to
> > > make StreamOperator API private/internal only, and move all of it's
> extra
> > > functionality in one way or another to the ProcessFunction). I don't
> > > remember and I'm not sure if *Dawid* had a different idea about this
> (do
> > > not expose Mailbox but wrap it somehow?)
> > > 2. If we provide a thin wrapper around MailboxExecutor, I'm not sure
> how
> > > helpful it will be for keeping backward compatibility in the future.
> > > `MailboxExecutor` is already a very generic interface that doesn't
> expose
> > > much about the current threading model. Note that the previous
> threading
> > > model (multi threaded with checkpoint lock), should be easy to
> implement
> > > using the `MailboxExecutor` interface (use a direct executor that
> > acquires
> > > checkpoint lock).
> > >
> > > Having said that, I haven't spent too much time thinking about whether
> > it's
> > > better to enrich AsyncIO or provide the AsyncSink. If we can just as
> > > efficiently provide the same functionality using the existing/enhanced
> > > AsyncIO API, that may be a good idea if it indeed reduces our
> > > maintenance costs.
> > >
> > > Piotrek
> > >
> > > pt., 23 lip 2021 o 12:55 Guowei Ma  napisał(a):
> > >
> > > > Hi, Arvid
> > > >
> > > > >>>The main question here is what do you think is the harm of
> exposing
> > > > Mailbox? Is it the complexity or the maintenance overhead?
> > > >
> > > > I think that exposing the internal threading model might be risky. In
> > > case
> > > > the threading model changes, it will affect the user's api and bring
> > the
> > > > burden of internal modification. (Of course, you may have more say in
> > how
> > > > the MailBox model will develop in the future) Therefore, I think that
> > if
> > > an
> > > > alternative 

[VOTE] FLIP-177: Extend Sink API

2021-07-30 Thread Arvid Heise
Hi all,

I'd like to start a vote on FLIP-177: Extend Sink API [1] which provides
small extensions to the Sink API introduced through FLIP-143.
The vote will be open for at least 72 hours unless there is an objection or
not enough votes.

Note that the FLIP was larger initially and I cut down all
advanced/breaking changes.

Best,

Arvid

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API


[jira] [Created] (FLINK-23564) Make taskmanager.out and taskmanager.err rollable

2021-07-30 Thread zlzhang0122 (Jira)
zlzhang0122 created FLINK-23564:
---

 Summary: Make taskmanager.out and taskmanager.err rollable
 Key: FLINK-23564
 URL: https://issues.apache.org/jira/browse/FLINK-23564
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.13.1
Reporter: zlzhang0122
 Fix For: 1.14.0


Now users can use 
System.out.print/System.out.println/System.err.print/System.err.println/e.printStackTraceto
 taskmanager.out and taskmanager.err as much as they want and this may use 
large space of disk cause the disk problem and influence the checkpoint of the 
flink and even the stability of the flink or other application on the same 
node. I proposed that we can make the taskmanager.out and taskmanager.err 
rollable just like taskmanager.log.By doing this, the disk consume of the 
taskmanager.out and taskmanager.err can be control.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23563) Sometimes ‘Stop’ cannot stop the job

2021-07-30 Thread Han Yin (Jira)
Han Yin created FLINK-23563:
---

 Summary: Sometimes ‘Stop’ cannot stop the job
 Key: FLINK-23563
 URL: https://issues.apache.org/jira/browse/FLINK-23563
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Reporter: Han Yin


Sometimes the 'Stop' command do not stop the job after the savepoint is 
finished.

This is because currently we set _syncSavepointId_ to null whenever we 
abort/complete a checkpoint, even if the aborted/completed checkpoint is not 
the latest one.

In some rare cases, it is possible that during a 'Stop' process, we trigger a 
savepoint, and then the _syncSavepointId_ is set to null due to the abortion of 
a previous checkpoint. As a result,  the subtasks are not stopped after 
completing the savepoint.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23562) Update CI docker image to latest java version (1.8.0_292)

2021-07-30 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-23562:
--

 Summary: Update CI docker image to latest java version (1.8.0_292)
 Key: FLINK-23562
 URL: https://issues.apache.org/jira/browse/FLINK-23562
 Project: Flink
  Issue Type: Technical Debt
  Components: Build System / Azure Pipelines
Reporter: Robert Metzger
 Fix For: 1.14.0


The java version we are using on our CI is outdated (1.8.0_282 vs 1.8.0_292). 
The latest java version has TLSv1 disabled, which makes the 
KubernetesClusterDescriptorTest fail.

This will be fixed by FLINK-22802.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23561) Detail the container completed message

2021-07-30 Thread zlzhang0122 (Jira)
zlzhang0122 created FLINK-23561:
---

 Summary: Detail the container completed message
 Key: FLINK-23561
 URL: https://issues.apache.org/jira/browse/FLINK-23561
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / YARN
Affects Versions: 1.13.1
Reporter: zlzhang0122
 Fix For: 1.14.0


Use the ContainerStatus to detailed the container completed reason, and thus 
users can explicitly know why the container completed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-179: Expose Standardized Operator Metrics

2021-07-30 Thread Arvid Heise
Hi everyone,

I started the voting thread [1]. Please cast your vote there or ask
additional questions here.

Best,

Arvid

[1]
https://lists.apache.org/thread.html/r70d321b6aa62ab4e31c8b73552b2de7846c4d31ed6f08d6541a9b36e%40%3Cdev.flink.apache.org%3E

On Fri, Jul 30, 2021 at 10:46 AM Becket Qin  wrote:

> Hi Arvid,
>
> I think it is OK to leave eventTimeFetchLag out of the scope of this FLIP
> given that it may involve additional API changes.
>
> 5. RecordMetadata is currently not simplifying any code. By the current
> > design RecordMetadata is a read-only data structure that is constant for
> > all records in a batch. So in Kafka, we still need to pass Tuple3 because
> > offset and timestamp are per record.
>
> Does this depend on whether we will get the RecordMetadata per record or
> per batch? We can make the semantic of RecordsWithSplitIds.metadata() to be
> the metadata associated with the last record returned by
> RecordsWithSplitIds.nextRecordsFromSplit(). In this case, individual
> implementations can decide whether to return different metadata for each
> record or not. In case of Kafka, the Tuple3 can be replaced with three
> lists of records, timestamps and offsets respectively. It probably saves
> some object instantiation, assuming the RecordMetadata object itself can be
> reused.
>
> 6. We might rename and change the semantics into
>
> public interface RecordsWithSplitIds {
> > /**
> >  * Returns the record metadata. The metadata is shared for all
> > records in the current split.
> >  */
> > @Nullable
> > default RecordMetadata metadataOfCurrentSplit() {
> > return null;
> > }
> > ...
> > }
>
> Maybe we can move one step further to make it "metadataOfCurrentRecord()"
> as I mentioned above.
>
> Thanks,
>
> Jiangjie (Becket) QIn
>
> On Fri, Jul 30, 2021 at 3:00 PM Arvid Heise  wrote:
>
> > Hi folks,
> >
> > To move on with the FLIP, I will cut out eventTimeFetchLag out of scope
> and
> > go ahead with the remainder.
> >
> > I will open a VOTE later to today.
> >
> > Best,
> >
> > Arvid
> >
> > On Wed, Jul 28, 2021 at 8:44 AM Arvid Heise  wrote:
> >
> > > Hi Becket,
> > >
> > > I have updated the PR according to your suggestion (note that this
> commit
> > > contains the removal of the previous approach) [1]. Here are my
> > > observations:
> > > 1. Adding the type of RecordMetadata to emitRecord would require adding
> > > another type parameter to RecordEmitter and SourceReaderBase. So I left
> > > that out for now as it would break things completely.
> > > 2. RecordEmitter implementations that want to pass it to SourceOutput
> > need
> > > to be changed in a boilerplate fashion. (passing the metadata to the
> > > SourceOutput)
> > > 3. RecordMetadata as an interface (as in the commit) probably requires
> > > boilerplate implementations in using sources as well.
> > > 4. SourceOutput would also require an additional collect
> > >
> > > default void collect(T record, RecordMetadata metadata) {
> > > collect(record, TimestampAssigner.NO_TIMESTAMP, metadata);
> > > }
> > >
> > > 5. RecordMetadata is currently not simplifying any code. By the current
> > > design RecordMetadata is a read-only data structure that is constant
> for
> > > all records in a batch. So in Kafka, we still need to pass Tuple3
> because
> > > offset and timestamp are per record.
> > > 6. RecordMetadata is currently the same for all splits in
> > > RecordsWithSplitIds.
> > >
> > > Some ideas for the above points:
> > > 3. We should accompy it with a default implementation to avoid the
> > trivial
> > > POJO implementations as the KafkaRecordMetadata of my commit. Can we
> skip
> > > the interface and just have RecordMetadata as a base class?
> > > 1.,2.,4. We could also set the metadata only once in an orthogonal
> method
> > > that need to be called before collect like
> > SourceOutput#setRecordMetadata.
> > > Then we can implement it entirely in SourceReaderBase without changing
> > any
> > > code. The clear downside is that it introduces some implicit state in
> > > SourceOutput (which we implement) and is harder to use in
> > > non-SourceReaderBase classes: Source devs need to remember to call
> > > setRecordMetadata before collect for a respective record.
> > > 6. We might rename and change the semantics into
> > >
> > > public interface RecordsWithSplitIds {
> > > /**
> > >  * Returns the record metadata. The metadata is shared for all
> > records in the current split.
> > >  */
> > > @Nullable
> > > default RecordMetadata metadataOfCurrentSplit() {
> > > return null;
> > > }
> > > ...
> > > }
> > >
> > >
> > > Re global variable
> > >
> > >> To explain a bit more on the metric being a global variable, I think
> in
> > >> general there are two ways to pass a value from one code block to
> > another.
> > >> The first way is direct passing. That means the variable is explicitly
> > >> passed from one code block to another via arguments, be them in the

[VOTE] FLIP-179: Expose Standardized Operator Metrics

2021-07-30 Thread Arvid Heise
Dear devs,

I'd like to open a vote on FLIP-179: Expose Standardized Operator Metrics
[1] which was discussed in this thread [2].
The vote will be open for at least 72 hours unless there is an objection
or not enough votes.

The proposal excludes the implementation for the currentFetchEventTimeLag
metric, which caused a bit of discussion without a clear convergence. We
will implement that metric in a generic way at a later point and encourage
sources to implement it themselves in the meantime.

Best,

Arvid

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-179%3A+Expose+Standardized+Operator+Metrics
[2]
https://lists.apache.org/thread.html/r856920cbfe6a262b521109c5bdb9e904e00a9b3f1825901759c24d85%40%3Cdev.flink.apache.org%3E


Re: [DISCUSS] FLIP-180: Adjust StreamStatus and Idleness definition

2021-07-30 Thread Martijn Visser
Hi all,

I have a couple of questions after studying the FLIP and the docs:

1. What happens when one of the readers has two splits assigned and one of
the splits actually receives data?

2. If I understand it correctly the Kinesis Source uses dynamic shard
discovery by default (so in case of idleness scenario 3 would happen there)
and the FileSource also has a dynamic assignment. The Kafka Source doesn't
use dynamic partition discovery by default (so scenario 2 would be the
default to happen there). Why did we choose to not enable dynamic partition
discovery by default and should we actually change that?

3. To be sure, is it correct that in case of a dynamic assignment and there
is temporarily no data, that scenario 2 is applicable?

4. Does WatermarkStrategy#withIdleness currently cover scenario 2, 3 and
the one from my 3rd question? (edited)

Best regards,

Martijn

On Fri, 23 Jul 2021 at 15:57, Till Rohrmann  wrote:

> Hi everyone,
>
> I would be in favour of what Arvid said about not exposing the
> WatermarkStatus to the Sink. Unless there is a very strong argument that
> this is required I think that keeping this concept internal seems to me the
> better choice right now. Moreover, as Arvid said the downstream application
> can derive the WatermarkStatus on their own depending on its business
> logic.
>
> Cheers,
> Till
>
> On Fri, Jul 23, 2021 at 2:15 PM Arvid Heise  wrote:
>
> > Hi Eron,
> >
> > thank you very much for your feedback.
> >
> > Please mention that the "temporary status toggle" code will be removed.
> > >
> > This code is already removed but there is still some automation of going
> > idle when temporary no splits are assigned. I will include it in the
> FLIP.
> >
> > I agree with adding the markActive() functionality, for symmetry.
> Speaking
> > > of symmetry, could we now include the minor enhancement we discussed in
> > > FLIP-167, the exposure of watermark status changes on the Sink
> interface.
> > > I drafted a PR and would be happy to revisit it.
> > >
> > >
> >
> https://github.com/streamnative/flink/pull/2/files#diff-64d9c652ffc3c60b6d838200a24b106eeeda4b2d853deae94dbbdf16d8d694c2R62-R70
> >
> > I'm still not sure if that's a good idea.
> >
> > If we have now refined idleness to be an user-specified,
> > application-specific way to handle with temporarily stalled partitions,
> > then downstream applications should actually implement their own idleness
> > definition. Let's see what other devs think. I'm pinging the once that
> have
> > been most involved in the discussion: @Stephan Ewen 
> > @Till
> > Rohrmann  @Dawid Wysakowicz <
> dwysakow...@apache.org>
> > .
> >
> > The flip mentions a 'watermarkstatus' package for the WatermarkStatus
> > > class.  Should it be 'eventtime' package?
> > >
> > Are you proposing org.apache.flink.api.common.eventtime? I was simply
> > suggesting to simply rename
> > org.apache.flink.streaming.runtime.streamstatus but I'm very open for
> other
> > suggestions (given that there are only 2 classes in the package).
> >
> >
> > > Regarding the change of 'streamStatus' to 'watermarkStatus', could you
> > > spell out what the new method names will be on each interface? May I
> > > suggest that Input.emitStreamStatus be Input.processStreamStatus?  This
> > is
> > > to help decouple the input's watermark status from the output's
> watermark
> > > status.
> > >
> > I haven't found
> > org.apache.flink.streaming.api.operators.Input#emitStreamStatus in
> master.
> > Could you double-check if I'm looking at the correct class?
> >
> > The current idea was mainly to grep+replace
> /streamStatus/watermarkStatus/
> > and /StreamStatus/WatermarkStatus/. But again I'm very open for more
> > descriptive names. I can add an explicit list later. I'm assuming you are
> > only interested in (semi-)public classes.
> >
> >
> > > I observe that AbstractStreamOperator is hardcoded to derive the output
> > > channel's status from the input channel's status.  May I suggest
> > > we refactor
> "AbstractStreamOperator::emitStreamStatus(StreamStatus,int)"
> > to
> > > allow for the operator subclass to customize the processing of the
> > > aggregated watermark and watermark status.
> > >
> > Can you add a motivation for that?
> > @Dawid Wysakowicz  , I think you are the last
> > person that touched the code. Do you have some example operators in your
> > head that would change it?
> >
> > Maybe the FLIP should spell out the expected behavior of the generic
> > > watermark generator (TimestampsAndWatermarksOperator).  Should the
> > > generator ignore the upstream idleness signal?  I believe it propagates
> > the
> > > signal, even though it also generates its own signals.   Given that
> > > source-based and generic watermark generation shouldn't be combined,
> one
> > > could argue that the generic watermark generator should activate only
> > when
> > > its input channel's watermark status is idle.
> > >
> > I will add a section. In general, we assume that we only have
> 

[jira] [Created] (FLINK-23560) Performance regression on 29.07.2021

2021-07-30 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-23560:
--

 Summary: Performance regression on 29.07.2021
 Key: FLINK-23560
 URL: https://issues.apache.org/jira/browse/FLINK-23560
 Project: Flink
  Issue Type: Bug
  Components: Benchmarks
Affects Versions: 1.14.0
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski
 Fix For: 1.14.0


http://codespeed.dak8s.net:8000/timeline/?ben=remoteFilePartition=2
http://codespeed.dak8s.net:8000/timeline/?ben=uncompressedMmapPartition=2
http://codespeed.dak8s.net:8000/timeline/?ben=compressedFilePartition=2
http://codespeed.dak8s.net:8000/timeline/?ben=tupleKeyBy=2
http://codespeed.dak8s.net:8000/timeline/?ben=arrayKeyBy=2
http://codespeed.dak8s.net:8000/timeline/?ben=uncompressedFilePartition=2
http://codespeed.dak8s.net:8000/timeline/?ben=sortedTwoInput=2
http://codespeed.dak8s.net:8000/timeline/?ben=sortedMultiInput=2
http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2
(And potentially other benchmarks)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-179: Expose Standardized Operator Metrics

2021-07-30 Thread Becket Qin
Hi Arvid,

I think it is OK to leave eventTimeFetchLag out of the scope of this FLIP
given that it may involve additional API changes.

5. RecordMetadata is currently not simplifying any code. By the current
> design RecordMetadata is a read-only data structure that is constant for
> all records in a batch. So in Kafka, we still need to pass Tuple3 because
> offset and timestamp are per record.

Does this depend on whether we will get the RecordMetadata per record or
per batch? We can make the semantic of RecordsWithSplitIds.metadata() to be
the metadata associated with the last record returned by
RecordsWithSplitIds.nextRecordsFromSplit(). In this case, individual
implementations can decide whether to return different metadata for each
record or not. In case of Kafka, the Tuple3 can be replaced with three
lists of records, timestamps and offsets respectively. It probably saves
some object instantiation, assuming the RecordMetadata object itself can be
reused.

6. We might rename and change the semantics into

public interface RecordsWithSplitIds {
> /**
>  * Returns the record metadata. The metadata is shared for all
> records in the current split.
>  */
> @Nullable
> default RecordMetadata metadataOfCurrentSplit() {
> return null;
> }
> ...
> }

Maybe we can move one step further to make it "metadataOfCurrentRecord()"
as I mentioned above.

Thanks,

Jiangjie (Becket) QIn

On Fri, Jul 30, 2021 at 3:00 PM Arvid Heise  wrote:

> Hi folks,
>
> To move on with the FLIP, I will cut out eventTimeFetchLag out of scope and
> go ahead with the remainder.
>
> I will open a VOTE later to today.
>
> Best,
>
> Arvid
>
> On Wed, Jul 28, 2021 at 8:44 AM Arvid Heise  wrote:
>
> > Hi Becket,
> >
> > I have updated the PR according to your suggestion (note that this commit
> > contains the removal of the previous approach) [1]. Here are my
> > observations:
> > 1. Adding the type of RecordMetadata to emitRecord would require adding
> > another type parameter to RecordEmitter and SourceReaderBase. So I left
> > that out for now as it would break things completely.
> > 2. RecordEmitter implementations that want to pass it to SourceOutput
> need
> > to be changed in a boilerplate fashion. (passing the metadata to the
> > SourceOutput)
> > 3. RecordMetadata as an interface (as in the commit) probably requires
> > boilerplate implementations in using sources as well.
> > 4. SourceOutput would also require an additional collect
> >
> > default void collect(T record, RecordMetadata metadata) {
> > collect(record, TimestampAssigner.NO_TIMESTAMP, metadata);
> > }
> >
> > 5. RecordMetadata is currently not simplifying any code. By the current
> > design RecordMetadata is a read-only data structure that is constant for
> > all records in a batch. So in Kafka, we still need to pass Tuple3 because
> > offset and timestamp are per record.
> > 6. RecordMetadata is currently the same for all splits in
> > RecordsWithSplitIds.
> >
> > Some ideas for the above points:
> > 3. We should accompy it with a default implementation to avoid the
> trivial
> > POJO implementations as the KafkaRecordMetadata of my commit. Can we skip
> > the interface and just have RecordMetadata as a base class?
> > 1.,2.,4. We could also set the metadata only once in an orthogonal method
> > that need to be called before collect like
> SourceOutput#setRecordMetadata.
> > Then we can implement it entirely in SourceReaderBase without changing
> any
> > code. The clear downside is that it introduces some implicit state in
> > SourceOutput (which we implement) and is harder to use in
> > non-SourceReaderBase classes: Source devs need to remember to call
> > setRecordMetadata before collect for a respective record.
> > 6. We might rename and change the semantics into
> >
> > public interface RecordsWithSplitIds {
> > /**
> >  * Returns the record metadata. The metadata is shared for all
> records in the current split.
> >  */
> > @Nullable
> > default RecordMetadata metadataOfCurrentSplit() {
> > return null;
> > }
> > ...
> > }
> >
> >
> > Re global variable
> >
> >> To explain a bit more on the metric being a global variable, I think in
> >> general there are two ways to pass a value from one code block to
> another.
> >> The first way is direct passing. That means the variable is explicitly
> >> passed from one code block to another via arguments, be them in the
> >> constructor or methods. Another way is indirect passing through context,
> >> that means the information is stored in some kind of context or
> >> environment, and everyone can have access to it. And there is no
> explicit
> >> value passing from one code block to another because everyone just reads
> >> from/writes to the context or environment. This is basically the "global
> >> variable" pattern I am talking about.
> >>
> >> In general people would avoid having a mutable global value shared
> across
> >> code blocks, because it is 

[jira] [Created] (FLINK-23559) Enable periodic materialisation in tests

2021-07-30 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-23559:
--

 Summary: Enable periodic materialisation in tests
 Key: FLINK-23559
 URL: https://issues.apache.org/jira/browse/FLINK-23559
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends
Reporter: Roman Khachatryan
Assignee: Roman Khachatryan
 Fix For: 1.14.0


FLINK-21448 adds the capability (test randomization), but it can't be turned on 
as there are some test failures: FLINK-23276, FLINK-23277, FLINK-23278 (should 
be enabled after those bugs fixed)..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.12.5, release candidate #3

2021-07-30 Thread Robert Metzger
Thanks a lot for providing the new staging repository. I dropped the 1440
and 1441 staging repositories, to avoid that other RC reviewers
accidentally look into it, or that we accidentally release it.

+1 (binding)

Checks:
- I didn't find any additional issues in the release announcement
- the pgp signatures on the source archive seem fine
- source archive compilation starts successfully (rat check passes etc.)
- standalone mode, job submission and cli cancellation works. logs look fine
- maven staging repository looks fine

On Fri, Jul 30, 2021 at 7:30 AM Jingsong Li  wrote:

> Hi everyone,
>
> Thanks Robert, I created a new one.
>
> all artifacts to be deployed to the Maven Central Repository [4],
>
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1444/
>
> Best,
> Jingsong
>
> On Thu, Jul 29, 2021 at 9:50 PM Robert Metzger 
> wrote:
>
> > The difference is that the 1440 staging repository contains the Scala
> _2.11
> > files, the 1441 repo contains scala_2.12. I'm not sure if this works,
> > because things like "flink-core:1.11.5" will be released twice?
> > I would prefer to have a single staging repository containing all
> binaries
> > we intend to release to maven central, to avoid complications in the
> > release process.
> >
> > Since only the convenience binaries are affected by this, we don't need
> to
> > cancel the release. We just need to create a new staging repository.
> >
> >
> > On Thu, Jul 29, 2021 at 3:36 PM Robert Metzger 
> > wrote:
> >
> > > Thanks a lot for creating a release candidate!
> > >
> > > What is the difference between the two maven staging repos?
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1440/
> > >  and
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1441/
> > ?
> > >
> > > On Thu, Jul 29, 2021 at 1:52 PM Xingbo Huang 
> wrote:
> > >
> > >> +1 (non-binding)
> > >>
> > >> - Verified checksums and signatures
> > >> - Built from sources
> > >> - Verified Python wheel package contents
> > >> - Pip install Python wheel package in Mac
> > >> - Run Python UDF job in Python REPL
> > >>
> > >> Best,
> > >> Xingbo
> > >>
> > >> Zakelly Lan  于2021年7月29日周四 下午5:57写道:
> > >>
> > >> > +1 (non-binding)
> > >> >
> > >> > * Built from source.
> > >> > * Run wordcount datastream job on yarn
> > >> > * Web UI and checkpoint seem good.
> > >> > * Kill a container to make job failover, everything is good.
> > >> > * Try run job from checkpoint, everything is good.
> > >> >
> > >> > On Thu, Jul 29, 2021 at 2:34 PM Yun Tang  wrote:
> > >> >
> > >> > > +1 (non-binding)
> > >> > >
> > >> > > Checked the signature.
> > >> > >
> > >> > > Reviewed the PR of flink-web.
> > >> > >
> > >> > > Download the pre-built tar package and launched an application
> mode
> > >> > > standalone job successfully.
> > >> > >
> > >> > > Best
> > >> > > Yun Tang
> > >> > >
> > >> > >
> > >> > > 
> > >> > > From: Jingsong Li 
> > >> > > Sent: Tuesday, July 27, 2021 11:54
> > >> > > To: dev 
> > >> > > Subject: [VOTE] Release 1.12.5, release candidate #3
> > >> > >
> > >> > > Hi everyone,
> > >> > >
> > >> > > Please review and vote on the release candidate #3 for the version
> > >> > 1.12.5,
> > >> > > as follows:
> > >> > > [ ] +1, Approve the release
> > >> > > [ ] -1, Do not approve the release (please provide specific
> > comments)
> > >> > >
> > >> > > The complete staging area is available for your review, which
> > >> includes:
> > >> > > * JIRA release notes [1],
> > >> > > * the official Apache source release and binary convenience
> releases
> > >> to
> > >> > be
> > >> > > deployed to dist.apache.org [2], which are signed with the key
> with
> > >> > > fingerprint FBB83C0A4FFB9CA8 [3],
> > >> > > * all artifacts to be deployed to the Maven Central Repository
> [4],
> > >> > > * source code tag "release-1.12.5-rc3" [5],
> > >> > > * website pull request listing the new release and adding
> > announcement
> > >> > blog
> > >> > > post [6].
> > >> > >
> > >> > > The vote will be open for at least 72 hours. It is adopted by
> > majority
> > >> > > approval, with at least 3 PMC affirmative votes.
> > >> > >
> > >> > > Best,
> > >> > > Jingsong Lee
> > >> > >
> > >> > > [1]
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350166
> > >> > > [2]
> https://dist.apache.org/repos/dist/dev/flink/flink-1.12.5-rc3/
> > >> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > >> > > [4]
> > >> > >
> > >>
> https://repository.apache.org/content/repositories/orgapacheflink-1440/
> > >> > >
> > >>
> https://repository.apache.org/content/repositories/orgapacheflink-1441/
> > >> > > [5]
> https://github.com/apache/flink/releases/tag/release-1.12.5-rc3
> > >> > > [6] https://github.com/apache/flink-web/pull/455
> > >> > >
> > >> >
> > >>
> > >
> >
>
>
> --
> Best, Jingsong Lee
>


[jira] [Created] (FLINK-23558) E2e tests fail because of quiesced system timers service

2021-07-30 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-23558:


 Summary: E2e tests fail because of quiesced system timers service
 Key: FLINK-23558
 URL: https://issues.apache.org/jira/browse/FLINK-23558
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.14.0
Reporter: Dawid Wysakowicz


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21180=logs=739e6eac-8312-5d31-d437-294c4d26fced=2a8cc459-df7a-5e6f-12bf-96efcc369aa9=10484

{code}
Jul 29 21:41:15 Caused by: 
org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailbox$MailboxClosedException:
 Mailbox is in state QUIESCED, but is required to be in state OPEN for put 
operations.
Jul 29 21:41:15 at 
org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.checkPutStateConditions(TaskMailboxImpl.java:269)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Jul 29 21:41:15 at 
org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.put(TaskMailboxImpl.java:197)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Jul 29 21:41:15 at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxExecutorImpl.execute(MailboxExecutorImpl.java:74)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Jul 29 21:41:15 at 
org.apache.flink.runtime.mailbox.MailboxExecutor.submit(MailboxExecutor.java:163)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Jul 29 21:41:15 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$throughputCalculationSetup$3(StreamTask.java:688)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Jul 29 21:41:15 at 
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$ScheduledTask.run(SystemProcessingTimeService.java:317)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Jul 29 21:41:15 at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_302]
Jul 29 21:41:15 at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_302]
Jul 29 21:41:15 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 ~[?:1.8.0_302]
Jul 29 21:41:15 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 ~[?:1.8.0_302]
Jul 29 21:41:15 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_302]
Jul 29 21:41:15 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_302]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23557) 'Resuming Externalized Checkpoint (hashmap, sync, no parallelism change) end-to-end test' fails on Azure

2021-07-30 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-23557:


 Summary: 'Resuming Externalized Checkpoint (hashmap, sync, no 
parallelism change) end-to-end test' fails on Azure
 Key: FLINK-23557
 URL: https://issues.apache.org/jira/browse/FLINK-23557
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.14.0
Reporter: Dawid Wysakowicz


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21129=logs=6caf31d6-847a-526e-9624-468e053467d6=1fdd9d50-31f7-5383-5578-49e27385b5f1=785
{code}
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to 
submit JobGraph.
at 
org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$9(RestClusterClient.java:405)
at 
java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
at 
java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
at 
org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:373)
at 
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at 
java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610)
at 
java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1085)
at 
java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.flink.runtime.rest.util.RestClientException: [File upload 
failed.]
at 
org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:486)
at 
org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:466)
at 
java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23556) SQLClientSchemaRegistryITCase fails with " Subject ... not found"

2021-07-30 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-23556:


 Summary: SQLClientSchemaRegistryITCase fails with " Subject ... 
not found"
 Key: FLINK-23556
 URL: https://issues.apache.org/jira/browse/FLINK-23556
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Ecosystem
Affects Versions: 1.14.0
Reporter: Dawid Wysakowicz


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21129=logs=91bf6583-3fb2-592f-e4d4-d79d79c3230a=cc5499f8-bdde-5157-0d76-b6528ecd808e=25337
{code}
Jul 28 23:37:48 [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time 
elapsed: 209.44 s <<< FAILURE! - in 
org.apache.flink.tests.util.kafka.SQLClientSchemaRegistryITCase
Jul 28 23:37:48 [ERROR] 
testWriting(org.apache.flink.tests.util.kafka.SQLClientSchemaRegistryITCase)  
Time elapsed: 81.146 s  <<< ERROR!
Jul 28 23:37:48 
io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: 
Subject 'test-user-behavior-d18d4af2-3830-4620-9993-340c13f50cc2-value' not 
found.; error code: 40401
Jul 28 23:37:48 at 
io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:292)
Jul 28 23:37:48 at 
io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:352)
Jul 28 23:37:48 at 
io.confluent.kafka.schemaregistry.client.rest.RestService.getAllVersions(RestService.java:769)
Jul 28 23:37:48 at 
io.confluent.kafka.schemaregistry.client.rest.RestService.getAllVersions(RestService.java:760)
Jul 28 23:37:48 at 
io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getAllVersions(CachedSchemaRegistryClient.java:364)
Jul 28 23:37:48 at 
org.apache.flink.tests.util.kafka.SQLClientSchemaRegistryITCase.getAllVersions(SQLClientSchemaRegistryITCase.java:230)
Jul 28 23:37:48 at 
org.apache.flink.tests.util.kafka.SQLClientSchemaRegistryITCase.testWriting(SQLClientSchemaRegistryITCase.java:195)
Jul 28 23:37:48 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
Jul 28 23:37:48 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Jul 28 23:37:48 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Jul 28 23:37:48 at java.lang.reflect.Method.invoke(Method.java:498)
Jul 28 23:37:48 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
Jul 28 23:37:48 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
Jul 28 23:37:48 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
Jul 28 23:37:48 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
Jul 28 23:37:48 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
Jul 28 23:37:48 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
Jul 28 23:37:48 at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
Jul 28 23:37:48 at java.lang.Thread.run(Thread.java:748)
Jul 28 23:37:48 

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-179: Expose Standardized Operator Metrics

2021-07-30 Thread Arvid Heise
Hi folks,

To move on with the FLIP, I will cut out eventTimeFetchLag out of scope and
go ahead with the remainder.

I will open a VOTE later to today.

Best,

Arvid

On Wed, Jul 28, 2021 at 8:44 AM Arvid Heise  wrote:

> Hi Becket,
>
> I have updated the PR according to your suggestion (note that this commit
> contains the removal of the previous approach) [1]. Here are my
> observations:
> 1. Adding the type of RecordMetadata to emitRecord would require adding
> another type parameter to RecordEmitter and SourceReaderBase. So I left
> that out for now as it would break things completely.
> 2. RecordEmitter implementations that want to pass it to SourceOutput need
> to be changed in a boilerplate fashion. (passing the metadata to the
> SourceOutput)
> 3. RecordMetadata as an interface (as in the commit) probably requires
> boilerplate implementations in using sources as well.
> 4. SourceOutput would also require an additional collect
>
> default void collect(T record, RecordMetadata metadata) {
> collect(record, TimestampAssigner.NO_TIMESTAMP, metadata);
> }
>
> 5. RecordMetadata is currently not simplifying any code. By the current
> design RecordMetadata is a read-only data structure that is constant for
> all records in a batch. So in Kafka, we still need to pass Tuple3 because
> offset and timestamp are per record.
> 6. RecordMetadata is currently the same for all splits in
> RecordsWithSplitIds.
>
> Some ideas for the above points:
> 3. We should accompy it with a default implementation to avoid the trivial
> POJO implementations as the KafkaRecordMetadata of my commit. Can we skip
> the interface and just have RecordMetadata as a base class?
> 1.,2.,4. We could also set the metadata only once in an orthogonal method
> that need to be called before collect like SourceOutput#setRecordMetadata.
> Then we can implement it entirely in SourceReaderBase without changing any
> code. The clear downside is that it introduces some implicit state in
> SourceOutput (which we implement) and is harder to use in
> non-SourceReaderBase classes: Source devs need to remember to call
> setRecordMetadata before collect for a respective record.
> 6. We might rename and change the semantics into
>
> public interface RecordsWithSplitIds {
> /**
>  * Returns the record metadata. The metadata is shared for all records in 
> the current split.
>  */
> @Nullable
> default RecordMetadata metadataOfCurrentSplit() {
> return null;
> }
> ...
> }
>
>
> Re global variable
>
>> To explain a bit more on the metric being a global variable, I think in
>> general there are two ways to pass a value from one code block to another.
>> The first way is direct passing. That means the variable is explicitly
>> passed from one code block to another via arguments, be them in the
>> constructor or methods. Another way is indirect passing through context,
>> that means the information is stored in some kind of context or
>> environment, and everyone can have access to it. And there is no explicit
>> value passing from one code block to another because everyone just reads
>> from/writes to the context or environment. This is basically the "global
>> variable" pattern I am talking about.
>>
>> In general people would avoid having a mutable global value shared across
>> code blocks, because it is usually less deterministic and therefore more
>> difficult to understand or debug.
>>
> Since the first approach was using a Gauge, it's a callback and not a
> global value. The actual value is passed when invoking the callback. It's
> the same as a supplier. However, the gauge itself is stored in the context,
> so your argument holds on that level.
>
>
>> Moreover, generally speaking, the Metrics in systems are usually perceived
>> as a reporting mechanism. People usually think of it as a way to expose
>> some internal values to the external system, and don't expect the program
>> itself to read the reported values again in the main logic, which is
>> essentially using the MetricGroup as a context to pass values across code
>> block, i.e. the "global variable" pattern. Instead, people would usually
>> use the "direct passing" to do this.
>>
> Here I still don't see a difference on how we calculate the meter values
> from the byteIn/Out counters. We also need to read the counters
> periodically and calculate a secondary metric. So it can't be that
> unexpected to users.
>
> [1]
> https://github.com/apache/flink/commit/71212e6baf2906444987253d0cf13b5a5978a43b
>
> On Tue, Jul 27, 2021 at 3:19 AM Becket Qin  wrote:
>
>> Hi Arvid,
>>
>> Thanks for the patient discussion.
>>
>> To explain a bit more on the metric being a global variable, I think in
>> general there are two ways to pass a value from one code block to another.
>> The first way is direct passing. That means the variable is explicitly
>> passed from one code block to another via arguments, be them in the
>> constructor or methods. Another way is indirect passing 

Re: [VOTE] Release 1.13.2, release candidate #3

2021-07-30 Thread Xingbo Huang
+1 (non-binding)

- Verified checksums and signatures
- Verified Python wheel package contents
- Pip install apache-flink-libraries source package and apache-flink wheel
package in Mac
- Write and Run a Simple Python UDF job in Python REPL

Best,
Xingbo

Yu Li  于2021年7月30日周五 下午2:33写道:

> +1 (binding)
>
> - Checked the diff between 1.13.1 and 1.13.2-rc3: OK (
> https://github.com/apache/flink/compare/release-1.13.1...release-1.13.2-rc3
> )
>   - commons-io version has been bumped to 2.8.0 through FLINK-22747 and all
> NOTICE files updated correctly
>   - guava version has been bumped to 29.0 for kinesis connector through
> FLINK-23009 and all NOTICE files updated correctly
> - Checked release notes: OK
>   - minor: I've moved FLINK-23315 and FLINK-23418 out of 1.13.2 to keep
> accordance with RC status
> - Checked sums and signatures: OK
> - Maven clean install from source: OK
> - Checked the jars in the staging repo: OK
> - Checked the website updates: OK
>   - minor: left some minor comments in PR (such as RN needs update, etc.)
> and please remember to address them before merging
>
> Best Regards,
> Yu
>
>
> On Fri, 30 Jul 2021 at 14:00, Jingsong Li  wrote:
>
> > +1 (non-binding)
> >
> > - Check if checksums and GPG files match the corresponding release files
> > - staging repository looks fine
> > - Start a local cluster (start-cluster.sh), logs fine
> > - Run sql-client and run a job, looks fine
> >
> > I found an unexpected log in sql-client:
> > "Searching for
> >
> >
> '/Users/lijingsong/Downloads/tmp/flink-1.13.2/conf/sql-client-defaults.yaml'...not
> > found"
> > This log should be removed. I created a JIRA for this:
> > https://issues.apache.org/jira/browse/FLINK-23552
> > (This should not be a blocker)
> >
> > Best,
> > Jingsong
> >
> > On Thu, Jul 29, 2021 at 10:44 PM Robert Metzger 
> > wrote:
> >
> > > Thanks a lot for creating this release candidate
> > >
> > > +1 (binding)
> > >
> > > - staging repository looks fine
> > > - Diff to 1.13.1 looks fine wrt to dependency changes:
> > >
> >
> https://github.com/apache/flink/compare/release-1.13.1...release-1.13.2-rc3
> > > - standalone mode works locally
> > >- I found this issue, which is not specific to 1.13.2:
> > > https://issues.apache.org/jira/browse/FLINK-23546
> > > - src archive signature is matched; sha512 is correct
> > >
> > > On Thu, Jul 29, 2021 at 9:10 AM Zakelly Lan 
> > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > * Built from source.
> > > > * Run wordcount datastream job on yarn
> > > > * Web UI and checkpoint seem good.
> > > > * Kill a container to make job failover, everything is good.
> > > > * Try run job from checkpoint, everything is good.
> > > >
> > > > On Fri, Jul 23, 2021 at 10:04 PM Yun Tang  wrote:
> > > >
> > > > > Hi everyone,
> > > > > Please review and vote on the release candidate #3 for the version
> > > > 1.13.2,
> > > > > as follows:
> > > > > [ ] +1, Approve the release
> > > > > [ ] -1, Do not approve the release (please provide specific
> comments)
> > > > >
> > > > >
> > > > > The complete staging area is available for your review, which
> > includes:
> > > > > * JIRA release notes [1],
> > > > > * the official Apache source release and binary convenience
> releases
> > to
> > > > be
> > > > > deployed to dist.apache.org [2], which are signed with the key
> with
> > > > > fingerprint 78A306590F1081CC6794DC7F62DAD618E07CF996 [3],
> > > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > > * source code tag "release-1.13.2-rc3" [5],
> > > > > * website pull request listing the new release and adding
> > announcement
> > > > > blog post [6].
> > > > >
> > > > > The vote will be open for at least 72 hours. It is adopted by
> > majority
> > > > > approval, with at least 3 PMC affirmative votes.
> > > > >
> > > > > Best,
> > > > > Yun Tang
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350218==12315522
> > > > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.13.2-rc3/
> > > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > > [4]
> > > > >
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1439/
> > > > > [5]
> https://github.com/apache/flink/releases/tag/release-1.13.2-rc3
> > > > > [6] https://github.com/apache/flink-web/pull/453
> > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Best, Jingsong Lee
> >
>


Re: [VOTE] Release 1.13.2, release candidate #3

2021-07-30 Thread Yu Li
+1 (binding)

- Checked the diff between 1.13.1 and 1.13.2-rc3: OK (
https://github.com/apache/flink/compare/release-1.13.1...release-1.13.2-rc3)
  - commons-io version has been bumped to 2.8.0 through FLINK-22747 and all
NOTICE files updated correctly
  - guava version has been bumped to 29.0 for kinesis connector through
FLINK-23009 and all NOTICE files updated correctly
- Checked release notes: OK
  - minor: I've moved FLINK-23315 and FLINK-23418 out of 1.13.2 to keep
accordance with RC status
- Checked sums and signatures: OK
- Maven clean install from source: OK
- Checked the jars in the staging repo: OK
- Checked the website updates: OK
  - minor: left some minor comments in PR (such as RN needs update, etc.)
and please remember to address them before merging

Best Regards,
Yu


On Fri, 30 Jul 2021 at 14:00, Jingsong Li  wrote:

> +1 (non-binding)
>
> - Check if checksums and GPG files match the corresponding release files
> - staging repository looks fine
> - Start a local cluster (start-cluster.sh), logs fine
> - Run sql-client and run a job, looks fine
>
> I found an unexpected log in sql-client:
> "Searching for
>
> '/Users/lijingsong/Downloads/tmp/flink-1.13.2/conf/sql-client-defaults.yaml'...not
> found"
> This log should be removed. I created a JIRA for this:
> https://issues.apache.org/jira/browse/FLINK-23552
> (This should not be a blocker)
>
> Best,
> Jingsong
>
> On Thu, Jul 29, 2021 at 10:44 PM Robert Metzger 
> wrote:
>
> > Thanks a lot for creating this release candidate
> >
> > +1 (binding)
> >
> > - staging repository looks fine
> > - Diff to 1.13.1 looks fine wrt to dependency changes:
> >
> https://github.com/apache/flink/compare/release-1.13.1...release-1.13.2-rc3
> > - standalone mode works locally
> >- I found this issue, which is not specific to 1.13.2:
> > https://issues.apache.org/jira/browse/FLINK-23546
> > - src archive signature is matched; sha512 is correct
> >
> > On Thu, Jul 29, 2021 at 9:10 AM Zakelly Lan 
> wrote:
> >
> > > +1 (non-binding)
> > >
> > > * Built from source.
> > > * Run wordcount datastream job on yarn
> > > * Web UI and checkpoint seem good.
> > > * Kill a container to make job failover, everything is good.
> > > * Try run job from checkpoint, everything is good.
> > >
> > > On Fri, Jul 23, 2021 at 10:04 PM Yun Tang  wrote:
> > >
> > > > Hi everyone,
> > > > Please review and vote on the release candidate #3 for the version
> > > 1.13.2,
> > > > as follows:
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > >
> > > > The complete staging area is available for your review, which
> includes:
> > > > * JIRA release notes [1],
> > > > * the official Apache source release and binary convenience releases
> to
> > > be
> > > > deployed to dist.apache.org [2], which are signed with the key with
> > > > fingerprint 78A306590F1081CC6794DC7F62DAD618E07CF996 [3],
> > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > * source code tag "release-1.13.2-rc3" [5],
> > > > * website pull request listing the new release and adding
> announcement
> > > > blog post [6].
> > > >
> > > > The vote will be open for at least 72 hours. It is adopted by
> majority
> > > > approval, with at least 3 PMC affirmative votes.
> > > >
> > > > Best,
> > > > Yun Tang
> > > >
> > > > [1]
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350218==12315522
> > > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.13.2-rc3/
> > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [4]
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1439/
> > > > [5] https://github.com/apache/flink/releases/tag/release-1.13.2-rc3
> > > > [6] https://github.com/apache/flink-web/pull/453
> > > >
> > > >
> > >
> >
>
>
> --
> Best, Jingsong Lee
>


[jira] [Created] (FLINK-23555) Improve common subexpression elimination by using local references

2021-07-30 Thread weibowen (Jira)
weibowen created FLINK-23555:


 Summary: Improve common subexpression elimination by using local 
references
 Key: FLINK-23555
 URL: https://issues.apache.org/jira/browse/FLINK-23555
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Reporter: weibowen
 Fix For: 1.14.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.11.4, release candidate #1

2021-07-30 Thread JING ZHANG
Thanks @godfrey for driving this.

+1 (non-binding)

1. built from source code flink-1.11.4-src.tgz

 succeeded
2. Started a local Flink cluster, ran the WordCount example, WebUI looks
good,  no suspicious output/log
3. started cluster and run some e2e sql queries using SQL Client, query
result is expected.  Find a minor bug FLINK-23554
 which would happen in
some corner cases.
4. Repeat Step 2 and 3 with flink-1.11.4-bin-scala_2.11.tgz



Robert Metzger  于2021年7月29日周四 下午9:26写道:

> Thanks a lot for creating the release candidate!
>
> +1 (binding)
>
> Checks:
> - manually checked the diff [1]. License documentation seems to be properly
> maintained in all changes (mostly a jackson version dump, and some ES +
> Kinesis bumps)
> - checked standalone mode, job submission, logs locally.
> - checked the flink-web PR
> - checked the maven staging repo
>
>
> [1]
> https://github.com/apache/flink/compare/release-1.11.3...release-1.11.4-rc1
> and
>
> https://github.com/apache/flink/compare/ca2ac0108bf4050ba7efc4fa729e5f7fdf3da459...release-1.11.4-rc1
> (which excludes the code reformatting)
>
> On Mon, Jul 26, 2021 at 5:26 PM godfrey he  wrote:
>
> > Hi everyone,
> > Please review and vote on the release candidate #1 for the version
> 1.11.4,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release and binary convenience releases to
> be
> > deployed to dist.apache.org [2], which are signed with the key with
> > fingerprint 4A978875E56AA2100EB0CF12A244D52CF0A40279 [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-1.11.4-rc1" [5],
> > * website pull request listing the new release and adding announcement
> blog
> > post [6].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Best,
> > Godfrey
> >
> > [1]
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12349404
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.4-rc1/
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1438
> > [5] https://github.com/apache/flink/releases/tag/release-1.11.4-rc1
> > [6] https://github.com/apache/flink-web/pull/459
> >
>


[jira] [Created] (FLINK-23554) SqlCli throws an exception and hang

2021-07-30 Thread JING ZHANG (Jira)
JING ZHANG created FLINK-23554:
--

 Summary: SqlCli throws an exception and hang 
 Key: FLINK-23554
 URL: https://issues.apache.org/jira/browse/FLINK-23554
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.12.4, 1.13.1, 1.11.3
Reporter: JING ZHANG
 Attachments: image-2021-07-30-14-12-07-817.png

SqlCli would throws an exception like the following, and SqlCli would hang 
forever until kill the process outside.

You could reproduce the exception by the following step:
 # submit a SQL command in SQLCli
 # does not wait for it response, input another SQL command in SQL Cli
 # an exception would be thrown out.

!image-2021-07-30-14-12-07-817.png|width=1706,height=527!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23553) Trigger global failover for a synchronous savepoint failure

2021-07-30 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-23553:


 Summary: Trigger global failover for a synchronous savepoint 
failure
 Key: FLINK-23553
 URL: https://issues.apache.org/jira/browse/FLINK-23553
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Reporter: Dawid Wysakowicz
 Fix For: 1.14.0


We should trigger a global job failover in case of a {{stop-with-savepoint 
--drain}} fails.

The situation is obvious in case of the with drain mode. If a savepoint fails 
we simply can not continue as we have already flushed all data and prepared the 
state for finishing. We can not simply continue processing records.

It is more debatable for without drain mode, where we could theoretically 
continue processing records, however, it is also a good approach to unify the 
two modes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.11.4, release candidate #1

2021-07-30 Thread Jingsong Li
+1 (non-binding)

- Verified checksums and signatures
- Verified local cluster
- Verified sql-client and run a job

Best,
Jingsong

On Thu, Jul 29, 2021 at 9:26 PM Robert Metzger  wrote:

> Thanks a lot for creating the release candidate!
>
> +1 (binding)
>
> Checks:
> - manually checked the diff [1]. License documentation seems to be properly
> maintained in all changes (mostly a jackson version dump, and some ES +
> Kinesis bumps)
> - checked standalone mode, job submission, logs locally.
> - checked the flink-web PR
> - checked the maven staging repo
>
>
> [1]
> https://github.com/apache/flink/compare/release-1.11.3...release-1.11.4-rc1
> and
>
> https://github.com/apache/flink/compare/ca2ac0108bf4050ba7efc4fa729e5f7fdf3da459...release-1.11.4-rc1
> (which excludes the code reformatting)
>
> On Mon, Jul 26, 2021 at 5:26 PM godfrey he  wrote:
>
> > Hi everyone,
> > Please review and vote on the release candidate #1 for the version
> 1.11.4,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release and binary convenience releases to
> be
> > deployed to dist.apache.org [2], which are signed with the key with
> > fingerprint 4A978875E56AA2100EB0CF12A244D52CF0A40279 [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-1.11.4-rc1" [5],
> > * website pull request listing the new release and adding announcement
> blog
> > post [6].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Best,
> > Godfrey
> >
> > [1]
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12349404
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.4-rc1/
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1438
> > [5] https://github.com/apache/flink/releases/tag/release-1.11.4-rc1
> > [6] https://github.com/apache/flink-web/pull/459
> >
>


-- 
Best, Jingsong Lee


Re: [VOTE] Release 1.13.2, release candidate #3

2021-07-30 Thread Jingsong Li
+1 (non-binding)

- Check if checksums and GPG files match the corresponding release files
- staging repository looks fine
- Start a local cluster (start-cluster.sh), logs fine
- Run sql-client and run a job, looks fine

I found an unexpected log in sql-client:
"Searching for
'/Users/lijingsong/Downloads/tmp/flink-1.13.2/conf/sql-client-defaults.yaml'...not
found"
This log should be removed. I created a JIRA for this:
https://issues.apache.org/jira/browse/FLINK-23552
(This should not be a blocker)

Best,
Jingsong

On Thu, Jul 29, 2021 at 10:44 PM Robert Metzger  wrote:

> Thanks a lot for creating this release candidate
>
> +1 (binding)
>
> - staging repository looks fine
> - Diff to 1.13.1 looks fine wrt to dependency changes:
> https://github.com/apache/flink/compare/release-1.13.1...release-1.13.2-rc3
> - standalone mode works locally
>- I found this issue, which is not specific to 1.13.2:
> https://issues.apache.org/jira/browse/FLINK-23546
> - src archive signature is matched; sha512 is correct
>
> On Thu, Jul 29, 2021 at 9:10 AM Zakelly Lan  wrote:
>
> > +1 (non-binding)
> >
> > * Built from source.
> > * Run wordcount datastream job on yarn
> > * Web UI and checkpoint seem good.
> > * Kill a container to make job failover, everything is good.
> > * Try run job from checkpoint, everything is good.
> >
> > On Fri, Jul 23, 2021 at 10:04 PM Yun Tang  wrote:
> >
> > > Hi everyone,
> > > Please review and vote on the release candidate #3 for the version
> > 1.13.2,
> > > as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > >
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release and binary convenience releases to
> > be
> > > deployed to dist.apache.org [2], which are signed with the key with
> > > fingerprint 78A306590F1081CC6794DC7F62DAD618E07CF996 [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag "release-1.13.2-rc3" [5],
> > > * website pull request listing the new release and adding announcement
> > > blog post [6].
> > >
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > > Best,
> > > Yun Tang
> > >
> > > [1]
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350218==12315522
> > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.13.2-rc3/
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1439/
> > > [5] https://github.com/apache/flink/releases/tag/release-1.13.2-rc3
> > > [6] https://github.com/apache/flink-web/pull/453
> > >
> > >
> >
>


-- 
Best, Jingsong Lee