Re: [VOTE] FLIP-199: Change some default config values of blocking shuffle for better usability

2022-01-11 Thread Jingsong Li
+1 Thanks Yingjie for driving.

Best,
Jingsong Lee

On Wed, Jan 12, 2022 at 3:16 PM 刘建刚  wrote:
>
> +1 for the proposal. In fact, we have used these params in our inner flink
> version for good performance.
>
> Yun Gao  于2022年1月12日周三 10:42写道:
>
> > +1 since it would highly improve the open-box experience for batch jobs.
> >
> > Thanks Yingjie for drafting the PR and initiating the discussion.
> >
> > Best,
> > Yun
> >
> >
> >
> >  --Original Mail --
> > Sender:Yingjie Cao 
> > Send Date:Tue Jan 11 15:15:01 2022
> > Recipients:dev 
> > Subject:[VOTE] FLIP-199: Change some default config values of blocking
> > shuffle for better usability
> > Hi all,
> >
> > I'd like to start a vote on FLIP-199: Change some default config values of
> > blocking shuffle for better usability [1] which has been discussed in this
> > thread [2].
> >
> > The vote will be open for at least 72 hours unless there is an objection or
> > not enough votes.
> >
> > [1]
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-199%3A+Change+some+default+config+values+of+blocking+shuffle+for+better+usability
> > [2] https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p
> >
> > Best,
> > Yingjie
> >


[jira] [Created] (FLINK-25625) Introduce FileFormat for table-store

2022-01-11 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-25625:


 Summary: Introduce FileFormat for table-store
 Key: FLINK-25625
 URL: https://issues.apache.org/jira/browse/FLINK-25625
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.1.0


Introduce file format class which creates reader and writer factories for 
specific file format.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25624) KafkaSinkITCase.testRecoveryWithExactlyOnceGuarantee blocked on azure pipeline

2022-01-11 Thread Yun Gao (Jira)
Yun Gao created FLINK-25624:
---

 Summary: KafkaSinkITCase.testRecoveryWithExactlyOnceGuarantee 
blocked on azure pipeline
 Key: FLINK-25624
 URL: https://issues.apache.org/jira/browse/FLINK-25624
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.14.2
Reporter: Yun Gao


{code:java}
"main" #1 prio=5 os_prio=0 tid=0x7fda8c00b000 nid=0x21b2 waiting on 
condition [0x7fda92dd7000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x826165c0> (a 
java.util.concurrent.CompletableFuture$Signaller)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
at 
java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at 
java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1989)
at 
org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:69)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1951)
at 
org.apache.flink.connector.kafka.sink.KafkaSinkITCase.testRecoveryWithAssertion(KafkaSinkITCase.java:335)
at 
org.apache.flink.connector.kafka.sink.KafkaSinkITCase.testRecoveryWithExactlyOnceGuarantee(KafkaSinkITCase.java:190)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
 {code}
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=29285=logs=c5612577-f1f7-5977-6ff6-7432788526f7=ffa8837a-b445-534e-cdf4-db364cf8235d=42106



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] FLIP-200: Support Multiple Rule and Dynamic Rule Changing (Flink CEP)

2022-01-11 Thread Becket Qin
Hi folks,

Till and I had an offline discussion about this FLIP. And it looks that
currently OC is still coupled with the Source operator in some cases. So it
is not really ready for the CEP use case at this point. In order to provide
the CEP dynamic pattern update feature to the users, at the moment a
connected stream seems the only viable solution. So here is what we agreed:

1. For this FLIP, we will use the connected stream to update the dynamic
pattern. API wise, it probably means there will be a method in the CEP
class such as
public static  DataStream patternProcessors(
DataStream input, DataStream commandStream)
{...}
This means a proper pattern update command API needs to be designed so
users can provide the pattern update commands in the command stream.

2. Meanwhile, we will improve the OC protocol to make it generic to support
arbitrary operators. After that, we will revisit the option of CEP and may
add a new API to CEP which may look like something below.
public static  DataStream patternProcessors(
DataStream input, PatternProcessorDiscovererFactory factory) {...}

And in this FLIP, we will mention the OC effort in the future work part to
let the users be aware of the potential future addition to the CEP API for
dynamic pattern update. So for people who are willing to wait for the OC
based API, they may choose to hold on a little bit.
If later on in the future, we see one of the two API flavors become
dominant, we can deprecate the other.

Some other things that may worth mentioning are:

1. The command streams will likely just use the processing/ingestion time
instead of event time to avoid unexpected watermark pauses.
2. In order to reprocess a record stream and produce repeatable result, it
is recommended to fully load the pattern change log before start processing
the record stream.

Thanks,

Jiangjie (Becket) Qin

On Thu, Jan 6, 2022 at 3:51 PM Till Rohrmann  wrote:

> Hi Becket,
>
> I might be missing something but having to define interfaces/formats for
> the CEP patterns should be necessary for either approach. The OC approach
> needs to receive and understand the pattern data from somewhere as well and
> will probably also have to deal with evolving formats. Hence, I believe
> that this work wouldn't be wasted.
>
> I might misjudge the willingness of our users to do some extra set up work,
> but I think that some of them would already be happy with state 1.
>
> My understanding so far was that the OC approach also requires the CEP
> operator infrastructure (making the operator accept new patterns) work that
> I proposed to do as a first step. The only difference is where the new
> patterns/commands are coming from. If we design this correctly, then this
> should change only very little depending on whether you read from a
> side-input or from an OC.
>
> Another benefit of downscoping the FLIP is to make it more realistic to be
> completed. Smaller and incremental steps are usually easier to realize. If
> we now say that this FLIP requires a general purpose user controlled
> control plane that gives you hard guarantees, then I am pretty sure that
> this will take at least half a year.
>
> Cheers,
> Till
>
> On Thu, Jan 6, 2022 at 4:45 AM Becket Qin  wrote:
>
> > Thanks for the explanation, Till. I like the idea, but have a question
> > about the first step.
> >
> > After the first step, would users be able to actually use the dynamic
> > patterns in CEP?
> >
> > In the first step you mentioned, the commands and formats for a new CEP
> > pattern seem related to how users would ingest the commands. If we go
> with
> > the OC, these commands and formats would become internal interfaces. The
> > end users would just use the REST API, or in the beginning, implement a
> > Java plugin of dynamic pattern provider. In this case our focus would be
> on
> > designing a good plugin interface. On the other hand, if we go with the
> > side-input, users would need to know the command format so they can send
> > the commands to the CEP operator. Therefore we need to think about stuff
> > like versioning, request validation and backwards compatibility.
> >
> > Also, because the public interface is all about how the users can ingest
> > the dynamic patterns. It seems we still need to figure that out before we
> > can conclude the FLIP.
> >
> > Assuming the first step closes this FLIP and after that users would be
> able
> > to use the CEP dynamic pattern, are you suggesting the following?
> >
> > 1. We will design the commands and format of CEP dynamic pattern, and
> also
> > how CEP operators take them into effect. This design would assume that
> > users can send the commands directly to the CEP operator via side-input.
> So
> > the protocol would take versioning and format evolution into account.
> After
> > the first step, the users CAN make dynamic pattern work with the help
> from
> > some external dependencies and maintenance effort.
> >
> > 2. Discuss about 

[jira] [Created] (FLINK-25623) TPC-DS end-to-end test (Blink planner) failed on azure due to download file tpcds.idx failed.

2022-01-11 Thread Yun Gao (Jira)
Yun Gao created FLINK-25623:
---

 Summary: TPC-DS end-to-end test (Blink planner) failed on azure 
due to download file tpcds.idx failed. 
 Key: FLINK-25623
 URL: https://issues.apache.org/jira/browse/FLINK-25623
 Project: Flink
  Issue Type: Bug
  Components: Build System / Azure Pipelines
Affects Versions: 1.13.5
Reporter: Yun Gao


 
{code:java}
Jan 12 02:22:14 [WARN] Download file tpcds.idx failed.
Jan 12 02:22:14 Command: download_and_validate tpcds.idx ../target/generator 
https://raw.githubusercontent.com/ververica/tpc-ds-generators/f5d6c11681637908ce15d697ae683676a5383641/generators/tpcds.idx
 376152c9aa150c59a386b148f954c47d linux failed. Retrying...
Jan 12 02:22:19 Command: download_and_validate tpcds.idx ../target/generator 
https://raw.githubusercontent.com/ververica/tpc-ds-generators/f5d6c11681637908ce15d697ae683676a5383641/generators/tpcds.idx
 376152c9aa150c59a386b148f954c47d linux failed 3 times.
Jan 12 02:22:19 [WARN] Download file tpcds.idx failed.
Jan 12 02:22:19 [ERROR] Download and validate data generator files fail, please 
check the network.
Jan 12 02:22:19 [FAIL] Test script contains errors.
Jan 12 02:22:19 Checking for errors...
Jan 12 02:22:19 No errors in log files.
Jan 12 02:22:19 Checking for exceptions...
Jan 12 02:22:19 No exceptions in log files.
Jan 12 02:22:19 Checking for non-empty .out files...
grep: /home/vsts/work/_temp/debug_files/flink-logs/*.out: No such file or 
directory
Jan 12 02:22:19 No non-empty .out files.
Jan 12 02:22:19 
Jan 12 02:22:19 [FAIL] 'TPC-DS end-to-end test (Blink planner)' failed after 1 
minutes and 12 seconds! Test exited with exit code 1
Jan 12 02:22:19 
 {code}
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=29284=logs=08866332-78f7-59e4-4f7e-49a56faa3179=7f606211-1454-543c-70ab-c7a028a1ce8c=19463

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25622) Throws NPE in Python UDTF

2022-01-11 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-25622:


 Summary: Throws NPE in Python UDTF
 Key: FLINK-25622
 URL: https://issues.apache.org/jira/browse/FLINK-25622
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.14.3
Reporter: Huang Xingbo


The failed case is

{code:python}
// Some comments here
   source_table = """
CREATE TABLE ad_track_android_source (
  `Rows` ARRAY>,
  `id` AS CAST(`Rows`[CASE WHEN `Rows`[2].`id` > 0 THEN 2 ELSE 1 
END].`id` AS INTEGER)
) WITH (
'connector' = 'datagen'
)
"""
self.t_env.execute_sql(source_table)

@udf(result_type=DataTypes.INT())
def ug(id):
return id

self.t_env.create_temporary_function("ug", ug)
res = self.t_env.sql_query(
"select id ,ug(cast(id as int)) as s from `ad_track_android_source` 
where id>0")
print(res.to_pandas())
{code}

The traceback is 

{code:java}
E   : java.lang.NullPointerException
E   at 
org.apache.calcite.rex.RexFieldAccess.checkValid(RexFieldAccess.java:74)
E   at 
org.apache.calcite.rex.RexFieldAccess.(RexFieldAccess.java:62)
E   at 
org.apache.calcite.rex.RexShuttle.visitFieldAccess(RexShuttle.java:205)
E   at 
org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitFieldAccess(RexProgramBuilder.java:904)
E   at 
org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitFieldAccess(RexProgramBuilder.java:887)
E   at 
org.apache.calcite.rex.RexFieldAccess.accept(RexFieldAccess.java:92)
E   at 
org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:295)
E   at 
org.apache.calcite.rex.RexProgramBuilder.addProject(RexProgramBuilder.java:206)
E   at 
org.apache.calcite.rex.RexProgram.create(RexProgram.java:224)
E   at 
org.apache.flink.table.planner.plan.rules.logical.PythonCalcSplitRuleBase.onMatch(PythonCalcSplitRule.scala:84)
E   at 
org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:333)
E   at 
org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:542)
E   at 
org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:407)
E   at 
org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:243)
E   at 
org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
E   at 
org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:202)
E   at 
org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:189)
E   at 
org.apache.flink.table.planner.plan.optimize.program.FlinkHepProgram.optimize(FlinkHepProgram.scala:69)
E   at 
org.apache.flink.table.planner.plan.optimize.program.FlinkHepRuleSetProgram.optimize(FlinkHepRuleSetProgram.scala:87)
E   at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.$anonfun$optimize$1(FlinkChainedProgram.scala:62)
E   at 
scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:156)
E   at 
scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:156)
E   at scala.collection.Iterator.foreach(Iterator.scala:937)
E   at 
scala.collection.Iterator.foreach$(Iterator.scala:937)
E   at 
scala.collection.AbstractIterator.foreach(Iterator.scala:1425)
E   at 
scala.collection.IterableLike.foreach(IterableLike.scala:70)
E   at 
scala.collection.IterableLike.foreach$(IterableLike.scala:69)
E   at 
scala.collection.AbstractIterable.foreach(Iterable.scala:54)
E   at 
scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:156)
E   at 
scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:154)
E   at 
scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)
E   at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:58)
E   at 
org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.optimizeTree(StreamCommonSubGraphBasedOptimizer.scala:165)
E   at 
org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.doOptimize(StreamCommonSubGraphBasedOptimizer.scala:83)
E   at 

[jira] [Created] (FLINK-25621) LegacyStatefulJobSavepointMigrationITCase failed on azure with exit code 127

2022-01-11 Thread Yun Gao (Jira)
Yun Gao created FLINK-25621:
---

 Summary: LegacyStatefulJobSavepointMigrationITCase failed on azure 
with exit code 127
 Key: FLINK-25621
 URL: https://issues.apache.org/jira/browse/FLINK-25621
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.14.2
Reporter: Yun Gao


{code:java}
Jan 12 05:37:09 [WARNING] The requested profile "skip-webui-build" could not be 
activated because it does not exist.
Jan 12 05:37:09 [ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test (integration-tests) 
on project flink-tests: There are test failures.
Jan 12 05:37:09 [ERROR] 
Jan 12 05:37:09 [ERROR] Please refer to 
/__w/1/s/flink-tests/target/surefire-reports for the individual test results.
Jan 12 05:37:09 [ERROR] Please refer to dump files (if any exist) [date].dump, 
[date]-jvmRun[N].dump and [date].dumpstream.
Jan 12 05:37:09 [ERROR] ExecutionException The forked VM terminated without 
properly saying goodbye. VM crash or System.exit called?
Jan 12 05:37:09 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target 
&& /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m 
-Dmvn.forkNumber=2 -XX:+UseG1GC -jar 
/__w/1/s/flink-tests/target/surefire/surefirebooter6476664513703430996.jar 
/__w/1/s/flink-tests/target/surefire 2022-01-12T04-37-00_265-jvmRun2 
surefire8749628241391231873tmp surefire_1763452000794129753394tmp
Jan 12 05:37:09 [ERROR] Error occurred in starting fork, check output in log
Jan 12 05:37:09 [ERROR] Process Exit Code: 127
Jan 12 05:37:09 [ERROR] Crashed tests:
Jan 12 05:37:09 [ERROR] 
org.apache.flink.test.checkpointing.utils.LegacyStatefulJobSavepointMigrationITCase
Jan 12 05:37:09 [ERROR] 
org.apache.maven.surefire.booter.SurefireBooterForkException: 
ExecutionException The forked VM terminated without properly saying goodbye. VM 
crash or System.exit called?
Jan 12 05:37:09 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target 
&& /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m 
-Dmvn.forkNumber=2 -XX:+UseG1GC -jar 
/__w/1/s/flink-tests/target/surefire/surefirebooter6476664513703430996.jar 
/__w/1/s/flink-tests/target/surefire 2022-01-12T04-37-00_265-jvmRun2 
surefire8749628241391231873tmp surefire_1763452000794129753394tmp
Jan 12 05:37:09 [ERROR] Error occurred in starting fork, check output in log
Jan 12 05:37:09 [ERROR] Process Exit Code: 127
Jan 12 05:37:09 [ERROR] Crashed tests:
Jan 12 05:37:09 [ERROR] 
org.apache.flink.test.checkpointing.utils.LegacyStatefulJobSavepointMigrationITCase
Jan 12 05:37:09 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:510)
Jan 12 05:37:09 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:457)
Jan 12 05:37:09 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:298)
Jan 12 05:37:09 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:246)
Jan 12 05:37:09 [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1183)
Jan 12 05:37:09 [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1011)
Jan 12 05:37:09 [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:857)
Jan 12 05:37:09 [ERROR] at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
Jan 12 05:37:09 [ERROR] at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
Jan 12 05:37:09 [ERROR] at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
Jan 12 05:37:09 [ERROR] at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
 {code}
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=29283=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=5423



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Seek help for making JIRA links clickable in github

2022-01-11 Thread Yun Gao
Currently it seems the issues of flink-statefun and flink-ml are 
also managed in the issues.apache.org ?

Best,
Yun


 --Original Mail --
Sender:Jingsong Li 
Send Date:Wed Jan 12 13:54:03 2022
Recipients:dev 
Subject:[DISCUSS] Seek help for making JIRA links clickable in github
Hi everyone,

We are creating flink-table-store[1] and we also find that flink-ml[2]
does not have clickable JIRA links, while flink-statefun[3] and
flink[4] do.

So I'm asking for PMC's help on how to make JIRA links clickable in github.

[1] https://github.com/apache/flink-table-store
[2] https://github.com/apache/flink-ml
[3] https://github.com/apache/flink-statefun
[4] https://github.com/apache/flink

Best,
Jingsong Lee

Re: [VOTE] FLIP-199: Change some default config values of blocking shuffle for better usability

2022-01-11 Thread 刘建刚
+1 for the proposal. In fact, we have used these params in our inner flink
version for good performance.

Yun Gao  于2022年1月12日周三 10:42写道:

> +1 since it would highly improve the open-box experience for batch jobs.
>
> Thanks Yingjie for drafting the PR and initiating the discussion.
>
> Best,
> Yun
>
>
>
>  --Original Mail --
> Sender:Yingjie Cao 
> Send Date:Tue Jan 11 15:15:01 2022
> Recipients:dev 
> Subject:[VOTE] FLIP-199: Change some default config values of blocking
> shuffle for better usability
> Hi all,
>
> I'd like to start a vote on FLIP-199: Change some default config values of
> blocking shuffle for better usability [1] which has been discussed in this
> thread [2].
>
> The vote will be open for at least 72 hours unless there is an objection or
> not enough votes.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-199%3A+Change+some+default+config+values+of+blocking+shuffle+for+better+usability
> [2] https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p
>
> Best,
> Yingjie
>


[jira] [Created] (FLINK-25620) Upload artifacts to S3 failed on azure pipeline

2022-01-11 Thread Yun Gao (Jira)
Yun Gao created FLINK-25620:
---

 Summary: Upload artifacts to S3 failed on azure pipeline
 Key: FLINK-25620
 URL: https://issues.apache.org/jira/browse/FLINK-25620
 Project: Flink
  Issue Type: Bug
  Components: Build System / Azure Pipelines
Affects Versions: 1.14.2
Reporter: Yun Gao


{code:java}
/bin/bash --noprofile --norc /__w/_temp/fdaf679a-209f-4720-8709-5c591f730c61.sh
Installing artifacts deployment script
+ upload_to_s3 ./tools/releasing/release
+ local FILES_DIR=./tools/releasing/release
+ echo 'Installing artifacts deployment script'
+ export ARTIFACTS_DEST=/home/vsts_azpcontainer/bin/artifacts
+ ARTIFACTS_DEST=/home/vsts_azpcontainer/bin/artifacts
+ curl -sL https://raw.githubusercontent.com/travis-ci/artifacts/master/install
+ bash
bash: line 1: !DOCTYPE: No such file or directory
bash: line 2: !--
: No such file or directory
bash: line 3: $'\r': command not found
bash: line 4: Hello: command not found
bash: line 6: $'\r': command not found
bash: line 7: unexpected EOF while looking for matching `''
bash: line 83: syntax error: unexpected end of file
##[error]Bash exited with code '2'.
Finishing: Upload artifacts to S3
 {code}
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=29283=logs=585d8b77-fa33-51bc-8163-03e54ba9ce5b=68e20e55-906c-5c49-157c-3005667723c9=18



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25619) Init flink-table-store repository

2022-01-11 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-25619:


 Summary: Init flink-table-store repository
 Key: FLINK-25619
 URL: https://issues.apache.org/jira/browse/FLINK-25619
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.1.0


Create:
 * README.md
 * NOTICE LICENSE CODE_OF_CONDUCT
 * .gitignore
 * maven tools
 * releasing tools
 * github build workflow
 * pom.xml



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[DISCUSS] Seek help for making JIRA links clickable in github

2022-01-11 Thread Jingsong Li
Hi everyone,

We are creating flink-table-store[1] and we also find that flink-ml[2]
does not have clickable JIRA links, while flink-statefun[3] and
flink[4] do.

So I'm asking for PMC's help on how to make JIRA links clickable in github.

[1] https://github.com/apache/flink-table-store
[2] https://github.com/apache/flink-ml
[3] https://github.com/apache/flink-statefun
[4] https://github.com/apache/flink

Best,
Jingsong Lee


[jira] [Created] (FLINK-25618) Data quality by apache flink

2022-01-11 Thread tanjialiang (Jira)
tanjialiang created FLINK-25618:
---

 Summary: Data quality by apache flink
 Key: FLINK-25618
 URL: https://issues.apache.org/jira/browse/FLINK-25618
 Project: Flink
  Issue Type: New Feature
Reporter: tanjialiang


This is discussing about how to support data quality through apache flink.

For example, I has a sql job, a table in this job has a column name phone, and 
the data of the column phone must match the pattern of telephone, if not match, 
i can choose drop it or ignored, and we can mark it in the metrics, so that 
user can monitor the data of quality in source and sink.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] FLIP-199: Change some default config values of blocking shuffle for better usability

2022-01-11 Thread Yun Gao
+1 since it would highly improve the open-box experience for batch jobs.

Thanks Yingjie for drafting the PR and initiating the discussion.

Best,
Yun



 --Original Mail --
Sender:Yingjie Cao 
Send Date:Tue Jan 11 15:15:01 2022
Recipients:dev 
Subject:[VOTE] FLIP-199: Change some default config values of blocking shuffle 
for better usability
Hi all,

I'd like to start a vote on FLIP-199: Change some default config values of
blocking shuffle for better usability [1] which has been discussed in this
thread [2].

The vote will be open for at least 72 hours unless there is an objection or
not enough votes.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-199%3A+Change+some+default+config+values+of+blocking+shuffle+for+better+usability
[2] https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p

Best,
Yingjie


[jira] [Created] (FLINK-25617) Support VectorAssembler in FlinkML

2022-01-11 Thread weibo zhao (Jira)
weibo zhao created FLINK-25617:
--

 Summary: Support VectorAssembler in FlinkML
 Key: FLINK-25617
 URL: https://issues.apache.org/jira/browse/FLINK-25617
 Project: Flink
  Issue Type: New Feature
  Components: Library / Machine Learning
Reporter: weibo zhao


Support VectorAssembler in FlinkML



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25616) Support VectorAssembler in FlinkML

2022-01-11 Thread weibo zhao (Jira)
weibo zhao created FLINK-25616:
--

 Summary: Support VectorAssembler in FlinkML
 Key: FLINK-25616
 URL: https://issues.apache.org/jira/browse/FLINK-25616
 Project: Flink
  Issue Type: New Feature
  Components: Library / Machine Learning
Reporter: weibo zhao


Support VectorAssembler in FlinkML



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[RESULT][VOTE] Create a separate sub project for FLIP-188

2022-01-11 Thread Jingsong Li
I am happy to announce that creating a separate sub project for
FLIP-188[1][2][3].

There are 12 approving votes, 10 of which are binding, 7 of which are
voting for flink-table-store:
* Timo Walther (binding)
* Till Rohrmann (binding)
* Konstantin Knauf(binding)
* David Moravek (binding)
* Yun Tang  (binding) (flink-table-store)
* Yu Li  (binding) (flink-table-store)
* Jark Wu  (binding) (flink-table-store)
* Becket Qin  (binding) (flink-table-store)
* Godfrey He  (binding) (flink-table-store)
* Jingsong Lee  (binding) (flink-table-store)
* Shouwei  (non-binding)
* Neng Lu  (non-binding) (flink-table-store)

The name of sub project is `flink-table-store`.

There are no disapproving votes.

Thanks everyone!
Jingsong Lee

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-188%3A+Introduce+Built-in+Dynamic+Table+Storage
[2] https://lists.apache.org/thread/tqyn1cro5ohl3c3fkjb1zvxbo03sofn7
[3] https://lists.apache.org/thread/wzzhr27cvrh6w107bn464m1m1ycfll1z


Re: [DISCUSS] FLIP-208: Update KafkaSource to detect EOF based on de-serialized record

2022-01-11 Thread Dong Lin
Hi Fabian,

Thanks for the comments. Please see my reply inline.

On Tue, Jan 11, 2022 at 11:46 PM Fabian Paul  wrote:

> Hi Dong,
>
> I wouldn't change the org.apache.flink.api.connector.source.Source
> interface because it either breaks existing sinks or we introduce it
> as some kind of optional. I deem both options as not great. My idea is
> to introduce a new interface that extends the Source. This way users
> who want to develop a source that stops with the record evaluator can
> implement the new interface. It also has the nice benefit that we can
> give this new type of source a lower stability guarantee than Public
> to allow some changes.
>

Currently the eofRecodEvaluator can be passed from
KafkaSourceBuilder/PulsarSourceBuilder
to SingleThreadMultiplexSourceReaderBase and SourceReaderBase. This
approach also allows developers who want to develop a source that stops
with the record evaluator to implement the new feature. Adding a new
interface could increase the complexity in our interface and
infrastructure. I am not sure if it has added benefits compared to the
existing proposal. Could you explain more?

I am not very sure what "new type of source a lower stability guarantee"
you are referring to. Could you explain more? It looks like a new feature
not mentioned in the FLIP. If the changes proposed in this FLIP also
support the feature you have in mind, could we discuss this in a separate
FLIP?

In the SourceOperatorFactory we can then access the record evaluator
> from the respective sources and pass it to the source operator.
>
> Hopefully, this makes sense. So far I did not find information about
> the actual stopping logic in the FLIP maybe you had something
> different in mind.
>

By "actual stopping logic", do you mean an example implementation of the
RecordEvalutor? I think the use-case is described in the motivation
section, which is about a pipeline processing stock transaction data.

We can support this use-case with this FLIP, by implementing this
RecordEvaluator that stops reading data from a split when there is a
message that says "EOF". Users can trigger this feature by sending messages
with "EOF" in the payload to all partitions of the source Kafka topic.

Does this make sense?


>
> Best,
> Fabian
>
> On Tue, Jan 11, 2022 at 1:40 AM Dong Lin  wrote:
> >
> > Hi Fabian,
> >
> > Thanks for the comments!
> >
> > By "add a source mixin interface", are you suggesting to update
> > the org.apache.flink.api.connector.source.Source interface to add the API
> > "RecordEvaluator getRecordEvaluator()"? If so, it seems to add more
> > public API and thus more complexity than the solution in the FLIP. Could
> > you help explain more about the benefits of doing this?
> >
> > Regarding the 2nd question, I think this FLIP does not change whether
> > sources are treated as bounded or unbounded. For example, the
> KafkaSource's
> > boundedness will continue to be determined with the API
> > KafkaSourceBuilder::setBounded(..) and
> > KafkaSourceBuilder::setUnbounded(..). Does this answer your question?
> >
> > Thanks,
> > Dong
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Mon, Jan 10, 2022 at 8:01 PM Fabian Paul  wrote:
> >
> > > Hi Dong,
> > >
> > > Thank you for updating the FLIP and making it applicable for all
> > > sources. I am a bit unsure about the implementation part. I would
> > > propose to add a source mixin interface that implements
> > > `getRecordEvaluator` and sources that want to allow dynamically
> > > stopping implement that interface.
> > >
> > > Another question I had was how do we treat sources using the record
> > > evaluator as bounded or unbounded?
> > >
> > > Best,
> > > Fabian
> > >
> > > On Sat, Jan 8, 2022 at 11:52 AM Dong Lin  wrote:
> > > >
> > > > Hi Martijn and Qingsheng,
> > > >
> > > > The FLIP has been updated to extend the dynamic EOF support for the
> > > > PulsarSource. I have not extended this feature to other sources yet
> > > since I
> > > > am not sure it is a requirement to ensure feature consistency across
> > > > different sources. Could you take another look?
> > > >
> > > > Thanks,
> > > > Dong
> > > >
> > > > On Fri, Jan 7, 2022 at 11:49 PM Dong Lin 
> wrote:
> > > >
> > > > > Hi Martijn,
> > > > >
> > > > > Thanks for the comments! In general I agree we should avoid feature
> > > > > sparsity.
> > > > >
> > > > > In this particular case, connectors are a bit different than most
> other
> > > > > features in Flink. AFAIK, we plan to move connectors (including
> Kafka
> > > and
> > > > > Pulsar) out of the Flink project in the future, which means that
> the
> > > > > development of these connectors will be mostly de-centralized
> (outside
> > > of
> > > > > Flink) and be up to their respective maintainers. While I agree
> that we
> > > > > should provide API/infrastructure in Flink (as this FLIP does) to
> > > support
> > > > > feature consistency across connectors, I am not sure we should own
> the
> > > > > responsibility to actually update all 

Re: [DISCUSS] FLIP-208: Update KafkaSource to detect EOF based on de-serialized record

2022-01-11 Thread Dong Lin
Hi Martijn,

Thank you for the comments. Please find my reply inline.

On Wed, Jan 12, 2022 at 3:07 AM Martijn Visser 
wrote:

> Hi Dong,
>
> Thanks for updating the FLIP and including Pulsar. I was indeed referring
> that we should have a generic interface that allows connector maintainers
> to implement this capability if they think it should be supported.
>

We are on the same page :)


>
> Could you see a feature like this also be useful for a connector like
> FileSystem?
>

Regarding the use-case for eofRecordEvaluator for a connector like
FileSystem, here is one fabricat use-case: Users want to stop processing
data from a FileSystem source when the data schema is found changed or
there is abnormal data, in order to stop emitting abnormal data to
downstream sink.

Since this is just a fabricated use-case and we agree the development of
particular connectors should be left to their maintainers, we won't support
eofRecordEvaluator with FileSystem connector in this FLIP.


>
> Best regards,
>
> Martijn
>
> On Tue, 11 Jan 2022 at 16:47, Fabian Paul  wrote:
>
> > Hi Dong,
> >
> > I wouldn't change the org.apache.flink.api.connector.source.Source
> > interface because it either breaks existing sinks or we introduce it
> > as some kind of optional. I deem both options as not great. My idea is
> > to introduce a new interface that extends the Source. This way users
> > who want to develop a source that stops with the record evaluator can
> > implement the new interface. It also has the nice benefit that we can
> > give this new type of source a lower stability guarantee than Public
> > to allow some changes.
> > In the SourceOperatorFactory we can then access the record evaluator
> > from the respective sources and pass it to the source operator.
> >
> > Hopefully, this makes sense. So far I did not find information about
> > the actual stopping logic in the FLIP maybe you had something
> > different in mind.
> >
> > Best,
> > Fabian
> >
> > On Tue, Jan 11, 2022 at 1:40 AM Dong Lin  wrote:
> > >
> > > Hi Fabian,
> > >
> > > Thanks for the comments!
> > >
> > > By "add a source mixin interface", are you suggesting to update
> > > the org.apache.flink.api.connector.source.Source interface to add the
> API
> > > "RecordEvaluator getRecordEvaluator()"? If so, it seems to add more
> > > public API and thus more complexity than the solution in the FLIP.
> Could
> > > you help explain more about the benefits of doing this?
> > >
> > > Regarding the 2nd question, I think this FLIP does not change whether
> > > sources are treated as bounded or unbounded. For example, the
> > KafkaSource's
> > > boundedness will continue to be determined with the API
> > > KafkaSourceBuilder::setBounded(..) and
> > > KafkaSourceBuilder::setUnbounded(..). Does this answer your question?
> > >
> > > Thanks,
> > > Dong
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Jan 10, 2022 at 8:01 PM Fabian Paul  wrote:
> > >
> > > > Hi Dong,
> > > >
> > > > Thank you for updating the FLIP and making it applicable for all
> > > > sources. I am a bit unsure about the implementation part. I would
> > > > propose to add a source mixin interface that implements
> > > > `getRecordEvaluator` and sources that want to allow dynamically
> > > > stopping implement that interface.
> > > >
> > > > Another question I had was how do we treat sources using the record
> > > > evaluator as bounded or unbounded?
> > > >
> > > > Best,
> > > > Fabian
> > > >
> > > > On Sat, Jan 8, 2022 at 11:52 AM Dong Lin 
> wrote:
> > > > >
> > > > > Hi Martijn and Qingsheng,
> > > > >
> > > > > The FLIP has been updated to extend the dynamic EOF support for the
> > > > > PulsarSource. I have not extended this feature to other sources yet
> > > > since I
> > > > > am not sure it is a requirement to ensure feature consistency
> across
> > > > > different sources. Could you take another look?
> > > > >
> > > > > Thanks,
> > > > > Dong
> > > > >
> > > > > On Fri, Jan 7, 2022 at 11:49 PM Dong Lin 
> > wrote:
> > > > >
> > > > > > Hi Martijn,
> > > > > >
> > > > > > Thanks for the comments! In general I agree we should avoid
> feature
> > > > > > sparsity.
> > > > > >
> > > > > > In this particular case, connectors are a bit different than most
> > other
> > > > > > features in Flink. AFAIK, we plan to move connectors (including
> > Kafka
> > > > and
> > > > > > Pulsar) out of the Flink project in the future, which means that
> > the
> > > > > > development of these connectors will be mostly de-centralized
> > (outside
> > > > of
> > > > > > Flink) and be up to their respective maintainers. While I agree
> > that we
> > > > > > should provide API/infrastructure in Flink (as this FLIP does) to
> > > > support
> > > > > > feature consistency across connectors, I am not sure we should
> own
> > the
> > > > > > responsibility to actually update all connectors to achieve
> feature
> > > > > > consistency, given that we don't plan to do it in Flink anyway
> due
> > to
> > 

Re: [DISCUSS] FLIP-211: Kerberos delegation token framework

2022-01-11 Thread Márton Balassi
Hi G,

Thanks for taking this challenge on. Scalable Kerberos authentication
support is important for Flink, delegation tokens is a great mechanism to
future-proof this. I second your assessment that the existing
implementation could use some improvement too and like the approach you
have outlined. It is crucial that the changes are self-contained and will
not affect users that do not use Kerberos, while are minimal for the ones
who do (configuration values change, but the defaults just keep working in
most cases).

Thanks,
Marton

On Tue, Jan 11, 2022 at 2:59 PM Gabor Somogyi 
wrote:

> Hi All,
>
> Hope all of you have enjoyed the holiday season.
>
> I would like to start the discussion on FLIP-211
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-211%3A+Kerberos+delegation+token+framework
> >
> which
> aims to provide a
> Kerberos delegation token framework that /obtains/renews/distributes tokens
> out-of-the-box.
>
> Please be aware that the FLIP wiki area is not fully done since the
> discussion may
> change the feature in major ways. The proposal can be found in a google doc
> here
> <
> https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ
> >
> .
> As the community agrees on the approach the content will be moved to the
> wiki page.
>
> Feel free to add your thoughts to make this feature better!
>
> BR,
> G
>


Re: [VOTE] Create a separate sub project for FLIP-188: flink-store

2022-01-11 Thread Neng Lu
+1 (non-binding) for `flink-table-store`

On Mon, Jan 10, 2022 at 11:20 PM Jingsong Li  wrote:

> Thanks everyone for your voting.
>
> If there are no objections, I'll close this vote and send a vote result
> mail:
> - create a sub project named `flink-table-store`.
>
> Best,
> Jingsong
>
> On Tue, Jan 11, 2022 at 2:51 PM Jingsong Li 
> wrote:
> >
> > Hi Fabian,
> >
> > Thanks for your information.
> >
> > If gradle is mature later, it should not be too difficult to migrate
> > from maven, we can consider it later.
> >
> > Best,
> > Jingsong
> >
> > On Tue, Jan 11, 2022 at 12:00 PM 刘首维 
> wrote:
> > >
> > > Thanks for driving this, Jingsong.
> > > +1 (non-binding) for separate repository.
> > >
> > >
> > > Best Regards,
> > > Shouwei
> > >
> > >
> > > --原始邮件--
> > > 发件人:
>   "dev"
> <
> godfre...@gmail.com;
> > > 发送时间:2022年1月10日(星期一) 晚上11:06
> > > 收件人:"dev" > > 抄送:"Jark Wu" mart...@ververica.com;
> > > 主题:Re: [VOTE] Create a separate sub project for FLIP-188:
> flink-store
> > >
> > >
> > >
> > > +1 for the separate repository, and the name "flink-table-store".
> > >
> > > Best,
> > > Godfrey
> > >
> > > Becket Qin  > > 
> > >  Thanks for the FLIP, Jingsong.
> > > 
> > >  +1 (binding)
> > > 
> > >  Naming wise, I am also slightly leaning towards calling it
> > >  "flink-table-store".
> > > 
> > >  Thanks,
> > > 
> > >  Jiangjie (Becket) Qin
> > > 
> > >  On Mon, Jan 10, 2022 at 7:39 PM Fabian Paul  wrote:
> > > 
> > >   Hi all,
> > >  
> > >   I just wanted to give my two cents for the build system
> discussion. In
> > >   general, I agree with David's opinion to start new projects
> with
> > >   Gradle but during the development of the external connector
> > >   repository, we found some difficulties that still need to be
> solved. I
> > >   do not want to force another project (with maybe limited
> Gradle
> > >   expertise) to use Gradle right now. After we fully
> established the
> > >   external connector repository with Gradle I can imagine
> converting the
> > >   other external repositories as well.
> > >  
> > >   Best,
> > >   Fabian
> > >  
> > >   On Mon, Jan 10, 2022 at 12:04 PM Jark Wu  wrote:
> > >   
> > >I'm also in favour of "flink-table-store".
> > >   
> > >Best,
> > >Jark
> > >   
> > >On Mon, 10 Jan 2022 at 16:18, David Morávek <
> d...@apache.org wrote:
> > >   
> > >Hi Jingsong,
> > >   
> > >the connector repository prototype I've seen is
> being built on top of
> > >Gradle [1], that's why I was referring to it (I
> think one idea was also
> > >   to
> > >migrate the main repository to Gradle eventually).
> I think Martijn /
> > >   Fabian
> > >may be bit more familiar with the connectors
> repository effort and could
> > >shed some light on this.
> > >   
> > >[1] https://github.com/apache/flink-connectors
> > >   
> > >Best,
> > >D.
> > >   
> > >On Mon, Jan 10, 2022 at 8:57 AM Yu Li <
> car...@gmail.com wrote:
> > >   
> > > +1 for a separate repository and release
> pipeline in the same way as
> > > flink-statefun [1], flink-ml [2] and the
> coming flink-connectors [3].
> > >
> > > +1 for naming it as "flink-table-store" (I'm
> also ok with
> > > "flink-table-storage", but slightly prefer
> "flink-table-store"
> > >   because it's
> > > shorter)
> > >
> > > Thanks for driving this Jingsong, and look
> forward to a fast
> > >   evolution of
> > > this direction!
> > >
> > > Best Regards,
> > > Yu
> > >
> > > [1] https://github.com/apache/flink-statefun
> > > [2] https://github.com/apache/flink-ml
> > > [3] https://github.com/apache/flink-connectors
> > >
> > >
> > > On Mon, 10 Jan 2022 at 10:52, Jingsong Li <
> jingsongl...@gmail.com
> > >   wrote:
> > >
> > >  Hi David, thanks for your suggestion.
> > > 
> > >  I think we should re-use as many common
> components with connectors
> > >   as
> > >  possible. I don't fully understand what
> you mean, but for this
> > >   project
> > >  I prefer to use Maven rather than Gradle.
> > > 
> > >  Best,
> > >  Jingsong
> > > 
> > >  On Fri, Jan 7, 2022 at 11:59 PM David
> Morávek  > >   wrote:
> > >  
> > >   +1 for the separate repository under
> the Flink umbrella
> > >  
> > >   as we've already started creating
> more repositories with
> > >   connectors,
> > >  would
> > >   it be possible to re-use the same
> build infrastructure for this
> > >   one?
> > > (eg.
> > >   shared set of Gradle plugins that
> unify the build experience)?
> > >  
> > >   Best,
> > >   D.
> > >  
> > >   On Fri, Jan 7, 2022 at 11:31 AM
> Jingsong Li <
> > >   jingsongl...@gmail.com
> > >  wrote:
> > >  
> > >For more references on `store`
> and `storage`:
> > >  

Use of JIRA fixVersion

2022-01-11 Thread Thomas Weise
Hi,

As part of preparing the 1.14.3 release, I observed that there were
around 200 JIRA issues with fixVersion 1.14.3 that were unresolved
(after blocking issues had been dealt with). Further cleanup resulted
in removing fixVersion 1.14.3  from most of these and we are left with
[1] - these are the tickets that rolled over to 1.14.4.

The disassociated issues broadly fell into following categories:

* test infrastructure / stability related (these can be found by label)
* stale tickets (can also be found by label)
* tickets w/o label that pertain to addition of features that don't
fit into or don't have to go into patch release

I wanted to bring this up so that we can potentially come up with
better guidance for use of the fixVersion field, since it is important
for managing releases [2]. Manual cleanup as done in this case isn't
desirable. A few thoughts:

* In most cases, it is not necessary to set fixVersion upfront.
Instead, we can set it when the issue is actually resolved, and set it
for all versions/branches for which a backport occured after the
changes are merged
* How to know where to backport? "Affect versions" seems to be the
right field to use for that purpose. While resolving an issue for
master it can guide backporting.
* What if an issue should block a release? The priority of the issue
should be blocker. Blockers are low cardinality and need to be fixed
before release. So that would be the case where fixVersion is set
upfront.

Thanks,
Thomas

[1] 
https://issues.apache.org/jira/issues/?jql=project%20%3D%20Flink%20and%20fixVersion%20%3D%201.14.4%20and%20resolution%20%3D%20Unresolved%20
[2] https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release


Re: [DISCUSS] Creating an external connector repository

2022-01-11 Thread Martijn Visser
Good question: we want to use the same setup as we currently have for
Flink, so using the existing CI infrastructure.

On Mon, 10 Jan 2022 at 11:19, Chesnay Schepler  wrote:

> What CI resources do you actually intend use? Asking since the ASF GHA
> resources are afaik quite overloaded.
>
> On 05/01/2022 11:48, Martijn Visser wrote:
> > Hi everyone,
> >
> > I wanted to summarise the email thread and see if there are any open
> items
> > that still need to be discussed, before we can finalise the discussion in
> > this email thread:
> >
> > 1. About having multi connectors in one repo or each connector in its own
> > repository
> >
> > As explained by @Arvid Heise  we ultimately propose to
> > have a single repository per connector, which seems to be favoured in the
> > community.
> >
> > 2. About having the connector repositories under ASF or not.
> >
> > The consensus is that all connectors would remain under the ASF.
> >
> > I think we can categorise the questions or concerns that are brought
> > forward as the following one:
> >
> > 3. How would we set up the testing?
> >
> > We need to make sure that we provide a proper testing framework, which
> > means that we provide a public Source- and Sink testing framework. As
> > mentioned extensively in the thread, we need to make sure that the
> > necessary interfaces are properly annotated and at least @PublicEvolving.
> > This also includes the test infrastructure, like MiniCluster. For the
> > latter, we don't know exactly yet how to balance having publicly
> available
> > test infrastructure vs being able to iterate inside of Flink, but we can
> > all agree this has to be solved.
> >
> > For testing infrastructure, we would like to use Github Actions. In the
> > current state, it probably makes sense for a connector repo to follow the
> > branching strategy of Flink. That will ensure a match between the
> released
> > connector and Flink version. This should change when all the Flink
> > interfaces have stabilised so you can use a connector with multiple Flink
> > versions. That means that we should have a nightly build test for:
> >
> > - The `main` branch of the connector (which would be the unreleased
> > version) against the `master` branch of Flink (the unreleased version of
> > Flink).
> > - Any supported `release-X.YY` branch of the connector against the
> > `release-X.YY` branch of Flink.
> >
> > We should also have a smoke test E2E tests in Flink (one for DataStream,
> > one for Table, one for SQL, one for Python) which loads all the
> connectors
> > and does an arbitrary test (post data on source, load into Flink, sink
> > output and compare that output is as expected.
> >
> > 4. How would we integrate documentation?
> >
> > Documentation for a connector should probably end up in the connector
> > repository. The Flink website should contain one entrance to all
> connectors
> > (so not the current approach where we have connectors per DataStream API,
> > Table API etc). Each connector documentation should end up as one menu
> item
> > in connectors, containing all necessary information for all DataStream,
> > Table, SQL and Python implementations.
> >
> > 5. Which connectors should end up in the external connector repo?
> >
> > I'll open up a separate thread on this topic to have a parallel
> discussion
> > on that. We should reach consensus on both threads before we can move
> > forward on this topic as a whole.
> >
> > Best regards,
> >
> > Martijn
> >
> > On Fri, 10 Dec 2021 at 04:47, Thomas Weise  wrote:
> >
> >> +1 for repo per connector from my side also
> >>
> >> Thanks for trying out the different approaches.
> >>
> >> Where would the common/infra pieces live? In a separate repository
> >> with its own release?
> >>
> >> Thomas
> >>
> >> On Thu, Dec 9, 2021 at 12:42 PM Till Rohrmann 
> >> wrote:
> >>> Sorry if I was a bit unclear. +1 for the single repo per connector
> >> approach.
> >>> Cheers,
> >>> Till
> >>>
> >>> On Thu, Dec 9, 2021 at 5:41 PM Till Rohrmann 
> >> wrote:
>  +1 for the single repo approach.
> 
>  Cheers,
>  Till
> 
>  On Thu, Dec 9, 2021 at 3:54 PM Martijn Visser 
>  wrote:
> 
> > I also agree that it feels more natural to go with a repo for each
> > individual connector. Each repository can be made available at
> > flink-packages.org so users can find them, next to referring to them
> >> in
> > documentation. +1 from my side.
> >
> > On Thu, 9 Dec 2021 at 15:38, Arvid Heise  wrote:
> >
> >> Hi all,
> >>
> >> We tried out Chesnay's proposal and went with Option 2.
> >> Unfortunately,
> > we
> >> experienced tough nuts to crack and feel like we hit a dead end:
> >> - The main pain point with the outlined Frankensteinian connector
> >> repo
> > is
> >> how to handle shared code / infra code. If we have it in some
> >> 
> >> branch, then we need to merge the common branch in the connector
> >> branch
> > on
> >> 

Re: [DISCUSS] FLIP-208: Update KafkaSource to detect EOF based on de-serialized record

2022-01-11 Thread Martijn Visser
Hi Dong,

Thanks for updating the FLIP and including Pulsar. I was indeed referring
that we should have a generic interface that allows connector maintainers
to implement this capability if they think it should be supported.

Could you see a feature like this also be useful for a connector like
FileSystem?

Best regards,

Martijn

On Tue, 11 Jan 2022 at 16:47, Fabian Paul  wrote:

> Hi Dong,
>
> I wouldn't change the org.apache.flink.api.connector.source.Source
> interface because it either breaks existing sinks or we introduce it
> as some kind of optional. I deem both options as not great. My idea is
> to introduce a new interface that extends the Source. This way users
> who want to develop a source that stops with the record evaluator can
> implement the new interface. It also has the nice benefit that we can
> give this new type of source a lower stability guarantee than Public
> to allow some changes.
> In the SourceOperatorFactory we can then access the record evaluator
> from the respective sources and pass it to the source operator.
>
> Hopefully, this makes sense. So far I did not find information about
> the actual stopping logic in the FLIP maybe you had something
> different in mind.
>
> Best,
> Fabian
>
> On Tue, Jan 11, 2022 at 1:40 AM Dong Lin  wrote:
> >
> > Hi Fabian,
> >
> > Thanks for the comments!
> >
> > By "add a source mixin interface", are you suggesting to update
> > the org.apache.flink.api.connector.source.Source interface to add the API
> > "RecordEvaluator getRecordEvaluator()"? If so, it seems to add more
> > public API and thus more complexity than the solution in the FLIP. Could
> > you help explain more about the benefits of doing this?
> >
> > Regarding the 2nd question, I think this FLIP does not change whether
> > sources are treated as bounded or unbounded. For example, the
> KafkaSource's
> > boundedness will continue to be determined with the API
> > KafkaSourceBuilder::setBounded(..) and
> > KafkaSourceBuilder::setUnbounded(..). Does this answer your question?
> >
> > Thanks,
> > Dong
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Mon, Jan 10, 2022 at 8:01 PM Fabian Paul  wrote:
> >
> > > Hi Dong,
> > >
> > > Thank you for updating the FLIP and making it applicable for all
> > > sources. I am a bit unsure about the implementation part. I would
> > > propose to add a source mixin interface that implements
> > > `getRecordEvaluator` and sources that want to allow dynamically
> > > stopping implement that interface.
> > >
> > > Another question I had was how do we treat sources using the record
> > > evaluator as bounded or unbounded?
> > >
> > > Best,
> > > Fabian
> > >
> > > On Sat, Jan 8, 2022 at 11:52 AM Dong Lin  wrote:
> > > >
> > > > Hi Martijn and Qingsheng,
> > > >
> > > > The FLIP has been updated to extend the dynamic EOF support for the
> > > > PulsarSource. I have not extended this feature to other sources yet
> > > since I
> > > > am not sure it is a requirement to ensure feature consistency across
> > > > different sources. Could you take another look?
> > > >
> > > > Thanks,
> > > > Dong
> > > >
> > > > On Fri, Jan 7, 2022 at 11:49 PM Dong Lin 
> wrote:
> > > >
> > > > > Hi Martijn,
> > > > >
> > > > > Thanks for the comments! In general I agree we should avoid feature
> > > > > sparsity.
> > > > >
> > > > > In this particular case, connectors are a bit different than most
> other
> > > > > features in Flink. AFAIK, we plan to move connectors (including
> Kafka
> > > and
> > > > > Pulsar) out of the Flink project in the future, which means that
> the
> > > > > development of these connectors will be mostly de-centralized
> (outside
> > > of
> > > > > Flink) and be up to their respective maintainers. While I agree
> that we
> > > > > should provide API/infrastructure in Flink (as this FLIP does) to
> > > support
> > > > > feature consistency across connectors, I am not sure we should own
> the
> > > > > responsibility to actually update all connectors to achieve feature
> > > > > consistency, given that we don't plan to do it in Flink anyway due
> to
> > > its
> > > > > heavy burden.
> > > > >
> > > > > With that being said, I am happy to follow the community guideline
> if
> > > we
> > > > > decide that connector-related FLIP should update every connector's
> API
> > > to
> > > > > ensure feature consistency (to a reasonable extent). For example,
> in
> > > this
> > > > > particular case, it looks like the EOF-detection feature can be
> > > applied to
> > > > > every connector (including bounded sources). Is it still
> sufficient to
> > > just
> > > > > update Kafka, Pulsar and Kinesis?
> > > > >
> > > > > Thanks,
> > > > > Dong
> > > > >
> > > > > On Tue, Jan 4, 2022 at 3:31 PM Martijn Visser <
> mart...@ververica.com>
> > > > > wrote:
> > > > >
> > > > >> Hi Dong,
> > > > >>
> > > > >> Thanks for writing the FLIP. It focusses only on the KafkaSource,
> but
> > > I
> > > > >> would expect that if such a functionality is desired, it should be
> > > made
> 

Re: [DISCUSS] FLIP-206: Support PyFlink Runtime Execution in Thread Mode

2022-01-11 Thread Thomas Weise
Hi Xingbo,

+1 from my side

Thanks for the clarification. For your use case the parameter size and
therefore serialization overhead was the limiting factor. I have seen
use cases where that is not the concern, because the Python logic
itself is heavy and dwarfs the protocol overhead (for example when
interacting with external systems from the UDF). Hence it is good to
give users options to optimize for their application requirements.

Cheers,
Thomas

On Tue, Jan 11, 2022 at 3:44 AM Xingbo Huang  wrote:
>
> Hi everyone,
>
> Thanks to all of you for the discussion.
> If there are no objections, I would like to start a vote thread tomorrow.
>
> Best,
> Xingbo
>
> Xingbo Huang  于2022年1月7日周五 16:18写道:
>
> > Hi Till,
> >
> > I have written a more complicated PyFlink job. Compared with the previous
> > single python udf job, there is an extra stage of converting between table
> > and datastream. Besides, I added a python map function for the job. Because
> > python datastream has not yet implemented Thread mode, the python map
> > function operator is still running in Process Mode.
> >
> > ```
> > source = t_env.from_path("source_table")  # schema [id: String, d:int]
> >
> > @udf(result_type=DataTypes.STRING(), func_type="general")
> > def upper(x):
> > return x.upper()
> >
> > t_env.create_temporary_system_function("upper", upper)
> > # python map function
> > ds = t_env.to_data_stream(source) \
> > .map(lambda x: x, output_type=Types.ROW_NAMED(["id", "d"],
> >
> >[Types.STRING(),
> >
> > Types.INT()]))
> >
> > t = t_env.from_data_stream(ds)
> > t.select('upper(id)').execute_insert('sink_table')
> > ```
> >
> > The input data size is 1k.
> >
> > Mode |   QPS
> > Process Mode   |3w
> > Thread Mode + Process mode |4w
> >
> > From the table, we can find that the nodes run in Process Mode is the
> > performance bottleneck of the job.
> >
> > Best,
> > Xingbo
> >
> > Till Rohrmann  于2022年1月5日周三 23:16写道:
> >
> >> Thanks for the detailed answer Xingbo. Quick question on the last figure
> >> in
> >> the FLIP. You said that this is a real world Flink stream SQL job. The
> >> title of the graph says UDF(String Upper). So do I understand correctly
> >> that string upper is the real world use case you have measured? What I
> >> wanted to ask is how a slightly more complex Flink Python job (involving
> >> shuffles, with back pressure, etc.) performs using the thread and process
> >> mode respectively.
> >>
> >> If the mode solely needs changes in the Python part of Flink, then I don't
> >> have any concerns from the runtime perspective.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Wed, Jan 5, 2022 at 1:55 PM Xingbo Huang  wrote:
> >>
> >> > Hi Till and Thomas,
> >> >
> >> > Thanks a lot for joining the discussion.
> >> >
> >> > For Till:
> >> >
> >> > >>> Is the slower performance currently the biggest pain point for our
> >> > Python users? What else are our Python users mainly complaining about?
> >> >
> >> > PyFlink users are most concerned about two parts, one is better
> >> usability,
> >> > the other is performance. Users often make some benchmarks when they
> >> > investigate pyflink[1][2] at the beginning to decide whether to use
> >> > PyFlink. The performance of a PyFlink job depends on two parts, one is
> >> the
> >> > overhead of the PyFlink framework, and the other is the Python function
> >> > complexity implemented by the user. In the Python ecosystem, there are
> >> many
> >> > libraries and tools that can help Python users improve the performance
> >> of
> >> > their custom functions, such as pandas[3], numba[4] and cython[5]. So we
> >> > hope that the framework overhead of PyFlink itself can also be reduced.
> >> >
> >> > >>> Concerning the proposed changes, are there any changes required on
> >> the
> >> > runtime side (changes to Flink)? How will the deployment and memory
> >> > management be affected when using the thread execution mode?
> >> >
> >> > The changes on PyFlink Runtime mentioned here are actually only
> >> > modifications of PyFlink custom Operators, such as
> >> > PythonScalarFunctionOperator[6], which won't affect deployment and
> >> memory
> >> > management.
> >> >
> >> > >>> One more question that came to my mind: How much performance
> >> > improvement dowe gain on a real-world Python use case? Were the
> >> > measurements more like micro benchmarks where the Python UDF was called
> >> w/o
> >> > the overhead of Flink? I would just be curious how much the Python
> >> > component contributes to the overall runtime of a real world job. Do we
> >> > have some data on this?
> >> >
> >> > The last figure I put in FLIP is the performance comparison of three
> >> real
> >> > Flink Stream Sql Jobs. They are a Java UDF job, a Python UDF job in
> >> Process
> >> > Mode, and a Python UDF job in Thread Mode. The calculated value of 

Re: Need help with finding inner workings of watermark stream idleness

2022-01-11 Thread Till Rohrmann
Hi Jeff,

I think this happens in the WatermarksWithIdleness [1].

[1]
https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/eventtime/WatermarksWithIdleness.java#L73

Cheers,
Till

On Tue, Jan 11, 2022 at 6:05 PM Jeff Carter  wrote:

> I'm looking into making a feature for flink related to watermarks and am
> digging into the inner watermark mechanisms, specifically with idleness.
> I'm familiar with idleness, but digging into the root code I can only get
> to where idlenessTimeout gets set in WatermarkStrategyWithIdleness.java.
>
>  But what I'm looking for the pieces beyond that. If I set the idleness to
> 500 milliseconds, where in the code does it actually go "I haven't seen a
> message in 500 milliseconds. I'm setting this stream to idle."?
>
> The reason being that what I'm thinking of would need to be able to see if
> any streams are marked idle, and if so react accordingly.
>
> Thanks for any help in advance.
>


Need help with finding inner workings of watermark stream idleness

2022-01-11 Thread Jeff Carter
I'm looking into making a feature for flink related to watermarks and am
digging into the inner watermark mechanisms, specifically with idleness.
I'm familiar with idleness, but digging into the root code I can only get
to where idlenessTimeout gets set in WatermarkStrategyWithIdleness.java.

 But what I'm looking for the pieces beyond that. If I set the idleness to
500 milliseconds, where in the code does it actually go "I haven't seen a
message in 500 milliseconds. I'm setting this stream to idle."?

The reason being that what I'm thinking of would need to be able to see if
any streams are marked idle, and if so react accordingly.

Thanks for any help in advance.


[VOTE] Release 1.14.3, release candidate #1

2022-01-11 Thread Martijn Visser
Hi everyone,
Please review and vote on the release candidate #1 for the version
1.14.3, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to
be deployed to dist.apache.org [2], which are signed with the key with
fingerprint 12DEE3E4D920A98C [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.14.3-rc1" [5],
* website pull request listing the new release and adding announcement
blog post [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks on behalf of Thomas Weise and myself,

Martijn Visser
http://twitter.com/MartijnVisser82

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12351075=12315522
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.14.3-rc1/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1481/
[5] https://dist.apache.org/repos/dist/dev/flink/flink-1.14.3-rc1/
[6] https://github.com/apache/flink-web/pull/497


[jira] [Created] (FLINK-25615) FlinkKafkaProducer fail to correctly migrate pre Flink 1.9 state

2022-01-11 Thread Matthias Schwalbe (Jira)
Matthias Schwalbe created FLINK-25615:
-

 Summary: FlinkKafkaProducer fail to correctly migrate pre Flink 
1.9 state
 Key: FLINK-25615
 URL: https://issues.apache.org/jira/browse/FLINK-25615
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.9.0
Reporter: Matthias Schwalbe


TBD



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] FLIP-208: Update KafkaSource to detect EOF based on de-serialized record

2022-01-11 Thread Fabian Paul
Hi Dong,

I wouldn't change the org.apache.flink.api.connector.source.Source
interface because it either breaks existing sinks or we introduce it
as some kind of optional. I deem both options as not great. My idea is
to introduce a new interface that extends the Source. This way users
who want to develop a source that stops with the record evaluator can
implement the new interface. It also has the nice benefit that we can
give this new type of source a lower stability guarantee than Public
to allow some changes.
In the SourceOperatorFactory we can then access the record evaluator
from the respective sources and pass it to the source operator.

Hopefully, this makes sense. So far I did not find information about
the actual stopping logic in the FLIP maybe you had something
different in mind.

Best,
Fabian

On Tue, Jan 11, 2022 at 1:40 AM Dong Lin  wrote:
>
> Hi Fabian,
>
> Thanks for the comments!
>
> By "add a source mixin interface", are you suggesting to update
> the org.apache.flink.api.connector.source.Source interface to add the API
> "RecordEvaluator getRecordEvaluator()"? If so, it seems to add more
> public API and thus more complexity than the solution in the FLIP. Could
> you help explain more about the benefits of doing this?
>
> Regarding the 2nd question, I think this FLIP does not change whether
> sources are treated as bounded or unbounded. For example, the KafkaSource's
> boundedness will continue to be determined with the API
> KafkaSourceBuilder::setBounded(..) and
> KafkaSourceBuilder::setUnbounded(..). Does this answer your question?
>
> Thanks,
> Dong
>
>
>
>
>
>
>
>
>
> On Mon, Jan 10, 2022 at 8:01 PM Fabian Paul  wrote:
>
> > Hi Dong,
> >
> > Thank you for updating the FLIP and making it applicable for all
> > sources. I am a bit unsure about the implementation part. I would
> > propose to add a source mixin interface that implements
> > `getRecordEvaluator` and sources that want to allow dynamically
> > stopping implement that interface.
> >
> > Another question I had was how do we treat sources using the record
> > evaluator as bounded or unbounded?
> >
> > Best,
> > Fabian
> >
> > On Sat, Jan 8, 2022 at 11:52 AM Dong Lin  wrote:
> > >
> > > Hi Martijn and Qingsheng,
> > >
> > > The FLIP has been updated to extend the dynamic EOF support for the
> > > PulsarSource. I have not extended this feature to other sources yet
> > since I
> > > am not sure it is a requirement to ensure feature consistency across
> > > different sources. Could you take another look?
> > >
> > > Thanks,
> > > Dong
> > >
> > > On Fri, Jan 7, 2022 at 11:49 PM Dong Lin  wrote:
> > >
> > > > Hi Martijn,
> > > >
> > > > Thanks for the comments! In general I agree we should avoid feature
> > > > sparsity.
> > > >
> > > > In this particular case, connectors are a bit different than most other
> > > > features in Flink. AFAIK, we plan to move connectors (including Kafka
> > and
> > > > Pulsar) out of the Flink project in the future, which means that the
> > > > development of these connectors will be mostly de-centralized (outside
> > of
> > > > Flink) and be up to their respective maintainers. While I agree that we
> > > > should provide API/infrastructure in Flink (as this FLIP does) to
> > support
> > > > feature consistency across connectors, I am not sure we should own the
> > > > responsibility to actually update all connectors to achieve feature
> > > > consistency, given that we don't plan to do it in Flink anyway due to
> > its
> > > > heavy burden.
> > > >
> > > > With that being said, I am happy to follow the community guideline if
> > we
> > > > decide that connector-related FLIP should update every connector's API
> > to
> > > > ensure feature consistency (to a reasonable extent). For example, in
> > this
> > > > particular case, it looks like the EOF-detection feature can be
> > applied to
> > > > every connector (including bounded sources). Is it still sufficient to
> > just
> > > > update Kafka, Pulsar and Kinesis?
> > > >
> > > > Thanks,
> > > > Dong
> > > >
> > > > On Tue, Jan 4, 2022 at 3:31 PM Martijn Visser 
> > > > wrote:
> > > >
> > > >> Hi Dong,
> > > >>
> > > >> Thanks for writing the FLIP. It focusses only on the KafkaSource, but
> > I
> > > >> would expect that if such a functionality is desired, it should be
> > made
> > > >> available for all unbounded sources (for example, Pulsar and
> > Kinesis). If
> > > >> it's only available for Kafka, I see it as if we're increasing feature
> > > >> sparsity while we actually want to decrease that. What do you think?
> > > >>
> > > >> Best regards,
> > > >>
> > > >> Martijn
> > > >>
> > > >> On Tue, 4 Jan 2022 at 08:04, Dong Lin  wrote:
> > > >>
> > > >> > Hi all,
> > > >> >
> > > >> > We created FLIP-208: Update KafkaSource to detect EOF based on
> > > >> > de-serialized records. Please find the KIP wiki in the link
> > > >> >
> > > >> >
> > > >>
> > 

Re: [DISCUSS] Releasing Flink 1.14.3

2022-01-11 Thread Martijn Visser
Hi Thomas,

Thanks! I'll prepare the website PR and send out the VOTE in a couple of
hours.

Best regards,

Martijn

On Tue, 11 Jan 2022 at 05:53, Thomas Weise  wrote:

> Thank you Xingbo. I meanwhile also got my Azure pipeline working and
> was able to build the artifacts. Although in general it would be nice
> if not every release volunteer had to set up their separate Azure
> environment.
>
> Martijn,
>
> The release is staged, except for the website PR:
>
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12351075=12315522
> https://dist.apache.org/repos/dist/dev/flink/flink-1.14.3-rc1/
> https://repository.apache.org/content/repositories/orgapacheflink-1481/
> https://github.com/apache/flink/releases/tag/release-1.14.3-rc1
>
> Would you like to create the website PR and start a VOTE?
>
> (If not, I can look into that tomorrow.)
>
> Cheers,
> Thomas
>
>
>
> On Sun, Jan 9, 2022 at 9:17 PM Xingbo Huang  wrote:
> >
> > Hi Thomas,
> >
> > Since multiple wheel packages with different python versions for mac and
> > linux are generated, building locally requires you have multiple machines
> > with different os and Python environments. I have triggered the wheel
> > package build of release-1.14.3-rc1 in my private Azure[1] and you can
> > download the wheels after building successfully.
> >
> > [1]
> >
> https://dev.azure.com/hxbks2ks/FLINK-TEST/_build/results?buildId=1704=results
> >
> > Best,
> > Xingbo
> >
> > Thomas Weise  于2022年1月10日周一 11:12写道:
> >
> > > Hi Martijn,
> > >
> > > I started building the release artifacts. The Maven part is ready.
> > > Currently blocked on the Azure build for the PyFlink wheel packages.
> > >
> > > I had to submit a "Azure DevOps Parallelism Request" and that might
> > > take a couple of days.
> > >
> > > Does someone have the steps to build the wheels locally?
> > > Alternatively, if someone can build them on their existing setup and
> > > point me to the result, that would speed up things as well.
> > >
> > > The release branch:
> > > https://github.com/apache/flink/tree/release-1.14.3-rc1
> > >
> > > Thanks,
> > > Thomas
> > >
> > > On Thu, Jan 6, 2022 at 9:14 PM Martijn Visser 
> > > wrote:
> > > >
> > > > Hi Thomas,
> > > >
> > > > Thanks for volunteering! There was no volunteer yet, so would be
> great if
> > > > you could help out.
> > > >
> > > > Best regards,
> > > >
> > > > Martijn
> > > >
> > > > Op vr 7 jan. 2022 om 01:54 schreef Thomas Weise 
> > > >
> > > > > Hi Martijn,
> > > > >
> > > > > Thanks for preparing the release. Did a volunteer check in with
> you?
> > > > > If not, I would like to take this up.
> > > > >
> > > > > Thomas
> > > > >
> > > > > On Mon, Dec 27, 2021 at 7:11 AM Martijn Visser <
> mart...@ververica.com>
> > > > > wrote:
> > > > > >
> > > > > > Thank you all! That means that there's currently no more blocker
> to
> > > start
> > > > > > with the Flink 1.14.3 release.
> > > > > >
> > > > > > The only thing that's needed is a committer that's willing to
> follow
> > > the
> > > > > > release process [1] Any volunteers?
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Martijn
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release
> > > > > >
> > > > > > On Mon, 27 Dec 2021 at 03:17, Qingsheng Ren 
> > > wrote:
> > > > > >
> > > > > > > Hi Martjin,
> > > > > > >
> > > > > > > FLINK-25132 has been merged to master and release-1.14.
> > > > > > >
> > > > > > > Thanks for your work for releasing 1.14.3!
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Qingsheng Ren
> > > > > > >
> > > > > > > > On Dec 26, 2021, at 3:46 PM, Konstantin Knauf <
> kna...@apache.org
> > > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > Hi Martijn,
> > > > > > > >
> > > > > > > > FLINK-25375 is merged to release-1.14.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > >
> > > > > > > > Konstantin
> > > > > > > >
> > > > > > > > On Wed, Dec 22, 2021 at 12:02 PM David Morávek <
> d...@apache.org>
> > > > > wrote:
> > > > > > > >
> > > > > > > >> Hi Martijn, FLINK-25271 has been merged to 1.14 branch.
> > > > > > > >>
> > > > > > > >> Best,
> > > > > > > >> D.
> > > > > > > >>
> > > > > > > >> On Wed, Dec 22, 2021 at 7:27 AM 任庆盛 
> wrote:
> > > > > > > >>
> > > > > > > >>> Hi Martjin,
> > > > > > > >>>
> > > > > > > >>> Thanks for the effort on Flink 1.14.3. FLINK-25132 has been
> > > merged
> > > > > on
> > > > > > > >>> master and is waiting for CI on release-1.14. I think it
> can be
> > > > > closed
> > > > > > > >>> today.
> > > > > > > >>>
> > > > > > > >>> Cheers,
> > > > > > > >>>
> > > > > > > >>> Qingsheng Ren
> > > > > > > >>>
> > > > > > >  On Dec 21, 2021, at 6:26 PM, Martijn Visser <
> > > > > mart...@ververica.com>
> > > > > > > >>> wrote:
> > > > > > > 
> > > > > > >  Hi everyone,
> > > > > > > 
> > > > > > >  I'm restarting this thread [1] with a new subject, given
> that
> > > > > Flink
> > > > > > > >>> 

[jira] [Created] (FLINK-25614) Let LocalWindowAggregate be chained with upstream

2022-01-11 Thread Q Kang (Jira)
Q Kang created FLINK-25614:
--

 Summary: Let LocalWindowAggregate be chained with upstream
 Key: FLINK-25614
 URL: https://issues.apache.org/jira/browse/FLINK-25614
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Runtime
Affects Versions: 1.14.2
Reporter: Q Kang


When enabling two-phase aggregation (local-global) strategy for Window TVF, the 
physical plan is shown as follows:
{code:java}
TableSourceScan -> Calc -> WatermarkAssigner -> Calc
||
|| [FORWARD]
||
LocalWindowAggregate
||
|| [HASH]
||
GlobalWindowAggregate
||
||
...{code}
We can let the `LocalWindowAggregate` node be chained with upstream operators 
in order to improve efficiency, just like the non-windowing counterpart 
`LocalGroupAggregate`.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] FLIP-210: Change logging level dynamically at runtime

2022-01-11 Thread Wenhao Ji
Hi all,

Yes, indeed.
After I did some investigation on similar features provided by the
Cloud platforms, I actually found several popular Clouds have already
offered this.

- AWS Kinesis: Setting the Application Logging Level [1], which is
implemented by UpdateApplication API [2].
- Ververica: Logging & Metrics[3], by changing the template.
- Alicloud: Configure job logs [4], which is quite similar to
Ververica also by changing the template
- Cloudera: Enabling Flink DEBUG logging[5], by changing the
Configuration and triggering a restart

It looks like this feature is not necessary. It has been developed in
one way or another by so many platforms in the ecosystem.

[1]: 
https://docs.aws.amazon.com/kinesisanalytics/latest/java/cloudwatch-logs.html#cloudwatch-level
[2]: 
https://docs.aws.amazon.com/kinesisanalytics/latest/apiv2/API_UpdateApplication.html
[3]: https://docs.ververica.com/platform_operations/logging_metrics.html
[4]: https://www.alibabacloud.com/help/doc-detail/173646.htm#title-1ay-hju-pka
[5]: 
https://docs.cloudera.com/csa/1.6.0/monitoring/topics/csa-ssb-enabling-debug-logging.html

Thanks,
Wenhao

On Tue, Jan 11, 2022 at 8:24 PM Martijn Visser  wrote:
>
> Hi all,
>
> I agree with Konstantin, this feels like a problem that shouldn't be solved
> via Apache Flink but via the logging ecosystem itself.
>
> Best regards,
>
> Martijn
>
> On Tue, 11 Jan 2022 at 13:11, Konstantin Knauf  wrote:
>
> > I've now read over the discussion on the ticket, and I am personally not in
> > favor of adding this functionality to Flink via the REST API or Web UI. I
> > believe that changing the logging configuration via the existing
> > configuration files (log4j or logback) is good enough, to justify not
> > increasing the scope of Flink in that direction. As you specifically
> > mention YARN: doesn't Cloudera's Hadoop platform, for example, offer means
> > to manage the configuration files for all worker nodes from a central
> > configuration management system? It overall feels like we are trying to
> > solve a problem in Apache Flink that is already solved in its ecosystem and
> > increases the scope of the project without adding core value. I also expect
> > that over time the exposed logging configuration options would become more
> > and more complex.
> >
> > I am curious to hear what others think.
> >
> > On Tue, Jan 11, 2022 at 10:34 AM Chesnay Schepler 
> > wrote:
> >
> > > Reloading the config from the filesystem  is already enabled by default;
> > > that was one of the things that made us switch to Log4j 2.
> > >
> > > The core point of contention w.r.t. this topic is whether having the
> > > admin ssh into the machine is too inconvenient.
> > >
> > > Personally I still think that the the current capabilities are
> > > sufficient, and I do not want us to rely on internals of the logging
> > > backends in production code.
> > >
> > > On 10/01/2022 17:26, Konstantin Knauf wrote:
> > > > Thank you for starting the discussion. Being able to change the logging
> > > > level at runtime is very valuable in my experience.
> > > >
> > > > Instead of introducing our own API (and eventually even persistence),
> > > could
> > > > we just periodically reload the log4j or logback configuration from the
> > > > environment/filesystem? I only quickly googled the topic and [1,2]
> > > suggest
> > > > that this might be possible?
> > > >
> > > > [1] https://stackoverflow.com/a/16216956/6422562?
> > > > [2] https://logback.qos.ch/manual/configuration.html#autoScan
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Jan 10, 2022 at 5:10 PM Wenhao Ji 
> > > wrote:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> Hope you enjoyed the Holiday Season.
> > > >>
> > > >> I would like to start the discussion on the improvement purpose
> > > >> FLIP-210 [1] which aims to provide a way to change log levels at
> > > >> runtime to simplify issues and bugs detection as reported in the
> > > >> ticket FLINK-16478 [2].
> > > >> Firstly, thanks Xingxing Di and xiaodao for their previous effort. The
> > > >> FLIP I drafted is largely influenced by their previous designs [3][4].
> > > >> Although we have reached some agreements under the jira comments about
> > > >> the scope of this feature, we still have the following questions
> > > >> listed below ready to be discussed in this thread.
> > > >>
> > > >> ## Question 1
> > > >>
> > > >>> Creating as custom DSL and implementing it for several logging
> > backend
> > > >> sounds like quite a maintenance burden. Extensions to the DSL, and
> > > >> supported backends, could become quite an effort. (by Chesnay
> > Schepler)
> > > >>
> > > >> I tried to design the API of the logging backend to stay away from the
> > > >> details of implementations but I did not find any slf4j-specific API
> > > >> that is available to change the log level of a logger. So what I did
> > > >> is to introduce another kind of abstraction on top of the slf4j /
> > > >> log4j / logback so that we will not 

[DISCUSS] FLIP-211: Kerberos delegation token framework

2022-01-11 Thread Gabor Somogyi
Hi All,

Hope all of you have enjoyed the holiday season.

I would like to start the discussion on FLIP-211

which
aims to provide a
Kerberos delegation token framework that /obtains/renews/distributes tokens
out-of-the-box.

Please be aware that the FLIP wiki area is not fully done since the
discussion may
change the feature in major ways. The proposal can be found in a google doc
here

.
As the community agrees on the approach the content will be moved to the
wiki page.

Feel free to add your thoughts to make this feature better!

BR,
G


Could not find any factory for identifier 'jdbc'

2022-01-11 Thread Ronak Beejawat (rbeejawa)
Correcting subject -> Could not find any factory for identifier 'jdbc'

From: Ronak Beejawat (rbeejawa)
Sent: Tuesday, January 11, 2022 6:43 PM
To: 'dev@flink.apache.org' ; 'commun...@flink.apache.org' 
; 'u...@flink.apache.org' 
Cc: 'Hang Ruan' ; Shrinath Shenoy K (sshenoyk) 
; Karthikeyan Muthusamy (karmuthu) ; 
Krishna Singitam (ksingita) ; Arun Yadav (aruny) 
; Jayaprakash Kuravatti (jkuravat) ; Avi 
Sanwal (asanwal) 
Subject: what is efficient way to write Left join in flink

Hi Team,

Getting below exception while using jdbc connector :

Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
factory for identifier 'jdbc' that implements 
'org.apache.flink.table.factories.DynamicTableFactory' in the classpath.

Available factory identifiers are:

blackhole
datagen
filesystem
kafka
print
upsert-kafka


I have already added dependency for jdbc connector in pom.xml as mentioned 
below:


org.apache.flink
   flink-connector-jdbc_2.11
   1.14.2


mysql
   mysql-connector-java
   5.1.41


Referred release doc link for the same 
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/table/jdbc/



Please help me on this and provide the solution for it !!!


Thanks
Ronak Beejawat


what is efficient way to write Left join in flink

2022-01-11 Thread Ronak Beejawat (rbeejawa)
Hi Team,

Getting below exception while using jdbc connector :

Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
factory for identifier 'jdbc' that implements 
'org.apache.flink.table.factories.DynamicTableFactory' in the classpath.

Available factory identifiers are:

blackhole
datagen
filesystem
kafka
print
upsert-kafka


I have already added dependency for jdbc connector in pom.xml as mentioned 
below:


org.apache.flink
   flink-connector-jdbc_2.11
   1.14.2


mysql
   mysql-connector-java
   5.1.41


Referred release doc link for the same 
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/table/jdbc/



Please help me on this and provide the solution for it !!!


Thanks
Ronak Beejawat


[jira] [Created] (FLINK-25613) Remove excessive surefire-plugin versions

2022-01-11 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-25613:


 Summary: Remove excessive surefire-plugin versions
 Key: FLINK-25613
 URL: https://issues.apache.org/jira/browse/FLINK-25613
 Project: Flink
  Issue Type: Technical Debt
  Components: Build System
Affects Versions: 1.15.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0


Various modules overwrite the default surefire version. We should remove that 
unless there is a good reason to do so.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


RE: what is efficient way to write Left join in flink

2022-01-11 Thread Ronak Beejawat (rbeejawa)
Can please someone help / reply on below Question ?

From: Ronak Beejawat (rbeejawa)
Sent: Monday, January 10, 2022 7:40 PM
To: dev@flink.apache.org; commun...@flink.apache.org; u...@flink.apache.org
Cc: Hang Ruan ; Shrinath Shenoy K (sshenoyk) 

Subject: what is efficient way to write Left join in flink

Hi Team,

We want a clarification on one real time processing scenario for below 
mentioned use case.

Use case :
1. We have topic one (testtopic1) which will get half a million data every 
minute.
2. We have topic two (testtopic2) which will get one million data every minute.

So we are doing join as testtopic1  left join  testtopic2 which has a 
correlated data 1:2

So the question is which API will be more efficient and faster for such use 
case (datastream API or sql API) for intensive joining logic?

Thanks
Ronak Beejawat


Re: [DISCUSS] Releasing Flink 1.14.3

2022-01-11 Thread Chesnay Schepler
The assumption is that everyone that works enough on Flink to volunteer 
as a RM already has a working azure setup, in which case there isn't any 
additional setup overhead.


We may improve this in the future though.

On 11/01/2022 05:53, Thomas Weise wrote:

Thank you Xingbo. I meanwhile also got my Azure pipeline working and
was able to build the artifacts. Although in general it would be nice
if not every release volunteer had to set up their separate Azure
environment.

Martijn,

The release is staged, except for the website PR:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12351075=12315522
https://dist.apache.org/repos/dist/dev/flink/flink-1.14.3-rc1/
https://repository.apache.org/content/repositories/orgapacheflink-1481/
https://github.com/apache/flink/releases/tag/release-1.14.3-rc1

Would you like to create the website PR and start a VOTE?

(If not, I can look into that tomorrow.)

Cheers,
Thomas



On Sun, Jan 9, 2022 at 9:17 PM Xingbo Huang  wrote:

Hi Thomas,

Since multiple wheel packages with different python versions for mac and
linux are generated, building locally requires you have multiple machines
with different os and Python environments. I have triggered the wheel
package build of release-1.14.3-rc1 in my private Azure[1] and you can
download the wheels after building successfully.

[1]
https://dev.azure.com/hxbks2ks/FLINK-TEST/_build/results?buildId=1704=results

Best,
Xingbo

Thomas Weise  于2022年1月10日周一 11:12写道:


Hi Martijn,

I started building the release artifacts. The Maven part is ready.
Currently blocked on the Azure build for the PyFlink wheel packages.

I had to submit a "Azure DevOps Parallelism Request" and that might
take a couple of days.

Does someone have the steps to build the wheels locally?
Alternatively, if someone can build them on their existing setup and
point me to the result, that would speed up things as well.

The release branch:
https://github.com/apache/flink/tree/release-1.14.3-rc1

Thanks,
Thomas

On Thu, Jan 6, 2022 at 9:14 PM Martijn Visser 
wrote:

Hi Thomas,

Thanks for volunteering! There was no volunteer yet, so would be great if
you could help out.

Best regards,

Martijn

Op vr 7 jan. 2022 om 01:54 schreef Thomas Weise 


Hi Martijn,

Thanks for preparing the release. Did a volunteer check in with you?
If not, I would like to take this up.

Thomas

On Mon, Dec 27, 2021 at 7:11 AM Martijn Visser 
wrote:

Thank you all! That means that there's currently no more blocker to

start

with the Flink 1.14.3 release.

The only thing that's needed is a committer that's willing to follow

the

release process [1] Any volunteers?

Best regards,

Martijn

[1]


https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release

On Mon, 27 Dec 2021 at 03:17, Qingsheng Ren 

wrote:

Hi Martjin,

FLINK-25132 has been merged to master and release-1.14.

Thanks for your work for releasing 1.14.3!

Cheers,

Qingsheng Ren


On Dec 26, 2021, at 3:46 PM, Konstantin Knauf 
wrote:

Hi Martijn,

FLINK-25375 is merged to release-1.14.

Cheers,

Konstantin

On Wed, Dec 22, 2021 at 12:02 PM David Morávek 

wrote:

Hi Martijn, FLINK-25271 has been merged to 1.14 branch.

Best,
D.

On Wed, Dec 22, 2021 at 7:27 AM 任庆盛  wrote:


Hi Martjin,

Thanks for the effort on Flink 1.14.3. FLINK-25132 has been

merged

on

master and is waiting for CI on release-1.14. I think it can be

closed

today.

Cheers,

Qingsheng Ren


On Dec 21, 2021, at 6:26 PM, Martijn Visser <

mart...@ververica.com>

wrote:

Hi everyone,

I'm restarting this thread [1] with a new subject, given that

Flink

1.14.1 was a (cancelled) emergency release for the Log4j

update and

we've

released Flink 1.14.2 as an emergency release for Log4j updates

[2].

To give an update, this is the current blocker for Flink

1.14.3:

* https://issues.apache.org/jira/browse/FLINK-25132 -

KafkaSource

cannot work with object-reusing DeserializationSchema -> @
renqs...@gmail.com can you provide an ETA for this ticket?

There are two critical tickets open for Flink 1.14.3. That

means

that

if

the above ticket is resolved, these two will not block the

release. If

we

can merge them in before the above ticket is completed, that's

a

bonus.

* https://issues.apache.org/jira/browse/FLINK-25199 -

fromValues

does

not emit final MAX watermark -> @Marios Trivyzas any update or

thoughts

on

this?

* https://issues.apache.org/jira/browse/FLINK-25227 -

Comparing

the

equality of the same (boxed) numeric values returns false ->

@Caizhi

Weng

any update or thoughts on this?

Best regards,

Martijn

[1]

https://lists.apache.org/thread/r0xhs9x01k8hnm0hyq2kk4ptrhkzgdw9

[2]

https://flink.apache.org/news/2021/12/16/log4j-patch-releases.html

On Thu, 9 Dec 2021 at 17:21, David Morávek 

wrote:

Hi Martijn, I've just opened a backport PR [1] for FLINK-23946

[2].

[1] https://github.com/apache/flink/pull/18066
[2] https://issues.apache.org/jira/browse/FLINK-23946

Best,
D.

On Thu, Dec 9, 2021 at 4:59 

Re: [DISCUSS] FLIP-210: Change logging level dynamically at runtime

2022-01-11 Thread Martijn Visser
Hi all,

I agree with Konstantin, this feels like a problem that shouldn't be solved
via Apache Flink but via the logging ecosystem itself.

Best regards,

Martijn

On Tue, 11 Jan 2022 at 13:11, Konstantin Knauf  wrote:

> I've now read over the discussion on the ticket, and I am personally not in
> favor of adding this functionality to Flink via the REST API or Web UI. I
> believe that changing the logging configuration via the existing
> configuration files (log4j or logback) is good enough, to justify not
> increasing the scope of Flink in that direction. As you specifically
> mention YARN: doesn't Cloudera's Hadoop platform, for example, offer means
> to manage the configuration files for all worker nodes from a central
> configuration management system? It overall feels like we are trying to
> solve a problem in Apache Flink that is already solved in its ecosystem and
> increases the scope of the project without adding core value. I also expect
> that over time the exposed logging configuration options would become more
> and more complex.
>
> I am curious to hear what others think.
>
> On Tue, Jan 11, 2022 at 10:34 AM Chesnay Schepler 
> wrote:
>
> > Reloading the config from the filesystem  is already enabled by default;
> > that was one of the things that made us switch to Log4j 2.
> >
> > The core point of contention w.r.t. this topic is whether having the
> > admin ssh into the machine is too inconvenient.
> >
> > Personally I still think that the the current capabilities are
> > sufficient, and I do not want us to rely on internals of the logging
> > backends in production code.
> >
> > On 10/01/2022 17:26, Konstantin Knauf wrote:
> > > Thank you for starting the discussion. Being able to change the logging
> > > level at runtime is very valuable in my experience.
> > >
> > > Instead of introducing our own API (and eventually even persistence),
> > could
> > > we just periodically reload the log4j or logback configuration from the
> > > environment/filesystem? I only quickly googled the topic and [1,2]
> > suggest
> > > that this might be possible?
> > >
> > > [1] https://stackoverflow.com/a/16216956/6422562?
> > > [2] https://logback.qos.ch/manual/configuration.html#autoScan
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Jan 10, 2022 at 5:10 PM Wenhao Ji 
> > wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> Hope you enjoyed the Holiday Season.
> > >>
> > >> I would like to start the discussion on the improvement purpose
> > >> FLIP-210 [1] which aims to provide a way to change log levels at
> > >> runtime to simplify issues and bugs detection as reported in the
> > >> ticket FLINK-16478 [2].
> > >> Firstly, thanks Xingxing Di and xiaodao for their previous effort. The
> > >> FLIP I drafted is largely influenced by their previous designs [3][4].
> > >> Although we have reached some agreements under the jira comments about
> > >> the scope of this feature, we still have the following questions
> > >> listed below ready to be discussed in this thread.
> > >>
> > >> ## Question 1
> > >>
> > >>> Creating as custom DSL and implementing it for several logging
> backend
> > >> sounds like quite a maintenance burden. Extensions to the DSL, and
> > >> supported backends, could become quite an effort. (by Chesnay
> Schepler)
> > >>
> > >> I tried to design the API of the logging backend to stay away from the
> > >> details of implementations but I did not find any slf4j-specific API
> > >> that is available to change the log level of a logger. So what I did
> > >> is to introduce another kind of abstraction on top of the slf4j /
> > >> log4j / logback so that we will not depend on the logging provider's
> > >> api directly. It will be convenient for us to adopt any other logging
> > >> providers. Please see the "Logging Abstraction" section.
> > >>
> > >> ## Question 2
> > >>
> > >>> Do we know whether other systems support this kind of feature? If
> yes,
> > >> how do they solve it for different logging backends? (by Till
> Rohrmann)
> > >>
> > >> I investigated several Java frameworks including Spark, Storm, and
> > >> Spring Boot. Here is what I found.
> > >> Spark & Storm directly depend on the log4j implementations, which
> > >> means they do not support any other slf4j implementation at all. They
> > >> simply call the log4j api directly. (see SparkContext.scala#L381 [5],
> > >> Utils.scala#L2443 [6] in Spark, and LogConfigManager.java#L144 [7] in
> > >> Storm). They are pretty different from what Flink provides.
> > >> However, I found Spring Boot has implemented what we are interested
> > >> in. Just as Flink, Spring boot also supports many slf4j
> > >> implementations. Users are not limited to log4j. They have the ability
> > >> to declare different logging frameworks by importing certain
> > >> dependencies. After that spring will decide the activated one by
> > >> scanning its classpath and context. (see LoggingSystem.java#L164 [8]
> > >> and LoggersEndpoint.java#L99 [9])
> > >>
> > >> ## Question 

Re: [VOTE][CANCELED] Release flink-shaded 15.0, release candidate #1

2022-01-11 Thread Chesnay Schepler
FLINK-25588 has been put onto the agenda for the next flink-shaded 
release, so we might as well cancel this vote to cover everything at once.


On 14/12/2021 09:57, Chesnay Schepler wrote:

Hi everyone,
Please review and vote on the release candidate #1 for the version 
15.0, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org 
[2], which are signed with the key with fingerprint C2EED7B111D464BA [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.


Thanks,
Release Manager

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350665

[2] https://dist.apache.org/repos/dist/dev/flink/flink-shaded-15.0-rc1/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] 
https://repository.apache.org/content/repositories/orgapacheflink-1458

[5] https://github.com/apache/flink-shaded/releases/tag/release-15.0-rc1
[6] https://github.com/apache/flink-web/pull/490





Re: [DISCUSS] FLIP-210: Change logging level dynamically at runtime

2022-01-11 Thread Konstantin Knauf
I've now read over the discussion on the ticket, and I am personally not in
favor of adding this functionality to Flink via the REST API or Web UI. I
believe that changing the logging configuration via the existing
configuration files (log4j or logback) is good enough, to justify not
increasing the scope of Flink in that direction. As you specifically
mention YARN: doesn't Cloudera's Hadoop platform, for example, offer means
to manage the configuration files for all worker nodes from a central
configuration management system? It overall feels like we are trying to
solve a problem in Apache Flink that is already solved in its ecosystem and
increases the scope of the project without adding core value. I also expect
that over time the exposed logging configuration options would become more
and more complex.

I am curious to hear what others think.

On Tue, Jan 11, 2022 at 10:34 AM Chesnay Schepler 
wrote:

> Reloading the config from the filesystem  is already enabled by default;
> that was one of the things that made us switch to Log4j 2.
>
> The core point of contention w.r.t. this topic is whether having the
> admin ssh into the machine is too inconvenient.
>
> Personally I still think that the the current capabilities are
> sufficient, and I do not want us to rely on internals of the logging
> backends in production code.
>
> On 10/01/2022 17:26, Konstantin Knauf wrote:
> > Thank you for starting the discussion. Being able to change the logging
> > level at runtime is very valuable in my experience.
> >
> > Instead of introducing our own API (and eventually even persistence),
> could
> > we just periodically reload the log4j or logback configuration from the
> > environment/filesystem? I only quickly googled the topic and [1,2]
> suggest
> > that this might be possible?
> >
> > [1] https://stackoverflow.com/a/16216956/6422562?
> > [2] https://logback.qos.ch/manual/configuration.html#autoScan
> >
> >
> >
> >
> >
> > On Mon, Jan 10, 2022 at 5:10 PM Wenhao Ji 
> wrote:
> >
> >> Hi everyone,
> >>
> >> Hope you enjoyed the Holiday Season.
> >>
> >> I would like to start the discussion on the improvement purpose
> >> FLIP-210 [1] which aims to provide a way to change log levels at
> >> runtime to simplify issues and bugs detection as reported in the
> >> ticket FLINK-16478 [2].
> >> Firstly, thanks Xingxing Di and xiaodao for their previous effort. The
> >> FLIP I drafted is largely influenced by their previous designs [3][4].
> >> Although we have reached some agreements under the jira comments about
> >> the scope of this feature, we still have the following questions
> >> listed below ready to be discussed in this thread.
> >>
> >> ## Question 1
> >>
> >>> Creating as custom DSL and implementing it for several logging backend
> >> sounds like quite a maintenance burden. Extensions to the DSL, and
> >> supported backends, could become quite an effort. (by Chesnay Schepler)
> >>
> >> I tried to design the API of the logging backend to stay away from the
> >> details of implementations but I did not find any slf4j-specific API
> >> that is available to change the log level of a logger. So what I did
> >> is to introduce another kind of abstraction on top of the slf4j /
> >> log4j / logback so that we will not depend on the logging provider's
> >> api directly. It will be convenient for us to adopt any other logging
> >> providers. Please see the "Logging Abstraction" section.
> >>
> >> ## Question 2
> >>
> >>> Do we know whether other systems support this kind of feature? If yes,
> >> how do they solve it for different logging backends? (by Till Rohrmann)
> >>
> >> I investigated several Java frameworks including Spark, Storm, and
> >> Spring Boot. Here is what I found.
> >> Spark & Storm directly depend on the log4j implementations, which
> >> means they do not support any other slf4j implementation at all. They
> >> simply call the log4j api directly. (see SparkContext.scala#L381 [5],
> >> Utils.scala#L2443 [6] in Spark, and LogConfigManager.java#L144 [7] in
> >> Storm). They are pretty different from what Flink provides.
> >> However, I found Spring Boot has implemented what we are interested
> >> in. Just as Flink, Spring boot also supports many slf4j
> >> implementations. Users are not limited to log4j. They have the ability
> >> to declare different logging frameworks by importing certain
> >> dependencies. After that spring will decide the activated one by
> >> scanning its classpath and context. (see LoggingSystem.java#L164 [8]
> >> and LoggersEndpoint.java#L99 [9])
> >>
> >> ## Question 3
> >>
> >> Besides the questions raised in the jira comments, I also find another
> >> thing that has not been discussed. Considering this feature as an MVP,
> >> do we need to introduce a HighAvailabilityService to store the log
> >> settings so that they can be synced to newly-joined task managers and
> >> also job manager followers for consistency? This issue is included in
> >> the "Limitations" section in the 

[jira] [Created] (FLINK-25612) Update the outdated illustration of ExecutionState in the documentation

2022-01-11 Thread Zhilong Hong (Jira)
Zhilong Hong created FLINK-25612:


 Summary: Update the outdated illustration of ExecutionState in the 
documentation
 Key: FLINK-25612
 URL: https://issues.apache.org/jira/browse/FLINK-25612
 Project: Flink
  Issue Type: Technical Debt
  Components: Documentation
Affects Versions: 1.14.2, 1.13.5, 1.15.0
Reporter: Zhilong Hong
 Fix For: 1.15.0, 1.13.6, 1.14.3
 Attachments: current-illustration-2.jpg, new-illustration.jpg

Currently, the illustration of {{ExecutionState}} located in the page "Jobs and 
Scheduling" 
([https://nightlies.apache.org/flink/flink-docs-master/docs/internals/job_scheduling/])
 is outdated. It doesn't involve the INITIALIZING state, which is introduced in 
FLINK-17102.

 

Current illustration:

!current-illustration-2.jpg|width=400!

New illustration:

!new-illustration.jpg|width=400!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25611) Remove CoordinatorExecutorThreadFactory thread creation guards

2022-01-11 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-25611:


 Summary: Remove CoordinatorExecutorThreadFactory thread creation 
guards
 Key: FLINK-25611
 URL: https://issues.apache.org/jira/browse/FLINK-25611
 Project: Flink
  Issue Type: Improvement
  Components: API / Core
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.15.0, 1.13.6, 1.14.3


The CoordinatorExecutorThreadFactory of the SourceCoordinator checks that only 
a single thread is active and that no new thread can be created if the previous 
one failed.

Neither of these guards work properly. If a runnable in the ThreadPoolExecutor 
fails then it actually uses the worker thread of the failed runnable to spawn a 
new worker. This means that at the time the second thread is created the 
previous thread is still alive, and the exception that caused the failure 
hasn't even been propagated to the threads exception handler.

As these guards do not work, and to boot result in the actual failure causes 
being hidden (like in FLINK-24855), we should remove them.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] FLIP-206: Support PyFlink Runtime Execution in Thread Mode

2022-01-11 Thread Xingbo Huang
Hi everyone,

Thanks to all of you for the discussion.
If there are no objections, I would like to start a vote thread tomorrow.

Best,
Xingbo

Xingbo Huang  于2022年1月7日周五 16:18写道:

> Hi Till,
>
> I have written a more complicated PyFlink job. Compared with the previous
> single python udf job, there is an extra stage of converting between table
> and datastream. Besides, I added a python map function for the job. Because
> python datastream has not yet implemented Thread mode, the python map
> function operator is still running in Process Mode.
>
> ```
> source = t_env.from_path("source_table")  # schema [id: String, d:int]
>
> @udf(result_type=DataTypes.STRING(), func_type="general")
> def upper(x):
> return x.upper()
>
> t_env.create_temporary_system_function("upper", upper)
> # python map function
> ds = t_env.to_data_stream(source) \
> .map(lambda x: x, output_type=Types.ROW_NAMED(["id", "d"],
>
>[Types.STRING(),
>
> Types.INT()]))
>
> t = t_env.from_data_stream(ds)
> t.select('upper(id)').execute_insert('sink_table')
> ```
>
> The input data size is 1k.
>
> Mode |   QPS
> Process Mode   |3w
> Thread Mode + Process mode |4w
>
> From the table, we can find that the nodes run in Process Mode is the
> performance bottleneck of the job.
>
> Best,
> Xingbo
>
> Till Rohrmann  于2022年1月5日周三 23:16写道:
>
>> Thanks for the detailed answer Xingbo. Quick question on the last figure
>> in
>> the FLIP. You said that this is a real world Flink stream SQL job. The
>> title of the graph says UDF(String Upper). So do I understand correctly
>> that string upper is the real world use case you have measured? What I
>> wanted to ask is how a slightly more complex Flink Python job (involving
>> shuffles, with back pressure, etc.) performs using the thread and process
>> mode respectively.
>>
>> If the mode solely needs changes in the Python part of Flink, then I don't
>> have any concerns from the runtime perspective.
>>
>> Cheers,
>> Till
>>
>> On Wed, Jan 5, 2022 at 1:55 PM Xingbo Huang  wrote:
>>
>> > Hi Till and Thomas,
>> >
>> > Thanks a lot for joining the discussion.
>> >
>> > For Till:
>> >
>> > >>> Is the slower performance currently the biggest pain point for our
>> > Python users? What else are our Python users mainly complaining about?
>> >
>> > PyFlink users are most concerned about two parts, one is better
>> usability,
>> > the other is performance. Users often make some benchmarks when they
>> > investigate pyflink[1][2] at the beginning to decide whether to use
>> > PyFlink. The performance of a PyFlink job depends on two parts, one is
>> the
>> > overhead of the PyFlink framework, and the other is the Python function
>> > complexity implemented by the user. In the Python ecosystem, there are
>> many
>> > libraries and tools that can help Python users improve the performance
>> of
>> > their custom functions, such as pandas[3], numba[4] and cython[5]. So we
>> > hope that the framework overhead of PyFlink itself can also be reduced.
>> >
>> > >>> Concerning the proposed changes, are there any changes required on
>> the
>> > runtime side (changes to Flink)? How will the deployment and memory
>> > management be affected when using the thread execution mode?
>> >
>> > The changes on PyFlink Runtime mentioned here are actually only
>> > modifications of PyFlink custom Operators, such as
>> > PythonScalarFunctionOperator[6], which won't affect deployment and
>> memory
>> > management.
>> >
>> > >>> One more question that came to my mind: How much performance
>> > improvement dowe gain on a real-world Python use case? Were the
>> > measurements more like micro benchmarks where the Python UDF was called
>> w/o
>> > the overhead of Flink? I would just be curious how much the Python
>> > component contributes to the overall runtime of a real world job. Do we
>> > have some data on this?
>> >
>> > The last figure I put in FLIP is the performance comparison of three
>> real
>> > Flink Stream Sql Jobs. They are a Java UDF job, a Python UDF job in
>> Process
>> > Mode, and a Python UDF job in Thread Mode. The calculated value of QPS
>> is
>> > the end-to-end Flink job execution result. As shown in the performance
>> > comparison chart, the performance of Python udf with the same function
>> can
>> > often only reach 20% of Java udf, so the performance of python udf will
>> > often become the performance bottleneck in a PyFlink job.
>> >
>> > For Thomas:
>> >
>> > The first time that I realized the framework overhead of various IPC
>> > (socket, grpc, shared memory) cannot be ignored in some scenarios is
>> due to
>> > an image algorithm prediction job of PyFlink. Its input parameters are a
>> > series of huge image binary arrays, and its data size is bigger than 1G.
>> > The performance overhead of serialization/deserialization has become an
>> > 

[jira] [Created] (FLINK-25610) [FLIP-171] Kinesis Firehose implementation of Async Sink Table API

2022-01-11 Thread Ahmed Hamdy (Jira)
Ahmed Hamdy created FLINK-25610:
---

 Summary: [FLIP-171] Kinesis Firehose implementation of Async Sink 
Table API
 Key: FLINK-25610
 URL: https://issues.apache.org/jira/browse/FLINK-25610
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / Kinesis
Reporter: Ahmed Hamdy
Assignee: Ahmed Hamdy
 Fix For: 1.15.0


h2. Motivation

*User stories:*
As a Flink user, I’d like to use the Table API for the new Kinesis Data Streams 
 sink.

*Scope:*
 * Introduce {{AsyncDynamicTableSink}} that enables Sinking Tables into Async 
Implementations.
 * Implement a new {{KinesisDynamicTableSink}} that uses 
{{KinesisDataStreamSink}} Async Implementation and implements 
{{{}AsyncDynamicTableSink{}}}.
 * The implementation introduces Async Sink configurations as optional options 
in the table definition, with default values derived from the 
{{KinesisDataStream}} default values.
 * Unit/Integration testing. modify KinesisTableAPI tests for the new 
implementation, add unit tests for {{AsyncDynamicTableSink}} and 
{{KinesisDynamicTableSink}} and {{{}KinesisDynamicTableSinkFactory{}}}.
 * Java / code-level docs.

h2. References

More details to be found 
[https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25609) Avoid creating temporary tables for inline tables

2022-01-11 Thread Timo Walther (Jira)
Timo Walther created FLINK-25609:


 Summary: Avoid creating temporary tables for inline tables
 Key: FLINK-25609
 URL: https://issues.apache.org/jira/browse/FLINK-25609
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Timo Walther


Currently, inline tables such as {{fromDataStream}} or 
{{from(TableDescriptor)}} as well as {{toChangelogStream}} or 
{{Table.executeInsert(TableDescriptor)}} create an artifical temporary table 
with object identifier.

This is due to Calcite that forces {{RelBuilder.scan}} to have an object 
identifier.

Instead we should:
- Create an internal {{RelOptTable}} that wraps {{CatalogTable}}
- Various flavors of {{CatalogTable}} can store the {{DataStream}} object or 
information of the {{TableDescriptor.}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


what is efficient way to write Left join in flink

2022-01-11 Thread Ronak Beejawat (rbeejawa)
Hi Team,

We want a clarification on one real time processing scenario for below 
mentioned use case.

Use case :
1. We have topic one (testtopic1) which will get half a million data every 
minute.
2. We have topic two (testtopic2) which will get one million data every minute.

So we are doing join as testtopic1  left join  testtopic2 which has a 
correlated data 1:2

So the question is which API will be more efficient and faster for such use 
case (datastream API or sql API) for intensive joining logic?

Thanks
Ronak Beejawat


Re: [DISCUSS] Deprecate MapR FS

2022-01-11 Thread Jingsong Li
+1 for dropping the MapR FS. Thanks for driving.

Best,
Jingsong

On Tue, Jan 11, 2022 at 5:22 PM Chang Li  wrote:
>
> +1 for dropping the MapR FS.
>
> Till Rohrmann  于2022年1月5日周三 18:33写道:
>
> > +1 for dropping the MapR FS.
> >
> > Cheers,
> > Till
> >
> > On Wed, Jan 5, 2022 at 10:11 AM Martijn Visser 
> > wrote:
> >
> > > Hi everyone,
> > >
> > > Thanks for your input. I've checked the MapR implementation and it has no
> > > annotation at all. Given the circumstances that we thought that MapR was
> > > already dropped, I would propose to immediately remove MapR in Flink 1.15
> > > instead of first marking it as deprecated and removing it in Flink 1.16.
> > >
> > > Please let me know what you think.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > On Thu, 9 Dec 2021 at 17:27, David Morávek  wrote:
> > >
> > >> +1, agreed with Seth's reasoning. There has been no real activity in
> > MapR
> > >> FS module for years [1], so the eventual users should be good with using
> > >> the jars from the older Flink versions for quite some time
> > >>
> > >> [1]
> > >>
> > https://github.com/apache/flink/commits/master/flink-filesystems/flink-mapr-fs
> > >>
> > >> Best,
> > >> D.
> > >>
> > >> On Thu, Dec 9, 2021 at 4:28 PM Konstantin Knauf 
> > >> wrote:
> > >>
> > >>> +1 (what Seth said)
> > >>>
> > >>> On Thu, Dec 9, 2021 at 4:15 PM Seth Wiesman 
> > wrote:
> > >>>
> > >>> > +1
> > >>> >
> > >>> > I actually thought we had already dropped this FS. If anyone is still
> > >>> > relying on it in production, the file system abstraction in Flink has
> > >>> been
> > >>> > incredibly stable over the years. They should be able to use the 1.14
> > >>> MapR
> > >>> > FS with later versions of Flink.
> > >>> >
> > >>> > Seth
> > >>> >
> > >>> > On Wed, Dec 8, 2021 at 10:03 AM Martijn Visser <
> > mart...@ververica.com>
> > >>> > wrote:
> > >>> >
> > >>> >> Hi all,
> > >>> >>
> > >>> >> Flink supports multiple file systems [1] which includes MapR FS.
> > MapR
> > >>> as
> > >>> >> a company doesn't exist anymore since 2019, the technology and
> > >>> intellectual
> > >>> >> property has been sold to Hewlett Packard.
> > >>> >>
> > >>> >> I don't think that there's anyone who's using MapR anymore and
> > >>> therefore
> > >>> >> I think it would be good to deprecate this for Flink 1.15 and then
> > >>> remove
> > >>> >> it in Flink 1.16. Removing this from Flink will slightly shrink the
> > >>> >> codebase and CI runtime.
> > >>> >>
> > >>> >> I'm also cross posting this to the User mailing list, in case
> > there's
> > >>> >> still anyone who's using MapR.
> > >>> >>
> > >>> >> Best regards,
> > >>> >>
> > >>> >> Martijn
> > >>> >>
> > >>> >> [1]
> > >>> >>
> > >>>
> > https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
> > >>> >>
> > >>> >
> > >>>
> > >>> --
> > >>>
> > >>> Konstantin Knauf
> > >>>
> > >>> https://twitter.com/snntrable
> > >>>
> > >>> https://github.com/knaufk
> > >>>
> > >>
> >


Re: [DISCUSS] FLIP-210: Change logging level dynamically at runtime

2022-01-11 Thread Chesnay Schepler
Reloading the config from the filesystem  is already enabled by default; 
that was one of the things that made us switch to Log4j 2.


The core point of contention w.r.t. this topic is whether having the 
admin ssh into the machine is too inconvenient.


Personally I still think that the the current capabilities are 
sufficient, and I do not want us to rely on internals of the logging 
backends in production code.


On 10/01/2022 17:26, Konstantin Knauf wrote:

Thank you for starting the discussion. Being able to change the logging
level at runtime is very valuable in my experience.

Instead of introducing our own API (and eventually even persistence), could
we just periodically reload the log4j or logback configuration from the
environment/filesystem? I only quickly googled the topic and [1,2] suggest
that this might be possible?

[1] https://stackoverflow.com/a/16216956/6422562?
[2] https://logback.qos.ch/manual/configuration.html#autoScan





On Mon, Jan 10, 2022 at 5:10 PM Wenhao Ji  wrote:


Hi everyone,

Hope you enjoyed the Holiday Season.

I would like to start the discussion on the improvement purpose
FLIP-210 [1] which aims to provide a way to change log levels at
runtime to simplify issues and bugs detection as reported in the
ticket FLINK-16478 [2].
Firstly, thanks Xingxing Di and xiaodao for their previous effort. The
FLIP I drafted is largely influenced by their previous designs [3][4].
Although we have reached some agreements under the jira comments about
the scope of this feature, we still have the following questions
listed below ready to be discussed in this thread.

## Question 1


Creating as custom DSL and implementing it for several logging backend

sounds like quite a maintenance burden. Extensions to the DSL, and
supported backends, could become quite an effort. (by Chesnay Schepler)

I tried to design the API of the logging backend to stay away from the
details of implementations but I did not find any slf4j-specific API
that is available to change the log level of a logger. So what I did
is to introduce another kind of abstraction on top of the slf4j /
log4j / logback so that we will not depend on the logging provider's
api directly. It will be convenient for us to adopt any other logging
providers. Please see the "Logging Abstraction" section.

## Question 2


Do we know whether other systems support this kind of feature? If yes,

how do they solve it for different logging backends? (by Till Rohrmann)

I investigated several Java frameworks including Spark, Storm, and
Spring Boot. Here is what I found.
Spark & Storm directly depend on the log4j implementations, which
means they do not support any other slf4j implementation at all. They
simply call the log4j api directly. (see SparkContext.scala#L381 [5],
Utils.scala#L2443 [6] in Spark, and LogConfigManager.java#L144 [7] in
Storm). They are pretty different from what Flink provides.
However, I found Spring Boot has implemented what we are interested
in. Just as Flink, Spring boot also supports many slf4j
implementations. Users are not limited to log4j. They have the ability
to declare different logging frameworks by importing certain
dependencies. After that spring will decide the activated one by
scanning its classpath and context. (see LoggingSystem.java#L164 [8]
and LoggersEndpoint.java#L99 [9])

## Question 3

Besides the questions raised in the jira comments, I also find another
thing that has not been discussed. Considering this feature as an MVP,
do we need to introduce a HighAvailabilityService to store the log
settings so that they can be synced to newly-joined task managers and
also job manager followers for consistency? This issue is included in
the "Limitations" section in the flip.

Finally, thanks for your time for joining this discussion and
reviewing this FLIP. I would appreciate it if you could have any
comments or suggestions on this.


[1]:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-210%3A+Change+logging+level+dynamically+at+runtime
[2]: https://issues.apache.org/jira/browse/FLINK-16478
[3]:
https://docs.google.com/document/d/1Q02VSSBzlZaZzvxuChIo1uinw8KDQsyTZUut6_IDErY
[4]:
https://docs.google.com/document/d/19AyuTHeERP6JKmtHYnCdBw29LnZpRkbTS7K12q4OfbA
[5]:
https://github.com/apache/spark/blob/11596b3b17b5e0f54e104cd49b1397c33c34719d/core/src/main/scala/org/apache/spark/SparkContext.scala#L381
[6]:
https://github.com/apache/spark/blob/11596b3b17b5e0f54e104cd49b1397c33c34719d/core/src/main/scala/org/apache/spark/util/Utils.scala#L2433
[7]:
https://github.com/apache/storm/blob/3f96c249cbc17ce062491bfbb39d484e241ab168/storm-client/src/jvm/org/apache/storm/daemon/worker/LogConfigManager.java#L144
[8]:
https://github.com/spring-projects/spring-boot/blob/main/spring-boot-project/spring-boot/src/main/java/org/springframework/boot/logging/LoggingSystem.java#L164
[9]:

[jira] [Created] (FLINK-25608) Mark metrics as Public(Evolving)

2022-01-11 Thread Fabian Paul (Jira)
Fabian Paul created FLINK-25608:
---

 Summary: Mark metrics as Public(Evolving)
 Key: FLINK-25608
 URL: https://issues.apache.org/jira/browse/FLINK-25608
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Affects Versions: 1.15.0
Reporter: Fabian Paul


With the introduction of architectural tests and the exposure of the metrics 
API to user-facing components the metrics classes also need proper annotations.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Deprecate MapR FS

2022-01-11 Thread Chang Li
+1 for dropping the MapR FS.

Till Rohrmann  于2022年1月5日周三 18:33写道:

> +1 for dropping the MapR FS.
>
> Cheers,
> Till
>
> On Wed, Jan 5, 2022 at 10:11 AM Martijn Visser 
> wrote:
>
> > Hi everyone,
> >
> > Thanks for your input. I've checked the MapR implementation and it has no
> > annotation at all. Given the circumstances that we thought that MapR was
> > already dropped, I would propose to immediately remove MapR in Flink 1.15
> > instead of first marking it as deprecated and removing it in Flink 1.16.
> >
> > Please let me know what you think.
> >
> > Best regards,
> >
> > Martijn
> >
> > On Thu, 9 Dec 2021 at 17:27, David Morávek  wrote:
> >
> >> +1, agreed with Seth's reasoning. There has been no real activity in
> MapR
> >> FS module for years [1], so the eventual users should be good with using
> >> the jars from the older Flink versions for quite some time
> >>
> >> [1]
> >>
> https://github.com/apache/flink/commits/master/flink-filesystems/flink-mapr-fs
> >>
> >> Best,
> >> D.
> >>
> >> On Thu, Dec 9, 2021 at 4:28 PM Konstantin Knauf 
> >> wrote:
> >>
> >>> +1 (what Seth said)
> >>>
> >>> On Thu, Dec 9, 2021 at 4:15 PM Seth Wiesman 
> wrote:
> >>>
> >>> > +1
> >>> >
> >>> > I actually thought we had already dropped this FS. If anyone is still
> >>> > relying on it in production, the file system abstraction in Flink has
> >>> been
> >>> > incredibly stable over the years. They should be able to use the 1.14
> >>> MapR
> >>> > FS with later versions of Flink.
> >>> >
> >>> > Seth
> >>> >
> >>> > On Wed, Dec 8, 2021 at 10:03 AM Martijn Visser <
> mart...@ververica.com>
> >>> > wrote:
> >>> >
> >>> >> Hi all,
> >>> >>
> >>> >> Flink supports multiple file systems [1] which includes MapR FS.
> MapR
> >>> as
> >>> >> a company doesn't exist anymore since 2019, the technology and
> >>> intellectual
> >>> >> property has been sold to Hewlett Packard.
> >>> >>
> >>> >> I don't think that there's anyone who's using MapR anymore and
> >>> therefore
> >>> >> I think it would be good to deprecate this for Flink 1.15 and then
> >>> remove
> >>> >> it in Flink 1.16. Removing this from Flink will slightly shrink the
> >>> >> codebase and CI runtime.
> >>> >>
> >>> >> I'm also cross posting this to the User mailing list, in case
> there's
> >>> >> still anyone who's using MapR.
> >>> >>
> >>> >> Best regards,
> >>> >>
> >>> >> Martijn
> >>> >>
> >>> >> [1]
> >>> >>
> >>>
> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
> >>> >>
> >>> >
> >>>
> >>> --
> >>>
> >>> Konstantin Knauf
> >>>
> >>> https://twitter.com/snntrable
> >>>
> >>> https://github.com/knaufk
> >>>
> >>
>


[jira] [Created] (FLINK-25607) Sorting by duration on Flink Web UI does not work correctly

2022-01-11 Thread Yingjie Cao (Jira)
Yingjie Cao created FLINK-25607:
---

 Summary: Sorting by duration on Flink Web UI does not work 
correctly
 Key: FLINK-25607
 URL: https://issues.apache.org/jira/browse/FLINK-25607
 Project: Flink
  Issue Type: Bug
Reporter: Yingjie Cao
 Attachments: image-2022-01-11-16-44-10-709.png

The Flink version used is 1.14.

!image-2022-01-11-16-44-10-709.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] FLIP-203: Incremental savepoints

2022-01-11 Thread Konstantin Knauf
Hi Piotr,

would it be possible to provide a table that shows the
compatibility guarantees provided by the different snapshots going forward?
Like type of change (Topology. State Schema, Parallelism, ..) in one
dimension, and type of snapshot as the other dimension. Based on that, it
would be easier to discuss those guarantees, I believe.

Cheers,

Konstantin

On Mon, Jan 3, 2022 at 9:11 AM David Morávek  wrote:

> Hi Piotr,
>
> does this mean that we need to keep the checkpoints compatible across minor
> versions? Or can we say, that the minor version upgrades are only
> guaranteed with canonical savepoints?
>
> My concern is especially if we'd want to change layout of the checkpoint.
>
> D.
>
>
>
> On Wed, Dec 29, 2021 at 5:19 AM Yu Li  wrote:
>
> > Thanks for the proposal Piotr! Overall I'm +1 for the idea, and below are
> > my two cents:
> >
> > 1. How about adding a "Term Definition" section and clarify what "native
> > format" (the "native" data persistence format of the current state
> backend)
> > and "canonical format" (the "uniform" format that supports switching
> state
> > backends) means?
> >
> > 2. IIUC, currently the FLIP proposes to only support incremental
> savepoint
> > with native format, and there's no plan to add such support for canonical
> > format, right? If so, how about writing this down explicitly in the FLIP
> > doc, maybe in a "Limitations" section, plus the fact that
> > `HashMapStateBackend` cannot support incremental savepoint before
> FLIP-151
> > is done? (side note: @Roman just a kindly reminder, that please take
> > FLIP-203 into account when implementing FLIP-151)
> >
> > 3. How about changing the description of "the default configuration of
> the
> > checkpoints will be used to determine whether the savepoint should be
> > incremental or not" to something like "the `state.backend.incremental`
> > setting now denotes the type of native format snapshot and will take
> effect
> > for both checkpoint and savepoint (with native type)", to prevent concept
> > confusion between checkpoint and savepoint?
> >
> > 4. How about putting the notes of behavior change (the default type of
> > savepoint will be changed to `native` in the future, and by then the
> taken
> > savepoint cannot be used to switch state backends by default) to a more
> > obvious place, for example moving from the "CLI" section to the
> > "Compatibility" section? (although it will only happen in 1.16 release
> > based on the proposed plan)
> >
> > And all above suggestions apply for our user-facing document after the
> FLIP
> > is (partially or completely, accordingly) done, if taken (smile).
> >
> > Best Regards,
> > Yu
> >
> >
> > On Tue, 21 Dec 2021 at 22:23, Seth Wiesman  wrote:
> >
> > > >> AFAIK state schema evolution should work both for native and
> canonical
> > > >> savepoints.
> > >
> > > Schema evolution does technically work for both formats, it happens
> after
> > > the code paths have been unified, but the community has up until this
> > point
> > > considered that an unsupported feature. From my perspective making this
> > > supported could be as simple as adding test coverage but that's an
> active
> > > decision we'd need to make.
> > >
> > > On Tue, Dec 21, 2021 at 7:43 AM Piotr Nowojski 
> > > wrote:
> > >
> > > > Hi Konstantin,
> > > >
> > > > > In this context: will the native format support state schema
> > evolution?
> > > > If
> > > > > not, I am not sure, we can let the format default to native.
> > > >
> > > > AFAIK state schema evolution should work both for native and
> canonical
> > > > savepoints.
> > > >
> > > > Regarding what is/will be supported we will document as part of this
> > > > FLIP-203. But it's not as simple as just the difference between
> native
> > > and
> > > > canonical formats.
> > > >
> > > > Best, Piotrek
> > > >
> > > > pon., 20 gru 2021 o 14:28 Konstantin Knauf 
> > > napisał(a):
> > > >
> > > > > Hi Piotr,
> > > > >
> > > > > Thanks a lot for starting the discussion. Big +1.
> > > > >
> > > > > In my understanding, this FLIP introduces the snapshot format as a
> > > > *really*
> > > > > user facing concept. IMO it is important that we document
> > > > >
> > > > > a) that it is not longer the checkpoint/savepoint characteristics
> > that
> > > > > determines the kind of changes that a snapshots allows (user code,
> > > state
> > > > > schema evolution, topology changes), but now this becomes a
> property
> > of
> > > > the
> > > > > format regardless of whether this is a snapshots or a checkpoint
> > > > > b) the exact changes that each format allows (code, state schema,
> > > > topology,
> > > > > state backend, max parallelism)
> > > > >
> > > > > In this context: will the native format support state schema
> > evolution?
> > > > If
> > > > > not, I am not sure, we can let the format default to native.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Konstantin
> > > > >
> > > > >
> > > > > On Mon, Dec 20, 2021 at 2:09 PM Piotr Nowojski <
> 

[jira] [Created] (FLINK-25606) Requesting exclusive buffers timeout when recovering from unaligned checkpoint under fine-grained resource mode

2022-01-11 Thread Yingjie Cao (Jira)
Yingjie Cao created FLINK-25606:
---

 Summary: Requesting exclusive buffers timeout when recovering from 
unaligned checkpoint under fine-grained resource mode
 Key: FLINK-25606
 URL: https://issues.apache.org/jira/browse/FLINK-25606
 Project: Flink
  Issue Type: Bug
Reporter: Yingjie Cao


When converting the RecoveredInputChannel to RemoteInputChannel, the network 
buffer is not enough to initialize input channel exclusive buffers. Here is the 
exception stack:
{code:java}
java.io.IOException: Timeout triggered when requesting exclusive buffers: The 
total number of network buffers is currently set to 6144 of 32768 bytes each. 
You can increase this number by setting the configuration keys 
'taskmanager.memory.network.fraction', 'taskmanager.memory.network.min', and 
'taskmanager.memory.network.max',  or you may increase the timeout which is 
3ms by setting the key 
'taskmanager.network.memory.exclusive-buffers-request-timeout-ms'.
  at 
org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.requestMemorySegments(NetworkBufferPool.java:205)
  at 
org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.requestMemorySegments(NetworkBufferPool.java:60)
  at 
org.apache.flink.runtime.io.network.partition.consumer.BufferManager.requestExclusiveBuffers(BufferManager.java:133)
  at 
org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.setup(RemoteInputChannel.java:157)
  at 
org.apache.flink.runtime.io.network.partition.consumer.RemoteRecoveredInputChannel.toInputChannelInternal(RemoteRecoveredInputChannel.java:77)
  at 
org.apache.flink.runtime.io.network.partition.consumer.RecoveredInputChannel.toInputChannel(RecoveredInputChannel.java:106)
  at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.convertRecoveredInputChannels(SingleInputGate.java:307)
  at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:290)
  at 
org.apache.flink.runtime.taskmanager.InputGateWithMetrics.requestPartitions(InputGateWithMetrics.java:94)
  at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
  at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
  at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsNonBlocking(MailboxProcessor.java:359)
  at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:323)
  at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:202)
  at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
  at 
org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:639)
  at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
  at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:623)
  at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779)
  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
  at java.lang.Thread.run(Thread.java:834) {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] FLIP-210: Change logging level dynamically at runtime

2022-01-11 Thread Zhilong Hong
Thank you for proposing this improvement, Wenhao. Changing the logging
level dynamically at runtime is very useful when users are trying to debug
their jobs. They can set the logging level to DEBUG and find out more
details in the logs.

1. I'm wondering if we could add a REST API to query the current logging
level? This API will be useful for users to get to know the current status
of the logging level, especially for those who have their own job
management platform.

2. Would it be better if we add a field to specify the target
JobManager/TaskManager for the logconfig API? Currently, it seems the
modified logging level will be applied to all components in the cluster. If
we change the logging level to DEBUG, the overall size of logs may increase
rapidly, especially for large-scale clusters. It may become a heavy burden
for the disk usage. Adding a field to specify the target could minimize the
impact. Users can only change the logging level for the TaskManager they
are focusing on. Furthermore, if users want to change the logging level for
all components, the target field can be set to "ALL".

On Tue, Jan 11, 2022 at 12:27 AM Konstantin Knauf  wrote:

> Thank you for starting the discussion. Being able to change the logging
> level at runtime is very valuable in my experience.
>
> Instead of introducing our own API (and eventually even persistence), could
> we just periodically reload the log4j or logback configuration from the
> environment/filesystem? I only quickly googled the topic and [1,2] suggest
> that this might be possible?
>
> [1] https://stackoverflow.com/a/16216956/6422562?
> [2] https://logback.qos.ch/manual/configuration.html#autoScan
>
>
>
>
>
> On Mon, Jan 10, 2022 at 5:10 PM Wenhao Ji  wrote:
>
> > Hi everyone,
> >
> > Hope you enjoyed the Holiday Season.
> >
> > I would like to start the discussion on the improvement purpose
> > FLIP-210 [1] which aims to provide a way to change log levels at
> > runtime to simplify issues and bugs detection as reported in the
> > ticket FLINK-16478 [2].
> > Firstly, thanks Xingxing Di and xiaodao for their previous effort. The
> > FLIP I drafted is largely influenced by their previous designs [3][4].
> > Although we have reached some agreements under the jira comments about
> > the scope of this feature, we still have the following questions
> > listed below ready to be discussed in this thread.
> >
> > ## Question 1
> >
> > > Creating as custom DSL and implementing it for several logging backend
> > sounds like quite a maintenance burden. Extensions to the DSL, and
> > supported backends, could become quite an effort. (by Chesnay Schepler)
> >
> > I tried to design the API of the logging backend to stay away from the
> > details of implementations but I did not find any slf4j-specific API
> > that is available to change the log level of a logger. So what I did
> > is to introduce another kind of abstraction on top of the slf4j /
> > log4j / logback so that we will not depend on the logging provider's
> > api directly. It will be convenient for us to adopt any other logging
> > providers. Please see the "Logging Abstraction" section.
> >
> > ## Question 2
> >
> > > Do we know whether other systems support this kind of feature? If yes,
> > how do they solve it for different logging backends? (by Till Rohrmann)
> >
> > I investigated several Java frameworks including Spark, Storm, and
> > Spring Boot. Here is what I found.
> > Spark & Storm directly depend on the log4j implementations, which
> > means they do not support any other slf4j implementation at all. They
> > simply call the log4j api directly. (see SparkContext.scala#L381 [5],
> > Utils.scala#L2443 [6] in Spark, and LogConfigManager.java#L144 [7] in
> > Storm). They are pretty different from what Flink provides.
> > However, I found Spring Boot has implemented what we are interested
> > in. Just as Flink, Spring boot also supports many slf4j
> > implementations. Users are not limited to log4j. They have the ability
> > to declare different logging frameworks by importing certain
> > dependencies. After that spring will decide the activated one by
> > scanning its classpath and context. (see LoggingSystem.java#L164 [8]
> > and LoggersEndpoint.java#L99 [9])
> >
> > ## Question 3
> >
> > Besides the questions raised in the jira comments, I also find another
> > thing that has not been discussed. Considering this feature as an MVP,
> > do we need to introduce a HighAvailabilityService to store the log
> > settings so that they can be synced to newly-joined task managers and
> > also job manager followers for consistency? This issue is included in
> > the "Limitations" section in the flip.
> >
> > Finally, thanks for your time for joining this discussion and
> > reviewing this FLIP. I would appreciate it if you could have any
> > comments or suggestions on this.
> >
> >
> > [1]:
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-210%3A+Change+logging+level+dynamically+at+runtime
> > [2]: 

[jira] [Created] (FLINK-25605) Batch get statistics of multiple partitions instead of get one by one

2022-01-11 Thread Jing Zhang (Jira)
Jing Zhang created FLINK-25605:
--

 Summary: Batch get statistics of multiple partitions instead of 
get one by one
 Key: FLINK-25605
 URL: https://issues.apache.org/jira/browse/FLINK-25605
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Jing Zhang
 Attachments: image-2022-01-11-15-59-55-894.png, 
image-2022-01-11-16-00-28-002.png

Currently, `PushPartitionIntoTableSourceScanRule` would fetch statistics of 
matched partitions one by one.
 !image-2022-01-11-15-59-55-894.png! 
If there are multiple matched partitions, it costs much time to waiting for get 
all statistics.
We could make an improvement here to batch get statistics of multiple 
partitions.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)