Re: [VOTE] Create a separate sub project for FLIP-188: flink-store

2022-01-09 Thread Yu Li
+1 for a separate repository and release pipeline in the same way as
flink-statefun [1], flink-ml [2] and the coming flink-connectors [3].

+1 for naming it as "flink-table-store" (I'm also ok with
"flink-table-storage", but slightly prefer "flink-table-store" because it's
shorter)

Thanks for driving this Jingsong, and look forward to a fast evolution of
this direction!

Best Regards,
Yu

[1] https://github.com/apache/flink-statefun
[2] https://github.com/apache/flink-ml
[3] https://github.com/apache/flink-connectors


On Mon, 10 Jan 2022 at 10:52, Jingsong Li  wrote:

> Hi David, thanks for your suggestion.
>
> I think we should re-use as many common components with connectors as
> possible. I don't fully understand what you mean, but for this project
> I prefer to use Maven rather than Gradle.
>
> Best,
> Jingsong
>
> On Fri, Jan 7, 2022 at 11:59 PM David Morávek  wrote:
> >
> > +1 for the separate repository under the Flink umbrella
> >
> > as we've already started creating more repositories with connectors,
> would
> > it be possible to re-use the same build infrastructure for this one? (eg.
> > shared set of Gradle plugins that unify the build experience)?
> >
> > Best,
> > D.
> >
> > On Fri, Jan 7, 2022 at 11:31 AM Jingsong Li 
> wrote:
> >
> > > For more references on `store` and `storage`:
> > >
> > > For example,
> > >
> > > Rocksdb is a library that provides an embeddable, persistent key-value
> > > store for fast storage. [1]
> > >
> > > Apache HBase [1] is an open-source, distributed, versioned,
> > > column-oriented store modeled after Google' Bigtable. [2]
> > >
> > > [1] https://github.com/facebook/rocksdb
> > > [2] https://github.com/apache/hbase
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Fri, Jan 7, 2022 at 6:17 PM Jingsong Li 
> wrote:
> > > >
> > > > Thanks all,
> > > >
> > > > Combining everyone's comments, I recommend using `flink-table-store`:
> > > >
> > > > ## table
> > > > something to do with table storage (From Till). Not only flink-table,
> > > > but also for user-oriented tables.
> > > >
> > > > ## store vs storage
> > > > - The first point I think, store is better pronounced, storage is
> > > > three syllables while store is two syllables
> > > > - Yes, store also stands for shopping. But I think the English
> > > > polysemy is also quite interesting, a store to store various items,
> it
> > > > also feels interesting to represent the feeling that we want to do
> > > > data storage.
> > > > - The first feeling is, storage is a physical object or abstract
> > > > concept, store is a software application or entity
> > > >
> > > > So I prefer `flink-table-store`, what do you think?
> > > >
> > > > (@_@ Naming is too difficult)
> > > >
> > > > Best,
> > > > Jingsong
> > > >
> > > > On Fri, Jan 7, 2022 at 5:37 PM Konstantin Knauf 
> > > wrote:
> > > > >
> > > > > +1 to a separate repository assuming this repository will still be
> > > part of
> > > > > Apache Flink (same PMC, Committers). I am not aware we have
> something
> > > like
> > > > > "sub-projects" officially.
> > > > >
> > > > > I share Till and Timo's concerns regarding "store".
> > > > >
> > > > > On Fri, Jan 7, 2022 at 9:59 AM Till Rohrmann  >
> > > wrote:
> > > > >
> > > > > > +1 for the separate project.
> > > > > >
> > > > > > I would agree that flink-store is not the best name.
> flink-storage >
> > > > > > flink-store but I would even more prefer a name that conveys
> that it
> > > has
> > > > > > something to do with table storage.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Fri, Jan 7, 2022 at 9:14 AM Timo Walther 
> > > wrote:
> > > > > >
> > > > > > > +1 for the separate project
> > > > > > >
> > > > > > > But maybe use `flink-storage` instead of `flink-store`?
> > > > > > >
> > > > > > > I'm not a native speaker but store is defined as "A place where
> > > items
> > > > > > > may be purchased.". It almost sounds like the `flink-packages`
> > > project.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Timo
> > > > > > >
> > > > > > >
> > > > > > > On 07.01.22 08:37, Jingsong Li wrote:
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > I'd like to start a vote for create a separate sub project
> for
> > > > > > > > FLIP-188 [1]: `flink-store`.
> > > > > > > >
> > > > > > > > - If you agree with the name `flink-store`, please just +1
> > > > > > > > - If you have a better suggestion, please write your
> suggestion,
> > > > > > > > followed by a reply that can +1 to the name that has appeared
> > > > > > > > - If you do not want it to be a subproject of flink, just -1
> > > > > > > >
> > > > > > > > The vote will be open for at least 72 hours unless there is
> an
> > > > > > > > objection or not enough votes.
> > > > > > > >
> > > > > > > > [1]
> > > > > > >
> > > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-188%3A+Introduce+Built-in+Dynamic+Table+Storage
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Jingsong
> > > > > > > >
> > > 

Re: [VOTE] Create a separate sub project for FLIP-188: flink-store

2022-01-09 Thread Yun Tang
+1 for the name of `flink-table-store`.

Best
Yun Tang

From: Jingsong Li 
Sent: Monday, January 10, 2022 10:46
To: dev 
Subject: Re: [VOTE] Create a separate sub project for FLIP-188: flink-store

Hi David, thanks for your suggestion.

I think we should re-use as many common components with connectors as
possible. I don't fully understand what you mean, but for this project
I prefer to use Maven rather than Gradle.

Best,
Jingsong

On Fri, Jan 7, 2022 at 11:59 PM David Morávek  wrote:
>
> +1 for the separate repository under the Flink umbrella
>
> as we've already started creating more repositories with connectors, would
> it be possible to re-use the same build infrastructure for this one? (eg.
> shared set of Gradle plugins that unify the build experience)?
>
> Best,
> D.
>
> On Fri, Jan 7, 2022 at 11:31 AM Jingsong Li  wrote:
>
> > For more references on `store` and `storage`:
> >
> > For example,
> >
> > Rocksdb is a library that provides an embeddable, persistent key-value
> > store for fast storage. [1]
> >
> > Apache HBase [1] is an open-source, distributed, versioned,
> > column-oriented store modeled after Google' Bigtable. [2]
> >
> > [1] https://github.com/facebook/rocksdb
> > [2] https://github.com/apache/hbase
> >
> > Best,
> > Jingsong
> >
> > On Fri, Jan 7, 2022 at 6:17 PM Jingsong Li  wrote:
> > >
> > > Thanks all,
> > >
> > > Combining everyone's comments, I recommend using `flink-table-store`:
> > >
> > > ## table
> > > something to do with table storage (From Till). Not only flink-table,
> > > but also for user-oriented tables.
> > >
> > > ## store vs storage
> > > - The first point I think, store is better pronounced, storage is
> > > three syllables while store is two syllables
> > > - Yes, store also stands for shopping. But I think the English
> > > polysemy is also quite interesting, a store to store various items, it
> > > also feels interesting to represent the feeling that we want to do
> > > data storage.
> > > - The first feeling is, storage is a physical object or abstract
> > > concept, store is a software application or entity
> > >
> > > So I prefer `flink-table-store`, what do you think?
> > >
> > > (@_@ Naming is too difficult)
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Fri, Jan 7, 2022 at 5:37 PM Konstantin Knauf 
> > wrote:
> > > >
> > > > +1 to a separate repository assuming this repository will still be
> > part of
> > > > Apache Flink (same PMC, Committers). I am not aware we have something
> > like
> > > > "sub-projects" officially.
> > > >
> > > > I share Till and Timo's concerns regarding "store".
> > > >
> > > > On Fri, Jan 7, 2022 at 9:59 AM Till Rohrmann 
> > wrote:
> > > >
> > > > > +1 for the separate project.
> > > > >
> > > > > I would agree that flink-store is not the best name. flink-storage >
> > > > > flink-store but I would even more prefer a name that conveys that it
> > has
> > > > > something to do with table storage.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Fri, Jan 7, 2022 at 9:14 AM Timo Walther 
> > wrote:
> > > > >
> > > > > > +1 for the separate project
> > > > > >
> > > > > > But maybe use `flink-storage` instead of `flink-store`?
> > > > > >
> > > > > > I'm not a native speaker but store is defined as "A place where
> > items
> > > > > > may be purchased.". It almost sounds like the `flink-packages`
> > project.
> > > > > >
> > > > > > Regards,
> > > > > > Timo
> > > > > >
> > > > > >
> > > > > > On 07.01.22 08:37, Jingsong Li wrote:
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I'd like to start a vote for create a separate sub project for
> > > > > > > FLIP-188 [1]: `flink-store`.
> > > > > > >
> > > > > > > - If you agree with the name `flink-store`, please just +1
> > > > > > > - If you have a better suggestion, please write your suggestion,
> > > > > > > followed by a reply that can +1 to the name that has appeared
> > > > > > > - If you do not want it to be a subproject of flink, just -1
> > > > > > >
> > > > > > > The vote will be open for at least 72 hours unless there is an
> > > > > > > objection or not enough votes.
> > > > > > >
> > > > > > > [1]
> > > > > >
> > > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-188%3A+Introduce+Built-in+Dynamic+Table+Storage
> > > > > > >
> > > > > > > Best,
> > > > > > > Jingsong
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Konstantin Knauf
> > > >
> > > > https://twitter.com/snntrable
> > > >
> > > > https://github.com/knaufk
> > >
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> >
> >
> >
> > --
> > Best, Jingsong Lee
> >



--
Best, Jingsong Lee


[jira] [Created] (FLINK-25586) ExecutionGraphInfoStore in session cluster should split failed and successful jobs

2022-01-09 Thread Shammon (Jira)
Shammon created FLINK-25586:
---

 Summary: ExecutionGraphInfoStore in session cluster should split 
failed and successful jobs
 Key: FLINK-25586
 URL: https://issues.apache.org/jira/browse/FLINK-25586
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.14.2, 1.13.5, 1.12.7
Reporter: Shammon


In flink session cluster, jobs are stored in `FileExecutionGraphInfoStore`. 
When the count of jobs in it reaches `jobstore.cache-size` or the live time of 
jobs reaches `jobstore.expiration-time`, the specify jobs will be removed. We 
can't holds too many jobs for performance reason, but we should hold failed 
jobs for longer time to trace the cause of failure. So it's better to split 
failed and successful jobs in `FileExecutionGraphInfoStore` and support 
independent max-capacity for them.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: Unable to update logback configuration in Flink Native Kubernetes

2022-01-09 Thread Yang Wang
Sorry for the late reply.

Flink clients will ship the log4j-console.properties and
logback-console.xml via K8s ConfigMap and then mount to
JobManager/TaskManager pod.
So if you want to update the log settings or using logback, all you need is
to update the client-local files.

Best,
Yang




Raghavendar T S  于2021年12月31日周五 12:47写道:

> Hi Sharon
>
> Thanks a lot. I just updated the files (flink-conf.yaml and
> logback-console.xml) in the local conf folder and it worked as expected.
>
> Thanks & Regards
> Raghavendar T S
> MERAS Plugins
>
>
> 
>  Virus-free.
> www.avast.com
> 
> <#m_-8574539363514296023_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Thu, Dec 30, 2021 at 12:54 AM Sharon Xie 
> wrote:
>
>> I've faced the same issue before.
>>
>> I figured out that there is an internal configuration
>> `$internal.deployment.config-dir` (code
>> )
>> which allows me to specify a local folder which contains the logback config
>> using file `logback-console.xml`. The content of the file is then used to
>> create the config map.
>>
>> Hope it helps.
>>
>>
>> Sharon
>>
>> On Wed, Dec 29, 2021 at 7:04 AM Raghavendar T S 
>> wrote:
>>
>>> Hi
>>>
>>> I have created a Flink Native Kubernetes (1.14.2) cluster which is
>>> successful. I am trying to update the logback configuration for which I am
>>> using the configmap exposed by Flink Native Kubernetes. Flink Native
>>> Kubernetes is creating this configmap during the start of the cluster and
>>> deleting it when the cluster is stopped and this behavior is as per the
>>> official documentation.
>>>
>>> I updated the logback configmap which is also successful and this
>>> process even updates the actual logback files (conf folder) in the job
>>> manager and task manager. But Flink is not loading (hot reloading) this
>>> logback configuration.
>>>
>>> Also I want to make sure that the logback configmap configuration is
>>> persisted even during cluster restarts. But the Flink Native Kubernetes
>>> recreates the configmap each time the cluster is started.
>>>
>>> What is that I am missing here? How to make the updated logback
>>> configuration work?
>>>
>>>
>>> Thanks & Regards
>>> Raghavendar T S
>>>
>>>
>>> 
>>>  Virus-free.
>>> www.avast.com
>>> 
>>> <#m_-8574539363514296023_m_-6539971309794987579_m_9211879584941238630_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>
>>
>
> --
> Raghavendar T S
> www.teknosrc.com
>


[jira] [Created] (FLINK-25585) JobManagerHAProcessFailureRecoveryITCase.testDispatcherProcessFailure failed on the azure

2022-01-09 Thread Yun Gao (Jira)
Yun Gao created FLINK-25585:
---

 Summary: 
JobManagerHAProcessFailureRecoveryITCase.testDispatcherProcessFailure failed on 
the azure
 Key: FLINK-25585
 URL: https://issues.apache.org/jira/browse/FLINK-25585
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.14.2
Reporter: Yun Gao



{code:java}
2022-01-08T02:39:54.7103772Z Jan 08 02:39:54 [ERROR] Tests run: 2, Failures: 1, 
Errors: 0, Skipped: 0, Time elapsed: 349.282 s <<< FAILURE! - in 
org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase
2022-01-08T02:39:54.7105233Z Jan 08 02:39:54 [ERROR] 
testDispatcherProcessFailure[ExecutionMode BATCH]  Time elapsed: 302.006 s  <<< 
FAILURE!
2022-01-08T02:39:54.7106478Z Jan 08 02:39:54 java.lang.AssertionError: The 
program encountered a RuntimeException : 
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
while waiting for job to be initialized
2022-01-08T02:39:54.7107409Z Jan 08 02:39:54at 
org.junit.Assert.fail(Assert.java:89)
2022-01-08T02:39:54.7108084Z Jan 08 02:39:54at 
org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testDispatcherProcessFailure(JobManagerHAProcessFailureRecoveryITCase.java:383)
2022-01-08T02:39:54.7108952Z Jan 08 02:39:54at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2022-01-08T02:39:54.7109491Z Jan 08 02:39:54at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2022-01-08T02:39:54.7110107Z Jan 08 02:39:54at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2022-01-08T02:39:54.7110772Z Jan 08 02:39:54at 
java.lang.reflect.Method.invoke(Method.java:498)
2022-01-08T02:39:54.7111594Z Jan 08 02:39:54at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
2022-01-08T02:39:54.7112510Z Jan 08 02:39:54at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2022-01-08T02:39:54.7113734Z Jan 08 02:39:54at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
2022-01-08T02:39:54.7114673Z Jan 08 02:39:54at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2022-01-08T02:39:54.7115423Z Jan 08 02:39:54at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
2022-01-08T02:39:54.7116011Z Jan 08 02:39:54at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
2022-01-08T02:39:54.7116586Z Jan 08 02:39:54at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
2022-01-08T02:39:54.7117154Z Jan 08 02:39:54at 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
2022-01-08T02:39:54.7117686Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
2022-01-08T02:39:54.7118448Z Jan 08 02:39:54at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
2022-01-08T02:39:54.7119020Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
2022-01-08T02:39:54.7119571Z Jan 08 02:39:54at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
2022-01-08T02:39:54.7120180Z Jan 08 02:39:54at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
2022-01-08T02:39:54.7120754Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
2022-01-08T02:39:54.7121286Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
2022-01-08T02:39:54.7121832Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
2022-01-08T02:39:54.7122376Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
2022-01-08T02:39:54.7123179Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
2022-01-08T02:39:54.7123796Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner.run(ParentRunner.java:413)
2022-01-08T02:39:54.7124304Z Jan 08 02:39:54at 
org.junit.runners.Suite.runChild(Suite.java:128)
2022-01-08T02:39:54.7125001Z Jan 08 02:39:54at 
org.junit.runners.Suite.runChild(Suite.java:27)
2022-01-08T02:39:54.7125753Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
2022-01-08T02:39:54.7126595Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
2022-01-08T02:39:54.7127354Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
2022-01-08T02:39:54.7127911Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
2022-01-08T02:39:54.7128456Z Jan 08 02:39:54at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
2022-01-08T02:39:54.7129019Z 

is task reassignment possible

2022-01-09 Thread Chen Qin
Hi there,

We ran multiple large scale applications YARN clusters, one observation
were those jobs often CPU skewed due to topology or data skew on subtasks.
And for better or worse, the skew leads to a few task managers consuming
large vcores while majority task managers consume much less. Our goal is to
save the total infra budget while keeping the job running smoothly.

Any ongoing discussions in this area? Naively, if we know for sure a few
tasks (uuids) use higher vcore from previous runs, could we request one
last batch of containers with high vcore resource profile and reassign
those tasks?

Thanks,
Chen


[jira] [Created] (FLINK-25584) Azure failed on install node and npm for Flink : Runtime web

2022-01-09 Thread Yun Gao (Jira)
Yun Gao created FLINK-25584:
---

 Summary: Azure failed on install node and npm for Flink : Runtime 
web
 Key: FLINK-25584
 URL: https://issues.apache.org/jira/browse/FLINK-25584
 Project: Flink
  Issue Type: Bug
  Components: Build System / Azure Pipelines, Runtime / Web Frontend
Affects Versions: 1.15.0
Reporter: Yun Gao


{code:java}
[INFO] --- frontend-maven-plugin:1.11.0:install-node-and-npm (install node and 
npm) @ flink-runtime-web ---
[INFO] Installing node version v12.14.1
[INFO] Downloading 
https://nodejs.org/dist/v12.14.1/node-v12.14.1-linux-x64.tar.gz to 
/__w/1/.m2/repository/com/github/eirslett/node/12.14.1/node-12.14.1-linux-x64.tar.gz
[INFO] No proxies configured
[INFO] No proxy was configured, downloading directly
[INFO] 
[INFO] Reactor Summary:

[ERROR] Failed to execute goal 
com.github.eirslett:frontend-maven-plugin:1.11.0:install-node-and-npm (install 
node and npm) on project flink-runtime-web: Could not download Node.js: Could 
not download https://nodejs.org/dist/v12.14.1/node-v12.14.1-linux-x64.tar.gz: 
Remote host terminated the handshake: SSL peer shut down incorrectly -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :flink-runtime-web
{code}

(The refactor summary is omitted)




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25583) [FLIP-191] Support compacting small files for FileSink

2022-01-09 Thread Yun Gao (Jira)
Yun Gao created FLINK-25583:
---

 Summary: [FLIP-191] Support compacting small files for FileSink
 Key: FLINK-25583
 URL: https://issues.apache.org/jira/browse/FLINK-25583
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / FileSystem
Reporter: Yun Gao
 Fix For: 1.15.0


Based on the extended sink API 
(https://issues.apache.org/jira/browse/FLINK-2), now we could support 
compacting small files in FileSink. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Releasing Flink 1.14.3

2022-01-09 Thread Xingbo Huang
Hi Thomas,

Since multiple wheel packages with different python versions for mac and
linux are generated, building locally requires you have multiple machines
with different os and Python environments. I have triggered the wheel
package build of release-1.14.3-rc1 in my private Azure[1] and you can
download the wheels after building successfully.

[1]
https://dev.azure.com/hxbks2ks/FLINK-TEST/_build/results?buildId=1704=results

Best,
Xingbo

Thomas Weise  于2022年1月10日周一 11:12写道:

> Hi Martijn,
>
> I started building the release artifacts. The Maven part is ready.
> Currently blocked on the Azure build for the PyFlink wheel packages.
>
> I had to submit a "Azure DevOps Parallelism Request" and that might
> take a couple of days.
>
> Does someone have the steps to build the wheels locally?
> Alternatively, if someone can build them on their existing setup and
> point me to the result, that would speed up things as well.
>
> The release branch:
> https://github.com/apache/flink/tree/release-1.14.3-rc1
>
> Thanks,
> Thomas
>
> On Thu, Jan 6, 2022 at 9:14 PM Martijn Visser 
> wrote:
> >
> > Hi Thomas,
> >
> > Thanks for volunteering! There was no volunteer yet, so would be great if
> > you could help out.
> >
> > Best regards,
> >
> > Martijn
> >
> > Op vr 7 jan. 2022 om 01:54 schreef Thomas Weise 
> >
> > > Hi Martijn,
> > >
> > > Thanks for preparing the release. Did a volunteer check in with you?
> > > If not, I would like to take this up.
> > >
> > > Thomas
> > >
> > > On Mon, Dec 27, 2021 at 7:11 AM Martijn Visser 
> > > wrote:
> > > >
> > > > Thank you all! That means that there's currently no more blocker to
> start
> > > > with the Flink 1.14.3 release.
> > > >
> > > > The only thing that's needed is a committer that's willing to follow
> the
> > > > release process [1] Any volunteers?
> > > >
> > > > Best regards,
> > > >
> > > > Martijn
> > > >
> > > > [1]
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release
> > > >
> > > > On Mon, 27 Dec 2021 at 03:17, Qingsheng Ren 
> wrote:
> > > >
> > > > > Hi Martjin,
> > > > >
> > > > > FLINK-25132 has been merged to master and release-1.14.
> > > > >
> > > > > Thanks for your work for releasing 1.14.3!
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Qingsheng Ren
> > > > >
> > > > > > On Dec 26, 2021, at 3:46 PM, Konstantin Knauf  >
> > > wrote:
> > > > > >
> > > > > > Hi Martijn,
> > > > > >
> > > > > > FLINK-25375 is merged to release-1.14.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Konstantin
> > > > > >
> > > > > > On Wed, Dec 22, 2021 at 12:02 PM David Morávek 
> > > wrote:
> > > > > >
> > > > > >> Hi Martijn, FLINK-25271 has been merged to 1.14 branch.
> > > > > >>
> > > > > >> Best,
> > > > > >> D.
> > > > > >>
> > > > > >> On Wed, Dec 22, 2021 at 7:27 AM 任庆盛  wrote:
> > > > > >>
> > > > > >>> Hi Martjin,
> > > > > >>>
> > > > > >>> Thanks for the effort on Flink 1.14.3. FLINK-25132 has been
> merged
> > > on
> > > > > >>> master and is waiting for CI on release-1.14. I think it can be
> > > closed
> > > > > >>> today.
> > > > > >>>
> > > > > >>> Cheers,
> > > > > >>>
> > > > > >>> Qingsheng Ren
> > > > > >>>
> > > > >  On Dec 21, 2021, at 6:26 PM, Martijn Visser <
> > > mart...@ververica.com>
> > > > > >>> wrote:
> > > > > 
> > > > >  Hi everyone,
> > > > > 
> > > > >  I'm restarting this thread [1] with a new subject, given that
> > > Flink
> > > > > >>> 1.14.1 was a (cancelled) emergency release for the Log4j
> update and
> > > > > we've
> > > > > >>> released Flink 1.14.2 as an emergency release for Log4j updates
> > > [2].
> > > > > 
> > > > >  To give an update, this is the current blocker for Flink
> 1.14.3:
> > > > > 
> > > > >  * https://issues.apache.org/jira/browse/FLINK-25132 -
> KafkaSource
> > > > > >>> cannot work with object-reusing DeserializationSchema -> @
> > > > > >>> renqs...@gmail.com can you provide an ETA for this ticket?
> > > > > 
> > > > >  There are two critical tickets open for Flink 1.14.3. That
> means
> > > that
> > > > > >> if
> > > > > >>> the above ticket is resolved, these two will not block the
> > > release. If
> > > > > we
> > > > > >>> can merge them in before the above ticket is completed, that's
> a
> > > bonus.
> > > > > 
> > > > >  * https://issues.apache.org/jira/browse/FLINK-25199 -
> fromValues
> > > does
> > > > > >>> not emit final MAX watermark -> @Marios Trivyzas any update or
> > > thoughts
> > > > > >> on
> > > > > >>> this?
> > > > >  * https://issues.apache.org/jira/browse/FLINK-25227 -
> Comparing
> > > the
> > > > > >>> equality of the same (boxed) numeric values returns false ->
> > > @Caizhi
> > > > > Weng
> > > > > >>> any update or thoughts on this?
> > > > > 
> > > > >  Best regards,
> > > > > 
> > > > >  Martijn
> > > > > 
> > > > >  [1]
> > > https://lists.apache.org/thread/r0xhs9x01k8hnm0hyq2kk4ptrhkzgdw9
> > > > >  [2]
> > > > > 

Re: [DISCUSS] Deprecate MapR FS

2022-01-09 Thread Yun Tang
+1 for dropping the MapR Fs.

Best
Yun Tang

From: Till Rohrmann 
Sent: Wednesday, January 5, 2022 18:33
To: Martijn Visser 
Cc: David Morávek ; dev ; Seth Wiesman 
; User 
Subject: Re: [DISCUSS] Deprecate MapR FS

+1 for dropping the MapR FS.

Cheers,
Till

On Wed, Jan 5, 2022 at 10:11 AM Martijn Visser 
mailto:mart...@ververica.com>> wrote:
Hi everyone,

Thanks for your input. I've checked the MapR implementation and it has no 
annotation at all. Given the circumstances that we thought that MapR was 
already dropped, I would propose to immediately remove MapR in Flink 1.15 
instead of first marking it as deprecated and removing it in Flink 1.16.

Please let me know what you think.

Best regards,

Martijn

On Thu, 9 Dec 2021 at 17:27, David Morávek 
mailto:d...@apache.org>> wrote:
+1, agreed with Seth's reasoning. There has been no real activity in MapR FS 
module for years [1], so the eventual users should be good with using the jars 
from the older Flink versions for quite some time

[1] 
https://github.com/apache/flink/commits/master/flink-filesystems/flink-mapr-fs

Best,
D.

On Thu, Dec 9, 2021 at 4:28 PM Konstantin Knauf 
mailto:kna...@apache.org>> wrote:
+1 (what Seth said)

On Thu, Dec 9, 2021 at 4:15 PM Seth Wiesman 
mailto:sjwies...@gmail.com>> wrote:

> +1
>
> I actually thought we had already dropped this FS. If anyone is still
> relying on it in production, the file system abstraction in Flink has been
> incredibly stable over the years. They should be able to use the 1.14 MapR
> FS with later versions of Flink.
>
> Seth
>
> On Wed, Dec 8, 2021 at 10:03 AM Martijn Visser 
> mailto:mart...@ververica.com>>
> wrote:
>
>> Hi all,
>>
>> Flink supports multiple file systems [1] which includes MapR FS. MapR as
>> a company doesn't exist anymore since 2019, the technology and intellectual
>> property has been sold to Hewlett Packard.
>>
>> I don't think that there's anyone who's using MapR anymore and therefore
>> I think it would be good to deprecate this for Flink 1.15 and then remove
>> it in Flink 1.16. Removing this from Flink will slightly shrink the
>> codebase and CI runtime.
>>
>> I'm also cross posting this to the User mailing list, in case there's
>> still anyone who's using MapR.
>>
>> Best regards,
>>
>> Martijn
>>
>> [1]
>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>
>

--

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk


Re: [DISCUSS] Add the ability of evolution RowData in sql/table job

2022-01-09 Thread Aitozi
Hi wenlong
  Thanks for your helpful suggestions, I will prepare the design doc
these days to outline the implementation plan for further discussion.

Best,
Aitozi

wenlong.lwl  于2022年1月10日周一 10:07写道:

> Hi, Aitozi,
> thanks for bringing up the discussion on the state evolution of flink sql.
> It would be a great improvement on flink sql.
> I think it would be better if you could prepare a document providing more
> details about the solution,
> it would be a big story and huge change and we need to discuss it
> comprehensively.
>
> Some important points:
> 1. how the digest of each column is generated?
> 2. how can we generate a serializer with multi versions, should we need the
> previous sql when we are compiling a new version?
> 3. what is the semantic when there are new added aggregation columns, and
> when there are cascading aggregation like?
> 4. can the design be extended in the future to support case 1?
>
> Best,
> Wenlong
>
>
> On Sun, 9 Jan 2022 at 17:38, Aitozi  wrote:
>
> > Hi all:
> >  When we use Flink SQL to develop job, we encounter a big problem
> that,
> > the state may become incompatible after changing sql. It mainly caused by
> > two case:
> >
> > 1. The operator number may change and make the state of the operator can
> > not mapping to the previous state.
> > 2. The format of the state value may change , may be caused by the
> > add/remove the column of aggregation operator.
> >
> > In this discussion, I want to proposal to solve the case two, by
> introduce
> > the mechanism of column digest for the RowData.
> >
> > 1. In sql job translate phase, we generate the digest for each column of
> > RowData.
> > 2. We create a new serializer may be calle MergeableRowDataSerializer
> which
> > includes the column digests for the RowData.
> > 3. We generate a int version number for each serialzier, and add the
> > version header in the serialized data. We can reply on
> > the version number to choose the suitable serializer to read the old
> data.
> > 4. We store multi-version serialzier in checkpoint during evolution, so
> > that we can support the lazy deserialization, which inspired by the avro
> > framework insights. In this way, we can avoid the full transfer of old
> data
> > during restoring (which may cost much time).
> > 5. We can also drop the old version of serializer after the ttl of state.
> >
> > We have apply this implementation in our internal version at (Ant
> > Financial), we are looking forward to give this back to flink repo, and
> > looking forward to some suggestion from the community.
> >
> > Best wishes
> > Aitozi
> >
>


Re: [Ask for review]

2022-01-09 Thread Aitozi
Hi wenlong,
  Thanks for your help first, we can keep the discussion in jira.

Best,
Aitozi

wenlong.lwl  于2022年1月10日周一 09:43写道:

> hi, Aitozi,  thanks for the contribution first. I checked the issue, it
> seems that the discussion is blocked and hasn't finished yet. Anyway, I
> would like to join and help review the solution and pr.
>
> Best,
> Wenlong
>
>
> On Sun, 9 Jan 2022 at 16:53, Aitozi  wrote:
>
> > Hi community:
> > I have created a pr  to
> > improve window performance in some scene. But it seems lack of review
> for a
> > long time. Can any sql guys can help take a look and give me some
> feedback?
> > Thanks in advance :)
> >
> > Aitozi
> >
>


Re: [DISCUSS] Releasing Flink 1.14.3

2022-01-09 Thread Thomas Weise
Hi Martijn,

I started building the release artifacts. The Maven part is ready.
Currently blocked on the Azure build for the PyFlink wheel packages.

I had to submit a "Azure DevOps Parallelism Request" and that might
take a couple of days.

Does someone have the steps to build the wheels locally?
Alternatively, if someone can build them on their existing setup and
point me to the result, that would speed up things as well.

The release branch: https://github.com/apache/flink/tree/release-1.14.3-rc1

Thanks,
Thomas

On Thu, Jan 6, 2022 at 9:14 PM Martijn Visser  wrote:
>
> Hi Thomas,
>
> Thanks for volunteering! There was no volunteer yet, so would be great if
> you could help out.
>
> Best regards,
>
> Martijn
>
> Op vr 7 jan. 2022 om 01:54 schreef Thomas Weise 
>
> > Hi Martijn,
> >
> > Thanks for preparing the release. Did a volunteer check in with you?
> > If not, I would like to take this up.
> >
> > Thomas
> >
> > On Mon, Dec 27, 2021 at 7:11 AM Martijn Visser 
> > wrote:
> > >
> > > Thank you all! That means that there's currently no more blocker to start
> > > with the Flink 1.14.3 release.
> > >
> > > The only thing that's needed is a committer that's willing to follow the
> > > release process [1] Any volunteers?
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > [1]
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release
> > >
> > > On Mon, 27 Dec 2021 at 03:17, Qingsheng Ren  wrote:
> > >
> > > > Hi Martjin,
> > > >
> > > > FLINK-25132 has been merged to master and release-1.14.
> > > >
> > > > Thanks for your work for releasing 1.14.3!
> > > >
> > > > Cheers,
> > > >
> > > > Qingsheng Ren
> > > >
> > > > > On Dec 26, 2021, at 3:46 PM, Konstantin Knauf 
> > wrote:
> > > > >
> > > > > Hi Martijn,
> > > > >
> > > > > FLINK-25375 is merged to release-1.14.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Konstantin
> > > > >
> > > > > On Wed, Dec 22, 2021 at 12:02 PM David Morávek 
> > wrote:
> > > > >
> > > > >> Hi Martijn, FLINK-25271 has been merged to 1.14 branch.
> > > > >>
> > > > >> Best,
> > > > >> D.
> > > > >>
> > > > >> On Wed, Dec 22, 2021 at 7:27 AM 任庆盛  wrote:
> > > > >>
> > > > >>> Hi Martjin,
> > > > >>>
> > > > >>> Thanks for the effort on Flink 1.14.3. FLINK-25132 has been merged
> > on
> > > > >>> master and is waiting for CI on release-1.14. I think it can be
> > closed
> > > > >>> today.
> > > > >>>
> > > > >>> Cheers,
> > > > >>>
> > > > >>> Qingsheng Ren
> > > > >>>
> > > >  On Dec 21, 2021, at 6:26 PM, Martijn Visser <
> > mart...@ververica.com>
> > > > >>> wrote:
> > > > 
> > > >  Hi everyone,
> > > > 
> > > >  I'm restarting this thread [1] with a new subject, given that
> > Flink
> > > > >>> 1.14.1 was a (cancelled) emergency release for the Log4j update and
> > > > we've
> > > > >>> released Flink 1.14.2 as an emergency release for Log4j updates
> > [2].
> > > > 
> > > >  To give an update, this is the current blocker for Flink 1.14.3:
> > > > 
> > > >  * https://issues.apache.org/jira/browse/FLINK-25132 - KafkaSource
> > > > >>> cannot work with object-reusing DeserializationSchema -> @
> > > > >>> renqs...@gmail.com can you provide an ETA for this ticket?
> > > > 
> > > >  There are two critical tickets open for Flink 1.14.3. That means
> > that
> > > > >> if
> > > > >>> the above ticket is resolved, these two will not block the
> > release. If
> > > > we
> > > > >>> can merge them in before the above ticket is completed, that's a
> > bonus.
> > > > 
> > > >  * https://issues.apache.org/jira/browse/FLINK-25199 - fromValues
> > does
> > > > >>> not emit final MAX watermark -> @Marios Trivyzas any update or
> > thoughts
> > > > >> on
> > > > >>> this?
> > > >  * https://issues.apache.org/jira/browse/FLINK-25227 - Comparing
> > the
> > > > >>> equality of the same (boxed) numeric values returns false ->
> > @Caizhi
> > > > Weng
> > > > >>> any update or thoughts on this?
> > > > 
> > > >  Best regards,
> > > > 
> > > >  Martijn
> > > > 
> > > >  [1]
> > https://lists.apache.org/thread/r0xhs9x01k8hnm0hyq2kk4ptrhkzgdw9
> > > >  [2]
> > > > https://flink.apache.org/news/2021/12/16/log4j-patch-releases.html
> > > > 
> > > >  On Thu, 9 Dec 2021 at 17:21, David Morávek 
> > wrote:
> > > >  Hi Martijn, I've just opened a backport PR [1] for FLINK-23946
> > [2].
> > > > 
> > > >  [1] https://github.com/apache/flink/pull/18066
> > > >  [2] https://issues.apache.org/jira/browse/FLINK-23946
> > > > 
> > > >  Best,
> > > >  D.
> > > > 
> > > >  On Thu, Dec 9, 2021 at 4:59 PM Fabian Paul 
> > wrote:
> > > >  Actually I meant
> > https://issues.apache.org/jira/browse/FLINK-25126
> > > >  sorry for the confusion.
> > > > 
> > > >  On Thu, Dec 9, 2021 at 4:55 PM Fabian Paul 
> > wrote:
> > > > >
> > > > > Hi Martijn,
> > > > >
> > > > > I just opened the backport for
> > > > > 

Re: [VOTE] Create a separate sub project for FLIP-188: flink-store

2022-01-09 Thread Jingsong Li
Hi David, thanks for your suggestion.

I think we should re-use as many common components with connectors as
possible. I don't fully understand what you mean, but for this project
I prefer to use Maven rather than Gradle.

Best,
Jingsong

On Fri, Jan 7, 2022 at 11:59 PM David Morávek  wrote:
>
> +1 for the separate repository under the Flink umbrella
>
> as we've already started creating more repositories with connectors, would
> it be possible to re-use the same build infrastructure for this one? (eg.
> shared set of Gradle plugins that unify the build experience)?
>
> Best,
> D.
>
> On Fri, Jan 7, 2022 at 11:31 AM Jingsong Li  wrote:
>
> > For more references on `store` and `storage`:
> >
> > For example,
> >
> > Rocksdb is a library that provides an embeddable, persistent key-value
> > store for fast storage. [1]
> >
> > Apache HBase [1] is an open-source, distributed, versioned,
> > column-oriented store modeled after Google' Bigtable. [2]
> >
> > [1] https://github.com/facebook/rocksdb
> > [2] https://github.com/apache/hbase
> >
> > Best,
> > Jingsong
> >
> > On Fri, Jan 7, 2022 at 6:17 PM Jingsong Li  wrote:
> > >
> > > Thanks all,
> > >
> > > Combining everyone's comments, I recommend using `flink-table-store`:
> > >
> > > ## table
> > > something to do with table storage (From Till). Not only flink-table,
> > > but also for user-oriented tables.
> > >
> > > ## store vs storage
> > > - The first point I think, store is better pronounced, storage is
> > > three syllables while store is two syllables
> > > - Yes, store also stands for shopping. But I think the English
> > > polysemy is also quite interesting, a store to store various items, it
> > > also feels interesting to represent the feeling that we want to do
> > > data storage.
> > > - The first feeling is, storage is a physical object or abstract
> > > concept, store is a software application or entity
> > >
> > > So I prefer `flink-table-store`, what do you think?
> > >
> > > (@_@ Naming is too difficult)
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Fri, Jan 7, 2022 at 5:37 PM Konstantin Knauf 
> > wrote:
> > > >
> > > > +1 to a separate repository assuming this repository will still be
> > part of
> > > > Apache Flink (same PMC, Committers). I am not aware we have something
> > like
> > > > "sub-projects" officially.
> > > >
> > > > I share Till and Timo's concerns regarding "store".
> > > >
> > > > On Fri, Jan 7, 2022 at 9:59 AM Till Rohrmann 
> > wrote:
> > > >
> > > > > +1 for the separate project.
> > > > >
> > > > > I would agree that flink-store is not the best name. flink-storage >
> > > > > flink-store but I would even more prefer a name that conveys that it
> > has
> > > > > something to do with table storage.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Fri, Jan 7, 2022 at 9:14 AM Timo Walther 
> > wrote:
> > > > >
> > > > > > +1 for the separate project
> > > > > >
> > > > > > But maybe use `flink-storage` instead of `flink-store`?
> > > > > >
> > > > > > I'm not a native speaker but store is defined as "A place where
> > items
> > > > > > may be purchased.". It almost sounds like the `flink-packages`
> > project.
> > > > > >
> > > > > > Regards,
> > > > > > Timo
> > > > > >
> > > > > >
> > > > > > On 07.01.22 08:37, Jingsong Li wrote:
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I'd like to start a vote for create a separate sub project for
> > > > > > > FLIP-188 [1]: `flink-store`.
> > > > > > >
> > > > > > > - If you agree with the name `flink-store`, please just +1
> > > > > > > - If you have a better suggestion, please write your suggestion,
> > > > > > > followed by a reply that can +1 to the name that has appeared
> > > > > > > - If you do not want it to be a subproject of flink, just -1
> > > > > > >
> > > > > > > The vote will be open for at least 72 hours unless there is an
> > > > > > > objection or not enough votes.
> > > > > > >
> > > > > > > [1]
> > > > > >
> > > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-188%3A+Introduce+Built-in+Dynamic+Table+Storage
> > > > > > >
> > > > > > > Best,
> > > > > > > Jingsong
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Konstantin Knauf
> > > >
> > > > https://twitter.com/snntrable
> > > >
> > > > https://github.com/knaufk
> > >
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> >
> >
> >
> > --
> > Best, Jingsong Lee
> >



-- 
Best, Jingsong Lee


[jira] [Created] (FLINK-25582) flink sql kafka source cannot special custom parallelism

2022-01-09 Thread venn wu (Jira)
venn wu created FLINK-25582:
---

 Summary: flink sql kafka source cannot special custom parallelism
 Key: FLINK-25582
 URL: https://issues.apache.org/jira/browse/FLINK-25582
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kafka, Table SQL / API
Affects Versions: 1.14.0, 1.13.0
Reporter: venn wu


costom flink sql kafka source parallelism



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Add the ability of evolution RowData in sql/table job

2022-01-09 Thread wenlong.lwl
Hi, Aitozi,
thanks for bringing up the discussion on the state evolution of flink sql.
It would be a great improvement on flink sql.
I think it would be better if you could prepare a document providing more
details about the solution,
it would be a big story and huge change and we need to discuss it
comprehensively.

Some important points:
1. how the digest of each column is generated?
2. how can we generate a serializer with multi versions, should we need the
previous sql when we are compiling a new version?
3. what is the semantic when there are new added aggregation columns, and
when there are cascading aggregation like?
4. can the design be extended in the future to support case 1?

Best,
Wenlong


On Sun, 9 Jan 2022 at 17:38, Aitozi  wrote:

> Hi all:
>  When we use Flink SQL to develop job, we encounter a big problem that,
> the state may become incompatible after changing sql. It mainly caused by
> two case:
>
> 1. The operator number may change and make the state of the operator can
> not mapping to the previous state.
> 2. The format of the state value may change , may be caused by the
> add/remove the column of aggregation operator.
>
> In this discussion, I want to proposal to solve the case two, by introduce
> the mechanism of column digest for the RowData.
>
> 1. In sql job translate phase, we generate the digest for each column of
> RowData.
> 2. We create a new serializer may be calle MergeableRowDataSerializer which
> includes the column digests for the RowData.
> 3. We generate a int version number for each serialzier, and add the
> version header in the serialized data. We can reply on
> the version number to choose the suitable serializer to read the old data.
> 4. We store multi-version serialzier in checkpoint during evolution, so
> that we can support the lazy deserialization, which inspired by the avro
> framework insights. In this way, we can avoid the full transfer of old data
> during restoring (which may cost much time).
> 5. We can also drop the old version of serializer after the ttl of state.
>
> We have apply this implementation in our internal version at (Ant
> Financial), we are looking forward to give this back to flink repo, and
> looking forward to some suggestion from the community.
>
> Best wishes
> Aitozi
>


Flink SQL Kafka connector

2022-01-09 Thread 聂荧屏
hello


Is there any plan to develop batch mode of Flink SQL Kafka connector?

I would like to use kafka connector for daily/hourly/minute-by-minute
statistics, but currently only supports streaming mode and kafka parameters
only support the start parameter setting, not the end parameter setting.


Version is based on Flink 1.14.2


Thanks for reading and I look forward to your reply.

Thank you very much


Re: [Ask for review]

2022-01-09 Thread wenlong.lwl
hi, Aitozi,  thanks for the contribution first. I checked the issue, it
seems that the discussion is blocked and hasn't finished yet. Anyway, I
would like to join and help review the solution and pr.

Best,
Wenlong


On Sun, 9 Jan 2022 at 16:53, Aitozi  wrote:

> Hi community:
> I have created a pr  to
> improve window performance in some scene. But it seems lack of review for a
> long time. Can any sql guys can help take a look and give me some feedback?
> Thanks in advance :)
>
> Aitozi
>


[jira] [Created] (FLINK-25581) Jobmanager does not archive completed jobs

2022-01-09 Thread Leonid Ilyevsky (Jira)
Leonid Ilyevsky created FLINK-25581:
---

 Summary: Jobmanager does not archive completed jobs
 Key: FLINK-25581
 URL: https://issues.apache.org/jira/browse/FLINK-25581
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.13.5
 Environment: RHEL 7
Reporter: Leonid Ilyevsky


Jobmanager does not archive completed jobs.

I configured the upload directory like this:

jobmanager.archive.fs.dir: file:///liquidnet/shared/flink/completed-jobs

 

After the job was completed, nothing appeared in that directory.

The job info was visible in the jobmanager console for one hour, then it 
disappeared, and still there was no files in the configured directory.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[DISCUSS] Add the ability of evolution RowData in sql/table job

2022-01-09 Thread Aitozi
Hi all:
 When we use Flink SQL to develop job, we encounter a big problem that,
the state may become incompatible after changing sql. It mainly caused by
two case:

1. The operator number may change and make the state of the operator can
not mapping to the previous state.
2. The format of the state value may change , may be caused by the
add/remove the column of aggregation operator.

In this discussion, I want to proposal to solve the case two, by introduce
the mechanism of column digest for the RowData.

1. In sql job translate phase, we generate the digest for each column of
RowData.
2. We create a new serializer may be calle MergeableRowDataSerializer which
includes the column digests for the RowData.
3. We generate a int version number for each serialzier, and add the
version header in the serialized data. We can reply on
the version number to choose the suitable serializer to read the old data.
4. We store multi-version serialzier in checkpoint during evolution, so
that we can support the lazy deserialization, which inspired by the avro
framework insights. In this way, we can avoid the full transfer of old data
during restoring (which may cost much time).
5. We can also drop the old version of serializer after the ttl of state.

We have apply this implementation in our internal version at (Ant
Financial), we are looking forward to give this back to flink repo, and
looking forward to some suggestion from the community.

Best wishes
Aitozi


[Ask for review]

2022-01-09 Thread Aitozi
Hi community:
I have created a pr  to
improve window performance in some scene. But it seems lack of review for a
long time. Can any sql guys can help take a look and give me some feedback?
Thanks in advance :)

Aitozi