[jira] [Created] (FLINK-31184) Failed to get python udf runner directory via running GET_RUNNER_DIR_SCRIPT

2023-02-22 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-31184:
-

 Summary: Failed to get python udf runner directory via running 
GET_RUNNER_DIR_SCRIPT 
 Key: FLINK-31184
 URL: https://issues.apache.org/jira/browse/FLINK-31184
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.16.1, 1.15.3, 1.17.0
Reporter: Wei Zhong


The following exception is thrown when using python udf in user job:

 
{code:java}
Caused by: java.io.IOException: Cannot run program "ERROR: ld.so: object 
'/usr/lib64/libjemalloc.so.1' from LD_PRELOAD cannot be preloaded: ignored.
/mnt/ssd/0/yarn/nm-local-dir/usercache/flink/appcache/application_1670838323719_705777/python-dist-fe870981-4de7-4229-ad0b-f51881e80d90/python-archives/pipeline_venv_v5.tar.gz/lib/python3.7/site-packages/pyflink/bin/pyflink-udf-runner.sh":
 error=2, No such file or directory
  at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
  at 
org.apache.beam.runners.fnexecution.environment.ProcessManager.startProcess(ProcessManager.java:147)
  at 
org.apache.beam.runners.fnexecution.environment.ProcessManager.startProcess(ProcessManager.java:122)
  at 
org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:106)
  at 
org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:252)
  at 
org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:231)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964)
  ... 19 more
  Suppressed: java.lang.NullPointerException: Process for id does not exist: 1-1
at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:895)
at 
org.apache.beam.runners.fnexecution.environment.ProcessManager.stopProcess(ProcessManager.java:172)
at 
org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:126)
... 29 more
Caused by: java.io.IOException: error=2, No such file or directory
  at java.lang.UNIXProcess.forkAndExec(Native Method)
  at java.lang.UNIXProcess.(UNIXProcess.java:247)
  at java.lang.ProcessImpl.start(ProcessImpl.java:134)
  at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
  ... 32 more {code}
 

 

This is because SRE introduce a environment param 

 
{code:java}
LD_PRELOAD=/usr/lib64/libjemalloc.so.1 {code}
The logic of the python process itself can be executed normally, but an extra 
error message will be printed. So the whole output looks like:
{code:java}
ERROR: ld.so: object '/usr/lib64/libjemalloc.so.1' from LD_PRELOAD cannot be 
preloaded: ignored.
/mnt/ssd/0/yarn/nm-local-dir/usercache/flink/appcache/application_1670838323719_705777/python-dist-fe870981-4de7-4229-ad0b-f51881e80d90/python-archives/pipeline_venv_v5.tar.gz/lib/python3.7/site-packages/pyflink/bin/{code}
And the whole output is treated as a command, which caused the exception.

It seems the output is not very reliable. Maybe we need to find another way to 
transfer data, or filter the output before using.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New Apache Flink Committer - Rui Fan

2023-02-20 Thread Wei Zhong
Congratulations, Rui!

Best,
Wei

> 2023年2月21日 下午1:52,Shammon FY  写道:
> 
> Congratulations, Rui!
> 
> 
> Best,
> Shammon
> 
> On Tue, Feb 21, 2023 at 1:40 PM Sergey Nuyanzin  wrote:
> 
>> Congratulations, Rui!
>> 
>> On Tue, Feb 21, 2023 at 4:53 AM Weihua Hu  wrote:
>> 
>>> Congratulations, Rui!
>>> 
>>> Best,
>>> Weihua
>>> 
>>> 
>>> On Tue, Feb 21, 2023 at 11:28 AM Biao Geng  wrote:
>>> 
 Congrats, Rui!
 Best,
 Biao Geng
 
 weijie guo  于2023年2月21日周二 11:21写道:
 
> Congrats, Rui!
> 
> Best regards,
> 
> Weijie
> 
> 
> Leonard Xu  于2023年2月21日周二 11:03写道:
> 
>> Congratulations, Rui!
>> 
>> Best,
>> Leonard
>> 
>>> On Feb 21, 2023, at 9:50 AM, Matt Wang  wrote:
>>> 
>>> Congrats Rui
>>> 
>>> 
>>> --
>>> 
>>> Best,
>>> Matt Wang
>>> 
>>> 
>>>  Replied Message 
>>> | From | yuxia |
>>> | Date | 02/21/2023 09:22 |
>>> | To | dev |
>>> | Subject | Re: [ANNOUNCE] New Apache Flink Committer - Rui Fan |
>>> Congrats Rui
>>> 
>>> Best regards,
>>> Yuxia
>>> 
>>> - 原始邮件 -
>>> 发件人: "Samrat Deb" 
>>> 收件人: "dev" 
>>> 发送时间: 星期二, 2023年 2 月 21日 上午 1:09:25
>>> 主题: Re: [ANNOUNCE] New Apache Flink Committer - Rui Fan
>>> 
>>> Congrats Rui
>>> 
>>> On Mon, 20 Feb 2023 at 10:28 PM, Anton Kalashnikov <
 kaa@yandex.com
>> 
>>> wrote:
>>> 
>>> Congrats Rui!
>>> 
>>> --
>>> Best regards,
>>> Anton Kalashnikov
>>> 
>>> On 20.02.23 17:53, Matthias Pohl wrote:
>>> Congratulations, Rui :)
>>> 
>>> On Mon, Feb 20, 2023 at 5:10 PM Jing Ge
>> >>> 
>>> wrote:
>>> 
>>> Congrats Rui!
>>> 
>>> On Mon, Feb 20, 2023 at 3:19 PM Piotr Nowojski <
>>> pnowoj...@apache.org
> 
>>> wrote:
>>> 
>>> Hi, everyone
>>> 
>>> On behalf of the PMC, I'm very happy to announce Rui Fan as a new
 Flink
>>> Committer.
>>> 
>>> Rui Fan has been active on a small scale since August 2019, and
 ramped
>>> up
>>> his contributions in the 2nd half of 2021. He was mostly involved
>>> in
>>> quite
>>> demanding performance related work around the network stack and
>>> checkpointing, like re-using TCP connections [1], and many
>> crucial
>>> improvements to the unaligned checkpoints. Among others:
>> FLIP-227:
>>> Support
>>> overdraft buffer [2], Merge small ChannelState file for Unaligned
>>> Checkpoint [3], Timeout aligned to unaligned checkpoint barrier
>> in
 the
>>> output buffers [4].
>>> 
>>> Please join me in congratulating Rui Fan for becoming a Flink
>>> Committer!
>>> 
>>> Best,
>>> Piotr Nowojski (on behalf of the Flink PMC)
>>> 
>>> [1] https://issues.apache.org/jira/browse/FLINK-22643
>>> [2]
>>> 
>>> 
>>> 
>>> 
>> 
> 
 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-227%3A+Support+overdraft+buffer
>>> [3] https://issues.apache.org/jira/browse/FLINK-26803
>>> [4] https://issues.apache.org/jira/browse/FLINK-27251
>>> 
>>> 
>> 
>> 
> 
 
>>> 
>> 
>> 
>> --
>> Best regards,
>> Sergey
>> 



[jira] [Created] (FLINK-30245) NPE thrown when filtering decimal(18, 4) values after calling DecimalDataUtils.subtract method

2022-11-29 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-30245:
-

 Summary: NPE thrown when filtering decimal(18, 4) values after 
calling DecimalDataUtils.subtract method
 Key: FLINK-30245
 URL: https://issues.apache.org/jira/browse/FLINK-30245
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Affects Versions: 1.13.6, 1.17.0
Reporter: Wei Zhong
 Attachments: image-2022-11-30-15-11-03-706.png

Reproduce code:
{code:java}
TableEnvironment tableEnv = 
TableEnvironment.create(EnvironmentSettings.newInstance().build());

tableEnv.executeSql("create table datagen_source1 (disburse_amount int) 
with ('connector' = 'datagen')");

tableEnv.executeSql("create table print_sink (disburse_amount 
Decimal(18,4)) with ('connector' = 'print')");

tableEnv.executeSql("create view mid as select cast(disburse_amount as 
Decimal(18,4)) - cast(disburse_amount as Decimal(18,4)) as disburse_amount from 
datagen_source1");

tableEnv.executeSql("insert into print_sink select * from mid where 
disburse_amount > 0 ").await();
{code}
Excpetion:
{code:java}
Exception in thread "main" java.util.concurrent.ExecutionException: 
org.apache.flink.table.api.TableException: Failed to wait job finish
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at 
org.apache.flink.table.api.internal.TableResultImpl.awaitInternal(TableResultImpl.java:118)
at 
org.apache.flink.table.api.internal.TableResultImpl.await(TableResultImpl.java:81)
at com.shopee.flink.BugExample2.main(BugExample2.java:21)
Caused by: org.apache.flink.table.api.TableException: Failed to wait job finish
at 
org.apache.flink.table.api.internal.InsertResultProvider.hasNext(InsertResultProvider.java:85)
at 
org.apache.flink.table.api.internal.InsertResultProvider.isFirstRowReady(InsertResultProvider.java:71)
at 
org.apache.flink.table.api.internal.TableResultImpl.lambda$awaitInternal$1(TableResultImpl.java:105)
at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at 
org.apache.flink.table.api.internal.InsertResultProvider.hasNext(InsertResultProvider.java:83)
... 6 more
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed.
at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
at 
org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:141)
at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
at 
org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$1(AkkaInvocationHandler.java:267)
at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
at 
org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1277)
at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)
at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
at 
java.util.concurrent.C

Re: [ANNOUNCE] New Apache Flink PMC Members - Godfrey He, Xingbo Huang

2022-11-22 Thread Wei Zhong
Congratulations, Godfrey and Xingbo!

Best,
Wei

> 2022年11月23日 下午3:29,Shuiqiang Chen  写道:
> 
> Congratulations, Godfrey and Xingbo!
> 
> Best,
> Shuiqiang
> 
> Runker Hanna  于2022年11月23日周三 15:28写道:
> 
>> Congrats, Godfrey and Xingbo!
>> 
>> Best,
>> Runkang He
>> 
>> Lincoln Lee  于2022年11月23日周三 15:21写道:
>> 
>>> Congrats, Godfrey and Xingbo!
>>> 
>>> Best,
>>> Lincoln Lee
>>> 
>>> 
>>> Benchao Li  于2022年11月23日周三 14:17写道:
>>> 
 Congratulations Godfrey and Xingbo, well deserved!
 
 Zakelly Lan  于2022年11月23日周三 14:11写道:
 
> Congratulations, Godfrey and Xingbo!
> 
> Best,
> Zakelly
> 
> On Wed, Nov 23, 2022 at 2:07 PM yuxia 
 wrote:
>> 
>> Congrats,Godfrey and Xingbo!
>> 
>> Best regards,
>> Yuxia
>> 
>> - 原始邮件 -
>> 发件人: "Geng Biao" 
>> 收件人: "dev" 
>> 发送时间: 星期三, 2022年 11 月 23日 下午 12:32:52
>> 主题: Re: [ANNOUNCE] New Apache Flink PMC Members - Godfrey He,
>> Xingbo
> Huang
>> 
>> Congrats,Godfrey and Xingbo!
>> Best,
>> Biao Geng
>> 
>> 获取 Outlook for iOS
>> 
>> 发件人: Dian Fu 
>> 发送时间: Wednesday, November 23, 2022 12:18:01 PM
>> 收件人: dev 
>> 主题: [ANNOUNCE] New Apache Flink PMC Members - Godfrey He, Xingbo
>>> Huang
>> 
>> Hi everyone,
>> 
>> On behalf of the Apache Flink PMC, I'm very happy to announce that
> Godfrey
>> He and Xingbo Huang have joined the Flink PMC!
>> 
>> Godfrey He becomes a Flink committer in Sep 2020. His contributions
>>> are
>> mainly focused on the Flink table module, covering almost all
>>> important
>> parts such as Client(SQL Client, SQL gateway, JDBC driver, etc),
>> API,
 SQL
>> parser, query optimization, query execution, etc. Especially in the
 query
>> optimization part, he built the query optimization framework and
>> the
 cost
>> model, improved the statistics information and made a lot of
> optimizations,
>> e.g. dynamic partition pruning, join hint, multiple input rewrite,
>>> etc.
> In
>> addition, he has done a lot of groundwork, such as refactoring the
>> ExecNode(which is the basis for further DAG optimizations) and SQL
>>> Plan
>> JSON serialization (which is a big step to support SQL job version
>> upgrade). Besides that, he's also helping the projects in other
>> ways,
> e.g.
>> managing releases, giving talks, etc.
>> 
>> Xingbo Huang becomes a Flink committer in Feb 2021. His
>> contributions
 are
>> mainly focused on the PyFlink module and he's the author of many
> important
>> features in PyFlink, e.g. Cython support, Python thread execution
>>> mode,
>> Python UDTF support, Python UDAF support in windowing, etc. He is
>>> also
> one
>> of the main contributors in the Flink ML project. Besides that,
>> he's
 also
>> helping to manage releases, taking care of the build stabilites,
>> etc.
>> 
>> Congratulations and welcome them as Apache Flink PMC!
>> 
>> Regards,
>> Dian
> 
 
 
 --
 
 Best,
 Benchao Li
 
>>> 
>> 



Re: [ANNOUNCE] Apache Flink 1.16.0 released

2022-10-28 Thread Wei Zhong
Congratulations!

Thank you very much! Thanks Xingbo for managing the release.

Best,
Wei


> 2022年10月28日 下午6:10,yuxia  写道:
> 
> Congratulations!
> 
> Best regards,
> Yuxia
> 
> - 原始邮件 -
> 发件人: "Márton Balassi" 
> 收件人: "dev" 
> 发送时间: 星期五, 2022年 10 月 28日 下午 5:59:42
> 主题: Re: [ANNOUNCE] Apache Flink 1.16.0 released
> 
> Awesome, thanks team! Thanks Xingbo for managing the release.
> 
> On Fri, Oct 28, 2022 at 11:04 AM Biao Liu  wrote:
> 
>> Congrats! Glad to hear that.
>> 
>> BTW, I just found the document link of 1.16 from https://flink.apache.org/
>> is not correct.
>> 
>> [image: 截屏2022-10-28 17.01.28.png]
>> 
>> Thanks,
>> Biao /'bɪ.aʊ/
>> 
>> 
>> 
>> On Fri, 28 Oct 2022 at 14:46, Xingbo Huang  wrote:
>> 
>>> The Apache Flink community is very happy to announce the release of Apache
>>> Flink 1.16.0, which is the first release for the Apache Flink 1.16 series.
>>> 
>>> Apache Flink® is an open-source stream processing framework for
>>> distributed, high-performing, always-available, and accurate data
>>> streaming
>>> applications.
>>> 
>>> The release is available for download at:
>>> https://flink.apache.org/downloads.html
>>> 
>>> Please check out the release blog post for an overview of the
>>> improvements for this release:
>>> https://flink.apache.org/news/2022/10/28/1.16-announcement.html
>>> 
>>> The full release notes are available in Jira:
>>> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351275
>>> 
>>> We would like to thank all contributors of the Apache Flink community
>>> who made this release possible!
>>> 
>>> Regards,
>>> Chesnay, Martijn, Godfrey & Xingbo
>>> 
>> 



[jira] [Created] (FLINK-28764) Support more than 64 distinct aggregate function calls in one aggregate SQL query

2022-08-01 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-28764:
-

 Summary: Support more than 64 distinct aggregate function calls in 
one aggregate SQL query
 Key: FLINK-28764
 URL: https://issues.apache.org/jira/browse/FLINK-28764
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.15.1, 1.14.5, 1.13.6
Reporter: Wei Zhong


Currently Flink SQL does not support more than 64 distinct aggregate function 
calls in one aggregate SQL query. We encountered this problem while migrating 
batch jobs from spark to flink. The spark job has 79 distinct aggregate 
function calls in one aggregate SQL query.

Reproduce code:
{code:java}
public class Test64Distinct {
    public static void main(String[] args) {
        TableEnvironment tableEnv = 
TableEnvironment.create(EnvironmentSettings.inBatchMode());
        tableEnv.executeSql("create table datagen_source(id BIGINT, val BIGINT) 
with " +
                "('connector'='datagen', 'number-of-rows'='1000')");
        tableEnv.executeSql("select " +
                "count(distinct val * 1), " +
                "count(distinct val * 2), " +
                "count(distinct val * 3), " +
                "count(distinct val * 4), " +
                "count(distinct val * 5), " +
                "count(distinct val * 6), " +
                "count(distinct val * 7), " +
                "count(distinct val * 8), " +
                "count(distinct val * 9), " +
                "count(distinct val * 10), " +
                "count(distinct val * 11), " +
                "count(distinct val * 12), " +
                "count(distinct val * 13), " +
                "count(distinct val * 14), " +
                "count(distinct val * 15), " +
                "count(distinct val * 16), " +
                "count(distinct val * 17), " +
                "count(distinct val * 18), " +
                "count(distinct val * 19), " +
                "count(distinct val * 20), " +
                "count(distinct val * 21), " +
                "count(distinct val * 22), " +
                "count(distinct val * 23), " +
                "count(distinct val * 24), " +
                "count(distinct val * 25), " +
                "count(distinct val * 26), " +
                "count(distinct val * 27), " +
                "count(distinct val * 28), " +
                "count(distinct val * 29), " +
                "count(distinct val * 30), " +
                "count(distinct val * 31), " +
                "count(distinct val * 32), " +
                "count(distinct val * 33), " +
                "count(distinct val * 34), " +
                "count(distinct val * 35), " +
                "count(distinct val * 36), " +
                "count(distinct val * 37), " +
                "count(distinct val * 38), " +
                "count(distinct val * 39), " +
                "count(distinct val * 40), " +
                "count(distinct val * 41), " +
                "count(distinct val * 42), " +
                "count(distinct val * 43), " +
                "count(distinct val * 44), " +
                "count(distinct val * 45), " +
                "count(distinct val * 46), " +
                "count(distinct val * 47), " +
                "count(distinct val * 48), " +
                "count(distinct val * 49), " +
                "count(distinct val * 50), " +
                "count(distinct val * 51), " +
                "count(distinct val * 52), " +
                "count(distinct val * 53), " +
                "count(distinct val * 54), " +
                "count(distinct val * 55), " +
                "count(distinct val * 56), " +
                "count(distinct val * 57), " +
                "count(distinct val * 58), " +
                "count(distinct val * 59), " +
                "count(distinct val * 60), " +
                "count(distinct val * 61), " +
                "count(distinct val * 62), " +
                "count(distinct val * 63), " +
                "count(distinct val * 64), " +
                "count(distinct val * 65) from datagen_source").print();
    }
} {code}
Exception:
{code:java}
Exception in thread "main" org.apache.flink.table.api.TableExc

[jira] [Created] (FLINK-27563) Resource Providers - Yarn doc page has minor display error

2022-05-10 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-27563:
-

 Summary:  Resource Providers - Yarn doc page has minor display 
error
 Key: FLINK-27563
 URL: https://issues.apache.org/jira/browse/FLINK-27563
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.15.0
Reporter: Wei Zhong
 Attachments: image-2022-05-10-17-44-37-241.png

doc link: 
[https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/yarn/]

screen shot:

!image-2022-05-10-17-44-37-241.png|width=811,height=301!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-27438) SQL validation failed when constructing a map array

2022-04-28 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-27438:
-

 Summary: SQL validation failed when constructing a map array
 Key: FLINK-27438
 URL: https://issues.apache.org/jira/browse/FLINK-27438
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.15.0
Reporter: Wei Zhong


Exception: 
{code:java}
Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL 
validation failed. Unsupported type when convertTypeToSpec: MAP
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:185)
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:110)
    at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:237)
    at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:105)
    at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:695)
    at 
org.apache.flink.table.planner.plan.stream.sql.LegacyTableFactoryTest.(LegacyTableFactoryTest.java:35)
    at 
org.apache.flink.table.planner.plan.stream.sql.LegacyTableFactoryTest.main(LegacyTableFactoryTest.java:49)
Caused by: java.lang.UnsupportedOperationException: Unsupported type when 
convertTypeToSpec: MAP
    at 
org.apache.calcite.sql.type.SqlTypeUtil.convertTypeToSpec(SqlTypeUtil.java:1059)
    at 
org.apache.calcite.sql.type.SqlTypeUtil.convertTypeToSpec(SqlTypeUtil.java:1081)
    at 
org.apache.flink.table.planner.functions.utils.SqlValidatorUtils.castTo(SqlValidatorUtils.java:82)
    at 
org.apache.flink.table.planner.functions.utils.SqlValidatorUtils.adjustTypeForMultisetConstructor(SqlValidatorUtils.java:74)
    at 
org.apache.flink.table.planner.functions.utils.SqlValidatorUtils.adjustTypeForArrayConstructor(SqlValidatorUtils.java:39)
    at 
org.apache.flink.table.planner.functions.sql.SqlArrayConstructor.inferReturnType(SqlArrayConstructor.java:44)
    at org.apache.calcite.sql.SqlOperator.validateOperands(SqlOperator.java:449)
    at org.apache.calcite.sql.SqlOperator.deriveType(SqlOperator.java:531)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5716)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5703)
    at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1736)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType(SqlValidatorImpl.java:1727)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.expandSelectItem(SqlValidatorImpl.java:421)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelectList(SqlValidatorImpl.java:4061)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3347)
    at 
org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
    at 
org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:997)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:975)
    at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:232)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:952)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:704)
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:180)
    ... 6 more {code}
How to reproduce:
{code:java}
tableEnv.executeSql("select array[map['A', 'AA'], map['B', 'BB'], map['C', 
CAST(NULL AS STRING)]] from (VALUES ('a'))").print(); {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [VOTE] FLIP-206: Support PyFlink Runtime Execution in Thread Mode

2022-01-12 Thread Wei Zhong
+1(binding)

Best,
Wei

> 2022年1月12日 下午5:58,Xingbo Huang  写道:
> 
> Hi all,
> 
> I would like to start the vote for FLIP-206[1], which was discussed and
> reached a consensus in the discussion thread[2].
> 
> The vote will be open for at least 72h, unless there is an objection or not
> enough votes.
> 
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-206%3A+Support+PyFlink+Runtime+Execution+in+Thread+Mode
> [2] https://lists.apache.org/thread/c7d2mt1vh8v11x2sl8slm4sy9j3t2pdg
> 
> Best,
> Xingbo



Re: [DISCUSS] FLIP-206: Support PyFlink Runtime Execution in Thread Mode

2021-12-30 Thread Wei Zhong
Hi Xingbo,

Thanks for creating this FLIP. Big +1 for it!

I have some question about the Thread Mode:

1. It seems that we dynamically load an embedded Python and user dependencies 
in the TM process. Can they be uninstalled cleanly after the task finished? 
i.e. Can we use the Thread Mode in session mode and Pyflink shell?

2. Does one TM have only one embedded Python running at the same time? If all 
the Python operator in the TM share the same PVM, will there be a loss in 
performance?

3. How do we load the relevant c library if the python.executable is provided 
by users? May there be a risk of version conflicts?

Best,
Wei


> 2021年12月29日 上午11:56,Xingbo Huang  写道:
> 
> Hi everyone,
> 
> I would like to start a discussion thread on "Support PyFlink Runtime
> Execution in Thread Mode"
> 
> We have provided PyFlink Runtime framework to support Python user-defined
> functions since Flink 1.10. The PyFlink Runtime framework is called Process
> Mode, which depends on an inter-process communication architecture based on
> the Apache Beam Portability framework. Although starting a dedicated
> process to execute Python user-defined functions could have better resource
> isolation, it will bring greater resource and performance overhead.
> 
> In order to overcome the resource and performance problems on Process Mode,
> we will propose a new execution mode which executes Python user-defined
> functions in the same thread instead of a separate process.
> 
> I have drafted the FLIP-206[1]. Please feel free to reply to this email
> thread. Looking forward to your feedback!
> 
> Best,
> Xingbo
> 
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-206%3A+Support+PyFlink+Runtime+Execution+in+Thread+Mode



[jira] [Created] (FLINK-21969) PythonTimestampsAndWatermarksOperator emitted the Long.MAX_VALUE watermark before emitting all the data

2021-03-25 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-21969:
-

 Summary: PythonTimestampsAndWatermarksOperator emitted the 
Long.MAX_VALUE watermark before emitting all the data
 Key: FLINK-21969
 URL: https://issues.apache.org/jira/browse/FLINK-21969
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.12.2, 1.13.0
Reporter: Wei Zhong
Assignee: Wei Zhong
 Attachments: image-2021-03-25-14-17-06-873.png

Currently the PythonTimestampsAndWatermarksOperator emitted the Long.MAX_VALUE 
watermark before emitting all the data, which makes some registered timer can 
not be triggered on bounded stream, we need to fix this.

!image-2021-03-25-14-17-06-873.png|width=493,height=284!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21842) Support user defined WindowAssigner, Trigger and ProcessWindowFunction on Python DataStream API

2021-03-17 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-21842:
-

 Summary: Support user defined WindowAssigner, Trigger and 
ProcessWindowFunction on Python DataStream API
 Key: FLINK-21842
 URL: https://issues.apache.org/jira/browse/FLINK-21842
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong
Assignee: Wei Zhong


Currently the Python DataStream API still not support user defined window. We 
need to support it firstly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21792) Add a PythonFunctionRunnerDescriptor and unify the creation of all the PythonFunctionRunner

2021-03-15 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-21792:
-

 Summary: Add a PythonFunctionRunnerDescriptor and unify the 
creation of all the PythonFunctionRunner 
 Key: FLINK-21792
 URL: https://issues.apache.org/jira/browse/FLINK-21792
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Currently almost all the Python function operators implement the 
createPythonFunctionRunner() method by themselves and most of the logic is 
same. We can add a PythonFunctionRunnerDescriptor and unify the creation of all 
the PythonFunctionRunners. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21679) Set output type for transformations from SourceProvider and DataStreamScanProvider in CommonExecTableSourceScan

2021-03-08 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-21679:
-

 Summary: Set output type for transformations from SourceProvider 
and DataStreamScanProvider in CommonExecTableSourceScan
 Key: FLINK-21679
 URL: https://issues.apache.org/jira/browse/FLINK-21679
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Reporter: Wei Zhong
Assignee: Wei Zhong


Currently we only set output type for the transformations from 
SourceFunctionProvider and InputFormatProvider in CommonExecTableSourceScan:
{code:java}
@Override
protected Transformation translateToPlanInternal(PlannerBase planner) {
final StreamExecutionEnvironment env = planner.getExecEnv();
final String operatorName = getDescription();
final InternalTypeInfo outputTypeInfo =
InternalTypeInfo.of((RowType) getOutputType());
final ScanTableSource tableSource = 
tableSourceSpec.getScanTableSource(planner);
ScanTableSource.ScanRuntimeProvider provider =

tableSource.getScanRuntimeProvider(ScanRuntimeProviderContext.INSTANCE);
if (provider instanceof SourceFunctionProvider) {
SourceFunction sourceFunction =
((SourceFunctionProvider) provider).createSourceFunction();
return env.addSource(sourceFunction, operatorName, 
outputTypeInfo).getTransformation();
} else if (provider instanceof InputFormatProvider) {
InputFormat inputFormat =
((InputFormatProvider) provider).createInputFormat();
return createInputFormatTransformation(env, inputFormat, 
outputTypeInfo, operatorName);
} else if (provider instanceof SourceProvider) {
// outputTypeInfo is not set here
Source source = ((SourceProvider) 
provider).createSource();
return env.fromSource(source, WatermarkStrategy.noWatermarks(), 
operatorName)
.getTransformation();
} else if (provider instanceof DataStreamScanProvider) {
// outputTypeInfo is not set here
return ((DataStreamScanProvider) 
provider).produceDataStream(env).getTransformation();
} else {
throw new UnsupportedOperationException(
provider.getClass().getSimpleName() + " is unsupported now.");
}
}{code}
We can also set output type for transformations from SourceProvider and 
DataStreamScanProvider in CommonExecTableSourceScan, so that users do not need 
to implement a ResultQueryable interface when implementing the new Source 
interface in FLIP-27.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21509) "w.start" returns "1970-01-01" when used with Pandas UDAF after grouping by slide window with processing time

2021-02-25 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-21509:
-

 Summary: "w.start" returns "1970-01-01" when used with Pandas UDAF 
after grouping by slide window with processing time
 Key: FLINK-21509
 URL: https://issues.apache.org/jira/browse/FLINK-21509
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.12.1, 1.13.0
        Reporter: Wei Zhong


"w.start" returns "1970-01-01" when used with Pandas UDAF after grouping by 
slide window with processing time. Reproduce code:
{code:java}
t_env.get_config().get_configuration().set_string("parallelism.default", "1")
from pyflink.table.window import Slide
t_env.register_function("mean_udaf", mean_udaf)

source_table = """
create table source_table(
a INT,
proctime as PROCTIME()
) with(
'connector' = 'datagen',
'rows-per-second' = '1',
'fields.a.kind' = 'sequence',
'fields.a.start' = '1',
'fields.a.end' = '10'
)
"""
t_env.execute_sql(source_table)
t = t_env.from_path("source_table")
iterator = t.select("a, proctime") \

.window(Slide.over("1.seconds").every("1.seconds").on("proctime").alias("w")) \
.group_by("a, w") \
.select("mean_udaf(a) as b, w.start").execute().collect()
result = [i for i in iterator]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Re: [ANNOUNCE] New Apache Flink Committers - Wei Zhong and Xingbo Huang

2021-02-23 Thread Wei Zhong
Thank you all for your congratulations! It's really an honor for me.

Best,
Wei


Xingbo Huang  于2021年2月24日周三 上午9:54写道:

> Thank you everyone! I really don't know what to say. I'm truly honoured.
>
> Best,
> Xingbo
>
> Guowei Ma  于2021年2月24日周三 上午9:29写道:
>
> > Congratulations Wei and Xingbo!
> >
> > Best,
> > Guowei
> >
> >
> > On Wed, Feb 24, 2021 at 9:21 AM Zhu Zhu  wrote:
> >
> > > Congratulations Wei and Xingbo!
> > >
> > > Thanks,
> > > Zhu
> > >
> > > Zhijiang  于2021年2月23日周二 上午10:59写道:
> > >
> > > > Congratulations Wei and Xingbo!
> > > >
> > > >
> > > > Best,
> > > > Zhijiang
> > > >
> > > >
> > > > ----------
> > > > From:Yun Tang 
> > > > Send Time:2021年2月23日(星期二) 10:58
> > > > To:Roman Khachatryan ; dev 
> > > > Subject:Re: Re: [ANNOUNCE] New Apache Flink Committers - Wei Zhong
> and
> > > > Xingbo Huang
> > > >
> > > > Congratulation!
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Yun Gao 
> > > > Sent: Tuesday, February 23, 2021 10:56
> > > > To: Roman Khachatryan ; dev 
> > > > Subject: Re: Re: [ANNOUNCE] New Apache Flink Committers - Wei Zhong
> and
> > > > Xingbo Huang
> > > >
> > > > Congratulations Wei and Xingbo!
> > > >
> > > > Best,
> > > > Yun
> > > >
> > > >
> > > >  --Original Mail --
> > > > Sender:Roman Khachatryan 
> > > > Send Date:Tue Feb 23 00:59:22 2021
> > > > Recipients:dev 
> > > > Subject:Re: [ANNOUNCE] New Apache Flink Committers - Wei Zhong and
> > Xingbo
> > > > Huang
> > > > Congratulations!
> > > >
> > > > Regards,
> > > > Roman
> > > >
> > > >
> > > > On Mon, Feb 22, 2021 at 12:22 PM Yangze Guo 
> > wrote:
> > > >
> > > > > Congrats,  Well deserved!
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Mon, Feb 22, 2021 at 6:47 PM Yang Wang 
> > > wrote:
> > > > > >
> > > > > > Congratulations Wei & Xingbo!
> > > > > >
> > > > > > Best,
> > > > > > Yang
> > > > > >
> > > > > > Rui Li  于2021年2月22日周一 下午6:23写道:
> > > > > >
> > > > > > > Congrats Wei & Xingbo!
> > > > > > >
> > > > > > > On Mon, Feb 22, 2021 at 4:24 PM Yuan Mei <
> yuanmei.w...@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Congratulations Wei & Xingbo!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yuan
> > > > > > > >
> > > > > > > > On Mon, Feb 22, 2021 at 4:04 PM Yu Li 
> > wrote:
> > > > > > > >
> > > > > > > > > Congratulations Wei and Xingbo!
> > > > > > > > >
> > > > > > > > > Best Regards,
> > > > > > > > > Yu
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, 22 Feb 2021 at 15:56, Till Rohrmann <
> > > > trohrm...@apache.org>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Congratulations Wei & Xingbo. Great to have you as
> > committers
> > > > in
> > > > > the
> > > > > > > > > > community now.
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Till
> > > > > > > > > >
> > > > > > > > > > On Mon, Feb 22, 2021 at 5:08 AM Xintong Song <
> > > > > tonysong...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Congratulatio

[jira] [Created] (FLINK-21294) Support state access API for the map operation of Python ConnectedStreams

2021-02-05 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-21294:
-

 Summary: Support state access API for the map operation of Python 
ConnectedStreams
 Key: FLINK-21294
 URL: https://issues.apache.org/jira/browse/FLINK-21294
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21242) Support state access API for map operation

2021-02-02 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-21242:
-

 Summary: Support state access API for map operation
 Key: FLINK-21242
 URL: https://issues.apache.org/jira/browse/FLINK-21242
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Wei Zhong
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21180) Move the state module from 'pyflink.common' to 'pyflink.datastream'

2021-01-27 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-21180:
-

 Summary: Move the state module from 'pyflink.common' to 
'pyflink.datastream'
 Key: FLINK-21180
 URL: https://issues.apache.org/jira/browse/FLINK-21180
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Wei Zhong
 Fix For: 1.13.0


Currently we put all the DataStream Functions to 'pyflink.datastream.functions' 
module and all the State API to 'pyflink.common.state' module. But the 
ReducingState and AggregatingState depend on ReduceFunction and 
AggregateFunction, which means the 'state' module will depend the 'functions' 
module. So we need to move the 'state' module to 'pyflink.datastream' package 
to avoid circular dependencies between 'pyflink.datastream' and 
'pyflink.common'.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21179) Split the base class of Python DataStream Function to 'Function' and 'RichFunction'

2021-01-27 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-21179:
-

 Summary: Split the base class of Python DataStream Function to 
'Function' and 'RichFunction'
 Key: FLINK-21179
 URL: https://issues.apache.org/jira/browse/FLINK-21179
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Wei Zhong
 Fix For: 1.13.0


As the ReducingState and AggregatingState only support non-rich functions, we 
need to split the base class of the DataStream Functions to 'Function' and 
'RichFunction'.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21115) Add AggregatingState and corresponding StateDescriptor for Python DataStream API

2021-01-24 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-21115:
-

 Summary: Add AggregatingState and corresponding StateDescriptor 
for Python DataStream API
 Key: FLINK-21115
 URL: https://issues.apache.org/jira/browse/FLINK-21115
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21114) Support ReducingState in Python DataStream API

2021-01-24 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-21114:
-

 Summary: Support ReducingState in Python DataStream API
 Key: FLINK-21114
 URL: https://issues.apache.org/jira/browse/FLINK-21114
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Apache Flink 1.12.1 released

2021-01-19 Thread Wei Zhong
Thanks Xintong for the great work!

Best,
Wei

> 在 2021年1月19日,18:00,Guowei Ma  写道:
> 
> Thanks Xintong's effort!
> Best,
> Guowei
> 
> 
> On Tue, Jan 19, 2021 at 5:37 PM Yangze Guo  > wrote:
> Thanks Xintong for the great work!
> 
> Best,
> Yangze Guo
> 
> On Tue, Jan 19, 2021 at 4:47 PM Till Rohrmann  > wrote:
> >
> > Thanks a lot for driving this release Xintong. This was indeed a release 
> > with some obstacles to overcome and you did it very well!
> >
> > Cheers,
> > Till
> >
> > On Tue, Jan 19, 2021 at 5:59 AM Xingbo Huang  > > wrote:
> >>
> >> Thanks Xintong for the great work!
> >>
> >> Best,
> >> Xingbo
> >>
> >> Peter Huang  >> > 于2021年1月19日周二 下午12:51写道:
> >>
> >> > Thanks for the great effort to make this happen. It paves us from using
> >> > 1.12 soon.
> >> >
> >> > Best Regards
> >> > Peter Huang
> >> >
> >> > On Mon, Jan 18, 2021 at 8:16 PM Yang Wang  >> > > wrote:
> >> >
> >> > > Thanks Xintong for the great work as our release manager!
> >> > >
> >> > >
> >> > > Best,
> >> > > Yang
> >> > >
> >> > > Xintong Song mailto:xts...@apache.org>> 
> >> > > 于2021年1月19日周二 上午11:53写道:
> >> > >
> >> > >> The Apache Flink community is very happy to announce the release of
> >> > >> Apache Flink 1.12.1, which is the first bugfix release for the Apache
> >> > Flink
> >> > >> 1.12 series.
> >> > >>
> >> > >> Apache Flink® is an open-source stream processing framework for
> >> > >> distributed, high-performing, always-available, and accurate data
> >> > streaming
> >> > >> applications.
> >> > >>
> >> > >> The release is available for download at:
> >> > >> https://flink.apache.org/downloads.html 
> >> > >> 
> >> > >>
> >> > >> Please check out the release blog post for an overview of the
> >> > >> improvements for this bugfix release:
> >> > >> https://flink.apache.org/news/2021/01/19/release-1.12.1.html 
> >> > >> 
> >> > >>
> >> > >> The full release notes are available in Jira:
> >> > >>
> >> > >>
> >> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12349459
> >> >  
> >> > 
> >> > >>
> >> > >> We would like to thank all contributors of the Apache Flink community
> >> > who
> >> > >> made this release possible!
> >> > >>
> >> > >> Regards,
> >> > >> Xintong
> >> > >>
> >> > >
> >> >



Re: [DISCUSS] FLIP-157 Migrate Flink Documentation from Jekyll to Hugo

2021-01-14 Thread Wei Zhong
+1 for migrating to Hugo.

Currently we have developed many plugins based on Jekyll because the native 
features of Jekyll cannot meet our needs. It seems all of them can be supported 
via Hugo shortcodes and will become more concise.

Best,
Wei

> 在 2021年1月14日,18:21,Aljoscha Krettek  写道:
> 
> +1
> 
> The build times on Jekyll have just become to annoying for me. I realize that 
> that is also a function of how we structure our documentation, and especially 
> how we construct the nav sidebar, but I think overall moving to Hugo is still 
> a benefit.
> 
> Aljoscha
> 
> On 2021/01/13 10:14, Seth Wiesman wrote:
>> Hi All,
>> 
>> I would like to start a discussion for FLIP-157: Migrating the Flink docs
>> from Jekyll to Hugo.
>> 
>> This will allow us:
>> 
>>  - Proper internationalization
>>  - Working Search
>>  - Sub-second build time ;)
>> 
>> Please take a look and let me know what you think.
>> 
>> Seth
>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-157+Migrate+Flink+Documentation+from+Jekyll+to+Hugo



Re: [VOTE] Release 1.12.1, release candidate #2

2021-01-14 Thread Wei Zhong
+1 (non-binding)

- verified checksums and signatures
- build Flink from source code
- pip install PyFlink source distribution and run a python udf job on 
MacOS/Win10
- start a standalone cluster on MacOS and submit a python udf job
- the documentation pages built from source look good

Best,
Wei

> 在 2021年1月14日,19:24,Xingbo Huang  写道:
> 
> +1 (non-binding)
> 
> - verified checksums and signatures
> - build Flink from source code
> - pip install PyFlink in MacOS under py35,py36,py37,py38
> - start a standalone cluster and submit a python udf job.
> 
> Best,
> Xingbo
> 
> Jark Wu  于2021年1月14日周四 下午6:14写道:
> 
>> +1
>> 
>> - checked/verified signatures and hashes
>> - reviewed the release pull request
>> - started cluster for Scala 2.11, ran examples, verified web ui and log
>> output, nothing unexpected
>> - started cluster and SQL CLI, run some SQL queries for kafka,
>> upsert-kafka, elasticsearch, and mysql connectors, nothing unexpected
>> 
>> Best,
>> Jark
>> 
>> On Thu, 14 Jan 2021 at 12:11, Yang Wang  wrote:
>> 
>>> +1 (non-binding)
>>> 
>>> * Check the checksum and signatures
>>> * Build from source successfully
>>> * Deploy Flink on K8s natively with HA mode and check the related fixes
>>>  * FLINK-20650, flink binary could work with the updated
>>> docker-entrypoint.sh in flink-docker
>>>  * FLINK-20664, support service account for TaskManager pod
>>>  * FLINK-20648, restore job from savepoint when using Kubernetes based
>> HA
>>> services
>>> * Check the webUI and logs without abnormal information
>>> 
>>> Best,
>>> Yang
>>> 
>>> Xintong Song  于2021年1月14日周四 上午10:21写道:
>>> 
 +1 (non-binding)
 
 - verified checksums and signatures
 - no binaries found in source archive
 - build from source
 - played with a couple of example jobs
 - played with various deployment modes
 - webui and logs look good
 
 On Thu, Jan 14, 2021 at 1:02 AM Chesnay Schepler 
 wrote:
 
> +1
> 
> - no dependencies have been changed requiring license updates
> - nothing seems to be missing from the maven repository
> - verified checksums/signatures
> 
> On 1/10/2021 2:23 AM, Xintong Song wrote:
>> Hi everyone,
>> 
>> Please review and vote on the release candidate #2 for the version
> 1.12.1,
>> as follows:
>> 
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific
>> comments)
>> 
>> The complete staging area is available for your review, which
>>> includes:
>> * JIRA release notes [1],
>> * the official Apache source release and binary convenience
>> releases
>>> to
> be
>> deployed to dist.apache.org [2], which are signed with the key
>> with
>> fingerprint F8E419AA0B60C28879E876859DFF40967ABFC5A4 [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "release-1.12.1-rc2" [5],
>> * website pull request listing the new release and adding
>>> announcement
> blog
>> post [6].
>> 
>> The vote will be open for at least 72 hours. It is adopted by
>>> majority
>> approval, with at least 3 PMC affirmative votes.
>> 
>> Thanks,
>> Xintong Song
>> 
>> [1]
>> 
> 
 
>>> 
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12349459
>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.12.1-rc2
>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>> [4]
> 
>> https://repository.apache.org/content/repositories/orgapacheflink-1411
>> [5]
>> https://github.com/apache/flink/releases/tag/release-1.12.1-rc2
>> [6] https://github.com/apache/flink-web/pull/405
>> 
> 
> 
 
>>> 
>> 



[jira] [Created] (FLINK-20963) Rewrite the example under 'flink-python/pyflink/table/examples'

2021-01-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-20963:
-

 Summary: Rewrite the example under 
'flink-python/pyflink/table/examples'
 Key: FLINK-20963
 URL: https://issues.apache.org/jira/browse/FLINK-20963
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Currently the example under 'flink-python/pyflink/table/examples' still uses 
the deprecated APIs. We need to rewrite it with the latest recommended API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-20962) Rewrite the example in 'flink-python/pyflink/shell.py'

2021-01-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-20962:
-

 Summary: Rewrite the example in 'flink-python/pyflink/shell.py'
 Key: FLINK-20962
 URL: https://issues.apache.org/jira/browse/FLINK-20962
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Currently the pyflink example in 'flink-python/pyflink/shell.py' was added in 
version 1.9 and has not been updated since. We need to rewrite it with the 
latest recommended API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Welcome Danny Cranmer as a new Apache Flink Committer

2021-01-12 Thread Wei Zhong
Congratulations Danny!

Best,
Wei

> 在 2021年1月12日,19:09,hailongwang <18868816...@163.com> 写道:
> 
> Congratulations Danny!
> 
> Best,
> Hailong在 2021-01-12 17:05:31,"Jark Wu"  写道:
>> Congratulations Danny!
>> 
>> Best,
>> Jark
>> 
>> On Tue, 12 Jan 2021 at 17:59, Yangze Guo  wrote:
>> 
>>> Congrats, Danny!
>>> 
>>> Best,
>>> Yangze Guo
>>> 
>>> On Tue, Jan 12, 2021 at 5:55 PM Xingbo Huang  wrote:
 
 Congratulations, Danny!
 
 Best,
 Xingbo
 
 Dian Fu  于2021年1月12日周二 下午5:48写道:
 
> Congratulations, Danny!
> 
> Regards,
> Dian
> 
>> 在 2021年1月12日,下午5:40,Till Rohrmann  写道:
>> 
>> Congrats and welcome Danny!
>> 
>> Cheers,
>> Till
>> 
>> On Tue, Jan 12, 2021 at 10:09 AM Dawid Wysakowicz <
> dwysakow...@apache.org>
>> wrote:
>> 
>>> Congratulations, Danny!
>>> 
>>> Best,
>>> 
>>> Dawid
>>> 
>>> On 12/01/2021 09:52, Paul Lam wrote:
 Congrats, Danny!
 
 Best,
 Paul Lam
 
> 2021年1月12日 16:48,Tzu-Li (Gordon) Tai  写道:
> 
> Hi everyone,
> 
> I'm very happy to announce that the Flink PMC has accepted Danny
>>> Cranmer to
> become a committer of the project.
> 
> Danny has been extremely helpful on the mailing lists with
>>> answering
>>> user
> questions on the AWS Kinesis connector, and has driven numerous
>>> new
> features and timely bug fixes for the connector as well.
> 
> Please join me in welcoming Danny :)
> 
> Cheers,
> Gordon
 
>>> 
>>> 
> 
> 
>>> 



Re: [VOTE] FLIP-153: Support state access in Python DataStream API

2021-01-05 Thread Wei Zhong
+1 (non-binding)

Best,
Wei

> 在 2021年1月6日,14:05,Xingbo Huang  写道:
> 
> +1 (non-binding)
> 
> Best,
> Xingbo
> 
> Dian Fu  于2021年1月6日周三 下午1:38写道:
> 
>> +1 (binding)
>> 
>>> 在 2021年1月6日,下午1:12,Shuiqiang Chen  写道:
>>> 
>>> Hi devs,
>>> 
>>> The discussion of the FLIP-153 [1] seems has reached a consensus through
>>> the mailing thread [2]. I would like to start a vote for it.
>>> 
>>> The vote will be opened until 11th January (72h), unless there is an
>>> objection or no enough votes.
>>> 
>>> Best,
>>> Shuiqiang
>>> 
>>> [1]:
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-153%3A+Support+state+access+in+Python+DataStream+API
>>> 
>>> [2]:
>>> 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-153-Support-state-access-in-Python-DataStream-API-td47127.html
>> 
>> 



Re: [DISCUSS] FLIP-155: Introduce a few convenient operations in Table API

2021-01-04 Thread Wei Zhong
Hi Dian,

Big +1 for making the Table API easier to use. Java users and Python users can 
both benefit from it. I think it would be better if we add some Python API 
examples. 

Best,
Wei


> 在 2021年1月4日,20:03,Dian Fu  写道:
> 
> Hi all,
> 
> I'd like to start a discussion about introducing a few convenient operations 
> in Table API from the perspective of ease of use. 
> 
> Currently some tasks are not easy to express in Table API e.g. deduplication, 
> topn, etc, or not easy to express when there are hundreds of columns in a 
> table, e.g. null data handling, etc.
> 
> I'd like to propose to introduce a few operations in Table API with the 
> following purposes:
> - Make Table API users to easily leverage the powerful features already in 
> SQL, e.g. deduplication, topn, etc
> - Provide some convenient operations, e.g. introducing a series of operations 
> for null data handling (it may become a problem when there are hundreds of 
> columns), data sampling and splitting (which is a very common use case in ML 
> which usually needs to split a table into multiple tables for training and 
> validation separately).
> 
> Please refer to FLIP-155 [1] for more details.
> 
> Looking forward to your feedback!
> 
> Regards,
> Dian
> 
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-155%3A+Introduce+a+few+convenient+operations+in+Table+API



[jira] [Created] (FLINK-20834) Add metrics for reporting the memory usage and CPU usage of the Python UDF workers

2021-01-03 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-20834:
-

 Summary: Add metrics for reporting the memory usage and CPU usage 
of the Python UDF workers
 Key: FLINK-20834
 URL: https://issues.apache.org/jira/browse/FLINK-20834
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Currently these is no official approach to access the memory usage and CPU 
usage of the Python UDF workers. We need to add these metrics to monitor the 
running status of the Python processes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-20756) PythonCalcSplitConditionRule is not working as expected

2020-12-23 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-20756:
-

 Summary: PythonCalcSplitConditionRule is not working as expected
 Key: FLINK-20756
 URL: https://issues.apache.org/jira/browse/FLINK-20756
 Project: Flink
  Issue Type: Bug
Reporter: Wei Zhong


Currently if users write such a SQL:

`SELECT pyFunc5(f0, f1) FROM (SELECT e.f0, e.f1 FROM (SELECT pyFunc5(a) as e 
FROM MyTable) where e.f0 is NULL)`

It will be optimized to:

`FlinkLogicalCalc(select=[pyFunc5(pyFunc5(a)) AS f0])
+- FlinkLogicalCalc(select=[a], where=[IS NULL(pyFunc5(a).f0)])
 +- FlinkLogicalLegacyTableSourceScan(table=[[default_catalog, 
default_database, MyTable, source: [TestTableSource(a, b, c, d)]]], fields=[a, 
b, c, d])`

The optimized plan is not runnable, we need to fix this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-153: Support state access in Python DataStream API

2020-12-17 Thread Wei Zhong
Hi Shuiqiang,

Thanks for driving this. +1 for this feature, just a minor comment to the 
design doc.

The interface of the `AppendingState` should be:

class AppendingState(State, Generic[IN, OUT]):

   @abstractmethod
   def get(self) -> OUT:
   pass

   @abstractmethod
   def add(self, value: IN) -> None:
   pass

The output type and the input type of the `AppendingState` maybe different. And 
the definition of the child classes should be:

class MergingState(AppendingState[IN, OUT]):
pass


class ListState(MergingState[T, Iterable[T]]):

   @abstractmethod
   def update(self, values: List[T]) -> None:
   pass

   @abstractmethod
   def add_all(self, values: List[T]) -> None:
   pass

   def __iter__(self) -> Iterator[T]:
   return iter(self.get())

Best,
Wei

> 在 2020年12月17日,21:06,Shuiqiang Chen  写道:
> 
> Hi Yun,
> 
> Highly appreciate for your questions! I have the corresponding answers as 
> bellow:
> 
> Re 1: You are right that the state access occurs in an async thread. However, 
> all the state access will be synchrouzed in the Java operator and so there 
> will be no concurrent access to the state backend.
> 
> Re 2: I think it could be handled well in Python DataStream API. In this 
> case, there will be two operators and so also two keyed state backend.
> 
> Re 3: Sure, you are right. We will store the current key which may be used by 
> the timer.
> 
> Re 4: Good point. State migration is still not covered in the current FLIP. 
> I'd like to cover it in a separate FLIP as it should be orthogonal to this 
> FLIP. I have updated the FLIP and added clear description for this.
> 
> Re 5: Good point. We may need to introduce a Python querable state client if 
> we want to support Queryable state for Python operators. I'd like to cover it 
> in a separate FLIP. I have updated the FLIP and add it as a future work.
> 
> Best,
> Shuiqiang
> 
>> 在 2020年12月17日,下午12:08,Yun Tang  写道:
>> 
>> Hi Shuiqiang,
>> 
>> Thanks for driving this. I have several questions below:
>> 
>> 
>> 1.  Thread safety of state write-access. As you might know, state access is 
>> not thread-safe [1] in Flink, we depend on task single thread access. Since 
>> you change the state access to another async thread, can we still ensure 
>> this? It also includes not allow user to access state in its java operator 
>> along with the bundled python operator.
>> 2.  Number of keyed state backend per task. Flink would only have one keyed 
>> state-backend per operator and would only have one keyed state backend per 
>> operator chain (in the head operator if possible). However, once we use 
>> experimental features such as reinterpretAsKeyedStream [2], we could have 
>> two keyed state-backend in one operator chain within one task. Can python 
>> datastream API could handle this well?
>> 3.  Time to set current key. As we still need current key when registering 
>> timer [3], we need some place to hole the current key even not registered in 
>> keyed state backend.
>> 4.  State migration. Flink supports to migrate state automatically if new 
>> provided serializer is compatible with old serializer[4]. I'm afraid if 
>> python data stream API wraps user's serializer as 
>> BytePrimitiveArraySerializer, we will lose such functionality. Moreover, 
>> RocksDB will migrate state automatically on java side [5] and this will 
>> break if python related bytes involved.
>> 5.  Queryable state client. Currently, we only have java-based queryable 
>> state client [6], and we need another python-based queryable state client if 
>> involved python bytes.
>> 
>> [1] https://issues.apache.org/jira/browse/FLINK-13072
>> [2] 
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/experimental.html#reinterpreting-a-pre-partitioned-data-stream-as-keyed-stream
>> [3] 
>> https://github.com/apache/flink/blob/58cc2a5fbd419d6a9e4f9c251ac01ecf59a8c5a2/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/InternalTimerServiceImpl.java#L203
>> [4] 
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/schema_evolution.html#evolving-state-schema
>> [5] 
>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/stream/state/custom_serialization.html#off-heap-state-backends-eg-rocksdbstatebackend
>> [6] 
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/queryable_state.html#example
>> 
>> Best
>> Yun Tang
>> 
>> 
>> 
>> From: Shuiqiang Chen 
>> Sent: Wednesday, December 16, 2020 17:32
>> To: dev@flink.apache.org 
>> Subject: Re: [DISCUSS] FLIP-153: Support state access in Python DataStream 
>> API
>> 
>> Hi Xingbo,
>> 
>> Thank you for your valuable suggestions.
>> 
>> Indeed, we need to provide clearer abstractions for StateDescriptor and 
>> State APIs, I have updated the FLIP accordingly. Looking forward to your 
>> feedbacks!
>> 
>> Best,
>> Shuiqiang
>> 
>>> 在 2020年12月14日,上午11:27,Xingbo Huang  写道:
>>> 
>>> Thanks 

Re: [DISCUSS] Enforce common opinionated coding style using Spotless

2020-12-16 Thread Wei Zhong
+1 for the coding style automation. Thanks for driving this!

Best,
Wei

> 在 2020年12月17日,10:10,Xingbo Huang  写道:
> 
> +1 asap. Spotless can greatly help us save the time of fixing checkstyle
> errors.
> 
> Best,
> Xingbo
> 
> Kostas Kloudas  于2020年12月17日周四 上午4:14写道:
> 
>> +1 asap from my side as well.
>> 
>> On Wed, Dec 16, 2020 at 8:04 PM Arvid Heise  wrote:
>>> 
>>> +1 asap.
>>> 
>>> On Wed, Dec 16, 2020 at 7:44 PM Robert Metzger 
>> wrote:
>>> 
 +1
 
 Thanks for driving this.
 
 On Wed, Dec 16, 2020 at 7:33 PM Chesnay Schepler 
 wrote:
 
> +1 to set this up ASAP. Now's a good time to (finally) finalize this
> effort with a new release cycle having started and
>> christmas/vacations
> being around the corner.
> 
> On 12/16/2020 7:20 PM, Aljoscha Krettek wrote:
>> Let's try and conclude this discussion! I've prepared a PoC that
>> uses
>> Spotless with google-java-format to do the formatting:
>> https://github.com/aljoscha/flink/commits/flink-xxx-spotless
>> 
>> To summarize from earlier discussion, the main benefits are:
>> - no more worrying about code style, both as reviewer and
>> implementer
>> - everyone can configure their IDE to auto-format, there will
>> never
>> be unrelated formatting changes
>> 
>> Also, this uses git's blame.ignoreRevsFile to add a file that tells
>> git blame/IntelliJ to ignore the refactoring commit. However, you
>> need
>> to manually configure your git for that using:
>> 
>> $ git config blame.ignoreRevsFile .git-blame-ignore-revs
>> 
>> You can check out the PoC, look at the code in an IDE, play around
>> with blame/annotations.
>> 
>> By the way, the coding style is not configurable, it’s the Google
>> Java
>> Style, wich uses spaces for indentation. In an IDE or on github the
>> code looks barely different from the previous formatting at a first
>> glance.
>> 
>> For IDE setup, I will add a guide in the README but it boils down
>> to:
>> 
>> - install the google-java-format plugin, enable it
>> - install the Save Actions plugin, enable "Optimize Imports" and
>> "Reformat File"
>> 
>> With this, every time you save, formatting will be applied
>> automatically. You will never see formatting complaints from CI
>> (except for cases where you break semantical checkstyle rules,
>> such as
>> using certain imports that we don't allow.).
>> 
>> What do you think?
>> 
>> Best,
>> Aljoscha
>> 
>> On 19.10.20 12:36, Aljoscha Krettek wrote:
>>> I don't like checkstyle because it cannot be easily applied from
>> the
>>> commandline. I'm happy to learn otherwise, though. And I'd also be
>>> very happy about alternative suggestions that can do that.
>>> 
>>> Right now, I think Spotless is the most straightforward tool for
>>> that. Also I don't care so much about what style we choose in the
>>> end. (If we choose one). My main motivation is that we have a
>> common,
>>> strict style that can easily applied via tooling so that we no
>> longer
>>> need to comment on coding style in PRs.
>>> 
>>> Aljoscha
>>> 
>>> On 09.10.20 11:11, tison wrote:
 +1 on locking on one codestyle and I think that is what current
 checkstyle
 rules serving.
 
 For automatic applying part, we suggest developing by IDEA and
>> with
 Checkstyle Plugin on IDEA applying checkstyle suggestion is also
 automatic.
 
 One short coming for spotless is that we can hardly adjust rules
>> if
 the
 project has its own issues to overcome. IIRC only several
 pre-defined rules
 and a clumsy general regex substitution rule can be used.
 
 FYI my team started with spotless but ended up with checkstyle
>> with
 few
 rules and Checkstyle Plugin for automation.
 
 Codestyle, in another perspective, is part of cognition of
>> developers
 working in project, not something we just apply before pull
>> request.
 No
 matter how much automation introduced, most of developers will
 converge
 working with the configured codestyle.
 
 Best,
 tison.
 
 
 Kostas Kloudas  于2020年10月7日周三 下午6:37写道:
 
> Hi all,
> 
> +1 for enforcing "a" codestyle using "a" tool.
> 
> As the project grows both in terms of LOCs and contributors,
>> this
> becomes more and more important as it eliminates some potential
 points
> of friction without any additional effort.
> 
> From the discussion, I am leaning towards having a single
>> commit
 with
> all the codestyle-related changes. This will avoid sprinkling
> refactoring commits all over the place for the next year or
>> more.
 

Re: [VOTE] Release 1.11.3, release candidate #2

2020-12-16 Thread Wei Zhong
+1 (non-binding)

- verified checksums and signatures
- pip install pyflink on Windows python 3.7
- run a python job with udfs on Windows

Best,
Wei

> 在 2020年12月16日,15:51,Dian Fu  写道:
> 
> +1 (binding)
> 
> - Verified the checksum and signature
> - Checked the pom changes since 1.11.2 for the dependencies changes and it 
> looks good to me
> - Checked the website PR and left one minor comment
> 
> Thanks,
> Dian
> 
>> 在 2020年12月16日,下午3:00,Xintong Song  写道:
>> 
>> +1 (non-binding)
>> 
>>  - Verified checksum and signature
>>  - Source archive contains no binaries
>>  - Built from source (scala 2.11, tests skipped)
>>  - Run example jobs with various deployments modes
>> - Example jobs
>>- examples/streaming/WordCount.jar
>>- examples/streaming/WindowJoin.jar
>>- examples/batch/WordCount.jar
>> - Deployments
>>- Local (MiniCluster)
>>- Standalone
>>- Native Kubernetes Session/Application
>>- Yarn Session/Per-Job
>> - Verified
>>- WebUI looks good
>>- Logs look good
>>- Changing memory configurations works well
>>- Job recovers fine after killing taskmanager pod/container
>>(Native Kuberentes / Yarn)
>> 
>> 
>> On Wed, Dec 16, 2020 at 2:14 PM Yang Wang  wrote:
>> 
>>> +1 (non-binding)
>>> 
>>> - Build from source with scala_2.11/2.12
>>> - Test Flink deployment with Yarn and native K8s, verified the following
>>> critical bugs fixed
>>> - The provided lib could support non-qualified path, FLINK-20143
>>> - Application mode could terminate normally in attach mode, FLINK-19909
>>> - Application mode does not delete HA data in case of suspended
>>> ZooKeeper connection, FLINK-19154
>>> - Checking the logs and webUI with no suspicious information
>>> 
>>> 
>>> Best,
>>> Yang
>>> 
>>> Xingbo Huang  于2020年12月16日周三 下午12:56写道:
>>> 
 +1 (non-binding)
 
 - verified checksums and signatures
 - build Flink with Scala 2.11
 - pip install pyflink in MacOS under py37
 - test pyflink shell
 - test cover FLINK-18836 and FLINK-19675
 
 Best,
 Xingbo
 
 Tzu-Li (Gordon) Tai  于2020年12月16日周三 下午12:29写道:
 
> +1 (binding)
> 
> - Verified signature and hashes
> - Verified that source artifact contains no binaries
> - Built from source (tests included)
> - Tested against StateFun, with a test job that covers
> https://issues.apache.org/jira/browse/FLINK-19741
> 
> Thanks for creating the RC Xintong!
> 
> On Sun, Dec 13, 2020 at 9:37 AM Xintong Song  wrote:
> 
>> Hi everyone,
>> 
>> We have resolved FLIP-27 issues.
>> 
>> Please review and vote on the release candidate 21 for the version
> 1.11.3,
>> as follows:
>> 
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>> 
>> The complete staging area is available for your review, which
 includes:
>> * JIRA release notes [1],
>> * the official Apache source release and binary convenience releases
 to
> be
>> deployed to dist.apache.org [2], which are signed with the key with
>> fingerprint F8E419AA0B60C28879E876859DFF40967ABFC5A4 [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "release-1.11.3-rc2" [5],
>> * website pull request listing the new release and adding announcement
>> blog post [6].
>> 
>> The vote will be open for at least 72 hours. It is adopted by majority
>> approval, with at least 3 PMC affirmative votes.
>> 
>> Thanks, Xintong Song
>> 
>> [1]
>> 
> 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348761
>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.3-rc2
>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1407
>> [5] https://github.com/apache/flink/releases/tag/release-1.11.3-rc2
>> [6] https://github.com/apache/flink-web/pull/399
>> 
> 
 
>>> 
> 



Re: [ANNOUNCE] Apache Flink 1.12.0 released

2020-12-10 Thread Wei Zhong
Congratulations! Thanks Dian and Robert for the great work!

Best,
Wei

> 在 2020年12月10日,20:26,Leonard Xu  写道:
> 
> 
> Thanks Dian and Robert for the great work as release manager ! 
> And thanks everyone who makes the release possible ! 
> 
> 
> Best,
> Leonard
> 
>> 在 2020年12月10日,20:17,Robert Metzger  写道:
>> 
>> The Apache Flink community is very happy to announce the release of Apache
>> Flink 1.12.0, which is the latest major release.
>> 
>> Apache Flink® is an open-source stream processing framework for
>> distributed, high-performing, always-available, and accurate data streaming
>> applications.
>> 
>> The release is available for download at:
>> https://flink.apache.org/downloads.html
>> 
>> Please check out the release blog post for an overview of the improvements
>> for this bugfix release:
>> https://flink.apache.org/news/2020/12/10/release-1.12.0.html
>> 
>> The full release notes are available in Jira:
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348263
>> 
>> We would like to thank all contributors of the Apache Flink community who
>> made this release possible!
>> 
>> Regards,
>> Dian & Robert
> 



Re: [DISCUSS] Make SQL docs Blink only

2020-12-09 Thread Wei Zhong
+1

This is a very good proposal. The Python Table API documentation also contains 
some code that still uses the old planner. We may need to do the same for the 
Python Table API documentation in the future.

Best,
Wei

> 在 2020年12月9日,12:35,Jark Wu  写道:
> 
> +1
> 
> This is a very good idea.
> 
> Best,
> Jark
> 
> On Wed, 9 Dec 2020 at 10:43, Xingbo Huang  wrote:
> 
>> +1
>> 
>> This is a very good proposal.In release-1.12, many newly added features are
>> only supported on the blink planner. For example, the newly added features
>> of PyFlnk in FLIP-137[1] and FLIP-139[2] are only available on the blink
>> planner.
>> 
>> [1]
>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-137%3A+Support+Pandas+UDAF+in+PyFlink
>> [2]
>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-139%3A+General+Python+User-Defined+Aggregate+Function+Support+on+Table+API
>> 
>> Best,
>> Xingbo
>> 
>> Leonard Xu  于2020年12月9日周三 上午9:46写道:
>> 
>>> +1
>>> 
>>> Very good proposal, blink planner has been the default planner since
>> Flink
>>> 1.11.0, and many new features are not supported in legacy planner.
>>> 
>>> Best,
>>> Leonard
>>> 
 在 2020年12月9日,05:15,Arvid Heise  写道:
 
 +1, add a small info box about legacy planner and point to 1.11 doc
 (nothing should have changed)
 
 Ideally, "legacy" and "blink" does not appear anywhere in the doc
>> (except
 for that info box)
 
 On Tue, Dec 8, 2020 at 6:35 PM Marta Paes Moreira >> 
 wrote:
 
> +1, this is confusing (esp. for new users) and also creates more and
>>> more
> "annotation clutter" as new features are added.
> 
> On Tue, Dec 8, 2020 at 5:30 PM Aljoscha Krettek 
> wrote:
> 
>> +1
>> 
>> Yes, please!
>> 
>> On 08.12.20 16:52, David Anderson wrote:
>>> I agree -- I think separating out the legacy planner info should
>> make
>>> things clearer for everyone, and then some day we can simply drop
>> it.
>> Plus,
>>> doing it now will make it easier to make improvements to the docs
>>> going
>>> forward.
>>> 
>>> David
>>> 
>>> On Tue, Dec 8, 2020 at 4:38 PM Timo Walther 
> wrote:
>>> 
 Hi Seth,
 
 this is a very good idea. We might not be able to remove the legacy
 planner immediately but at least we can make the docs easier for
> current
 and future users of the Blink planner.
 
 Making the SQL docs Blink-only with a dedicated legacy planner page
 sounds good to me.
 
 Regards,
 Timo
 On 08.12.20 16:36, Seth Wiesman wrote:
> Hi Everyone,
> 
> I've been spending a lot of time recently working on the SQL
 documentation
> and I'm finding it very difficult to explain semantics as the two
> table
> planners continue to diverge. As Blink has been the default
>> planner
> for
> some time, and 1.12 now offers bounded data stream support, how
>> does
>> the
> community feel about making the documentation "blink only"?
> 
> We would update the documentation to assume users are always using
> the
> Blink planner. As the legacy planner still exists we would create
>> a
> dedicated legacy planner page for users that have not migrated for
 whatever
> reason - likely dataset interop. On this page, we would clearly
>> list
>> the
> features that are not supported by the legacy planner and any
> semantics
> that differ from the Blink planner.
> 
> Seth
> 
 
 
>>> 
>> 
>> 
> 
 
 
 --
 
 Arvid Heise | Senior Java Developer
 
 
 
 Follow us @VervericaData
 
 --
 
 Join Flink Forward  - The Apache Flink
 Conference
 
 Stream Processing | Event Driven | Real Time
 
 --
 
 Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
 
 --
 Ververica GmbH
 Registered at Amtsgericht Charlottenburg: HRB 158244 B
 Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
 (Toni) Cheng
>>> 
>>> 
>> 



Re: Urgent help on S3 CSV file reader DataStream Job

2020-12-07 Thread Wei Zhong
Hi Deep,

(redirecting this to user mailing list as this is not a dev question)

You can try to set the line delimiter and field delimiter of the 
RowCsvInputFormat to a non-printing character (assume there is no non-printing 
characters in the csv files). It will read all the content of a csv file into 
one Row. e.g.

final StreamExecutionEnvironment env =
   StreamExecutionEnvironment.getExecutionEnvironment();
String path = "test";
TypeInformation[] fieldTypes = new TypeInformation[]{
   BasicTypeInfo.STRING_TYPE_INFO};
RowCsvInputFormat csvFormat = 
   new RowCsvInputFormat(new Path(path), fieldTypes);
csvFormat.setNestedFileEnumeration(true);
csvFormat.setDelimiter((char) 0);
csvFormat.setFieldDelimiter(String.valueOf((char) 0));
DataStream
   lines = env.readFile(csvFormat, path, FileProcessingMode.PROCESS_ONCE,
   -1);lines.map(value -> value).print();
env.execute();

Then you can convert the content of the csv files to json manually.

Best,
Wei


> 在 2020年12月7日,19:10,DEEP NARAYAN Singh  写道:
> 
> Hi  Guys,
> 
> Below is my code snippet , which read all csv files under the given folder
> row by row but my requirement is to read csv file at a time and  convert as
> json which will looks like :
> {"A":"1","B":"3","C":"4","D":9}
> 
> Csv file data format   :
> ---
> *field_id,data,*
> 
> 
> 
> *A,1B,3C,4D,9*
> 
> Code snippet:
> --
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> *final StreamExecutionEnvironment env =
> StreamExecutionEnvironment.getExecutionEnvironment();String path =
> "s3://messages/data/test/dev/2020-12-07/67241306/";TypeInformation[]
> fieldTypes = new TypeInformation[]{  BasicTypeInfo.STRING_TYPE_INFO,
>  BasicTypeInfo.STRING_TYPE_INFO};RowCsvInputFormat csvFormat =  new
> RowCsvInputFormat(new Path(path),
> fieldTypes);csvFormat.setSkipFirstLineAsHeader(true);csvFormat.setNestedFileEnumeration(true);DataStream
> lines = env.readFile(csvFormat, path, FileProcessingMode.PROCESS_ONCE,
> -1);lines.map(value -> value).print();*
> 
> 
> Any help is highly appreciated.
> 
> Thanks,
> -Deep



Re: [VOTE] Release 1.12.0, release candidate #3

2020-12-04 Thread Wei Zhong
+1 (non-binding)

- verified checksums and signatures
- build Flink with Scala 2.11
- pip install pyflink on Windows python 3.7
- run a python job with udfs on Windows
- pyflink shell works well on local mode and remote mode

Best,
Wei

> 在 2020年12月4日,17:21,Yang Wang  写道:
> 
> +1 (non-binding)
> 
> * Build from source
> * Deploy Flink cluster in following deployments with HA enabled(ZooKeeper
> and K8s), including kill JobManager and check failover
>  * Native K8s Session
>  * Native K8s Application
>  * Yarn Session
>  * Yarn Per-Job
>  * Yarn Application
> * Check webUI and logs in different deployments especially via `kubectl
> logs` in K8s
> 
> Best,
> Yang
> 
> Xintong Song  于2020年12月4日周五 下午3:00写道:
> 
>> +1 (non-binding)
>> 
>>   - Verified checksums and signatures
>>   - No binaries found in source archive
>>   - Build from source
>>   - Tried a couple of example jobs in various deployment mode
>>  - Local
>>  - Standalone
>>  - Native Kubernetes Application
>>  - Native Kubernetes Session
>>  - Yarn Job
>>  - Yarn Session
>>   - Changing memory configurations, things work as expected
>>   - UI looks good
>>   - Logs look good
>> 
>> 
>> 
>> Thank you~
>> 
>> Xintong Song
>> 
>> 
>> 
>> On Thu, Dec 3, 2020 at 9:18 PM Rui Li  wrote:
>> 
>>> +1 (non-binding)
>>> 
>>> Built from source and verified hive connector tests for different hive
>>> versions.
>>> Setup a cluster to connect to a real hive warehouse and run some queries
>>> successfully.
>>> 
>>> On Thu, Dec 3, 2020 at 8:44 PM Xingbo Huang  wrote:
>>> 
 +1 (non-binding)
 
 Checks:
 1. verified checksums and signatures
 2. build Flink with Scala 2.11
 3. pip install pyflink in MacOS/CentOS under py35,py36,py37,py38
 4. test Pandas UDAF/General UDAF/Python DataStream MapFunction
 5. start standalone cluster and submit a python udf job.
 6. verified NOTICE/LICENSE files of some regular modules
 
 I observed that the NOTICE file of flink-sql-connector-hbase-2.2 lists
>> 3
 dependencies that are not bundled in:
 commons-lang:commons-lang:2.6
 org.apache.hbase:hbase-hadoop-compat:2.2.3
 org.apache.hbase:hbase-hadoop2-compat:2.2.3
 
 I guess listing more than dependencies with apache licensed shouldn't
>> be
>>> a
 blocker issue. I have opened a PR[1] to fix it.
 
 [1] https://github.com/apache/flink/pull/14299
 
 Best,
 Xingbo
 
 Robert Metzger  于2020年12月3日周四 下午5:36写道:
 
> There's now a pull request for the announcement blog post, please
>> help
> checking it: https://github.com/apache/flink-web/pull/397
> 
> On Thu, Dec 3, 2020 at 9:03 AM Robert Metzger 
 wrote:
> 
>> +1 (binding)
>> 
>> 
>> Checks:
>> - checksums seem correct
>> - source archive code compiles
>> - Compiled a test job against the staging repository
>> - launched a standalone cluster, ran some test jobs against it
>> - quickstart contains correct version
>> - regular jars contain correct NOTICE file
>> - Looked a bit over the output of
>> git diff release-1.11.2...release-1.12 --  "**/pom.xml"
>> 
>> 
>> 
>> I noticed that at least one more jar file contains an invalid
>> LICENSE
> file
>> in it's root. This has already been the case with Flink 1.11, and
>>> from
> the
>> context (apache flink jar, all the other license and notice files
>>> talk
>> about this being an Apache project) it should be clear that the
>>> license
>> file is not meant for the whole jar file contents.
>> I will still extend the automated LicenseChecker to resolve this,
>>> but I
>> don't want to cancel the release because of this.
>> 
>> 
>> 
>> On Wed, Dec 2, 2020 at 11:19 AM Robert Metzger <
>> rmetz...@apache.org>
>> wrote:
>> 
>>> Hi everyone,
>>> 
>>> We have resolved the licensing issue Chesnay found.
>>> 
>>> Please review and vote on the release candidate #3 for the version
>>> 1.12.0, as follows:
>>> 
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific
>>> comments)
>>> 
>>> 
>>> The complete staging area is available for your review, which
 includes:
>>> * JIRA release notes [1a], and website release notes [1b]
>>> * the official Apache source release and binary convenience
>> releases
 to
>>> be deployed to dist.apache.org [2], which are signed with the key
 with
>>> fingerprint D9839159 [3],
>>> * all artifacts to be deployed to the Maven Central Repository
>> [4],
>>> * source code tag "release-1.12.0-rc3" [5]
>>> 
>>> We will soon publish the PR for the release announcement blog
>> post!
>>> 
>>> The vote will be open for at least 72 hours. It is adopted by
>>> majority
>>> approval, with at least 3 PMC affirmative votes.
>>> 
>>> Thanks,
>>> Dian & Robert
>>> 

[jira] [Created] (FLINK-20333) Flink standalone cluster throws metaspace OOM after submitting multiple PyFlink UDF jobs.

2020-11-24 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-20333:
-

 Summary: Flink standalone cluster throws metaspace OOM after 
submitting multiple PyFlink UDF jobs.
 Key: FLINK-20333
 URL: https://issues.apache.org/jira/browse/FLINK-20333
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Reporter: Wei Zhong


Currently the Flink standalone cluster will throw metaspace OOM after 
submitting multiple PyFlink UDF jobs. The root cause is that there are many 
soft references and Finalizers in memory, which prevent the garbage collection 
of the finished PyFlink job classloader. 

Due to their existence, it needs multiple full gc to reclaim the classloader of 
the completed job. If only one full gc is performed before the metaspace space 
is insufficient, then OOM will occur.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-20207) Improve the error message printed when submitting the pyflink jobs via 'flink run'

2020-11-17 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-20207:
-

 Summary: Improve the error message printed when submitting the 
pyflink jobs via 'flink run'
 Key: FLINK-20207
 URL: https://issues.apache.org/jira/browse/FLINK-20207
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Sometimes the Java stack traces were swallowed when submitting the pyflink jobs 
via "flink run", e.g.:

File "/home/cdh272705/poc/T24_parse.py", line 179, in from_kafka_to_oracle_demo
   
main_table.execute_insert("sink")#.get_job_client().get_job_execution_result().result()
 File 
"/home/cdh272705/.local/lib/python3.6/site-packages/pyflink/table/table.py", 
line 783, in execute_insert
   return TableResult(self._j_table.executeInsert(table_path, overwrite))
 File 
"/home/cdh272705/.local/lib/python3.6/site-packages/py4j/java_gateway.py", line 
1286, in __call__
   answer, self.gateway_client, self.target_id, self.name)
 File 
"/home/cdh272705/.local/lib/python3.6/site-packages/pyflink/util/exceptions.py",
 line 154, in deco
   raise exception_mapping[exception](s.split(': ', 1)[1], stack_trace)
pyflink.util.exceptions.TableException: 'Failed to execute sql'

 

The Java stack traces under the TableException were swallowed, which makes the 
troubleshooting difficult.

We need to improve the error reporting logic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19236) Optimize the performance of Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19236:
-

 Summary: Optimize the performance of Python UDAF
 Key: FLINK-19236
 URL: https://issues.apache.org/jira/browse/FLINK-19236
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19233) Support DISTINCT KeyWord for Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19233:
-

 Summary: Support DISTINCT KeyWord for Python UDAF
 Key: FLINK-19233
 URL: https://issues.apache.org/jira/browse/FLINK-19233
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19234) Support FILTER KeyWord for Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19234:
-

 Summary: Support FILTER KeyWord for Python UDAF
 Key: FLINK-19234
 URL: https://issues.apache.org/jira/browse/FLINK-19234
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19235) Support mixed use with built-in aggs for Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19235:
-

 Summary: Support mixed use with built-in aggs for Python UDAF
 Key: FLINK-19235
 URL: https://issues.apache.org/jira/browse/FLINK-19235
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19232) Support MapState and MapView for Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19232:
-

 Summary: Support MapState and MapView for Python UDAF
 Key: FLINK-19232
 URL: https://issues.apache.org/jira/browse/FLINK-19232
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19231) Support ListState and ListView for Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19231:
-

 Summary: Support ListState and ListView for Python UDAF
 Key: FLINK-19231
 URL: https://issues.apache.org/jira/browse/FLINK-19231
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19230) Support Python UDAF on blink batch planner

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19230:
-

 Summary: Support Python UDAF on blink batch planner
 Key: FLINK-19230
 URL: https://issues.apache.org/jira/browse/FLINK-19230
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19229) Support ValueState and Python UDAF on blink stream planner

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19229:
-

 Summary: Support ValueState and Python UDAF on blink stream planner
 Key: FLINK-19229
 URL: https://issues.apache.org/jira/browse/FLINK-19229
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19185) Support General Python User-Defined Aggregate Function Support on Table API (FLIP-139)

2020-09-10 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19185:
-

 Summary: Support General Python User-Defined Aggregate Function 
Support on Table API (FLIP-139)
 Key: FLINK-19185
 URL: https://issues.apache.org/jira/browse/FLINK-19185
 Project: Flink
  Issue Type: Task
  Components: API / Python
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[RESULT][VOTE] FLIP-139: General Python User-Defined Aggregate Function on Table API

2020-09-10 Thread Wei Zhong
Hi all,

The voting time for FLIP-139 has passed. I'm closing the vote now.

There were 4 +1 votes, 3 of which are binding:
- Jincheng (binding)
- Dian (binding)
- Hequn (binding)
- Xingbo (non-binding)

There were no disapproving votes.

Thus, FLIP-139 has been accepted.

Thanks everyone for joining the discussion and giving feedback!

Best,
Wei



Re: [VOTE] FLIP-139: General Python User-Defined Aggregate Function on Table API

2020-09-10 Thread Wei Zhong
Thanks everyone for the votes!
I’ll summarize the voting result in a separate email.

Best,
Wei

> 在 2020年9月10日,10:04,Hequn Cheng  写道:
> 
> +1 (binding)
> 
> Best,
> Hequn
> 
> On Thu, Sep 10, 2020 at 10:03 AM Dian Fu  wrote:
> 
>> +1(binding)
>> 
>> Regards,
>> Dian
>> 
>>> 在 2020年9月8日,上午7:43,jincheng sun  写道:
>>> 
>>> +1(binding)
>>> 
>>> Best,
>>> Jincheng
>>> 
>>> Xingbo Huang  于2020年9月7日周一 下午5:45写道:
>>> 
>>>> Hi,
>>>> 
>>>> +1 (non-binding)
>>>> 
>>>> Best,
>>>> Xingbo
>>>> 
>>>> Wei Zhong  于2020年9月7日周一 下午2:37写道:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I would like to start the vote for FLIP-139[1] which is discussed and
>>>>> reached consensus in the discussion thread[2].
>>>>> 
>>>>> The vote will be open for at least 72 hours. I'll try to close it by
>>>>> 2020-09-10 07:00 UTC, unless there is an objection or not enough votes.
>>>>> 
>>>>> Best,
>>>>> Wei
>>>>> 
>>>>> [1]
>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-139%3A+General+Python+User-Defined+Aggregate+Function+Support+on+Table+API
>>>>> [2]
>>>>> 
>>>> 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-139-General-Python-User-Defined-Aggregate-Function-on-Table-API-td44139.html
>>>>> 
>>>>> 
>>>> 
>> 
>> 



Re: [VOTE] FLIP-137: Support Pandas UDAF in PyFlink

2020-09-07 Thread Wei Zhong
+1 (non-binding)

> 在 2020年9月7日,10:00,Dian Fu  写道:
> 
> +1
> 
>> 在 2020年9月4日,上午11:12,Xingbo Huang  写道:
>> 
>> Hi all,
>> I would like to start the vote for FLIP-137[1], which is discussed and
>> reached a consensus in the discussion thread[2].
>> 
>> The vote will be open for at least 72h, unless there is an objection or not
>> enough votes.
>> 
>> Best,
>> Xingbo
>> 
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-137%3A+Support+Pandas+UDAF+in+PyFlink
>> [2]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-137-Support-Pandas-UDAF-in-PyFlink-tt44060.html
> 



[VOTE] FLIP-139: General Python User-Defined Aggregate Function on Table API

2020-09-07 Thread Wei Zhong
Hi all,

I would like to start the vote for FLIP-139[1] which is discussed and reached 
consensus in the discussion thread[2].

The vote will be open for at least 72 hours. I'll try to close it by 2020-09-10 
07:00 UTC, unless there is an objection or not enough votes.

Best,
Wei

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-139%3A+General+Python+User-Defined+Aggregate+Function+Support+on+Table+API
[2] 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-139-General-Python-User-Defined-Aggregate-Function-on-Table-API-td44139.html



Re: [DISCUSS] FLIP-139: General Python User-Defined Aggregate Function on Table API

2020-09-03 Thread Wei Zhong
Hi everyone,

Are there more comments about this FLIP? If not, I would like to bring up the 
VOTE.

Best,
Wei

> 在 2020年9月1日,11:15,Wei Zhong  写道:
> 
> Hi Timo,
> 
> Thanks for your notification. I’ll remove it from the design doc.
> 
> Best,
> Wei
> 
>> 在 2020年8月31日,21:11,Timo Walther  写道:
>> 
>> Hi Wei,
>> 
>> is `reset_accumulator` still necessary? We dropped it recently in the Java 
>> API because it was not used anymore by the planner.
>> 
>> Regards,
>> Timo
>> 
>> On 31.08.20 15:00, Wei Zhong wrote:
>>> Hi Jincheng & Xingbo,
>>> Thanks for your suggestions.
>>> I agree that we should keep the user interface uniform. I'll adjust the 
>>> design to allow users to specify the result type and accumulator type via 
>>> @udaf.
>>> Best,
>>> Wei
>>>> 在 2020年8月31日,18:06,Xingbo Huang  写道:
>>>> 
>>>> Hi Wei,
>>>> 
>>>> Thanks a lot for the discussion.
>>>> 
>>>> Thanks a lot for Jincheng's suggestion of discussing FLIP-137 and FLIP-139
>>>> together.
>>>> 
>>>> One question is whether we can use @udaf which is introduced in FLIP-137[1]
>>>> to describe pandas udaf and general python udaf together. From the overall
>>>> view of Python User Defined Function, we use @udf to describe general
>>>> python udf and pandas udf, @udtf to describe python udtf, and @udaf to
>>>> describe general python udaf and pandas udaf, which is more unified.
>>>> 
>>>> [1]
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-137%3A+Support+Pandas+UDAF+in+PyFlink#FLIP137:SupportPandasUDAFinPyFlink-Interfaces
>>>> 
>>>> Best,
>>>> Xingbo
>>>> 
>>>> jincheng sun  于2020年8月31日周一 上午11:11写道:
>>>> 
>>>>> Hi Wei,
>>>>> Thanks for the discussion! Overall, + 1 for this FLIP.
>>>>> 
>>>>> One question is, can the @udaf added by flip-137 be used in General Python
>>>>> UDAF?
>>>>> Would be gread if we can consider it combination with FLIP-137 for design.
>>>>> 
>>>>> What do you think?
>>>>> 
>>>>> Best,
>>>>> Jincheng
>>>>> 
>>>>> Wei Zhong  于2020年8月26日周三 上午11:28写道:
>>>>> 
>>>>>> Hi everyone,
>>>>>> 
>>>>>> I would like to start discussion about how to support General Python
>>>>>> User-Defined Aggregate Function on Table API.
>>>>>> 
>>>>>> FLIP-58[1] has already introduced the stateless Python UDF and has
>>>>> already
>>>>>> been supported in the previous releases. However the stateful Python UDF,
>>>>>> i.e. User-Defined Aggregate Function is not supported in PyFlink yet. We
>>>>>> want to introduce the general Python user-defined aggregate function for
>>>>>> PyFlink Table API.
>>>>>> 
>>>>>> Here is the design doc:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-139%3A+General+Python+User-Defined+Aggregate+Function+Support+on+Table+API
>>>>>> 
>>>>>> Looking forward to your feedback!
>>>>>> 
>>>>>> Best,
>>>>>> Wei
>>>>>> 
>>>>>> [1]
>>>>>> 
>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
>>>>>> 
>>>>>> 
>>>>> 
>> 
> 



Re: [DISCUSS] FLIP-139: General Python User-Defined Aggregate Function on Table API

2020-08-31 Thread Wei Zhong
Hi Timo,

Thanks for your notification. I’ll remove it from the design doc.

Best,
Wei

> 在 2020年8月31日,21:11,Timo Walther  写道:
> 
> Hi Wei,
> 
> is `reset_accumulator` still necessary? We dropped it recently in the Java 
> API because it was not used anymore by the planner.
> 
> Regards,
> Timo
> 
> On 31.08.20 15:00, Wei Zhong wrote:
>> Hi Jincheng & Xingbo,
>> Thanks for your suggestions.
>> I agree that we should keep the user interface uniform. I'll adjust the 
>> design to allow users to specify the result type and accumulator type via 
>> @udaf.
>> Best,
>> Wei
>>> 在 2020年8月31日,18:06,Xingbo Huang  写道:
>>> 
>>> Hi Wei,
>>> 
>>> Thanks a lot for the discussion.
>>> 
>>> Thanks a lot for Jincheng's suggestion of discussing FLIP-137 and FLIP-139
>>> together.
>>> 
>>> One question is whether we can use @udaf which is introduced in FLIP-137[1]
>>> to describe pandas udaf and general python udaf together. From the overall
>>> view of Python User Defined Function, we use @udf to describe general
>>> python udf and pandas udf, @udtf to describe python udtf, and @udaf to
>>> describe general python udaf and pandas udaf, which is more unified.
>>> 
>>> [1]
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-137%3A+Support+Pandas+UDAF+in+PyFlink#FLIP137:SupportPandasUDAFinPyFlink-Interfaces
>>> 
>>> Best,
>>> Xingbo
>>> 
>>> jincheng sun  于2020年8月31日周一 上午11:11写道:
>>> 
>>>> Hi Wei,
>>>> Thanks for the discussion! Overall, + 1 for this FLIP.
>>>> 
>>>> One question is, can the @udaf added by flip-137 be used in General Python
>>>> UDAF?
>>>> Would be gread if we can consider it combination with FLIP-137 for design.
>>>> 
>>>> What do you think?
>>>> 
>>>> Best,
>>>> Jincheng
>>>> 
>>>> Wei Zhong  于2020年8月26日周三 上午11:28写道:
>>>> 
>>>>> Hi everyone,
>>>>> 
>>>>> I would like to start discussion about how to support General Python
>>>>> User-Defined Aggregate Function on Table API.
>>>>> 
>>>>> FLIP-58[1] has already introduced the stateless Python UDF and has
>>>> already
>>>>> been supported in the previous releases. However the stateful Python UDF,
>>>>> i.e. User-Defined Aggregate Function is not supported in PyFlink yet. We
>>>>> want to introduce the general Python user-defined aggregate function for
>>>>> PyFlink Table API.
>>>>> 
>>>>> Here is the design doc:
>>>>> 
>>>>> 
>>>>> 
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-139%3A+General+Python+User-Defined+Aggregate+Function+Support+on+Table+API
>>>>> 
>>>>> Looking forward to your feedback!
>>>>> 
>>>>> Best,
>>>>> Wei
>>>>> 
>>>>> [1]
>>>>> 
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
>>>>> 
>>>>> 
>>>> 
> 



Re: [DISCUSS] FLIP-139: General Python User-Defined Aggregate Function on Table API

2020-08-31 Thread Wei Zhong
Hi Jincheng & Xingbo,

Thanks for your suggestions. 

I agree that we should keep the user interface uniform. I'll adjust the design 
to allow users to specify the result type and accumulator type via @udaf.

Best,
Wei


> 在 2020年8月31日,18:06,Xingbo Huang  写道:
> 
> Hi Wei,
> 
> Thanks a lot for the discussion.
> 
> Thanks a lot for Jincheng's suggestion of discussing FLIP-137 and FLIP-139
> together.
> 
> One question is whether we can use @udaf which is introduced in FLIP-137[1]
> to describe pandas udaf and general python udaf together. From the overall
> view of Python User Defined Function, we use @udf to describe general
> python udf and pandas udf, @udtf to describe python udtf, and @udaf to
> describe general python udaf and pandas udaf, which is more unified.
> 
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-137%3A+Support+Pandas+UDAF+in+PyFlink#FLIP137:SupportPandasUDAFinPyFlink-Interfaces
> 
> Best,
> Xingbo
> 
> jincheng sun  于2020年8月31日周一 上午11:11写道:
> 
>> Hi Wei,
>> Thanks for the discussion! Overall, + 1 for this FLIP.
>> 
>> One question is, can the @udaf added by flip-137 be used in General Python
>> UDAF?
>> Would be gread if we can consider it combination with FLIP-137 for design.
>> 
>> What do you think?
>> 
>> Best,
>> Jincheng
>> 
>> Wei Zhong  于2020年8月26日周三 上午11:28写道:
>> 
>>> Hi everyone,
>>> 
>>> I would like to start discussion about how to support General Python
>>> User-Defined Aggregate Function on Table API.
>>> 
>>> FLIP-58[1] has already introduced the stateless Python UDF and has
>> already
>>> been supported in the previous releases. However the stateful Python UDF,
>>> i.e. User-Defined Aggregate Function is not supported in PyFlink yet. We
>>> want to introduce the general Python user-defined aggregate function for
>>> PyFlink Table API.
>>> 
>>> Here is the design doc:
>>> 
>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-139%3A+General+Python+User-Defined+Aggregate+Function+Support+on+Table+API
>>> 
>>> Looking forward to your feedback!
>>> 
>>> Best,
>>> Wei
>>> 
>>> [1]
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
>>> 
>>> 
>> 



Re: [ANNOUNCE] New PMC member: Dian Fu

2020-08-28 Thread Wei Zhong
Congratulations Dian!

> 在 2020年8月28日,14:29,Jingsong Li  写道:
> 
> Congratulations , Dian!
> 
> Best, Jingsong
> 
> On Fri, Aug 28, 2020 at 11:06 AM Walter Peng  > wrote:
> congrats!
> 
> Yun Tang wrote:
> > Congratulations , Dian!
> 
> 
> -- 
> Best, Jingsong Lee



[DISCUSS] FLIP-139: General Python User-Defined Aggregate Function on Table API

2020-08-25 Thread Wei Zhong
Hi everyone,

I would like to start discussion about how to support General Python 
User-Defined Aggregate Function on Table API.

FLIP-58[1] has already introduced the stateless Python UDF and has already been 
supported in the previous releases. However the stateful Python UDF, i.e. 
User-Defined Aggregate Function is not supported in PyFlink yet. We want to 
introduce the general Python user-defined aggregate function for PyFlink Table 
API.

Here is the design doc:

https://cwiki.apache.org/confluence/display/FLINK/FLIP-139%3A+General+Python+User-Defined+Aggregate+Function+Support+on+Table+API

Looking forward to your feedback!

Best,
Wei

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table



[jira] [Created] (FLINK-18937) Add a "Environment Setup" section to the "Installation" document

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18937:
-

 Summary: Add a "Environment Setup" section to the "Installation" 
document
 Key: FLINK-18937
 URL: https://issues.apache.org/jira/browse/FLINK-18937
 Project: Flink
  Issue Type: Sub-task
        Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18932) Add a "Overview" document under the "Python API" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18932:
-

 Summary: Add a "Overview" document under  the "Python API" section
 Key: FLINK-18932
 URL: https://issues.apache.org/jira/browse/FLINK-18932
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
        Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18933) Delete the old Python Table API document section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18933:
-

 Summary: Delete the old Python Table API document section
 Key: FLINK-18933
 URL: https://issues.apache.org/jira/browse/FLINK-18933
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18929) Add a "API Docs" link (linked to the generated sphinx docs) under the "Python API" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18929:
-

 Summary: Add a "API Docs" link (linked to the generated sphinx 
docs) under the "Python API" section
 Key: FLINK-18929
 URL: https://issues.apache.org/jira/browse/FLINK-18929
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
        Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18928) Move the "Common Questions" document from the old Python Table API documentation to the "Python API" section with a new name "FAQ"

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18928:
-

 Summary: Move the "Common Questions" document from the old Python 
Table API documentation to the "Python API" section with a new name "FAQ"
 Key: FLINK-18928
 URL: https://issues.apache.org/jira/browse/FLINK-18928
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
    Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18927) Add a "Debugging" document under the "Python API" -> "User Guide" -> "Table API" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18927:
-

 Summary: Add a "Debugging" document under  the "Python API" -> 
"User Guide" -> "Table API" section
 Key: FLINK-18927
 URL: https://issues.apache.org/jira/browse/FLINK-18927
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18926) Add a "Environment Variables" document under the "Python API" -> "User Guide" -> "Table API" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18926:
-

 Summary: Add a "Environment Variables" document under  the "Python 
API" -> "User Guide" -> "Table API" section
 Key: FLINK-18926
 URL: https://issues.apache.org/jira/browse/FLINK-18926
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18925) Move the "Configuration" document from the old Python Table API documentation to the "Python API" -> "User Guide" -> "Table API" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18925:
-

 Summary: Move the "Configuration" document from the old Python 
Table API documentation to the "Python API" -> "User Guide" -> "Table API" 
section
 Key: FLINK-18925
 URL: https://issues.apache.org/jira/browse/FLINK-18925
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18924) Move the "Metrics" document from the old Python Table API documentation to the "Python API" -> "User Guide" -> "Table API" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18924:
-

 Summary: Move the "Metrics" document from the old Python Table API 
documentation to the "Python API" -> "User Guide" -> "Table API" section
 Key: FLINK-18924
 URL: https://issues.apache.org/jira/browse/FLINK-18924
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18923) Add a "CEP" document under the "Python API" -> "User Guide" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18923:
-

 Summary: Add a "CEP" document under  the "Python API" -> "User 
Guide" section
 Key: FLINK-18923
 URL: https://issues.apache.org/jira/browse/FLINK-18923
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18922) Add a "Catalogs" link (linked to dev/table/catalogs.md) under the "Python API" -> "User Guide" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18922:
-

 Summary: Add a "Catalogs" link (linked to dev/table/catalogs.md) 
under the "Python API" -> "User Guide" section
 Key: FLINK-18922
 URL: https://issues.apache.org/jira/browse/FLINK-18922
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18921) Add a "SQL" link (linked to dev/table/sql/index.md) under the "Python API" -> "User Guide" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18921:
-

 Summary: Add a "SQL" link (linked to dev/table/sql/index.md) under 
the "Python API" -> "User Guide" section
 Key: FLINK-18921
 URL: https://issues.apache.org/jira/browse/FLINK-18921
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18920) Move the "Dependency Management" document from the old Python Table API documentation to the "Python API" -> "GettingStart" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18920:
-

 Summary: Move the "Dependency Management" document from the old 
Python Table API documentation to the "Python API" -> "GettingStart" section
 Key: FLINK-18920
 URL: https://issues.apache.org/jira/browse/FLINK-18920
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18919) Move the "User Defined Functions" document and "Vectorized User Defined Functions" document from the old Python Table API documentation to the "Python API" -> "User Guid

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18919:
-

 Summary: Move the "User Defined Functions" document and 
"Vectorized User Defined Functions" document from the old Python Table API 
documentation to the "Python API" -> "User Guide" section
 Key: FLINK-18919
 URL: https://issues.apache.org/jira/browse/FLINK-18919
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18918) Add a "Connectors" document under the "Python API" -> "User Guide" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18918:
-

 Summary: Add a "Connectors" document under  the "Python API" -> 
"User Guide" section
 Key: FLINK-18918
 URL: https://issues.apache.org/jira/browse/FLINK-18918
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18917) Add a "Built-in Functions" link (linked to dev/table/functions/systemFunctions.md) under the "Python API" -> "User Guide" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18917:
-

 Summary: Add a "Built-in Functions" link (linked to 
dev/table/functions/systemFunctions.md) under the "Python API" -> "User Guide" 
section
 Key: FLINK-18917
 URL: https://issues.apache.org/jira/browse/FLINK-18917
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18916) Add a "Operations" link(linked to dev/table/tableApi.md) under the "Python API" -> "User Guide" section

2020-08-13 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18916:
-

 Summary: Add a "Operations" link(linked to dev/table/tableApi.md) 
under the "Python API" -> "User Guide" section
 Key: FLINK-18916
 URL: https://issues.apache.org/jira/browse/FLINK-18916
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18914) Move the "Python DataTypes" document from the old Python Table API documentation to the "Python API" -> "GettingStart" section with a new name "DataTypes"

2020-08-12 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18914:
-

 Summary: Move the "Python DataTypes" document from the old Python 
Table API documentation to the "Python API" -> "GettingStart" section with a 
new name "DataTypes"
 Key: FLINK-18914
 URL: https://issues.apache.org/jira/browse/FLINK-18914
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18913) Add a "TableEnvironment" document under the "Python API" -> "User Guide" section

2020-08-12 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18913:
-

 Summary: Add a "TableEnvironment" document under  the "Python API" 
-> "User Guide" section
 Key: FLINK-18913
 URL: https://issues.apache.org/jira/browse/FLINK-18913
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18912) Add a Table API tutorial link under the "Python API" -> "GettingStart" -> "Tutorial" section

2020-08-12 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18912:
-

 Summary: Add a Table API tutorial link under  the "Python API" -> 
"GettingStart" -> "Tutorial" section
 Key: FLINK-18912
 URL: https://issues.apache.org/jira/browse/FLINK-18912
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18911) Move the "Installation" document from the old Python Table API documentation to the new Python API documentation

2020-08-12 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18911:
-

 Summary: Move the "Installation" document from the old Python 
Table API documentation to the new Python API documentation
 Key: FLINK-18911
 URL: https://issues.apache.org/jira/browse/FLINK-18911
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18910) Create the new document structure for Python documentation according to FLIP-133

2020-08-12 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18910:
-

 Summary: Create the new document structure for Python 
documentation according to FLIP-133
 Key: FLINK-18910
 URL: https://issues.apache.org/jira/browse/FLINK-18910
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Wei Zhong


Create the following catalog structure under the "Application Development" 
catalog:

*Application Development*

  *-* *Python API* 

       *-* Getting Started

       *-* User Guide

          *-* Table API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-133: Rework PyFlink Documentation

2020-08-11 Thread Wei Zhong
+1 (non-binding)

Best,
Wei

> 在 2020年8月11日,22:41,Seth Wiesman  写道:
> 
> +1 (binding)
> 
> On Sun, Aug 9, 2020 at 8:15 PM jincheng sun 
> wrote:
> 
>> Hi Seth,
>> 
>> Thank you for joining  the discussion and voting. Could you please have a
>> look at the discussion thread.
>> It would be great to know that did the follow-up discussion allay your
>> concerns and any feedback is welcome.
>> 
>> Best,
>> Jincheng
>> 
>> 
>> 
>> Seth Wiesman  于2020年8月3日周一 下午9:41写道:
>> 
>>> -1
>>> 
>>> I'm sorry to be late to the discussion but I have some concerns I've
>>> brought up in the discussion thread.
>>> 
>>> Seth
>>> 
>>> On Mon, Aug 3, 2020 at 3:00 AM jincheng sun 
>>> wrote:
>>> 
 Hi everyone,
 
 I would like to start the vote for FLIP-133[1], which is discussed and
 reached a consensus through the discussion thread[2].
 
 The vote will be open until 6th August (72h), unless there is an
>>> objection
 or not enough votes.
 
 Best,
 Jincheng
 
 [1]:
 
 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-133%3A+Rework+PyFlink+Documentation
 [2]:
 
 
>>> 
>> https://lists.apache.org/thread.html/redebc9d1281edaa4a1fbf0d8c76a69fcff574b0496e78519840a5a61%40%3Cdev.flink.apache.org%3E
 
>>> 
>> 



[jira] [Created] (FLINK-18849) Improve the code tabs of the Flink documents

2020-08-07 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18849:
-

 Summary: Improve the code tabs of the Flink documents
 Key: FLINK-18849
 URL: https://issues.apache.org/jira/browse/FLINK-18849
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Wei Zhong


Currently there are some minor problems on the code tabs of the Flink documents:
 # There are some tab labels like `data-lang="Java/Scala"`, which can not be 
changed synchronously with the label `data-lang="Java"` and `data-lang="Scala"`.
 # Case sensitive. If one code tab has a label `data-lang="java"` and another 
has the label `data-lang="Java"` in one page. They would not change 
synchronously.
 # Duplicated content. Many contents in the "Java" tab are the same as the 
"Scala" tab.

I would like to improve the situation by following way:

      1. When parsing the label like `data-lang="Java/Scala"`, we can clone the 
tab content, let one has the label `data-lang="Java"`, another has the label 
`data-lang="Scala"`.

      2. Then force the first character of the data-lang value to be upper 
case. i.e. if the label is `data-lang="java"`, it will be modified to 
`data-lang="Java"`.

This way we can remove the duplicated content via merge them into one element 
with a `data-lang="Java/Scala"` label. And all the tab can be changed 
synchronously when they are clicked.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-133: Rework PyFlink Documentation

2020-08-05 Thread Wei Zhong
r the FLIP.
> >>
> >> I think this will bring big benefits for the PyFlink users. Currently,
> >> the Python TableAPI document is hidden deeply under the TableAPI tab
> >> which makes it quite unreadable. Also, the PyFlink documentation is mixed
> >> with Java/Scala documentation. It is hard for users to have an overview of
> >> all the PyFlink documents. As more and more functionalities are added into
> >> PyFlink, I think it's time for us to refactor the document.
> >>
> >> Best,
> >> Hequn
> >>
> >>
> >> On Fri, Jul 31, 2020 at 3:43 PM Marta Paes Moreira  >> <mailto:ma...@ververica.com>>
> >> wrote:
> >>
> >>> Hi, Jincheng!
> >>>
> >>> Thanks for creating this detailed FLIP, it will make a big difference in
> >>> the experience of Python developers using Flink. I'm interested in
> >>> contributing to this work, so I'll reach out to you offline!
> >>>
> >>> Also, thanks for sharing some information on the adoption of PyFlink,
> >>> it's
> >>> great to see that there are already production users.
> >>>
> >>> Marta
> >>>
> >>> On Fri, Jul 31, 2020 at 5:35 AM Xingbo Huang  >>> <mailto:hxbks...@gmail.com>> wrote:
> >>>
> >>> > Hi Jincheng,
> >>> >
> >>> > Thanks a lot for bringing up this discussion and the proposal.
> >>> >
> >>> > Big +1 for improving the structure of PyFlink doc.
> >>> >
> >>> > It will be very friendly to give PyFlink users a unified entrance to
> >>> learn
> >>> > PyFlink documents.
> >>> >
> >>> > Best,
> >>> > Xingbo
> >>> >
> >>> > Dian Fu mailto:dian0511...@gmail.com>> 
> >>> > 于2020年7月31日周五 上午11:00写道:
> >>> >
> >>> >> Hi Jincheng,
> >>> >>
> >>> >> Thanks a lot for bringing up this discussion and the proposal. +1 to
> >>> >> improve the Python API doc.
> >>> >>
> >>> >> I have received many feedbacks from PyFlink beginners about
> >>> >> the PyFlink doc, e.g. the materials are too few, the Python doc is
> >>> mixed
> >>> >> with the Java doc and it's not easy to find the docs he wants to know.
> >>> >>
> >>> >> I think it would greatly improve the user experience if we can have
> >>> one
> >>> >> place which includes most knowledges PyFlink users should know.
> >>> >>
> >>> >> Regards,
> >>> >> Dian
> >>> >>
> >>> >> 在 2020年7月31日,上午10:14,jincheng sun  >>> >> <mailto:sunjincheng...@gmail.com>> 写道:
> >>> >>
> >>> >> Hi folks,
> >>> >>
> >>> >> Since the release of Flink 1.11, users of PyFlink have continued to
> >>> grow.
> >>> >> As far as I know there are many companies have used PyFlink for data
> >>> >> analysis, operation and maintenance monitoring business has been put
> >>> into
> >>> >> production(Such as 聚美优品[1](Jumei),  浙江墨芷[2] (Mozhi) etc.).  According
> >>> to
> >>> >> the feedback we received, current documentation is not very friendly
> >>> to
> >>> >> PyFlink users. There are two shortcomings:
> >>> >>
> >>> >> - Python related content is mixed in the Java/Scala documentation,
> >>> which
> >>> >> makes it difficult for users who only focus on PyFlink to read.
> >>> >> - There is already a "Python Table API" section in the Table API
> >>> document
> >>> >> to store PyFlink documents, but the number of articles is small and
> >>> the
> >>> >> content is fragmented. It is difficult for beginners to learn from it.
> >>> >>
> >>> >> In addition, FLIP-130 introduced the Python DataStream API. Many
> >>> >> documents will be added for those new APIs. In order to increase the
> >>> >> readability and maintainability of the PyFlink document, Wei Zhong
> >>> and me
> >>> >> have discussed offline and would like to rework it via this FLIP.
> >>> >>
> >>> >> We will rework the document around the following three objectives:
> >>> >>
> >>> >> - Add a separate section for Python API under the "Application
> >>> >> Development" section.
> >>> >> - Restructure current Python documentation to a brand new structure to
> >>> >> ensure complete content and friendly to beginners.
> >>> >> - Improve the documents shared by Python/Java/Scala to make it more
> >>> >> friendly to Python users and without affecting Java/Scala users.
> >>> >>
> >>> >> More detail can be found in the FLIP-133:
> >>> >>
> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-133%3A+Rework+PyFlink+Documentation
> >>>  
> >>> <https://cwiki.apache.org/confluence/display/FLINK/FLIP-133%3A+Rework+PyFlink+Documentation>
> >>> >>
> >>> >> Best,
> >>> >> Jincheng
> >>> >>
> >>> >> [1] https://mp.weixin.qq.com/s/zVsBIs1ZEFe4atYUYtZpRg 
> >>> >> <https://mp.weixin.qq.com/s/zVsBIs1ZEFe4atYUYtZpRg>
> >>> >> [2] https://mp.weixin.qq.com/s/R4p_a2TWGpESBWr3pLtM2g 
> >>> >> <https://mp.weixin.qq.com/s/R4p_a2TWGpESBWr3pLtM2g>
> >>> >>
> >>> >>
> >>> >>
> >>>
> >>



[jira] [Created] (FLINK-18801) Add the new document "10 minutes to PyFlink"

2020-08-03 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18801:
-

 Summary: Add the new document "10 minutes to PyFlink"
 Key: FLINK-18801
 URL: https://issues.apache.org/jira/browse/FLINK-18801
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18750) SqlValidatorException thrown when select from a view which contains a UDTF call

2020-07-29 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18750:
-

 Summary: SqlValidatorException thrown when select from a view 
which contains a UDTF call
 Key: FLINK-18750
 URL: https://issues.apache.org/jira/browse/FLINK-18750
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.11.1
Reporter: Wei Zhong


When executing such code:

 
{code:java}
package com.example;

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.EnvironmentSettings;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
import org.apache.flink.table.functions.TableFunction;

public class TestUTDF {

   public static class UDTF extends TableFunction {

  public void eval(String input) {
 collect(input);
  }
   }

   public static void main(String[] args) {

  StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
  StreamTableEnvironment tEnv = StreamTableEnvironment.create(
 env, EnvironmentSettings.newInstance().useBlinkPlanner().build());
  tEnv.createTemporarySystemFunction("udtf", new UDTF());
  tEnv.createTemporaryView("source", tEnv.fromValues("a", "b", 
"c").as("f0"));
  String udtfCall = "SELECT S.f0, T.f1 FROM source as S, LATERAL 
TABLE(udtf(f0)) as T(f1)";
  System.out.println(tEnv.explainSql(udtfCall));
  String createViewCall = "CREATE VIEW tmp_view AS" + udtfCall;
  tEnv.executeSql(createViewCall);
  System.out.println(tEnv.from("tmp_view").explain());
   }
}
{code}
Such a SqlValidatorException would be thrown:

 

 
{code:java}
== Abstract Syntax Tree  Abstract Syntax Tree ==LogicalProject(f0=[$0], 
f1=[$1])+- LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
requiredColumns=[{0}])   :- LogicalProject(f0=[AS($0, _UTF-16LE'f0')])   :  +- 
LogicalValues(tuples=[[{ _UTF-16LE'a' }, { _UTF-16LE'b' }, { _UTF-16LE'c' }]])  
 +- LogicalTableFunctionScan(invocation=[udtf($cor0.f0)], 
rowType=[RecordType(VARCHAR(2147483647) EXPR$0)])
== Optimized Logical Plan ==Correlate(invocation=[udtf($cor0.f0)], 
correlate=[table(udtf($cor0.f0))], select=[f0,EXPR$0], 
rowType=[RecordType(CHAR(1) f0, VARCHAR(2147483647) EXPR$0)], 
joinType=[INNER])+- Calc(select=[f0])   +- Values(type=[RecordType(CHAR(1) 
f0)], tuples=[[{ _UTF-16LE'a' }, { _UTF-16LE'b' }, { _UTF-16LE'c' }]])
== Physical Execution Plan ==Stage 1 : Data Source content : Source: 
Values(tuples=[[{ _UTF-16LE'a' }, { _UTF-16LE'b' }, { _UTF-16LE'c' }]])
 Stage 2 : Operator content : Calc(select=[f0]) ship_strategy : FORWARD
 Stage 3 : Operator content : Correlate(invocation=[udtf($cor0.f0)], 
correlate=[table(udtf($cor0.f0))], select=[f0,EXPR$0], 
rowType=[RecordType(CHAR(1) f0, VARCHAR(2147483647) EXPR$0)], joinType=[INNER]) 
ship_strategy : FORWARDException in thread "main" 
org.apache.flink.table.api.ValidationException: SQL validation failed. From 
line 4, column 14 to line 4, column 17: Column 'f0' not found in any table at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:146)
 at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl$ToRelContextImpl.expandView(FlinkPlannerImpl.scala:204)
 at org.apache.calcite.plan.ViewExpanders$1.expandView(ViewExpanders.java:52) 
at 
org.apache.flink.table.planner.catalog.SqlCatalogViewTable.convertToRel(SqlCatalogViewTable.java:58)
 at 
org.apache.flink.table.planner.plan.schema.ExpandingPreparingTable.expand(ExpandingPreparingTable.java:59)
 at 
org.apache.flink.table.planner.plan.schema.ExpandingPreparingTable.toRel(ExpandingPreparingTable.java:55)
 at 
org.apache.calcite.rel.core.RelFactories$TableScanFactoryImpl.createScan(RelFactories.java:533)
 at org.apache.calcite.tools.RelBuilder.scan(RelBuilder.java:1044) at 
org.apache.calcite.tools.RelBuilder.scan(RelBuilder.java:1068) at 
org.apache.flink.table.planner.plan.QueryOperationConverter$SingleRelVisitor.visit(QueryOperationConverter.java:349)
 at 
org.apache.flink.table.planner.plan.QueryOperationConverter$SingleRelVisitor.visit(QueryOperationConverter.java:152)
 at 
org.apache.flink.table.operations.CatalogQueryOperation.accept(CatalogQueryOperation.java:69)
 at 
org.apache.flink.table.planner.plan.QueryOperationConverter.defaultMethod(QueryOperationConverter.java:149)
 at 
org.apache.flink.table.planner.plan.QueryOperationConverter.defaultMethod(QueryOperationConverter.java:131)
 at 
org.apache.flink.table.operations.utils.QueryOperationDefaultVisitor.visit(QueryOperationDefaultVisitor.java:92)
 at 
org.apache.flink.table.operations.CatalogQueryOperation.accept(CatalogQueryOperation.java:69)
 at 
org.apache.flink.table.planner.calcite.FlinkRelBuilder.queryOperation(Flin

Re: flink 1.11 CREATE VIEW + LATERAL TABLE 语法校验问题

2020-07-29 Thread Wei Zhong
Hi XiaChang,

I think this is a bug. Others have encountered similar problems before.
I have create a JIRA for it: https://issues.apache.org/jira/browse/FLINK-18750 


Best,
Wei

> 在 2020年7月29日,15:00,XiaChang <13628620...@163.com> 写道:
> 
> 您好,请教一个问题:
> 背景:
> 运行环境-->flink1.11.1
> source_table-->kafka输入表,kafka中每一个行为一个json数组。([{},{}])
> -- source
> CREATE TABLE source_table(
>__message STRING
> ) WITH(
> 'connector.type' = 'kafka',
> 'connector.version' = 'universal-xx',
> 'connector.properties.bootstrap.servers' = 'xxx',
> 'connector.topic' = 'xxx',
> 'connector.startup-mode' = 'latest-offset',
> 'connector.properties.group.id' = 'xxx',
> 'format.type' = 'json-xx',
> 'format.derive-schema' = 'true'
> );
> 
> 
> XX_JSON_TUPLE-->自定义udtf
> XX_ROW_VALUE-->自定义udf
> 
> 
> 可正常执行:
> 
> INSERT INTO sink_table
> SELECT
> cast(XX_ROW_VALUE(msg, 1) as bigint) AS bbb
> ,((cast(XX_ROW_VALUE(msg, 0) as bigint))/1000) AS ddd
> ,cast(XX_ROW_VALUE(msg, 0) as bigint) AS aaa
> ,XX_ROW_VALUE(msg, 8) AS ccc
> FROM
> source_table, LATERAL TABLE(XX_JSON_TUPLE(__message, 'aaa','bbb','ccc')) AS 
> T(msg)
> 
> 
> 不可正常运行:
> CREATE VIEW view_aaa AS
> SELECT
> cast(XX_ROW_VALUE(msg, 1) as bigint) AS bbb
> ,((cast(XX_ROW_VALUE(msg, 0) as bigint))/1000) AS ddd
> ,cast(XX_ROW_VALUE(msg, 0) as bigint) AS aaa
> ,XX_ROW_VALUE(msg, 8) AS ccc
> FROM
> source_table, LATERAL TABLE(XX_JSON_TUPLE(__message, 'aaa','bbb','ccc')) AS 
> T(msg);
> INSERT INTO sink_table
> SELECT * FROM view_aaa;
> 
> 
> 
> 
> 相关错误堆栈信息:
> Caused by: org.apache.flink.table.api.ValidationException: SQL validation 
> failed. From line 3, column 23 to line 3, column 33: Column '__message' not 
> found in any table
> at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:146)
> at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl$ToRelContextImpl.expandView(FlinkPlannerImpl.scala:204)
> at org.apache.calcite.plan.ViewExpanders$1.expandView(ViewExpanders.java:52)
> at 
> org.apache.flink.table.planner.catalog.SqlCatalogViewTable.convertToRel(SqlCatalogViewTable.java:58)
> at 
> org.apache.flink.table.planner.plan.schema.ExpandingPreparingTable.expand(ExpandingPreparingTable.java:59)
> at 
> org.apache.flink.table.planner.plan.schema.ExpandingPreparingTable.toRel(ExpandingPreparingTable.java:55)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.toRel(SqlToRelConverter.java:3492)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:2415)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2102)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2051)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:661)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:642)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3345)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:568)
> at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$rel(FlinkPlannerImpl.scala:164)
> at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.rel(FlinkPlannerImpl.scala:151)
> at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.toQueryOperation(SqlToOperationConverter.java:774)
> at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlQuery(SqlToOperationConverter.java:746)
> at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:236)
> at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:78)
> at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:664)
> at 
> com.douyu.ocean.xuanwu.flink.siddhi.SiddhiJob.lambda$callFrom$0(SiddhiJob.java:145)
> at java.util.ArrayList.forEach(ArrayList.java:1257)
> at com.douyu.ocean.xuanwu.flink.siddhi.SiddhiJob.callFrom(SiddhiJob.java:141)
> ... 4 common frames omitted
> Caused by: org.apache.calcite.runtime.CalciteContextException: From line 3, 
> column 23 to line 3, column 33: Column '__message' not found in any table
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:457)
> at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:839)
> at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:824)
> at 
> 

[jira] [Created] (FLINK-18716) Remove the deprecated "execute" and "insert_into" calls in PyFlink Table API docs

2020-07-26 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18716:
-

 Summary: Remove the deprecated "execute" and "insert_into" calls 
in PyFlink Table API docs
 Key: FLINK-18716
 URL: https://issues.apache.org/jira/browse/FLINK-18716
 Project: Flink
  Issue Type: Improvement
  Components: API / Python, Documentation
Affects Versions: 1.11.0
        Reporter: Wei Zhong


Currently the TableEnvironment#execute and the Table#insert_into is deprecated, 
but the docs of PyFlink Table API still use them. We should replace them with 
the recommended API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Wei Zhong
Congratulations! Thanks Dian for the great work!

Best,
Wei

> 在 2020年7月22日,15:09,Leonard Xu  写道:
> 
> Congratulations!
> 
> Thanks Dian Fu for the great work as release manager, and thanks everyone 
> involved!
> 
> Best
> Leonard Xu
> 
>> 在 2020年7月22日,14:52,Dian Fu  写道:
>> 
>> The Apache Flink community is very happy to announce the release of Apache 
>> Flink 1.11.1, which is the first bugfix release for the Apache Flink 1.11 
>> series.
>> 
>> Apache Flink® is an open-source stream processing framework for distributed, 
>> high-performing, always-available, and accurate data streaming applications.
>> 
>> The release is available for download at:
>> https://flink.apache.org/downloads.html
>> 
>> Please check out the release blog post for an overview of the improvements 
>> for this bugfix release:
>> https://flink.apache.org/news/2020/07/21/release-1.11.1.html
>> 
>> The full release notes are available in Jira:
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348323
>> 
>> We would like to thank all contributors of the Apache Flink community who 
>> made this release possible!
>> 
>> Regards,
>> Dian
> 



Re: [DISCUSS] FLIP-130: Support for Python DataStream API (Stateless Part)

2020-07-08 Thread Wei Zhong
Hi Shuiqiang,

Thanks for driving this. Big +1 for supporting DataStream API in PyFlink!

Best,
Wei


> 在 2020年7月9日,10:29,Hequn Cheng  写道:
> 
> +1 for adding the Python DataStream API and starting with the stateless
> part.
> There are already some users that expressed their wish to have the Python
> DataStream APIs. Once we have the APIs in PyFlink, we can cover more use
> cases for our users.
> 
> Best, Hequn
> 
> On Wed, Jul 8, 2020 at 11:45 AM Shuiqiang Chen  wrote:
> 
>> Sorry, the 3rd link is broken, please refer to this one: Support Python
>> DataStream API
>> <
>> https://docs.google.com/document/d/1H3hz8wuk22-8cDBhQmQKNw3m1q5gDAMkwTDEwnj3FBI/edit
>>> 
>> 
>> Shuiqiang Chen  于2020年7月8日周三 上午11:13写道:
>> 
>>> Hi everyone,
>>> 
>>> As we all know, Flink provides three layered APIs: the ProcessFunctions,
>>> the DataStream API and the SQL & Table API. Each API offers a different
>>> trade-off between conciseness and expressiveness and targets different
>> use
>>> cases[1].
>>> 
>>> Currently, the SQL & Table API has already been supported in PyFlink. The
>>> API provides relational operations as well as user-defined functions to
>>> provide convenience for users who are familiar with python and relational
>>> programming.
>>> 
>>> Meanwhile, the DataStream API and ProcessFunctions provide more generic
>>> APIs to implement stream processing applications. The ProcessFunctions
>>> expose time and state which are the fundamental building blocks for any
>>> kind of streaming application.
>>> To cover more use cases, we are planning to cover all these APIs in
>>> PyFlink.
>>> 
>>> In this discussion(FLIP-130), we propose to support the Python DataStream
>>> API for the stateless part. For more detail, please refer to the FLIP
>> wiki
>>> page here[2]. If interested in the stateful part, you can also take a
>>> look the design doc here[3] for which we are going to discuss in a
>> separate
>>> FLIP.
>>> 
>>> Any comments will be highly appreciated!
>>> 
>>> [1] https://flink.apache.org/flink-applications.html#layered-apis
>>> [2]
>>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866298
>>> [3]
>>> 
>> https://docs.google.com/document/d/1H3hz8wuk228cDBhQmQKNw3m1q5gDAMkwTDEwnj3FBI/edit?usp=sharing
>>> 
>>> Best,
>>> Shuiqiang
>>> 
>>> 
>>> 
>>> 
>> 



[jira] [Created] (FLINK-18463) Make the "input_types" parameter of the Python udf decorator optional

2020-07-02 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18463:
-

 Summary:  Make the "input_types" parameter of the Python udf 
decorator optional 
 Key: FLINK-18463
 URL: https://issues.apache.org/jira/browse/FLINK-18463
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


In current design, the input types of the Python UDF is unnecessary to be 
declared explicitly. For user convenience, we can make the "input_types" 
parameter of the Python UDF decorator optional.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18460) If the Python UDFs are called before calling the Python Dependency Management API, the Python Dependency Management would not will not take effect.

2020-07-01 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18460:
-

 Summary: If the Python UDFs are called before calling the Python 
Dependency Management API, the Python Dependency Management would not will not 
take effect.
 Key: FLINK-18460
 URL: https://issues.apache.org/jira/browse/FLINK-18460
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.10.1, 1.11.0, 1.12.0
Reporter: Wei Zhong


When developing PyFlink Job, If the python UDFs was called before specifying 
the python dependencies, the python dependencies would not take effect on the 
previously called  python UDFs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18218) Add yarn perjob, java dependency management e2e tests for pyflink.

2020-06-09 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-18218:
-

 Summary: Add yarn perjob, java dependency management e2e tests for 
pyflink.
 Key: FLINK-18218
 URL: https://issues.apache.org/jira/browse/FLINK-18218
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Currently the PyFlink e2e test only tests the submitting on standalone session 
mode and does not test the java dependency management. We need to add them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17881) Add documentation for PyFlink's Windows support

2020-05-22 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-17881:
-

 Summary: Add documentation for PyFlink's Windows support
 Key: FLINK-17881
 URL: https://issues.apache.org/jira/browse/FLINK-17881
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Currently PyFlink already supports running on windows in Flink 1.11. But as we 
drop the bat script in Flink 1.11, submitting Python job on windows is not 
supported. We should add documentation for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17876) Add documents for the Python UDF SQL DDL.

2020-05-22 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-17876:
-

 Summary: Add documents for the Python UDF SQL DDL.
 Key: FLINK-17876
 URL: https://issues.apache.org/jira/browse/FLINK-17876
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Currently our docs only records how to create Java/Scala UDF via SQL DDL. As 
creating Python UDF via SQL DDL is supported in FLINK-106, we should add 
document for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17866) InvalidPathException was thrown when running the test cases of PyFlink on Windows

2020-05-21 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-17866:
-

 Summary: InvalidPathException was thrown when running the test 
cases of PyFlink on Windows
 Key: FLINK-17866
 URL: https://issues.apache.org/jira/browse/FLINK-17866
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Reporter: Wei Zhong


When running the test_dependency.py on Windows,such exception was thrown:
{code:java}
Error
Traceback (most recent call last):
  File 
"C:\Users\zw144119\AppData\Local\Continuum\miniconda3\envs\py36\lib\unittest\case.py",
 line 59, in testPartExecutor
yield
  File 
"C:\Users\zw144119\AppData\Local\Continuum\miniconda3\envs\py36\lib\unittest\case.py",
 line 605, in run
testMethod()
  File "D:\flink\flink-python\pyflink\table\tests\test_dependency.py", line 55, 
in test_add_python_file
self.t_env.execute("test")
  File "D:\flink\flink-python\pyflink\table\table_environment.py", line 1049, 
in execute
return JobExecutionResult(self._j_tenv.execute(job_name))
  File 
"C:\Users\zw144119\AppData\Local\Continuum\miniconda3\envs\py36\lib\site-packages\py4j\java_gateway.py",
 line 1286, in __call__
answer, self.gateway_client, self.target_id, self.name)
  File "D:\flink\flink-python\pyflink\util\exceptions.py", line 147, in deco
return f(*a, **kw)
  File 
"C:\Users\zw144119\AppData\Local\Continuum\miniconda3\envs\py36\lib\site-packages\py4j\protocol.py",
 line 328, in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o4.execute.
: java.nio.file.InvalidPathException: Illegal char <:> at index 2: 
/C:/Users/zw144119/AppData/Local/Temp/tmp0x4273cg/python_file_dir_cfb9e8fe-2812-4a89-ae46-5dc3c844d62c/test_dependency_manage_lib.py
  at sun.nio.fs.WindowsPathParser.normalize(WindowsPathParser.java:182)
  at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:153)
  at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:77)
  at sun.nio.fs.WindowsPath.parse(WindowsPath.java:94)
  at sun.nio.fs.WindowsFileSystem.getPath(WindowsFileSystem.java:255)
  at java.nio.file.Paths.get(Paths.java:84)
  at 
org.apache.flink.core.fs.local.LocalFileSystem.pathToFile(LocalFileSystem.java:314)
  at 
org.apache.flink.core.fs.local.LocalFileSystem.getFileStatus(LocalFileSystem.java:110)
  at 
org.apache.flink.runtime.jobgraph.JobGraphUtils.addUserArtifactEntries(JobGraphUtils.java:52)
  at 
org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:186)
  at 
org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:109)
  at 
org.apache.flink.streaming.api.graph.StreamGraph.getJobGraph(StreamGraph.java:850)
  at 
org.apache.flink.client.StreamGraphTranslator.translateToJobGraph(StreamGraphTranslator.java:52)
  at 
org.apache.flink.client.FlinkPipelineTranslationUtil.getJobGraph(FlinkPipelineTranslationUtil.java:43)
  at 
org.apache.flink.client.deployment.executors.PipelineExecutorUtils.getJobGraph(PipelineExecutorUtils.java:55)
  at 
org.apache.flink.client.deployment.executors.LocalExecutor.getJobGraph(LocalExecutor.java:98)
  at 
org.apache.flink.client.deployment.executors.LocalExecutor.execute(LocalExecutor.java:79)
  at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1786)
  at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1687)
  at 
org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:74)
  at 
org.apache.flink.table.planner.delegation.ExecutorBase.execute(ExecutorBase.java:52)
  at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:1167)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at 
org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
  at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
  at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282)
  at 
org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
  at 
org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
  at 
org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
  at java.lang.Thread.run(Thread.java:748)
{code}
It seems the windows-style path is not recognized by the "Paths.get()" method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17646) Reduce the python package size of PyFlink

2020-05-12 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-17646:
-

 Summary: Reduce the python package size of PyFlink
 Key: FLINK-17646
 URL: https://issues.apache.org/jira/browse/FLINK-17646
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Currently the python package size of PyFlink has increased to about 320MB, 
which exceeds the size limit of pypi.org (300MB). We need to remove unnecessary 
jars to reduce the package size.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.10.1, release candidate #3

2020-05-12 Thread Wei Zhong
+1 (non-binding)

- checked signatures
- the python package looks good
- the python shell looks good
- submitted a python job on local cluster

Best,
Wei

> 在 2020年5月12日,11:42,Zhu Zhu  写道:
> 
> +1 (non-binding)
> 
> - checked release notes
> - checked signatures
> - built from source
> - submitted an example job on yarn cluster
> - WebUI and logs look good
> 
> Thanks,
> Zhu Zhu
> 
> Leonard Xu  于2020年5月12日周二 上午11:10写道:
> 
>> +1 (non-binding)
>> 
>> - checked/verified signatures and hashes
>> - checked the release note
>> - checked that there are no missing artifacts in staging area
>> - built from source sing scala 2.11 succeeded
>> - started cluster and run some e2e test succeeded
>> - started a cluster, WebUI was accessible, submitted a wordcount job and
>> ran succeeded, no suspicious log output
>> - the web PR looks good
>> 
>> Best,
>> Leonard Xu
>> 
>>> 在 2020年5月12日,10:24,jincheng sun  写道:
>>> 
>>> +1(binding)
>>> 
>>> - built from source and run Streaming word count without unexpected
>>> information.
>>> - check the signatures, looks good!
>>> 
>>> BTW: I've added your PyPI account(carp84) as a maintainer role. Great job
>>> Yu!
>>> 
>>> Best,
>>> Jincheng
>>> 
>>> Hequn Cheng  于2020年5月12日周二 上午9:49写道:
>>> 
 +1 (binding)
 
 - Go through all commits from 1.10.0 to 1.10.1 and spot no new license
 problems.
 - Built from source archive successfully.
 - Travis e2e tests have passed.
 - Signatures and hash are correct.
 - Run SocketWindowWordCount on the local cluster and check web ui &
>> logs.
 - Install Python package and run Python WordCount example.
 - Website PR looks good.
 
 Best,
 Hequn
 
 On Mon, May 11, 2020 at 10:39 PM Ufuk Celebi  wrote:
 
> +1 (binding)
> 
> - checked release notes
> - verified sums and hashes
> - reviewed website PR
> - successfully built an internal Flink distribution based on the
 1.10.1-rc3
> commit
> - successfully built internal jobs against the staging repo and
>> deployed
> those jobs to a 1.10.1 job cluster on Kubernetes and tested
>> checkpointing
> 
> –  Ufuk
> 
> On Mon, May 11, 2020 at 11:47 AM Tzu-Li (Gordon) Tai <
 tzuli...@apache.org>
> wrote:
>> 
>> +1 (binding)
>> 
>> Legal checks:
>> - checksum & gpg OK
>> - changes to Kinesis connector NOTICE files looks good
>> - built from source
>> 
>> Downstream checks in flink-statefun:
>> - Built StateFun with Flink 1.10.1 + e2e tests enabled (mvn clean
 install
>> -Prun-e2e-tests), builds and passes
>> - Previous issue preventing successful failure recovery when using the
> new
>> scheduler, is now fixed with this RC
>> 
>> Cheers,
>> Gordon
>> 
>> On Mon, May 11, 2020 at 2:47 PM Congxian Qiu 
> wrote:
>> 
>>> +1 (no-binding)
>>> 
>>> - checksum & gpg ok
>>> - build from source OK
>>> - all pom files points to the same version OK
>>> - LICENSE OK
>>> - run demos OK
>>> Best,
>>> Congxian
>>> 
>>> 
>>> Dian Fu  于2020年5月10日周日 下午10:14写道:
>>> 
 +1 (non-binding)
 
 - checked the dependency changes since 1.10.0
 1) kafka was bumped from 0.10.2.1 to 0.10.2.2 for
 flink-connector-kafka-0.10 and it has been reflected in the notice
> file
 2) amazon-kinesis-producer was bumped from 0.13.1 to 0.14.0 and
 it
> has
 been reflected in the notice file
 3) the dependencies com.carrotsearch:hppc,
 com.github.spullara.mustache.java,
> org.elasticsearch:elasticsearch-geo
>>> and
 org.elasticsearch.plugin:lang-mustache-client was bundled in the
 jar
> of
 flink-sql-connector-elasticsearch7 and it has been reflected in the
>>> notice
 file
 4) influxdb-java was bumped from 2.16 to 2.17 and it has been
> reflected
 in the notice file
 - verified the checksum and signature
 - checked that the PyFlink package could be pip installed
 - have left a few minor comments on the website PR
 
 Regards,
 Dian
 
 On Sat, May 9, 2020 at 12:02 PM Thomas Weise 
 wrote:
 
> Thanks for the RC!
> 
> +1 (binding)
> 
> - repeated benchmark runs
> 
> 
> On Fri, May 8, 2020 at 10:52 AM Robert Metzger <
> rmetz...@apache.org>
> wrote:
> 
>> Thanks a lot for creating another RC!
>> 
>> +1 (binding)
>> 
>> - checked diff to last RC:
>> 
>> 
> 
 
>>> 
> 
> 
 
>> https://github.com/apache/flink/compare/release-1.10.1-rc2...release-1.10.1-rc3
>> - kinesis dependency change is properly documented
>> - started Flink locally (on Java11 :) )
>>  - seems to be build off the specified commit

[jira] [Created] (FLINK-17628) Remove the unnecessary py4j log information

2020-05-12 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-17628:
-

 Summary: Remove the unnecessary py4j log information
 Key: FLINK-17628
 URL: https://issues.apache.org/jira/browse/FLINK-17628
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Currently the py4j will print the INFO level logging information to the 
console. It is unnecessary for users. We should set the level to WARN.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17612) The Python command line options are ignored by the CliOptionsParser in SQL Client

2020-05-11 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-17612:
-

 Summary: The Python command line options are ignored by the 
CliOptionsParser in SQL Client
 Key: FLINK-17612
 URL: https://issues.apache.org/jira/browse/FLINK-17612
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Reporter: Wei Zhong


It seems the CliOptions in SQL Client does not parse the python command line 
options currently. We need to add the parse logic for the python command line 
options.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >