Re: sql-client pyexec参数生效疑问

2022-06-07 Thread Dian Fu
有两个参数指定Python解释器:
1)-pyexec,指定的是作业执行过程中,用来运行Python UDF的Python解释器路径
2)-pyclientexec,指定客户端编译作业的时候,用到的Python解释器路径,具体信息可以看一下:
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/dependency_management/#python-interpreter-of-client

可以把这个参数-pyclientexec 也加上试试。

On Tue, Jun 7, 2022 at 11:24 AM RS  wrote:

> Hi,
>
>
> 环境:
> - flink-1.14.3, 单机集群
> - 服务器上默认python2,也存在python3.6.8
> - /xxx/bin/python3是python3生成的虚拟环境
>
>
> 使用sql-client测试pyflink的udf,自定义了一个函数f1,/xxx/p.py
> 启动命令:
> ./bin/sql-client.sh -pyfs file:///xxx/p.py -pyexec /xxx/bin/python3
> 配置pyexec指定了使用的python为python3
>
>
> 执行命令报错,报错信息如下:
> Flink SQL> create temporary function fun1 as 'p.f1' language python;
> [INFO] Execute statement succeed.
> Flink SQL> select fun1('a',1,'s');
> Traceback (most recent call last):
>   File "/usr/lib64/python2.7/runpy.py", line 151, in _run_module_as_main
> mod_name, loader, code, fname = _get_module_details(mod_name)
>   File "/usr/lib64/python2.7/runpy.py", line 101, in _get_module_details
> loader = get_loader(mod_name)
>   File "/usr/lib64/python2.7/pkgutil.py", line 464, in get_loader
> return find_loader(fullname)
>   File "/usr/lib64/python2.7/pkgutil.py", line 474, in find_loader
> for importer in iter_importers(fullname):
>   File "/usr/lib64/python2.7/pkgutil.py", line 430, in iter_importers
> __import__(pkg)
>   File "/home/flink-1.14.3/opt/python/pyflink.zip/pyflink/__init__.py",
> line 26, in 
> RuntimeError: Python versions prior to 3.6 are not supported for PyFlink
> [sys.version_info(major=2, minor=7, micro=5, releaselevel='final',
> serial=0)].
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.IllegalStateException: Instantiating python function 'p.f1'
> failed.
>
>
> 报错提示中使用到的是python2,不是参数里面配置的python3,如何让pyexec生效?
>
>
> Thx


Re: [DISCUSS] Update Scala 2.12.7 to 2.12.15

2022-06-07 Thread Ran Tao
Hi, Martijn. I have asked the scala community, i will reply to you if there
is any respond.  however, i think you can vote or discuss with 2.12.15.
The 2.12.16 just backport some functionalies from higher scala version,
such as jvm11, jvm-17 target, the flink may need it.
But as Chesnay says, we do not limit the flink java & scala target bytecode
version must consistent.

Martijn Visser  于2022年6月7日周二 17:04写道:

> Hi all,
>
> @Ran When I created this discussion thread, the latest available version of
> Scala 2.12 was 2.12.15. I don't see 2.12.16 being released yet, only
> discussed. I did read that they were planning it in the last couple of
> weeks but haven't seen any progress since. Do you know more about the
> timeline?
>
> I would also like to make a final call towards Scala users to provide their
> input in the next 72 hours. Else, I'll open up a voting thread to make the
> upgrade.
>
> Best regards,
>
> Martijn
>
> Op vr 20 mei 2022 om 14:10 schreef Ran Tao :
>
> > Got it. But I think the runtime java environment e.g.  jdk11 env may
> cannot
> > optimize these scala lower bytecode very well.  However currently no
> direct
> > report show this problem. hah~
> >
> > Chesnay Schepler 于2022年5月20日 周五19:53写道:
> >
> > > It's not necessarily required that the scala byte code matches the
> > > version of the java byte code.
> > >
> > > By and large such inconsistencies are inevitable w.r.t. external
> > libraries.
> > >
> > > On 20/05/2022 12:23, Ran Tao wrote:
> > > > Hi, Martijn. Even if we upgrade scala to 2.12.15 to support Java17,
> it
> > > just
> > > > fix the compilation of FLINK-25000
> > > >  .  There is
> > another
> > > > problem[1] that Scala <=2.13 can't generate jvm11 or higher bytecode
> > > > version. The flink project target class bytecode version is
> > inconsistent
> > > > between scala and java source.
> > > > If we decide to upgrade scala version, how aboout scala 2.12.16, as i
> > > know
> > > > from scala community, the 2.12.16 will backport 2.13 functionilies
> like
> > > > jvm11,jvm17 target jvm class support.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-27549
> > > >
> > > > Martijn Visser  于2022年5月20日周五 16:37写道:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> I would like to get some opinions from our Scala users, therefore
> I'm
> > > also
> > > >> looping in the user mailing list.
> > > >>
> > > >> Flink currently is tied to Scala 2.12.7. As outlined in FLINK-12461
> > [1]
> > > >> there is a binary incompatibility introduced by Scala 2.12.8, which
> > > >> currently limits Flink from upgrading to a later Scala 2.12 version.
> > > >> According to the Scala 2.12.8 release notes, "The fix is not binary
> > > >> compatible: the 2.12.8 compiler omits certain methods that are
> > > generated by
> > > >> earlier 2.12 compilers. However, we believe that these methods are
> > never
> > > >> used and existing compiled code will continue to work"
> > > >>
> > > >> We could still consider upgrading to a later Scala 2.12 version, the
> > > latest
> > > >> one currently being 2.12.15. Next to any benefits that are
> introduced
> > in
> > > >> the newer Scala versions, it would also resolve a blocker for Flink
> to
> > > add
> > > >> support for Java 17 [2].
> > > >>
> > > >> My question to Scala users of Flink and others who have an opinion
> on
> > > this:
> > > >> * Has any of you already manually compiled Flink with Scala 2.12.8
> or
> > > >> later?
> > > >> * If so, have you experienced any problems with checkpoint and/or
> > > savepoint
> > > >> incompatibility?
> > > >> * Would you prefer Flink breaking binary compatibility by upgrading
> > to a
> > > >> later Scala 2.12 version or would you prefer Flink to stick with
> Scala
> > > >> 2.12.7?
> > > >>
> > > >> Note: I know there are also questions about Scala 2.13 and Scala 3
> > > support
> > > >> in Flink; I think that deserves its own discussion thread.
> > > >>
> > > >> Best regards,
> > > >>
> > > >> Martijn Visser
> > > >> https://twitter.com/MartijnVisser82
> > > >> https://github.com/MartijnVisser
> > > >>
> > > >> [1] https://issues.apache.org/jira/browse/FLINK-12461
> > > >> [2] https://issues.apache.org/jira/browse/FLINK-25000
> > > >>
> > > >
> > >
> > > --
> > Best,
> > Ran Tao
> >
>


-- 
Best,
Ran Tao


Re: [DISCUSS] Update Scala 2.12.7 to 2.12.15

2022-06-07 Thread Ran Tao
Hi, Martijn. I have asked the scala community, i will reply to you if there
is any respond.  however, i think you can vote or discuss with 2.12.15.
The 2.12.16 just backport some functionalies from higher scala version,
such as jvm11, jvm-17 target, the flink may need it.
But as Chesnay says, we do not limit the flink java & scala target bytecode
version must consistent.

Martijn Visser  于2022年6月7日周二 17:04写道:

> Hi all,
>
> @Ran When I created this discussion thread, the latest available version of
> Scala 2.12 was 2.12.15. I don't see 2.12.16 being released yet, only
> discussed. I did read that they were planning it in the last couple of
> weeks but haven't seen any progress since. Do you know more about the
> timeline?
>
> I would also like to make a final call towards Scala users to provide their
> input in the next 72 hours. Else, I'll open up a voting thread to make the
> upgrade.
>
> Best regards,
>
> Martijn
>
> Op vr 20 mei 2022 om 14:10 schreef Ran Tao :
>
> > Got it. But I think the runtime java environment e.g.  jdk11 env may
> cannot
> > optimize these scala lower bytecode very well.  However currently no
> direct
> > report show this problem. hah~
> >
> > Chesnay Schepler 于2022年5月20日 周五19:53写道:
> >
> > > It's not necessarily required that the scala byte code matches the
> > > version of the java byte code.
> > >
> > > By and large such inconsistencies are inevitable w.r.t. external
> > libraries.
> > >
> > > On 20/05/2022 12:23, Ran Tao wrote:
> > > > Hi, Martijn. Even if we upgrade scala to 2.12.15 to support Java17,
> it
> > > just
> > > > fix the compilation of FLINK-25000
> > > >  .  There is
> > another
> > > > problem[1] that Scala <=2.13 can't generate jvm11 or higher bytecode
> > > > version. The flink project target class bytecode version is
> > inconsistent
> > > > between scala and java source.
> > > > If we decide to upgrade scala version, how aboout scala 2.12.16, as i
> > > know
> > > > from scala community, the 2.12.16 will backport 2.13 functionilies
> like
> > > > jvm11,jvm17 target jvm class support.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-27549
> > > >
> > > > Martijn Visser  于2022年5月20日周五 16:37写道:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> I would like to get some opinions from our Scala users, therefore
> I'm
> > > also
> > > >> looping in the user mailing list.
> > > >>
> > > >> Flink currently is tied to Scala 2.12.7. As outlined in FLINK-12461
> > [1]
> > > >> there is a binary incompatibility introduced by Scala 2.12.8, which
> > > >> currently limits Flink from upgrading to a later Scala 2.12 version.
> > > >> According to the Scala 2.12.8 release notes, "The fix is not binary
> > > >> compatible: the 2.12.8 compiler omits certain methods that are
> > > generated by
> > > >> earlier 2.12 compilers. However, we believe that these methods are
> > never
> > > >> used and existing compiled code will continue to work"
> > > >>
> > > >> We could still consider upgrading to a later Scala 2.12 version, the
> > > latest
> > > >> one currently being 2.12.15. Next to any benefits that are
> introduced
> > in
> > > >> the newer Scala versions, it would also resolve a blocker for Flink
> to
> > > add
> > > >> support for Java 17 [2].
> > > >>
> > > >> My question to Scala users of Flink and others who have an opinion
> on
> > > this:
> > > >> * Has any of you already manually compiled Flink with Scala 2.12.8
> or
> > > >> later?
> > > >> * If so, have you experienced any problems with checkpoint and/or
> > > savepoint
> > > >> incompatibility?
> > > >> * Would you prefer Flink breaking binary compatibility by upgrading
> > to a
> > > >> later Scala 2.12 version or would you prefer Flink to stick with
> Scala
> > > >> 2.12.7?
> > > >>
> > > >> Note: I know there are also questions about Scala 2.13 and Scala 3
> > > support
> > > >> in Flink; I think that deserves its own discussion thread.
> > > >>
> > > >> Best regards,
> > > >>
> > > >> Martijn Visser
> > > >> https://twitter.com/MartijnVisser82
> > > >> https://github.com/MartijnVisser
> > > >>
> > > >> [1] https://issues.apache.org/jira/browse/FLINK-12461
> > > >> [2] https://issues.apache.org/jira/browse/FLINK-25000
> > > >>
> > > >
> > >
> > > --
> > Best,
> > Ran Tao
> >
>


-- 
Best,
Ran Tao


Re: Flink config driven tool ?

2022-06-07 Thread Shengkai Fang
Hi.

I am not sure whether Flink SQL satisfies your requirement or not. You can
just write the SQL in the file and use the SQL Client to submit it to your
cluster.  We have a quick start in the Flink CDC and you can make a try[1].

Best,
Shengkai

[1]
https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-postgres-tutorial.html

sri hari kali charan Tummala  于2022年6月8日周三 04:49写道:

> Thanks, I'll check it out.
>
> On Tue, Jun 7, 2022, 1:31 PM Austin Cawley-Edwards <
> austin.caw...@gmail.com> wrote:
>
>> They support Flink as well. Looks like they even support the new Flink
>> k8s operator.[1]
>>
>> Austin
>>
>> [1]:
>> https://seatunnel.apache.org/docs/2.1.1/start/kubernetes#deploying-the-operator
>>
>> On Tue, Jun 7, 2022 at 3:11 PM sri hari kali charan Tummala <
>> kali.tumm...@gmail.com> wrote:
>>
>>> thanks but looks like a spark tool is there something similar in flink?
>>>
>>> Thanks
>>> Sri
>>>
>>> On Tue, Jun 7, 2022 at 12:07 PM Austin Cawley-Edwards <
>>> austin.caw...@gmail.com> wrote:
>>>
 Hey there,

 No idea if it's any good, but just saw Apache SeaTunnel[1] today which
 seems to fit your requirements.

 Best,
 Austin

 [1]: https://seatunnel.apache.org/

 On Tue, Jun 7, 2022 at 2:19 PM sri hari kali charan Tummala <
 kali.tumm...@gmail.com> wrote:

> Hi Flink Community,
>
> can someone point me to a good config-driven flink data movement tool
> Github repos? Imagine I build my ETL dag connecting source -->
> transformations --> target just using a config file.
>
> below are a few spark examples:-
> https://github.com/mvrpl/big-shipper
> https://github.com/BitwiseInc/Hydrograph
>
> Thanks & Regards
> Sri Tummala
>
>
>>>
>>> --
>>> Thanks & Regards
>>> Sri Tummala
>>>
>>>


Re: Re: 实现SupportsFilterPushDown接口过程中遇到的问题

2022-06-07 Thread Shengkai Fang
Hi.

之前就已经开了一个 issue 关于 Optional 文档,你可以直接认领这个 issue[1]。另外,如果是 Flink
文档的错误,可以参考下这个PR,对应着修改[2]。

Best,
Shengkai

[1] https://issues.apache.org/jira/browse/FLINK-20760
[2] https://github.com/apache/flink/pull/19498

Xuyang  于2022年6月7日周二 23:46写道:

> Hi, 非常欢迎你一起参与社区的建设中来。
>
> 社区有一套完整的贡献流程,大体可以参照文档[1]。
>
> 总的来说可以细分为:
>
> 1、发现问题,并在jira[2]中提出一个issue
>
> 2、提一个pr并在pr中写明相关的issue号码+模块名+简单描述,具体可以参考下其他的pr来写
>
> 3、热心的同学会帮你review的,你也可以在你新建的issue下面ping我下(xuyang)
>
>
>
>
> [1] https://flink.apache.org/contributing/how-to-contribute.html
>
> [2]
> https://issues.apache.org/jira/projects/FLINK/issues/FLINK-27084?filter=allopenissues
>
>
>
>
> --
>
> Best!
> Xuyang
>
>
>
>
>
> 在 2022-06-07 18:08:10,"朱育锋"  写道:
> >Hi
> >
> >很抱歉这么晚回复
> >
> >1. Hi
> Xuyang老师,确实如你所说,调用applyFilters方法之后,又调用了copy方法生成了新的JdbcDynamicTableSource对象,在新的JdbcDynamicTableSource对象中再次调用了getScanRuntimeProvider方法。现在已经成功实现了谓词下推,十分感谢
> >2. Hi Shengkai老师,我十分愿意贡献到社区,不过我之前从未参与过GitHub开源项目,这对我既是机会又是挑战,我愿意挑战下自己。
> >为此,我通读了一遍Flink官网的贡献指南,当我阅读到这一节时[1],我点击了文本[2]上的链接,显示"The requested URL was
> not found on this
> server",发现是md文档中链接的URL拼写错误造成的。我提了一个hotfix[3],正好借助这个PR熟悉下贡献流程,辛苦老师帮忙Review下
> >
> >[1]
> https://flink.apache.org/contributing/code-style-and-quality-common.html#nullability-of-the-mutable-parts
> >[2] "usage of Java Optional"
> https://flink.apache.org/contributing/code-style-and-quality-java.md#java-optional
> >[3] https://github.com/apache/flink-web/pull/544
> >
> >Best regards
> >YuFeng
> >
> >> 2022年6月2日 10:28,Shengkai Fang  写道:
> >>
> >> Hi.
> >>
> >> 我记得 Jdbc Connector 实现了 ProjectionPushDown。你可以参考着实现。
> >>
> >> xuyang 老师说的对,getScanRuntimeProvider 发生在 push down
> >> 之后。应该不会有你说的问题。另外,可以考虑贡献到社区[1],我们也可以帮忙一起 review 下,帮忙解决你的问题?
> >>
> >> Best,
> >> Shengkai
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-19651
> >>
> >> Xuyang  于2022年6月1日周三 23:47写道:
> >>
> >>>
> >>>
> Hi,在SQL优化的末尾,StreamPhysicalTableSourceScan转化为StreamExecTableSourceScan的过程中[1],会执行一次getScanRuntimeProvider,此时的getScanRuntimeProvider一定是在applyFilters方法之后触发的。
> >>>
> >>>
> >>>
> >>>
> >>> 你可以尝试将filterFields记录在JdbcDynamicTableSource
> >>> 这个类中,如果该值为空,则getScanRuntimeProvider
> >>> 时无需拼接(在applyFilters执行之前一定是空的);当该值不为空的时候,在getScanRuntimeProvider
> >>> 进行拼接(最后一次physical node转exec node时,一定执行过applyFilters方法)。
> >>>
> >>>
> >>>
> >>>
> >>> [1]
> >>>
> https://github.com/apache/flink/blob/98997ea37ba08eae0f9aa6dd34823238097d8e0d/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/common/CommonExecTableSourceScan.java#L83
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>>Best!
> >>>Xuyang
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> 在 2022-06-01 20:03:58,"朱育锋"  写道:
> >>>
> >>>
> Hi,在SQL优化的末尾,StreamPhysicalTableSourceScan转化为StreamExecTableSourceScan的过程中[1],会执行一次getScanRuntimeProvider,此时的getScanRuntimeProvider一定是在applyFilters方法之后触发的。你可以尝试将filterFields记录在JdbcDynamicTableSource
> >>> 这个类中,如果该值为空,则getScanRuntimeProvider
> >>> 时无需拼接(在applyFilters执行之前一定是空的);当该值不为空的时候,在getScanRuntimeProvider
> >>> 进行拼接(最后一次physical node转exec node时,一定执行过applyFilters方法)。[1]
> >>>
> https://github.com/apache/flink/blob/98997ea37ba08eae0f9aa6dd34823238097d8e0d/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/common/CommonExecTableSourceScan.java#L83
>


Re: Cannot cast GoogleHadoopFileSystem to hadoop.fs.FileSystem to list file in Flink 1.15

2022-06-07 Thread 陳昌倬
On Tue, Jun 07, 2022 at 03:26:43PM +0800, czc...@gmail.com wrote:
> On Mon, Jun 06, 2022 at 10:42:08AM +0800, Shengkai Fang wrote:
> > Hi. In my experience, the step to debug classloading problems are as
> > follows:
> 
> Thanks for the help. We get the following log when using
> `-verbose:class`:
> 
>   [2.074s][info][class,load] org.apache.hadoop.fs.FileSystem source: 
> file:/opt/flink/opt/flink-s3-fs-hadoop-1.15.0.jar
>   ...
>   [8.094s][info][class,load] 
> com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem source: 
> file:/opt/flink/opt/flink-gs-fs-hadoop-1.15.0.jar
> 
> 
> It looks like application uses hadoop.fs.FileSystem from
> flink-s3-fs-hadoop-1.15.0.jar, and use GoogleHadoopFileSystem from
> flink-gs-fs-hadoop-1.15.0.jar, and they are incompatible. Since we run
> Flink in both AWS and GCP, our base image contains both plugins at the
> same time. Any idea how to workaround it?
> 
> We also try to set `classloader.resolve-order: parent-first`. However,
> we got another error causing by library conflict between Flink and our
> application:
> 
>   Caused by: com.fasterxml.jackson.databind.JsonMappingException: Scala 
> module 2.11.3 requires Jackson Databind version >= 2.11.0 and < 2.12.0

We solve the problem by moving plugins into correct plugins directory.
Thanks for the help from slack.



-- 
ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org
http://czchen.info/
Key fingerprint = BA04 346D C2E1 FE63 C790  8793 CC65 B0CD EC27 5D5B


signature.asc
Description: PGP signature


RE: Apache Flink - Rest API for num of records in/out

2022-06-07 Thread Hailu, Andreas
Hi M,

We had a similar requirement – we were able to solve for this by:

1.   Supply the operators we’re interested in acquiring metrics for through 
the various name() methods

2.   Use the jobid API [1] and find the operator we’ve named in the 
“vertices” array

[1] 
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/rest_api/#jobs-jobid

ah

From: M Singh 
Sent: Tuesday, June 7, 2022 4:51 PM
To: User-Flink 
Subject: Apache Flink - Rest API for num of records in/out

Hi Folks:

I am trying to find if I can get the number of records for an operator using 
flinks REST API.  I've checked the docs at 
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/rest_api/.

I did see some apis that use vertexid, but could not find how to that info 
without having vertex ids.

I am using flink 1.14.4.

Can you please let me know how to get that ?

Thanks



Your Personal Data: We may collect and process information about you that may 
be subject to data protection laws. For more information about how we use and 
disclose your personal data, how we protect your information, our legal basis 
to use your information, your rights and who you can contact, please refer to: 
www.gs.com/privacy-notices


Apache Flink - Rest API for num of records in/out

2022-06-07 Thread M Singh
Hi Folks:
I am trying to find if I can get the number of records for an operator using 
flinks REST API.  I've checked the docs at 
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/rest_api/.  
 
I did see some apis that use vertexid, but could not find how to that info 
without having vertex ids.
I am using flink 1.14.4. 
Can you please let me know how to get that ?
Thanks

Re: Flink config driven tool ?

2022-06-07 Thread sri hari kali charan Tummala
Thanks, I'll check it out.

On Tue, Jun 7, 2022, 1:31 PM Austin Cawley-Edwards 
wrote:

> They support Flink as well. Looks like they even support the new Flink k8s
> operator.[1]
>
> Austin
>
> [1]:
> https://seatunnel.apache.org/docs/2.1.1/start/kubernetes#deploying-the-operator
>
> On Tue, Jun 7, 2022 at 3:11 PM sri hari kali charan Tummala <
> kali.tumm...@gmail.com> wrote:
>
>> thanks but looks like a spark tool is there something similar in flink?
>>
>> Thanks
>> Sri
>>
>> On Tue, Jun 7, 2022 at 12:07 PM Austin Cawley-Edwards <
>> austin.caw...@gmail.com> wrote:
>>
>>> Hey there,
>>>
>>> No idea if it's any good, but just saw Apache SeaTunnel[1] today which
>>> seems to fit your requirements.
>>>
>>> Best,
>>> Austin
>>>
>>> [1]: https://seatunnel.apache.org/
>>>
>>> On Tue, Jun 7, 2022 at 2:19 PM sri hari kali charan Tummala <
>>> kali.tumm...@gmail.com> wrote:
>>>
 Hi Flink Community,

 can someone point me to a good config-driven flink data movement tool
 Github repos? Imagine I build my ETL dag connecting source -->
 transformations --> target just using a config file.

 below are a few spark examples:-
 https://github.com/mvrpl/big-shipper
 https://github.com/BitwiseInc/Hydrograph

 Thanks & Regards
 Sri Tummala


>>
>> --
>> Thanks & Regards
>> Sri Tummala
>>
>>


Re: Flink config driven tool ?

2022-06-07 Thread Austin Cawley-Edwards
They support Flink as well. Looks like they even support the new Flink k8s
operator.[1]

Austin

[1]:
https://seatunnel.apache.org/docs/2.1.1/start/kubernetes#deploying-the-operator

On Tue, Jun 7, 2022 at 3:11 PM sri hari kali charan Tummala <
kali.tumm...@gmail.com> wrote:

> thanks but looks like a spark tool is there something similar in flink?
>
> Thanks
> Sri
>
> On Tue, Jun 7, 2022 at 12:07 PM Austin Cawley-Edwards <
> austin.caw...@gmail.com> wrote:
>
>> Hey there,
>>
>> No idea if it's any good, but just saw Apache SeaTunnel[1] today which
>> seems to fit your requirements.
>>
>> Best,
>> Austin
>>
>> [1]: https://seatunnel.apache.org/
>>
>> On Tue, Jun 7, 2022 at 2:19 PM sri hari kali charan Tummala <
>> kali.tumm...@gmail.com> wrote:
>>
>>> Hi Flink Community,
>>>
>>> can someone point me to a good config-driven flink data movement tool
>>> Github repos? Imagine I build my ETL dag connecting source -->
>>> transformations --> target just using a config file.
>>>
>>> below are a few spark examples:-
>>> https://github.com/mvrpl/big-shipper
>>> https://github.com/BitwiseInc/Hydrograph
>>>
>>> Thanks & Regards
>>> Sri Tummala
>>>
>>>
>
> --
> Thanks & Regards
> Sri Tummala
>
>


Re: Flink config driven tool ?

2022-06-07 Thread sri hari kali charan Tummala
thanks but looks like a spark tool is there something similar in flink?

Thanks
Sri

On Tue, Jun 7, 2022 at 12:07 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

> Hey there,
>
> No idea if it's any good, but just saw Apache SeaTunnel[1] today which
> seems to fit your requirements.
>
> Best,
> Austin
>
> [1]: https://seatunnel.apache.org/
>
> On Tue, Jun 7, 2022 at 2:19 PM sri hari kali charan Tummala <
> kali.tumm...@gmail.com> wrote:
>
>> Hi Flink Community,
>>
>> can someone point me to a good config-driven flink data movement tool
>> Github repos? Imagine I build my ETL dag connecting source -->
>> transformations --> target just using a config file.
>>
>> below are a few spark examples:-
>> https://github.com/mvrpl/big-shipper
>> https://github.com/BitwiseInc/Hydrograph
>>
>> Thanks & Regards
>> Sri Tummala
>>
>>

-- 
Thanks & Regards
Sri Tummala


Re: Flink config driven tool ?

2022-06-07 Thread Austin Cawley-Edwards
Hey there,

No idea if it's any good, but just saw Apache SeaTunnel[1] today which
seems to fit your requirements.

Best,
Austin

[1]: https://seatunnel.apache.org/

On Tue, Jun 7, 2022 at 2:19 PM sri hari kali charan Tummala <
kali.tumm...@gmail.com> wrote:

> Hi Flink Community,
>
> can someone point me to a good config-driven flink data movement tool
> Github repos? Imagine I build my ETL dag connecting source -->
> transformations --> target just using a config file.
>
> below are a few spark examples:-
> https://github.com/mvrpl/big-shipper
> https://github.com/BitwiseInc/Hydrograph
>
> Thanks & Regards
> Sri Tummala
>
>


Flink config driven tool ?

2022-06-07 Thread sri hari kali charan Tummala
Hi Flink Community,

can someone point me to a good config-driven flink data movement tool
Github repos? Imagine I build my ETL dag connecting source -->
transformations --> target just using a config file.

below are a few spark examples:-
https://github.com/mvrpl/big-shipper
https://github.com/BitwiseInc/Hydrograph

Thanks & Regards
Sri Tummala


Re:Re: 实现SupportsFilterPushDown接口过程中遇到的问题

2022-06-07 Thread Xuyang
Hi, 非常欢迎你一起参与社区的建设中来。

社区有一套完整的贡献流程,大体可以参照文档[1]。 

总的来说可以细分为:

1、发现问题,并在jira[2]中提出一个issue

2、提一个pr并在pr中写明相关的issue号码+模块名+简单描述,具体可以参考下其他的pr来写

3、热心的同学会帮你review的,你也可以在你新建的issue下面ping我下(xuyang)




[1] https://flink.apache.org/contributing/how-to-contribute.html

[2] 
https://issues.apache.org/jira/projects/FLINK/issues/FLINK-27084?filter=allopenissues




--

Best!
Xuyang





在 2022-06-07 18:08:10,"朱育锋"  写道:
>Hi
>
>很抱歉这么晚回复
>
>1. Hi 
>Xuyang老师,确实如你所说,调用applyFilters方法之后,又调用了copy方法生成了新的JdbcDynamicTableSource对象,在新的JdbcDynamicTableSource对象中再次调用了getScanRuntimeProvider方法。现在已经成功实现了谓词下推,十分感谢
>2. Hi Shengkai老师,我十分愿意贡献到社区,不过我之前从未参与过GitHub开源项目,这对我既是机会又是挑战,我愿意挑战下自己。
>为此,我通读了一遍Flink官网的贡献指南,当我阅读到这一节时[1],我点击了文本[2]上的链接,显示"The requested URL was not 
>found on this 
>server",发现是md文档中链接的URL拼写错误造成的。我提了一个hotfix[3],正好借助这个PR熟悉下贡献流程,辛苦老师帮忙Review下
>
>[1] 
>https://flink.apache.org/contributing/code-style-and-quality-common.html#nullability-of-the-mutable-parts
>[2] "usage of Java Optional" 
>https://flink.apache.org/contributing/code-style-and-quality-java.md#java-optional
>[3] https://github.com/apache/flink-web/pull/544
>
>Best regards
>YuFeng
>
>> 2022年6月2日 10:28,Shengkai Fang  写道:
>> 
>> Hi.
>> 
>> 我记得 Jdbc Connector 实现了 ProjectionPushDown。你可以参考着实现。
>> 
>> xuyang 老师说的对,getScanRuntimeProvider 发生在 push down
>> 之后。应该不会有你说的问题。另外,可以考虑贡献到社区[1],我们也可以帮忙一起 review 下,帮忙解决你的问题?
>> 
>> Best,
>> Shengkai
>> 
>> [1] https://issues.apache.org/jira/browse/FLINK-19651
>> 
>> Xuyang  于2022年6月1日周三 23:47写道:
>> 
>>> 
>>> Hi,在SQL优化的末尾,StreamPhysicalTableSourceScan转化为StreamExecTableSourceScan的过程中[1],会执行一次getScanRuntimeProvider,此时的getScanRuntimeProvider一定是在applyFilters方法之后触发的。
>>> 
>>> 
>>> 
>>> 
>>> 你可以尝试将filterFields记录在JdbcDynamicTableSource
>>> 这个类中,如果该值为空,则getScanRuntimeProvider
>>> 时无需拼接(在applyFilters执行之前一定是空的);当该值不为空的时候,在getScanRuntimeProvider
>>> 进行拼接(最后一次physical node转exec node时,一定执行过applyFilters方法)。
>>> 
>>> 
>>> 
>>> 
>>> [1]
>>> https://github.com/apache/flink/blob/98997ea37ba08eae0f9aa6dd34823238097d8e0d/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/common/CommonExecTableSourceScan.java#L83
>>> 
>>> 
>>> 
>>> 
>>> --
>>> 
>>>Best!
>>>Xuyang
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 在 2022-06-01 20:03:58,"朱育锋"  写道:
>>> 
>>> Hi,在SQL优化的末尾,StreamPhysicalTableSourceScan转化为StreamExecTableSourceScan的过程中[1],会执行一次getScanRuntimeProvider,此时的getScanRuntimeProvider一定是在applyFilters方法之后触发的。你可以尝试将filterFields记录在JdbcDynamicTableSource
>>> 这个类中,如果该值为空,则getScanRuntimeProvider
>>> 时无需拼接(在applyFilters执行之前一定是空的);当该值不为空的时候,在getScanRuntimeProvider
>>> 进行拼接(最后一次physical node转exec node时,一定执行过applyFilters方法)。[1]
>>> https://github.com/apache/flink/blob/98997ea37ba08eae0f9aa6dd34823238097d8e0d/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/common/CommonExecTableSourceScan.java#L83


Re: Needed help with skipping savepoint state (unsure how to set --allowNonRestoredState in Docker)

2022-06-07 Thread Chesnay Schepler
You are on the right path with using the --allowNonRestoredState flag; 
we'll just have to find the right place to put it w.r.t. your setup.


Which docker images are you using (flink/statefun/something custom), and 
how do you submit the job?


On 03/06/2022 01:17, Bhavani Balasubramanyam wrote:

Hi,

I am Bhavani, a Software Engineer at Reddit. I'm trying to upgrade the
Flink version in my application from 3.0.0 to version 3.2.0, and in the
process I see the below error, where the  the operator has been removed,
and the checkpoint is unable to recover:


- Jun 2 14:32:03
snooron-worker-perspective-flink-staging-statefun-master-5f8hzd master
ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Fatal
error occurred in the cluster entrypoint.
org.apache.flink.util.FlinkException: JobMaster for job
 failed. at

org.apache.flink.runtime.dispatcher.Dispatcher.jobMasterFailed(Dispatcher.java:913)
~[flink-dist_2.12-1.14.3.jar:1.14.3] at

org.apache.flink.runtime.dispatcher.Dispatcher.jobManagerRunnerFailed(Dispatcher.java:473)
~[flink-dist_2.12-1.14.3.jar:1.14.3] at

org.apache.flink.runtime.dispatcher.Dispatcher.handleJobManagerRunnerResult(Dispatcher.java:450)
~[flink-dist_2.12-1.14.3.jar:1.14.3] at

org.apache.flink.runtime.dispatcher.Dispatcher.lambda$runJob$3(Dispatcher.java:427)
~[flink-dist_2.12-1.14.3.jar:1.14.3] at
java.util.concurrent.CompletableFuture.uniHandle(Unknown Source) ~[?:?] at
java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown Source)
~[?:?] at java.util.concurrent.CompletableFuture$Completion.run(Unknown
Source) ~[?:?] at

org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRunAsync$4(AkkaRpcActor.java:455)
~[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at

org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
~[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at

org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:455)
~[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at

org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:213)
~[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at

org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78)
~[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at

org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163)
~[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
akka.actor.Actor.aroundReceive(Actor.scala:537)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
akka.actor.Actor.aroundReceive$(Actor.scala:535)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
akka.actor.ActorCell.invoke(ActorCell.scala:548)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
akka.dispatch.Mailbox.run(Mailbox.scala:231)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
akka.dispatch.Mailbox.exec(Mailbox.scala:243)
[flink-rpc-akka_0833457a-e96a-488d-b0fd-a4aafb20352d.jar:1.14.3] at
java.util.concurrent.ForkJoinTask.doExec(Unknown Source) [?:?] at

flink运行一段时间后TaskManager退出,报OutOfMemoryError: Metaspace

2022-06-07 Thread weishishuo...@163.com
我使用的版本是:
flink:1.13.2
flink cdc: flink-connector-jdbc_2.11-1.13.2.jar 
flink-sql-connector-mysql-cdc-2.2.0.jar 
flink-sql-connector-postgres-cdc-2.2.0.jar

任务比较简单,就是从mysql、pg同步数据到pg,mysql,使用的是sql接口,请问大伙儿有碰到过这个问题吗?

2022-06-07 18:13:59,393 ERROR 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner  [] - Fatal error 
occurred while executing the TaskManager. Shutting it down...
java.lang.OutOfMemoryError: Metaspace. The metaspace out-of-memory error has 
occurred. This can mean two things: either the job requires a larger size of 
JVM metaspace to load classes or there is a class loading leak. In the first 
case 'taskmanager.memory.jvm-metaspace.size' configuration option should be 
increased. If the error persists (usually in cluster after several job 
(re-)submissions) then there is probably a class loading leak in user code or 
some of its dependencies which has to be investigated and fixed. The task 
executor has to be shutdown...at java.lang.ClassLoader.defineClass1(Native 
Method) ~[?:1.8.0_112]
at java.lang.ClassLoader.defineClass(ClassLoader.java:763) ~[?:1.8.0_112]
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) 
~[?:1.8.0_112]
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) 
~[?:1.8.0_112]
at java.net.URLClassLoader.access$100(URLClassLoader.java:73) ~[?:1.8.0_112]
at java.net.URLClassLoader$1.run(URLClassLoader.java:368) ~[?:1.8.0_112]
at java.net.URLClassLoader$1.run(URLClassLoader.java:362) ~[?:1.8.0_112]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
at java.net.URLClassLoader.findClass(URLClassLoader.java:361) ~[?:1.8.0_112]
at 
org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:71)
 ~[flink-dist_2.11-1.13.2.jar:1.13.2]
at 
org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:48)
 [flink-dist_2.11-1.13.2.jar:1.13.2]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) [?:1.8.0_112]
at io.debezium.relational.Column.editor(Column.java:31) 
[blob_p-343be43b7874de49f6ce1d8bcb6a90a384203530-2e5afb6f8bf4834164c1bb92aaf97a00:2.2.0]
at 
io.debezium.connector.postgresql.connection.PostgresConnection.readTableColumn(PostgresConnection.java:464)
 
[blob_p-343be43b7874de49f6ce1d8bcb6a90a384203530-2e5afb6f8bf4834164c1bb92aaf97a00:2.2.0]
at 
io.debezium.jdbc.JdbcConnection.getColumnsDetails(JdbcConnection.java:1226) 
[blob_p-343be43b7874de49f6ce1d8bcb6a90a384203530-2e5afb6f8bf4834164c1bb92aaf97a00:2.2.0]
at io.debezium.jdbc.JdbcConnection.readSchema(JdbcConnection.java:1182) 
[blob_p-343be43b7874de49f6ce1d8bcb6a90a384203530-2e5afb6f8bf4834164c1bb92aaf97a00:2.2.0]
at 
io.debezium.connector.postgresql.PostgresSchema.refresh(PostgresSchema.java:100)
 
[blob_p-343be43b7874de49f6ce1d8bcb6a90a384203530-2e5afb6f8bf4834164c1bb92aaf97a00:2.2.0]
at 
io.debezium.connector.postgresql.PostgresSnapshotChangeEventSource.connectionCreated(PostgresSnapshotChangeEventSource.java:95)
 
[blob_p-343be43b7874de49f6ce1d8bcb6a90a384203530-2e5afb6f8bf4834164c1bb92aaf97a00:2.2.0]
at 
io.debezium.relational.RelationalSnapshotChangeEventSource.doExecute(RelationalSnapshotChangeEventSource.java:103)
 
[blob_p-343be43b7874de49f6ce1d8bcb6a90a384203530-2e5afb6f8bf4834164c1bb92aaf97a00:2.2.0]




weishishuo...@163.com


Re: 实现SupportsFilterPushDown接口过程中遇到的问题

2022-06-07 Thread 朱育锋
Hi

很抱歉这么晚回复

1. Hi 
Xuyang老师,确实如你所说,调用applyFilters方法之后,又调用了copy方法生成了新的JdbcDynamicTableSource对象,在新的JdbcDynamicTableSource对象中再次调用了getScanRuntimeProvider方法。现在已经成功实现了谓词下推,十分感谢
2. Hi Shengkai老师,我十分愿意贡献到社区,不过我之前从未参与过GitHub开源项目,这对我既是机会又是挑战,我愿意挑战下自己。
为此,我通读了一遍Flink官网的贡献指南,当我阅读到这一节时[1],我点击了文本[2]上的链接,显示"The requested URL was not 
found on this 
server",发现是md文档中链接的URL拼写错误造成的。我提了一个hotfix[3],正好借助这个PR熟悉下贡献流程,辛苦老师帮忙Review下

[1] 
https://flink.apache.org/contributing/code-style-and-quality-common.html#nullability-of-the-mutable-parts
[2] "usage of Java Optional" 
https://flink.apache.org/contributing/code-style-and-quality-java.md#java-optional
[3] https://github.com/apache/flink-web/pull/544

Best regards
YuFeng

> 2022年6月2日 10:28,Shengkai Fang  写道:
> 
> Hi.
> 
> 我记得 Jdbc Connector 实现了 ProjectionPushDown。你可以参考着实现。
> 
> xuyang 老师说的对,getScanRuntimeProvider 发生在 push down
> 之后。应该不会有你说的问题。另外,可以考虑贡献到社区[1],我们也可以帮忙一起 review 下,帮忙解决你的问题?
> 
> Best,
> Shengkai
> 
> [1] https://issues.apache.org/jira/browse/FLINK-19651
> 
> Xuyang  于2022年6月1日周三 23:47写道:
> 
>> 
>> Hi,在SQL优化的末尾,StreamPhysicalTableSourceScan转化为StreamExecTableSourceScan的过程中[1],会执行一次getScanRuntimeProvider,此时的getScanRuntimeProvider一定是在applyFilters方法之后触发的。
>> 
>> 
>> 
>> 
>> 你可以尝试将filterFields记录在JdbcDynamicTableSource
>> 这个类中,如果该值为空,则getScanRuntimeProvider
>> 时无需拼接(在applyFilters执行之前一定是空的);当该值不为空的时候,在getScanRuntimeProvider
>> 进行拼接(最后一次physical node转exec node时,一定执行过applyFilters方法)。
>> 
>> 
>> 
>> 
>> [1]
>> https://github.com/apache/flink/blob/98997ea37ba08eae0f9aa6dd34823238097d8e0d/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/common/CommonExecTableSourceScan.java#L83
>> 
>> 
>> 
>> 
>> --
>> 
>>Best!
>>Xuyang
>> 
>> 
>> 
>> 
>> 
>> 在 2022-06-01 20:03:58,"朱育锋"  写道:
>> 
>> Hi,在SQL优化的末尾,StreamPhysicalTableSourceScan转化为StreamExecTableSourceScan的过程中[1],会执行一次getScanRuntimeProvider,此时的getScanRuntimeProvider一定是在applyFilters方法之后触发的。你可以尝试将filterFields记录在JdbcDynamicTableSource
>> 这个类中,如果该值为空,则getScanRuntimeProvider
>> 时无需拼接(在applyFilters执行之前一定是空的);当该值不为空的时候,在getScanRuntimeProvider
>> 进行拼接(最后一次physical node转exec node时,一定执行过applyFilters方法)。[1]
>> https://github.com/apache/flink/blob/98997ea37ba08eae0f9aa6dd34823238097d8e0d/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/common/CommonExecTableSourceScan.java#L83



Re: Flink source Code Explanation

2022-06-07 Thread Martijn Visser
Hi,

The documentation links are working as expected.

Best regards,

Martijn

Op ma 6 jun. 2022 om 04:32 schreef sri hari kali charan Tummala <
kali.tumm...@gmail.com>:

> I am getting a connection timed out error in firefox and google chrome can
> you double-check whether the weblink is working or not?
>
> Thanks
> Sri
>
>
>
>
> On Sun, Jun 5, 2022 at 7:01 PM Jing Ge  wrote:
>
>> Hi Sri,
>>
>> Flink is very well documented. You can find it under e.g.
>> https://nightlies.apache.org/flink/flink-docs-master/
>>
>> Best regards,
>> Jing
>>
>> On Mon, Jun 6, 2022 at 3:39 AM sri hari kali charan Tummala <
>> kali.tumm...@gmail.com> wrote:
>>
>>> Hi Flink Community,
>>>
>>> I want to go through flink source code in my free time is there a
>>> document that I can go through that explains to me where to start? other
>>> than Java doc is there anything else to start my reserve engineering.
>>>
>>> Thanks & Regards
>>> Sri Tummala
>>>
>>>
>
> --
> Thanks & Regards
> Sri Tummala
>
>


Re: Add me to slack

2022-06-07 Thread Martijn Visser
Hi everyone,

The link at https://flink.apache.org/community.html#slack is usually your
best way to get in; it's updated whenever the 100 sign-up limit has been
reached.

Best regards,

Martijn

Op ma 6 jun. 2022 om 07:39 schreef Chengxuan Wang :

> Hi Jing,
> Could you also send the invite to me? wcxz...@gmail.com
>
> Thanks,
> Chengxuan
>
> Zain Haider Nemati  于2022年6月5日周日 20:03写道:
>
>> Hi Jing,
>> Could you also send the invite to me?
>> zain.hai...@retailo.co
>>
>>
>> On Mon, 6 Jun 2022 at 7:04 AM Jing Ge  wrote:
>>
>>> Hi Xiao,
>>>
>>> Just done, please check. Thanks!
>>>
>>> Best regards,
>>> Jing
>>>
>>>
>>> On Mon, Jun 6, 2022 at 3:59 AM Xiao Ma  wrote:
>>>
 Hi Jing,

 Could you please add me to the slack channel also?

 Thank you.


 Best,
 Mark Ma

 On Sun, Jun 5, 2022 at 9:57 PM Jing Ge  wrote:

> Hi Raghunadh,
>
> Just did, please check your email. Thanks!
>
> Best regards,
> Jing
>
> On Mon, Jun 6, 2022 at 3:51 AM Raghunadh Nittala <
> raghunitt...@gmail.com> wrote:
>
>> Team, Kindly add me to the slack channel.
>>
>> Best Regards.
>>
> --
 Xiao Ma
 Geotab
 Software Developer, Data Engineering | B.Sc, M.Sc
 Direct +1 (416) 836 - 3541
 Toll-free  +1 (877) 436 - 8221
 Visit   www.geotab.com
 Twitter | Facebook | YouTube | LinkedIn

>>>


Re: [DISCUSS] Update Scala 2.12.7 to 2.12.15

2022-06-07 Thread Martijn Visser
Hi all,

@Ran When I created this discussion thread, the latest available version of
Scala 2.12 was 2.12.15. I don't see 2.12.16 being released yet, only
discussed. I did read that they were planning it in the last couple of
weeks but haven't seen any progress since. Do you know more about the
timeline?

I would also like to make a final call towards Scala users to provide their
input in the next 72 hours. Else, I'll open up a voting thread to make the
upgrade.

Best regards,

Martijn

Op vr 20 mei 2022 om 14:10 schreef Ran Tao :

> Got it. But I think the runtime java environment e.g.  jdk11 env may cannot
> optimize these scala lower bytecode very well.  However currently no direct
> report show this problem. hah~
>
> Chesnay Schepler 于2022年5月20日 周五19:53写道:
>
> > It's not necessarily required that the scala byte code matches the
> > version of the java byte code.
> >
> > By and large such inconsistencies are inevitable w.r.t. external
> libraries.
> >
> > On 20/05/2022 12:23, Ran Tao wrote:
> > > Hi, Martijn. Even if we upgrade scala to 2.12.15 to support Java17, it
> > just
> > > fix the compilation of FLINK-25000
> > >  .  There is
> another
> > > problem[1] that Scala <=2.13 can't generate jvm11 or higher bytecode
> > > version. The flink project target class bytecode version is
> inconsistent
> > > between scala and java source.
> > > If we decide to upgrade scala version, how aboout scala 2.12.16, as i
> > know
> > > from scala community, the 2.12.16 will backport 2.13 functionilies like
> > > jvm11,jvm17 target jvm class support.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-27549
> > >
> > > Martijn Visser  于2022年5月20日周五 16:37写道:
> > >
> > >> Hi everyone,
> > >>
> > >> I would like to get some opinions from our Scala users, therefore I'm
> > also
> > >> looping in the user mailing list.
> > >>
> > >> Flink currently is tied to Scala 2.12.7. As outlined in FLINK-12461
> [1]
> > >> there is a binary incompatibility introduced by Scala 2.12.8, which
> > >> currently limits Flink from upgrading to a later Scala 2.12 version.
> > >> According to the Scala 2.12.8 release notes, "The fix is not binary
> > >> compatible: the 2.12.8 compiler omits certain methods that are
> > generated by
> > >> earlier 2.12 compilers. However, we believe that these methods are
> never
> > >> used and existing compiled code will continue to work"
> > >>
> > >> We could still consider upgrading to a later Scala 2.12 version, the
> > latest
> > >> one currently being 2.12.15. Next to any benefits that are introduced
> in
> > >> the newer Scala versions, it would also resolve a blocker for Flink to
> > add
> > >> support for Java 17 [2].
> > >>
> > >> My question to Scala users of Flink and others who have an opinion on
> > this:
> > >> * Has any of you already manually compiled Flink with Scala 2.12.8 or
> > >> later?
> > >> * If so, have you experienced any problems with checkpoint and/or
> > savepoint
> > >> incompatibility?
> > >> * Would you prefer Flink breaking binary compatibility by upgrading
> to a
> > >> later Scala 2.12 version or would you prefer Flink to stick with Scala
> > >> 2.12.7?
> > >>
> > >> Note: I know there are also questions about Scala 2.13 and Scala 3
> > support
> > >> in Flink; I think that deserves its own discussion thread.
> > >>
> > >> Best regards,
> > >>
> > >> Martijn Visser
> > >> https://twitter.com/MartijnVisser82
> > >> https://github.com/MartijnVisser
> > >>
> > >> [1] https://issues.apache.org/jira/browse/FLINK-12461
> > >> [2] https://issues.apache.org/jira/browse/FLINK-25000
> > >>
> > >
> >
> > --
> Best,
> Ran Tao
>


Re: [DISCUSS] Update Scala 2.12.7 to 2.12.15

2022-06-07 Thread Martijn Visser
Hi all,

@Ran When I created this discussion thread, the latest available version of
Scala 2.12 was 2.12.15. I don't see 2.12.16 being released yet, only
discussed. I did read that they were planning it in the last couple of
weeks but haven't seen any progress since. Do you know more about the
timeline?

I would also like to make a final call towards Scala users to provide their
input in the next 72 hours. Else, I'll open up a voting thread to make the
upgrade.

Best regards,

Martijn

Op vr 20 mei 2022 om 14:10 schreef Ran Tao :

> Got it. But I think the runtime java environment e.g.  jdk11 env may cannot
> optimize these scala lower bytecode very well.  However currently no direct
> report show this problem. hah~
>
> Chesnay Schepler 于2022年5月20日 周五19:53写道:
>
> > It's not necessarily required that the scala byte code matches the
> > version of the java byte code.
> >
> > By and large such inconsistencies are inevitable w.r.t. external
> libraries.
> >
> > On 20/05/2022 12:23, Ran Tao wrote:
> > > Hi, Martijn. Even if we upgrade scala to 2.12.15 to support Java17, it
> > just
> > > fix the compilation of FLINK-25000
> > >  .  There is
> another
> > > problem[1] that Scala <=2.13 can't generate jvm11 or higher bytecode
> > > version. The flink project target class bytecode version is
> inconsistent
> > > between scala and java source.
> > > If we decide to upgrade scala version, how aboout scala 2.12.16, as i
> > know
> > > from scala community, the 2.12.16 will backport 2.13 functionilies like
> > > jvm11,jvm17 target jvm class support.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-27549
> > >
> > > Martijn Visser  于2022年5月20日周五 16:37写道:
> > >
> > >> Hi everyone,
> > >>
> > >> I would like to get some opinions from our Scala users, therefore I'm
> > also
> > >> looping in the user mailing list.
> > >>
> > >> Flink currently is tied to Scala 2.12.7. As outlined in FLINK-12461
> [1]
> > >> there is a binary incompatibility introduced by Scala 2.12.8, which
> > >> currently limits Flink from upgrading to a later Scala 2.12 version.
> > >> According to the Scala 2.12.8 release notes, "The fix is not binary
> > >> compatible: the 2.12.8 compiler omits certain methods that are
> > generated by
> > >> earlier 2.12 compilers. However, we believe that these methods are
> never
> > >> used and existing compiled code will continue to work"
> > >>
> > >> We could still consider upgrading to a later Scala 2.12 version, the
> > latest
> > >> one currently being 2.12.15. Next to any benefits that are introduced
> in
> > >> the newer Scala versions, it would also resolve a blocker for Flink to
> > add
> > >> support for Java 17 [2].
> > >>
> > >> My question to Scala users of Flink and others who have an opinion on
> > this:
> > >> * Has any of you already manually compiled Flink with Scala 2.12.8 or
> > >> later?
> > >> * If so, have you experienced any problems with checkpoint and/or
> > savepoint
> > >> incompatibility?
> > >> * Would you prefer Flink breaking binary compatibility by upgrading
> to a
> > >> later Scala 2.12 version or would you prefer Flink to stick with Scala
> > >> 2.12.7?
> > >>
> > >> Note: I know there are also questions about Scala 2.13 and Scala 3
> > support
> > >> in Flink; I think that deserves its own discussion thread.
> > >>
> > >> Best regards,
> > >>
> > >> Martijn Visser
> > >> https://twitter.com/MartijnVisser82
> > >> https://github.com/MartijnVisser
> > >>
> > >> [1] https://issues.apache.org/jira/browse/FLINK-12461
> > >> [2] https://issues.apache.org/jira/browse/FLINK-25000
> > >>
> > >
> >
> > --
> Best,
> Ran Tao
>


Re: Unable to retrieve savepoint status from non-leader/standby in HA with Flink 1.15

2022-06-07 Thread Chesnay Schepler

I think your analysis is correct; I'll file a ticket.

On 03/06/2022 15:28, Nick Birnberg wrote:

Hello everyone!

Our current setup has us running Flink on Kubernetes in HA mode 
(Zookeeper) with multiple JobManagers. This appears to be a regression 
from 1.14.


We can use the flink CLI to communicate with the REST API to reproduce 
this. We directly target a standby JobManager (by using `kubectl 
port-forward $STANDBY_JM 8081`. And then run `flink savepoint -m 
localhost:8081 $JOB_ID`. This command triggers the savepoint  via the 
REST API and polls for it using the triggerId.


Relevant stack trace:

org.apache.flink.runtime.rest.util.RestClientException: 
[org.apache.flink.runtime.rest.handler.RestHandlerException: Internal 
server error while retrieving status of savepoint operation with 
triggerId=10e6bb05749f572cf4ee5eee9b4959c7 for job 
488f4846310e2763dd1c338d7d7f55bb.
at 
org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers.createInternalServerError(SavepointHandlers.java:352)
at 
org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers.access$000(SavepointHandlers.java:115)
at 
org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers$SavepointStatusHandler.lambda$null$0(SavepointHandlers.java:311)

...
Caused by: 
org.apache.flink.runtime.rpc.akka.exceptions.AkkaRpcException: Failed 
to serialize the result for RPC call : getTriggeredSavepointStatus.
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:405)

...
Caused by: java.io.NotSerializableException: 
org.apache.flink.runtime.rest.handler.async.OperationResult
at 
java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1185)
at 
java.base/java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:349)
at 
org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:632)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcSerializedValue.valueOf(AkkaRpcSerializedValue.java:66)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:388)

... 30 more
]
at 
org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:532)


The savepoint itself is successful and this is not a problem if we 
target the leader JobManager. This seems similar to 
https://issues.apache.org/jira/browse/FLINK-26779 and I would think 
that the solution would be to 
have org.apache.flink.runtime.rest.handler.async.OperationResult 
implement Serializable, but I wanted a quick sanity check to make sure 
this is reproducible outside of our environment before moving forward.


Thank you!





Re: Add me to slack

2022-06-07 Thread Márton Balassi
Seems you have been added or found the link since then. :-)

https://flink.apache.org/community.html#slack

On Tue, Jun 7, 2022 at 12:28 AM Sahil Aulakh 
wrote:

> Hi
>
> Please add me to the slack group.
>
> Thanks
> Sahil Aulakh
>


Re: Slow tests on Flink 1.15

2022-06-07 Thread Chesnay Schepler
Can you give us a more complete stacktrace so we can see what call in 
Flink is waiting for something?


Does this happen to all of your tests?
Can you provide us with an example that we can try ourselves? If not, 
can you describe the test structure (e.g., is it using a 
MiniClusterResource).


On 02/06/2022 11:48, Lasse Nedergaard wrote:

Hi.

Just tried to upgrade from 1.14.2 to 1.15.0. It went well and our jobs runs as 
expected.

We have a number of test, testing the entire job so we mock input and output 
and start our job with mocked data. After upgrading a simple test now takes 
minutes where it before was less than a minute.
If I run a test in debug data are processed right a way but the job are stuck 
in park method in the idk.internal.misc.unsafe object I Java 11 and it’s called 
from StreamExecutionEnviroment.execute (job client mini cluster).

Any idea why this happens and what I’m missing?

Med venlig hilsen / Best regards
Lasse Nedergaard





Re: Cannot cast GoogleHadoopFileSystem to hadoop.fs.FileSystem to list file in Flink 1.15

2022-06-07 Thread czchen
On Mon, Jun 06, 2022 at 10:42:08AM +0800, Shengkai Fang wrote:
> Hi. In my experience, the step to debug classloading problems are as
> follows:

Thanks for the help. We get the following log when using
`-verbose:class`:

  [2.074s][info][class,load] org.apache.hadoop.fs.FileSystem source: 
file:/opt/flink/opt/flink-s3-fs-hadoop-1.15.0.jar
  ...
  [8.094s][info][class,load] 
com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem source: 
file:/opt/flink/opt/flink-gs-fs-hadoop-1.15.0.jar


It looks like application uses hadoop.fs.FileSystem from
flink-s3-fs-hadoop-1.15.0.jar, and use GoogleHadoopFileSystem from
flink-gs-fs-hadoop-1.15.0.jar, and they are incompatible. Since we run
Flink in both AWS and GCP, our base image contains both plugins at the
same time. Any idea how to workaround it?

We also try to set `classloader.resolve-order: parent-first`. However,
we got another error causing by library conflict between Flink and our
application:

  Caused by: com.fasterxml.jackson.databind.JsonMappingException: Scala module 
2.11.3 requires Jackson Databind version >= 2.11.0 and < 2.12.0

-- 
ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org
http://czchen.info/
Key fingerprint = BA04 346D C2E1 FE63 C790  8793 CC65 B0CD EC27 5D5B


signature.asc
Description: PGP signature