Re: 基于savepoint 调小并发的问题

2019-10-09 文章 Congxian Qiu
你好,1.2 以后可以修改并发的,可以看下这个官方文档[1],不过你需要注意一下最大并发这个值不要变,而最大并发如果没有指定是由并发数计算出来的[2]

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/state/savepoints.html#what-happens-when-i-change-the-parallelism-of-my-program-when-restoring
[2]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/production_ready.html#set-an-explicit-max-parallelism
Best,
Congxian


陈赋赟  于2019年10月10日周四 上午11:43写道:

>
> 由于资源问题,想对已运行的任务执行生成savepoint并,基于改savepoint重启并调小并发,但是查询了文档并没有看到相关调小并发的描述,所以想问一下这样是否可行?


基于savepoint 调小并发的问题

2019-10-09 文章 陈赋赟
由于资源问题,想对已运行的任务执行生成savepoint并,基于改savepoint重启并调小并发,但是查询了文档并没有看到相关调小并发的描述,所以想问一下这样是否可行?

Re: [DISCUSS] Drop Python 2 support for 1.10

2019-10-09 文章 Dian Fu
Thanks everyone for your reply.  

So far all the reply tend to option 1 (dropping Python 2 support in 1.10) and 
will continue to hear if there are any other opinions. 

@Jincheng @Hequn, you are right, things become more complicate if dropping 
Python 2 support is performed after Python UDF has been supported. Users will 
have to migrate their Python UDFs if they have used features which only are 
supported in Python 2.

Thanks @Yu for your suggestion. It makes much sense to me and will do that. 
Also CC @user and @user-zh ML in case any users are concerned about this.

Thanks,
Dian

> 在 2019年10月9日,下午1:14,Yu Li  写道:
> 
> Thanks for bringing this up Dian.
> 
> Since python 2.7 support was added in 1.9.0 and would be EOL near the
> planned release time for 1.10, I could see a good reason to take option 1.
> 
> Please remember to add an explicit release note and would be better to send
> a notification in user ML about the plan to drop it, just in case some
> 1.9.0 users are already using python 2.7 in their product env.
> 
> Best Regards,
> Yu
> 
> 
> On Wed, 9 Oct 2019 at 11:13, Jeff Zhang  wrote:
> 
>> +1
>> 
>> Hequn Cheng  于2019年10月9日周三 上午11:07写道:
>> 
>>> Hi Dian,
>>> 
>>> +1 to drop Python 2 directly.
>>> 
>>> Just as @jincheng said, things would be more complicated if we are going
>> to
>>> support python UDFs.
>>> The python UDFs will introduce a lot of python dependencies which will
>> also
>>> drop the support of Python 2, such as beam, pandas, pyarrow, etc.
>>> Given this and Python 2 will reach EOL on Jan 1 2020. I think we can drop
>>> Python 2 in Flink as well.
>>> 
>>> As for the two options, I think we can drop it directly in 1.10. The
>>> flink-python is introduced just from 1.9, I think it's safe to drop it
>> for
>>> now.
>>> And we can also benefit from it when we add support for python UDFs.
>>> 
>>> Best, Hequn
>>> 
>>> 
>>> On Wed, Oct 9, 2019 at 8:40 AM jincheng sun 
>>> wrote:
>>> 
 Hi Dian,
 
 Thanks for bringing this discussion!
 
 In Flink 1.9 we only add Python Table API mapping to Java Table
>>> API(without
 Python UDFs), there no special requirements for Python version, so we
>> add
 python 2,7 support. But for Flink 1.10, we add the Python UDFs support,
 i.e., user will add more python code in Flink job and more requirements
>>> for
 the features of the Python language.So I think It's better to follow
>> the
 rhythm of Python official.
 
 Option 2 is the most conservative and correct approach, but for the
>>> current
 situation, we cooperate with the Beam community and use Beam's
>>> portability
 framework for UDFs support, so we prefer to adopt the Option 1.
 
 Best,
 Jincheng
 
 
 
 Dian Fu  于2019年10月8日周二 下午10:34写道:
 
> Hi everyone,
> 
> I would like to propose to drop Python 2 support(Currently Python
>> 2.7,
> 3.5, 3.6, 3.7 are all supported in Flink) as it's coming to an end at
>>> Jan
> 1, 2020 [1]. A lot of projects [2][3][4] has already stated or are
 planning
> to drop Python 2 support.
> 
> The benefits of dropping Python 2 support are:
> 1. Maintaining Python 2/3 compatibility is a burden and it makes the
>>> code
> complicate as Python 2 and Python 3 is not compatible.
> 2. There are many features which are only available in Python 3.x
>> such
>>> as
> Type Hints[5]. We can only make use of this kind of features after
 dropping
> the Python 2 support.
> 3. Flink-python depends on third-part projects, such as Apache Beam
>>> (may
> add more dependencies such as pandas, etc in the near future), it's
>> not
> possible to upgrade them to the latest version once they drop the
>>> Python
 2
> support.
> 
> Here are the options we have:
> 1. Drop Python 2 support in 1.10:
> As flink-python module is a new module added since 1.9.0 and so
>>> dropping
> Python 2 support at the early stage seems a good choice for us.
> 2. Deprecate Python 2 in 1.10 and drop its support in 1.11:
> As 1.10 is planned to be released around the beginning of 2020. This
>> is
> also aligned with the official Python 2 support.
> 
> Personally I prefer option 1 as flink-python is new module and there
>> is
 no
> much history reasons to consider.
> 
> Looking forward to your feedback!
> 
> Regards,
> Dian
> 
> [1] https://pythonclock.org/ 
> [2] https://python3statement.org/ 
> [3]
 https://spark.apache.org/news/plan-for-dropping-python-2-support.html
> <
>> https://spark.apache.org/news/plan-for-dropping-python-2-support.html
 
> [4]
> 
 
>>> 
>> https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
> <
> 
 
>>> 
>> 

Re: How to write stream data to other Hadoop Cluster by StreamingFileSink

2019-10-09 文章 Jun Zhang
Hi,Yang :
thank you very much for your reply.


I had add the configurations on my hadoop cluster client , both hdfs-site.xml 
and core-site.xml are configured, the client can read mycluster1 and mycluter2, 
but when I submit the flink job to yarn cluster , the hadoop client 
configurations is invalid, I read the source code ,it will give priority to the 
configuration of the hadoop cluster.

   
  

 
 
 
 On 10/9/2019 10:57??Yang Wang