yarn application -kill application_1565677682535_0431??????kill????????????????????

2019-09-08 Thread 646208563
??flink??yarnsinkhdfs??close??yarn
 application -kill 
application_1565677682535_0431??killsink??close

Kafka 与 extractly-once

2019-09-08 Thread Jimmy Wong
Hi,all:
请教一下,我设置 checkpoint 的时间是 5 分钟,如果在这 5 分钟之内,某个 task 挂了,然后又重新拉起。我是不是可以理解为这时候从 
checkpoint 的数据获得的是这 5 分钟之前的 Kafka offset,但是这 5 分钟之内的消息已经消费,流向下游。重新拉起之后,source 
重放,那么这时候这 5 分钟的数据会再次被消费麽?如果再次消费,那么怎么保证 Extractly-Once 呢?
| |
Jimmy
|
|
wangzmk...@163.com
|
签名由网易邮箱大师定制



Re: FLINK WEEKLY 2019/36

2019-09-08 Thread Wesley Peng




on 2019/9/9 11:23, Zili Chen wrote:

FLINK WEEKLY 2019/36

很高兴和大家分享上周 FLINK 社区的发展。在过去的一周里,更多 FLINK 1.10 的特性被提出和讨论,包括新的 FLIP,来自 Apache
Pulsar 社区的 Connector 贡献等等。专门讨论 FLINK 1.10 将要实现什么特性的议程也在进行。
用户问题


A nice work. thank you.

regards.


FLINK WEEKLY 2019/36

2019-09-08 Thread Zili Chen
FLINK WEEKLY 2019/36 

很高兴和大家分享上周 FLINK 社区的发展。在过去的一周里,更多 FLINK 1.10 的特性被提出和讨论,包括新的 FLIP,来自 Apache
Pulsar 社区的 Connector 贡献等等。专门讨论 FLINK 1.10 将要实现什么特性的议程也在进行。
用户问题

Streaming File Sink疑问


FLINK 作业运行的集群和结果写入的 HDFS 集群不是同一个,配置 nameservices 正确寻址的方法

关于Flink SQL DISTINCT问题


FLINK SQL DISTINCT 窗口内去重的实现逻辑

flink1.9.0对DDL的支持


FLINK 1.9.0 仅支持通过 CLI Create View

如何优化flink内存?


特定业务使用滑动窗口导致占用大量内存,社区的成员分享了他们各自场景下对此问题的解决方案或绕过方案

flink1.9中blinkSQL对定义udf的TIMESTAMP类型报错


BLINK Planner 对 TIMESTAMP 支持与 FLINK Planner 的不同,确定为缺陷,将在 1.10 中修复

Making broadcast state queryable?


社区成员关于 queryable state 的改进建议,目前 queryable state 社区没有足够的 committer
能够参与到已有的改进方案讨论中。如果有更多的用户有对 queryable state 的需求的话,社区对此功能的优先级可能会重新定义

Post-processing batch JobExecutionResult


批作业场景下在 env.execute 后进一步处理作业结果的需求,目前除了提交到 session 集群以外由于 FLINK
作业提交逻辑的实现均无法做进一步处理。正在进行的 Client API 改进的讨论有助于改善这一情况

Flink SQL client support for running in Flink cluster


FLINK SQL Client 仅支持和预先部署的 standalone session 集群交互,是一个比较基础的实现。同样受限于目前
Client API 的缺陷,有望和 Client API 的改进一同得到改善
开发进展

FLINK-13954 Clean up ExecutionEnvironment / JobSubmission code paths


Client API 重构的一部分,重构 ExecutionEnvironment 和作业提交的遗留代码路径正在推进中

FLINK-13958 Job class loader may not be reused after batch job recovery


在最新的 Batch 作业恢复模式 region based restart 下,ClassLoader 加载 native library
可能会出现重复加载

[DISCUSS] Support JSON functions in Flink SQL


Xu Forward 发起了在 FLINK SQL 中支持 JSON 函数的讨论

[DISCUSS] Reducing build times


此前 Chesnay Schepler 发起的关于缩短 FLINK CI 构建时间的讨论有了新的进展,开始讨论是否将 FLINK 的 CI 迁移到
Travis 以外的系统上,以使得 e2e 的测试也能对每个 pull request 的提交都运行

[DISCUSS] Contribute Pulsar Flink connector back to Flink


来自 Apache Pulsar 社区的 Yijie Shen 提出了将适用于 FLINK 1.9.0 和 Pulsar 2.4.0 的
connector 贡献到 FLINK 社区的讨论。然而,由于此前 Pulsar connector 曾经提出过相同请求,但在提出 pull
request 之后被搁置至今,FLINK 社区和 Pulsar 社区正在商讨一个合适的贡献和维护 connector 的方式

[DISCUSS] FLIP-61 Simplify Flink's cluster level RestartStrategy
configuration


Till Rohrmann 提出的关于简化 FLINK 集群级别重启策略配置的 FLIP,已经被接受,正在实现中

[DISCUSS] FLIP-62: Set default restart delay for FixedDelay- and
FailureRateRestartStrategy to 1s


Till Rohrmann 提出的把重启策略的延迟默认值设置为非零值的 FLIP,这有助于避免 FLINK 以外的系统生成的流的重启。FLIP
已经被接受,正在实现中

FLIP-63: Rework table partition support


Jingsong Lee 提出的 FLIP-63,旨在重构 Table 的 partition 支持

[DISCUSS] FLIP-64: Support for Temporary Objects in Table module


Dawid Wysakowicz 提出的 FLIP-64,旨在支持 Table 中临时对象,是对 Catalog API 的功能完善

[DISCUSS] FLIP-66: Support time attribute in SQL DDL


Jark Wu 提出的 FLIP-66,旨在支持在 SQL DDL 中的时间属性,这将有助于用户对 DDL 生成的 Table 应用 window 操作
社区新闻

[DISCUSS] Features for Apache Flink 1.10


Gary Yao 发起了 FLINK 1.10 特性的讨论,旨在初步确定 FLINK 将会在 1.10 中引入什么功能和改进。同时,Gary
提议了自己和 Yu Li 担当 1.10 的 release manager

[ANNOUNCE] Kostas Kloudas joins the Flink PMC

回复: 多个keyBy时,只有最后一个超作用吗?

2019-09-08 Thread gaofeilong198...@163.com
相当于做了两次shuffle,对下游来讲只有最后一次shuffle生效,其他的shuffle都是浪费性能,如果想对name和age组合作分组的话, 
应该用keyby(name + age)。




gaofeilong198...@163.com
 
发件人: but...@163.com
发送时间: 2019-09-09 09:33
收件人: user-zh
主题: 多个keyBy时,只有最后一个超作用吗?
大家好,
我有如下代码:
 
dataStream.keyBy(PojoClass::getName).keyBy(PojoClass::getAge)...

这样的代码,我发现,只有最后一个keyBy生效,前面的不生效,
我想问原本设计就是这样的吗?
还是BUG?
 


Re: FLink WEB 怎么加登录验证?

2019-09-08 Thread Bruce Bian
Good morning !
+1.
if you use nginx dynamic(redirect) proxy to Flink web ui ,
you can use Nginx bash_aush module implement simple username and password
authentication;

Thanks.



Wesley Peng  于2019年9月5日周四 上午10:48写道:

> Hi
>
> on 2019/9/5 10:46, wanghongquan.sh wrote:
> > FLink WEB 控制台中,没有找到登录验证的配置,请问这个WEB怎么加登录验证?
>
> Flink does not directly support authenticating access to the web UI, but
> you can always put something like Apache's basic_auth in front of it.
>
>


多个keyBy时,只有最后一个超作用吗?

2019-09-08 Thread but...@163.com
大家好,
我有如下代码:

dataStream.keyBy(PojoClass::getName).keyBy(PojoClass::getAge)...

这样的代码,我发现,只有最后一个keyBy生效,前面的不生效,
我想问原本设计就是这样的吗?
还是BUG?



[ANNOUNCE] Weekly Community Update 2019/33-36

2019-09-08 Thread Konstantin Knauf
Dear Community,

happy to share this "week's" community update, back after a three week
summer break. It's been a very busy time in the Flink community as a lot of
FLIP discussions and votes for Apache Flink 1.10 are on their way. I will
try to cover a good part of it in this update along with bugs in Flink
1.9.0 and and more...

Flink Development
==

* [roadmap] There are currently two great resources to get an overview
of *Flink's
Roadmap* for 1.10 and beyond. The first one is the recently updated roadmap
on the Project website [1] and the other one is a discussion thread
launched by Gary on the features for Flink 1.10 [2]. Gary and Yu Li stepped
up as release managers for Flink 1.10 and proposed a feature freeze around
end of November 2019 and a release beginning of January 2020. Most of the
FLIP discussions covered in this update are mentioned on these roadmaps.

* [releases] The vote for *Apache Flink 1.8.2 *RC1 [3] is currently
ongoing. Checkout the corresponding discussion thread [4] for a list of
fixes.

* [development] Following up on the repository split discussion, the
community is now looking into other ways to *reduce the build time* of
Apache Flink. Chesnay has proposed several options, some of which are
investigated in more detailed as of writing. Among these are sharing JVMs
between tests for more modules, moving to gradle has a build system (better
incremental builds) and moving to a different CI system (Azure Pipelines?).
[5]

* [state] Yu Li proposes to add a new state backend to Flink, the
*SpillableHeapStatebackend.* [6] State will primarly live on the Java heap,
but the coldest state will be spilled to disk if memory becomes scarce. The
vote has already passed. [7]

* [python] Jincheng has started a discussion on adding support for
*user-defined
functions* in the Python Table API. The high-level architecture follows the
approach of Beam's portability framework of executing user-defined
functions in a separate language specific environment. The first FLIP
(FLIP-58) will only deal with stateless user-defined functions and will lay
the ground work.[8]

* [sql] Xu Forward has started a discussion on adding functions to *construct
and query JSON* objects in Flink SQL. The proposal has generally been
well-received, but there is no FLIP yet. [9]

* [sql] Bowen has started a discussion on reworking the *function catalog*,
which among other goals aims to support external built-in functions (Hive),
to revisit the resolution order of function names and to support fully
qualified function names. [10]

* [connectors] Yijie Shen proposes to contribute the *Apache Pulsar
connector* (currently in Apache Pulsar) back to Apache Flink. While
everyone agrees that a strong Apache Pulsar connector is a valuable
contribution to the project, there are concerns about build time,
maintainability in the long-run and dependencies on FLIP-27 (New Source
Interface). The discussion is ongoing. [11]

* [connectors] From Apache Flink 1.10 onwards the* Kinesis Connector* will
be part of the Apache Flink release. In the past this was blocked by the
license of its dependencies, which have recently been changed to Apache
2.0. [12]

* [recovery] Till has published to small FLIPs on *Flink's restart
strategies*. The first one, FLIP-61, proposes to change the logic to
determine the restart strategy to ignore restart strategy configuration
properties, when the corresponding restart strategy was not set via
"restart-strategy". The other one, FLIP-62, proposes to change the default
restart delay for all strategies from 0s to 1s. The vote has passed for
both of them [13, 14].

* [resource management] Following up on FLIP-49, Xintong Song has started a
discussion on FLIP-53 to add *fine grained operator resource management* to
Flink [15]. If I understand it correctly, the feature will only be
available via the Blink Planner of the Table API at first, and might later
be extended to the DataStream API. The DataSet API will not be affected.
The vote [16] is currently ongoing.

* [configuration] Dawid introduced a FLIP that adds support to configure
ExecutionConfig (and similar classes) from a file or more generally from
layers above the StreamExecutionEnvironment, which you currently need
access to change these configurations. [17]

* [development] Stephan proposed to switch *Java's Duration class* instead
of Flink's time class for non-API parts of Flink (API maybe in Flink 2.0).
[18]

* [development] Gyula started a discussion to unify the implementation of
the *Builder pattern in Flink*. Following the discussion he will add some
guidelines to the code style guide. [19]

* [releases] *Apache Flink-shaded 8.0* has been released. [20]

[1] https://flink.apache.org/roadmap.html
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Features-for-Apache-Flink-1-10-tp32824p32844.html
[3]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-8-2-release-candidate-1-tp32808.html
[4]

Re: [flink-1.9] how to read local json file through Flink SQL

2019-09-08 Thread Anyang Hu
Hi Wesley,

This is not the way I want, I want to read local json data in Flink SQL by
defining DDL.

Best regards,
Anyang

Wesley Peng  于2019年9月8日周日 下午6:14写道:

> On 2019/9/8 5:40 下午, Anyang Hu wrote:
> > In flink1.9, is there a way to read local json file in Flink SQL like
> > the reading of csv file?
>
> hi,
>
> might this thread help you?
>
> http://mail-archives.apache.org/mod_mbox/flink-dev/201604.mbox/%3cCAK+0a_o5=c1_p3sylrhtznqbhplexpb7jg_oq-sptre2neo...@mail.gmail.com%3e
>
> regards.
>


Re: [flink-1.9] how to read local json file through Flink SQL

2019-09-08 Thread Wesley Peng

On 2019/9/8 5:40 下午, Anyang Hu wrote:
In flink1.9, is there a way to read local json file in Flink SQL like 
the reading of csv file?


hi,

might this thread help you?
http://mail-archives.apache.org/mod_mbox/flink-dev/201604.mbox/%3cCAK+0a_o5=c1_p3sylrhtznqbhplexpb7jg_oq-sptre2neo...@mail.gmail.com%3e

regards.


[flink-1.9] how to read local json file through Flink SQL

2019-09-08 Thread Anyang Hu
Hi guys,

In flink1.9, is there a way to read local json file in Flink SQL like the
reading of csv file?

Now we can read local csv file like the following, replacing of  'csv' to
'json' can not work:
create table source (
first varchar,
id int
) with (
'connector.type' = 'filesystem',
'connector.path' = '/path/to/csv',
'format.type' = 'csv'
)

Best regards,
Anyang


Re: suggestion of FLINK-10868

2019-09-08 Thread Anyang Hu
Hi Peter,

For our online batch task, there is a scene where the failed Container
reaches MAXIMUM_WORKERS_FAILURE_RATE but the client will not immediately
exit (the probability of JM loss is greatly improved when thousands of
Containers is to be started). It is found that the JM disconnection (the
reason for JM loss is unknown) will cause the notifyAllocationFailure not
to take effect.

After the introduction of FLINK-13184
 to start  the container
with multi-threaded, the JM disconnection situation has been alleviated. In
order to stably implement the client immediate exit, we use the following
code to determine  whether call onFatalError when
MaximumFailedTaskManagerExceedingException is occurd:

@Override
public void notifyAllocationFailure(JobID jobId, AllocationID
allocationId, Exception cause) {
   validateRunsInMainThread();

   JobManagerRegistration jobManagerRegistration =
jobManagerRegistrations.get(jobId);
   if (jobManagerRegistration != null) {
  
jobManagerRegistration.getJobManagerGateway().notifyAllocationFailure(allocationId,
cause);
   } else {
  if (exitProcessOnJobManagerTimedout) {
 ResourceManagerException exception = new
ResourceManagerException("Job Manager is lost, can not notify
allocation failure.");
 onFatalError(exception);
  }
   }
}


Best regards,

Anyang


Re: suggestion of FLINK-10868

2019-09-08 Thread Peter Huang
Hi Till,

1) From Anyang's request, I think it is reasonable to use two parameters
for the rate as a batch job runs for a while. The failure rate in a small
interval is meaningless.
I think they need a failure count from the beginning as the failure
condition.

@Anyang Hu 
2) In the current implementation, the
MaximumFailedTaskManagerExceedingException is SuppressRestartsException. It
will exit immediately.


Best Regards
Peter Huang




On Sun, Sep 8, 2019 at 1:27 AM Anyang Hu  wrote:

> Hi Till,
> Thank you for the reply.
>
> 1. The batch processing may be customized according to the usage scenario.
> For our online batch jobs, we set the interval parameter to 8h.
> 2. For our usage scenario, we need the client to exit immediately when the
> failed Container reaches MAXIMUM_WORKERS_FAILURE_RATE.
>
> Best Regards,
> Anyang
>
> Till Rohrmann  于2019年9月6日周五 下午9:33写道:
>
>> Hi Anyang,
>>
>> thanks for your suggestions.
>>
>> 1) I guess one needs to make this interval configurable. A session
>> cluster could theoretically execute batch as well as streaming tasks and,
>> hence, I doubt that there is an optimal value. Maybe the default could be a
>> bit longer than 1 min, though.
>>
>> 2) Which component to do you want to let terminate immediately?
>>
>> I think we can consider your input while reviewing the PR. If it would be
>> a bigger change, then it would be best to create a follow up issue once
>> FLINK-10868 has been merged.
>>
>> Cheers,
>> Till
>>
>> On Fri, Sep 6, 2019 at 11:42 AM Anyang Hu  wrote:
>>
>>> Thank you for the reply and look forward to the advice of Till.
>>>
>>> Anyang
>>>
>>> Peter Huang  于2019年9月5日周四 下午11:53写道:
>>>
 Hi Anyang,

 Thanks for raising it up. I think it is reasonable as what you
 requested is needed for batch. Let's wait for Till to give some more input.



 Best Regards
 Peter Huang

 On Thu, Sep 5, 2019 at 7:02 AM Anyang Hu 
 wrote:

> Hi Peter:
>
> As commented in the issue
> ,We have
> introduced the FLINK-10868
>  patch (mainly
> batch tasks) online, what do you think of the following two suggestions:
>
> 1) Parameter control time interval. At present, the default time
> interval of 1 min is used, which is too short for batch tasks;
>
> 2)Parameter Control When the failed Container number reaches
> MAXIMUM_WORKERS_FAILURE_RATE and JM disconnects whether to perform
> OnFatalError so that the batch tasks can exit as soon as possible.
>
> Best regards,
> Anyang
>



Re: suggestion of FLINK-10868

2019-09-08 Thread Anyang Hu
Hi Till,
Thank you for the reply.

1. The batch processing may be customized according to the usage scenario.
For our online batch jobs, we set the interval parameter to 8h.
2. For our usage scenario, we need the client to exit immediately when the
failed Container reaches MAXIMUM_WORKERS_FAILURE_RATE.

Best Regards,
Anyang

Till Rohrmann  于2019年9月6日周五 下午9:33写道:

> Hi Anyang,
>
> thanks for your suggestions.
>
> 1) I guess one needs to make this interval configurable. A session cluster
> could theoretically execute batch as well as streaming tasks and, hence, I
> doubt that there is an optimal value. Maybe the default could be a bit
> longer than 1 min, though.
>
> 2) Which component to do you want to let terminate immediately?
>
> I think we can consider your input while reviewing the PR. If it would be
> a bigger change, then it would be best to create a follow up issue once
> FLINK-10868 has been merged.
>
> Cheers,
> Till
>
> On Fri, Sep 6, 2019 at 11:42 AM Anyang Hu  wrote:
>
>> Thank you for the reply and look forward to the advice of Till.
>>
>> Anyang
>>
>> Peter Huang  于2019年9月5日周四 下午11:53写道:
>>
>>> Hi Anyang,
>>>
>>> Thanks for raising it up. I think it is reasonable as what you requested
>>> is needed for batch. Let's wait for Till to give some more input.
>>>
>>>
>>>
>>> Best Regards
>>> Peter Huang
>>>
>>> On Thu, Sep 5, 2019 at 7:02 AM Anyang Hu  wrote:
>>>
 Hi Peter:

 As commented in the issue
 ,We have
 introduced the FLINK-10868
  patch (mainly
 batch tasks) online, what do you think of the following two suggestions:

 1) Parameter control time interval. At present, the default time
 interval of 1 min is used, which is too short for batch tasks;

 2)Parameter Control When the failed Container number reaches
 MAXIMUM_WORKERS_FAILURE_RATE and JM disconnects whether to perform
 OnFatalError so that the batch tasks can exit as soon as possible.

 Best regards,
 Anyang

>>>


Re: HBase Connectors(sink)

2019-09-08 Thread Ni Yanchun
Hi Chesnay,

   I saw the code about hbase here: 
https://github.com/apache/flink/tree/master/flink-connectors/flink-hbase/src/main/java/org/apache/flink/addons/hbase,
 it seems including both input and output.

On Sep 6, 2019, at 17:45, Chesnay Schepler 
mailto:ches...@apache.org>> wrote:

Where did you see an HBase sink? On the current master, and all releases that I 
can remember, flink-hbase only contains input formats / sources.

On 06/09/2019 03:23, Ni Yanchun wrote:
Hi all,
I have found that flink could use hbase as sink in flink source code, 
but it does not appear in the official document. Does that means hbase sink is 
not ready for production use?