[jira] [Created] (FLINK-17399) CsvTableSink should also extend from OverwritableTableSink

2020-04-26 Thread godfrey he (Jira)
godfrey he created FLINK-17399:
--

 Summary: CsvTableSink should also extend from OverwritableTableSink
 Key: FLINK-17399
 URL: https://issues.apache.org/jira/browse/FLINK-17399
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: godfrey he
 Fix For: 1.11.0


{{CsvTableSink}} has supported {{writeMode}} which could be {{OVERWRITE}} or 
{{NO_OVERWRITE}}. When we execute "INSERT OVERWRITE csv_table_sink xx", 
planners will check whether a table sink is an {{OverwritableTableSink}}.
Now {{CsvTableSink}} does not extend from {{OverwritableTableSink}}, so we 
can't execute above statement. 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Flink 1.9.2 why always checkpoint expired

2020-04-26 Thread qq
Hi all,

Why my flink checkpoint always expired, I used RocksDB checkpoint,
and I can’t get any useful messages for this. Could you help me ? Thanks very 
much.





[jira] [Created] (FLINK-17398) Support flexible path reading

2020-04-26 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-17398:


 Summary: Support flexible path reading
 Key: FLINK-17398
 URL: https://issues.apache.org/jira/browse/FLINK-17398
 Project: Flink
  Issue Type: Sub-task
Reporter: Jingsong Lee


Like:
 * Single file reading
 * wildcard path reading



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17397) Filesystem support lookup table source

2020-04-26 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-17397:


 Summary: Filesystem support lookup table source
 Key: FLINK-17397
 URL: https://issues.apache.org/jira/browse/FLINK-17397
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem
Reporter: Jingsong Lee






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17396) Support SHOW MODULES in Flink SQL

2020-04-26 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-17396:
---

 Summary: Support SHOW MODULES in Flink SQL
 Key: FLINK-17396
 URL: https://issues.apache.org/jira/browse/FLINK-17396
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: Caizhi Weng


We currently have this {{module}} concept but SQL users currently could not 
list all available modules. SQL client does have this support but it is parsed 
by its own parser.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17395) Add the sign and sha logic for PyFlink wheel packages

2020-04-26 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-17395:


 Summary: Add the sign and sha logic for PyFlink wheel packages
 Key: FLINK-17395
 URL: https://issues.apache.org/jira/browse/FLINK-17395
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Huang Xingbo
 Fix For: 1.11.0


Add the sign and sha logic for PyFlink wheel packages



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Add custom labels on AbstractPrometheusReporter like PrometheusPushGatewayReporter's groupingKey

2020-04-26 Thread Yang Wang
Hi jinhai,

I think it is a good improvement to also add groupingKey to
PrometheusReporter.And making
the PrometheusPushGatewayReporter and PrometheusReporter always share the
same logics
is important. AFAIK, they should not have differences except for the
metrics exposing mechanism
pulling/pushing.


Best,
Yang

jinhai wang  于2020年4月24日周五 下午6:24写道:

> Hi all
>
> I'd like to start the discussion about Prometheus labelNames.
>
> We need to add some custom labels on Prometheus, so we can query by them.
> We can add groupingKey to PrometheusPushGatewayReporter in version 1.10,
> but not in PrometheusReporter. Can we add abstract addLabels method on
> AbstractPrometheusReporter  to support this?
>
>
> Best Regards
>
> jinhai...@gmail.com
>
>


[jira] [Created] (FLINK-17394) Add RemoteEnvironment and RemoteStreamEnvironment in Python

2020-04-26 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-17394:


 Summary: Add RemoteEnvironment and RemoteStreamEnvironment in 
Python
 Key: FLINK-17394
 URL: https://issues.apache.org/jira/browse/FLINK-17394
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Huang Xingbo
 Fix For: 1.11.0


Add RemoteEnvironment and RemoteStreamEnvironment in Python



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Apache Flink 1.9.3 released

2020-04-26 Thread Jingsong Li
Thanks Dian for managing this release!

Best,
Jingsong Lee

On Sun, Apr 26, 2020 at 7:17 PM Jark Wu  wrote:

> Thanks Dian for being the release manager and thanks all who make this
> possible.
>
> Best,
> Jark
>
> On Sun, 26 Apr 2020 at 18:06, Leonard Xu  wrote:
>
> > Thanks Dian for the release and being the release manager !
> >
> > Best,
> > Leonard Xu
> >
> >
> > 在 2020年4月26日,17:58,Benchao Li  写道:
> >
> > Thanks Dian for the effort, and all who make this release possible. Great
> > work!
> >
> > Konstantin Knauf  于2020年4月26日周日 下午5:21写道:
> >
> >> Thanks for managing this release!
> >>
> >> On Sun, Apr 26, 2020 at 3:58 AM jincheng sun 
> >> wrote:
> >>
> >>> Thanks for your great job, Dian!
> >>>
> >>> Best,
> >>> Jincheng
> >>>
> >>>
> >>> Hequn Cheng  于2020年4月25日周六 下午8:30写道:
> >>>
>  @Dian, thanks a lot for the release and for being the release manager.
>  Also thanks to everyone who made this release possible!
> 
>  Best,
>  Hequn
> 
>  On Sat, Apr 25, 2020 at 7:57 PM Dian Fu  wrote:
> 
> > Hi everyone,
> >
> > The Apache Flink community is very happy to announce the release of
> > Apache Flink 1.9.3, which is the third bugfix release for the Apache
> Flink
> > 1.9 series.
> >
> > Apache Flink® is an open-source stream processing framework for
> > distributed, high-performing, always-available, and accurate data
> streaming
> > applications.
> >
> > The release is available for download at:
> > https://flink.apache.org/downloads.html
> >
> > Please check out the release blog post for an overview of the
> > improvements for this bugfix release:
> > https://flink.apache.org/news/2020/04/24/release-1.9.3.html
> >
> > The full release notes are available in Jira:
> > https://issues.apache.org/jira/projects/FLINK/versions/12346867
> >
> > We would like to thank all contributors of the Apache Flink community
> > who made this release possible!
> > Also great thanks to @Jincheng for helping finalize this release.
> >
> > Regards,
> > Dian
> >
> 
> >>
> >> --
> >> Konstantin Knauf | Head of Product
> >> +49 160 91394525
> >>
> >> Follow us @VervericaData Ververica 
> >>
> >> --
> >> Join Flink Forward  - The Apache Flink
> >> Conference
> >> Stream Processing | Event Driven | Real Time
> >> --
> >> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> >> --
> >> Ververica GmbH
> >> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> >> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> >> (Tony) Cheng
> >>
> >
> >
> > --
> >
> > Benchao Li
> > School of Electronics Engineering and Computer Science, Peking University
> > Tel:+86-15650713730
> > Email: libenc...@gmail.com; libenc...@pku.edu.cn
> >
> >
> >
>


-- 
Best, Jingsong Lee


[jira] [Created] (FLINK-17393) Improve the `FutureCompletingBlockingQueue` to wakeup blocking put() more elegantly.

2020-04-26 Thread Jiangjie Qin (Jira)
Jiangjie Qin created FLINK-17393:


 Summary: Improve the `FutureCompletingBlockingQueue` to wakeup 
blocking put() more elegantly.
 Key: FLINK-17393
 URL: https://issues.apache.org/jira/browse/FLINK-17393
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Common
Reporter: Jiangjie Qin


Currently, if a {{FetchTask}} is blocking on 
{{FutureCompletingBlockingQueue.put()}}, interrupt has to be called to wake it 
up, which will result in {{InterruptedException}}. We can avoid the 
interruption by having our own implementation of {{BlockingQueue.Put()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17392) enable configuring minicluster in Flink SQL in IDE

2020-04-26 Thread Bowen Li (Jira)
Bowen Li created FLINK-17392:


 Summary: enable configuring minicluster in Flink SQL in IDE
 Key: FLINK-17392
 URL: https://issues.apache.org/jira/browse/FLINK-17392
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.11.0
Reporter: Bowen Li
Assignee: Kurt Young
 Fix For: 1.11.0


It's very common case that users who want to learn and test Flink SQL will try 
to run a SQL job in IDE like Intellij, with Flink minicluster. Currently it's 
fine to do so with a simple job requiring only one task slot, which is the 
default resource config of minicluster.

However, users cannot run even a little bit more complicated job since they 
cannot configure task slots of minicluster thru Flink SQL, e.g. single 
parallelism job requires shuffle. This incapability has been very frustrating 
to new users.

There are two solutions to this problem:
- in minicluster, if it is single parallelism job, then chain all operators 
together
- enable configuring minicluster in Flink SQL in IDE.

The latter feels more proper.

Expected: users can configure minicluster resources via either SQL ("set 
...=...") or TableEnvironment ("tEnv.setMiniclusterResources(..., ...)"). 

[~jark] [~lzljs3620320]




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Adding support for Hadoop 3 and removing flink-shaded-hadoop

2020-04-26 Thread Stephan Ewen
Indeed, that would be the assumption, that Hadoop does not expose its
transitive libraries on its public API surface.

>From vague memory, I think that pretty much true so far. I only remember
Kinesis and Calcite as counter examples, who exposed Guava classes as part
of the public API.
But that is definitely the "weak spot" of this approach. Plus, as with all
custom class loaders, the fact that the Thread Context Class Loader does
not really work well any more.

On Thu, Apr 23, 2020 at 11:50 AM Chesnay Schepler 
wrote:

> This would only work so long as all Hadoop APIs do not directly expose
> any transitive non-hadoop dependency.
> Otherwise the user code classloader might search for this transitive
> dependency in lib instead of the hadoop classpath (and possibly not find
> it).
>
> On 23/04/2020 11:34, Stephan Ewen wrote:
> > True, connectors built on Hadoop make this a bit more complex. That is
> also
> > the reason why Hadoop is on the "parent first" patterns.
> >
> > Maybe this is a bit of a wild thought, but what would happen if we had a
> > "first class" notion of a Hadoop Classloader in the system, and the user
> > code classloader would explicitly fall back to that one whenever a class
> > whose name starts with "org.apache.hadoop" is not found? We could also
> > generalize this by associating plugin loaders with class name prefixes.
> >
> > Then it would try to load from the user code jar, and if the class was
> not
> > found, load it from the hadoop classpath.
> >
> > On Thu, Apr 23, 2020 at 10:56 AM Chesnay Schepler 
> > wrote:
> >
> >> although, if you can load the HADOOP_CLASSPATH as a plugin, then you can
> >> also load it in the user-code classloader.
> >>
> >> On 23/04/2020 10:50, Chesnay Schepler wrote:
> >>> @Stephan I'm not aware of anyone having tried that; possibly since we
> >>> have various connectors that require hadoop (hadoop-compat, hive,
> >>> orc/parquet/hbase, hadoop inputformats). This would require connectors
> >>> to be loaded as plugins (or having access to the plugin classloader)
> >>> to be feasible.
> >>>
> >>> On 23/04/2020 09:59, Stephan Ewen wrote:
>  Hi all!
> 
>  +1 for the simplification of dropping hadoop-shaded
> 
> 
>  Have we ever investigated how much work it would be to load the
>  HADOOP_CLASSPATH through the plugin loader? Then Hadoop's crazy
>  dependency
>  footprint would not spoil the main classpath.
> 
>  - HDFS might be very simple, because file systems are already
>  Plugin aware
>  - Yarn would need some extra work. In essence, we would need to
>  discover
>  executors also through plugins
>  - Kerberos is the other remaining bit. We would need to switch
>  security
>  modules to ServiceLoaders (which we should do anyways) and also pull
>  them
>  from plugins.
> 
>  Best,
>  Stephan
> 
> 
> 
>  On Thu, Apr 23, 2020 at 4:05 AM Xintong Song 
>  wrote:
> 
> > +1 for supporting Hadoop 3.
> >
> > I'm not familiar with the shading efforts, thus no comment on
> > dropping the
> > flink-shaded-hadoop.
> >
> >
> > Correct me if I'm wrong. Despite currently the default Hadoop
> > version for
> > compiling is 2.4.1 in Flink, I think this does not mean Flink should
> > support only Hadoop 2.4+. So no matter which Hadoop version we use
> for
> > compiling by default, we need to use reflection for the Hadoop
> > features/APIs that are not supported in all versions anyway.
> >
> >
> > There're already many such reflections in `YarnClusterDescriptor` and
> > `YarnResourceManager`, and might be more in future. I'm wondering
> > whether
> > we should have a unified mechanism (an interface / abstract class or
> > so)
> > that handles all these kind of Hadoop API reflections at one place.
> Not
> > necessarily in the scope to this discussion though.
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Wed, Apr 22, 2020 at 8:32 PM Chesnay Schepler  >
> > wrote:
> >
> >> 1) Likely not, as this again introduces a hard-dependency on
> >> flink-shaded-hadoop.
> >> 2) Indeed; this will be something the user/cloud providers have to
> >> deal
> >> with now.
> >> 3) Yes.
> >>
> >> As a small note, we can still keep the hadoop-2 version of
> >> flink-shaded
> >> around for existing users.
> >> What I suggested was to just not release hadoop-3 versions.
> >>
> >> On 22/04/2020 14:19, Yang Wang wrote:
> >>> Thanks Robert for starting this significant discussion.
> >>>
> >>> Since hadoop3 has been released for long time and many companies
> have
> >>> already
> >>> put it in production. No matter you are using flink-shaded-hadoop2
> or
> >> not,
> >>> currently
> >>> Flink could already run in yarn3(not sure about HDFS). Since 

Re: [DISCUSS] FLIP-124: Add open/close and Collector to (De)SerializationSchema

2020-04-26 Thread Stephan Ewen
That makes sense, thanks for clarifying.

Best,
Stephan


On Fri, Apr 24, 2020 at 2:15 PM Dawid Wysakowicz 
wrote:

> Hi Stephan,
>
> I fully agree with what you said. Also as far as I can tell what was
> suggested in the FLIP-124 does not contradict with what you are saying. Let
> me clarify it a bit if it is not clear in the document.
>
> Current implementations of Kafka and Kinesis do the deserialization
> outside of the checkpoint lock in threads separate from the main processing
> thread already. The approach described as option 1, which had the most
> supporters is to keep that behavior. The way I would like to support
> emitting multiple results in this setup is to let the DeserializationSchema
> deserialize records into a list (via collector) that will be emitted
> atomically all at once.
>
> Currently the behavior can be modelled as:
> T record = deserializationSchema.deserialize(...)
> synchronized(checkpointLock) {
>sourceContext.collect(record)
>updateSourceState(...)
> }
>
> and I was suggesting to change it to:
> Collector out = new Collector();
> deserializationSchema.deserialize(..., out);
> List deserializedRecords = out.getRecords();
> synchronized(checkpointLock) {
>for (T record: deserializedRecords) {
> sourceContext.collect(record)
>}
>updateSourceState(...)
>
> }
>
> I think that is aligned with your comment to Seth's comment that the
> "batch" of records originating from a source record is atomically emitted.
>
> Best,
>
> Dawid
>
>
>
> On 23/04/2020 14:55, Stephan Ewen wrote:
>
> Hi!
>
> Sorry for being a bit late to the party.
>
> One very important thing to consider for "serialization under checkpoint
> lock or not" is:
>   - If you do it under checkpoint lock, things are automatically correct.
> Checkpoint barriers go between original records that correspond to offsets
> in the source.
>   - If you deserialize outside the checkpoint lock, then you read a record
> from the source but only partially emit it. In that case you need to store
> the difference (not emitted part) in the checkpoint.
>
> ==> I would advise against trying to emit partial records, i.e. doing
> things outside the checkpoint lock. FLIP-27 will by default also not do
> partial emission of unnested events. Also, it is questionable whether
> optimizing this in the source makes sense when no other operator supports
> that (flatMap, etc.).
>
> Regarding Seth's comment about performance:
>   - For that it does probably makes not so much difference whether this is
> under lock or not, but more whether this can be pushed to another thread
> (source's I/O thread), so that it does not add load to the main task
> processing thread.
>
> ==> This means that the I/O thread deserialized that "batch" that it hands
> over.
> ==> Still, it is important that all records coming from one original
> source record are emitted atomically, otherwise we have the same issue as
> above.
>
> Best,
> Stephan
>
>
> On Tue, Apr 14, 2020 at 10:35 AM Dawid Wysakowicz 
> wrote:
>
>> Hi Xiaogang,
>>
>> I very much agree with Jark's and Aljoscha's responses.
>>
>>
>> On 10/04/2020 17:35, Jark Wu wrote:
>> > Hi Xiaogang,
>> >
>> > I think this proposal doesn't conflict with your use case, you can still
>> > chain a ProcessFunction after a source which emits raw data.
>> > But I'm not in favor of chaining ProcessFunction after source, and we
>> > should avoid that, because:
>> >
>> > 1) For correctness, it is necessary to perform the watermark generation
>> as
>> > early as possible in order to be close to the actual data
>> >  generation within a source's data partition. This is also the purpose
>> of
>> > per-partition watermark and event-time alignment.
>> >  Many on going FLIPs (e.g. FLIP-27, FLIP-95) works a lot on this effort.
>> > Deseriazing records and generating watermark in chained
>> >  ProcessFunction makes it difficult to do per-partition watermark in the
>> > future.
>> > 2) In Flink SQL, a source should emit the deserialized row instead of
>> raw
>> > data. Otherwise, users have to define raw byte[] as the
>> >  single column of the defined table, and parse them in queries, which is
>> > very inconvenient.
>> >
>> > Best,
>> > Jark
>> >
>> > On Fri, 10 Apr 2020 at 09:18, SHI Xiaogang 
>> wrote:
>> >
>> >> Hi,
>> >>
>> >> I don't think the proposal is a good solution to the problems. I am in
>> >> favour of using a ProcessFunction chained to the source/sink function
>> to
>> >> serialize/deserialize the records, instead of embedding
>> (de)serialization
>> >> schema in source/sink function.
>> >>
>> >> Message packing is heavily used in our production environment to allow
>> >> compression and improve throughput. As buffered messages have to be
>> >> delivered when the time exceeds the limit, timers are also required in
>> our
>> >> cases. I think it's also a common need for other users.
>> >>
>> >> In the this proposal, with more components added into the context, in
>> the
>> >> end we will find the 

[jira] [Created] (FLINK-17391) sink.rolling-policy.time.interval default value should be bigger

2020-04-26 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-17391:


 Summary: sink.rolling-policy.time.interval default value should be 
bigger
 Key: FLINK-17391
 URL: https://issues.apache.org/jira/browse/FLINK-17391
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem
Reporter: Jingsong Lee
 Fix For: 1.11.0


Otherwise there is a lot of small files



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17390) Container resource cannot be mapped on Hadoop 2.10+

2020-04-26 Thread Xintong Song (Jira)
Xintong Song created FLINK-17390:


 Summary: Container resource cannot be mapped on Hadoop 2.10+
 Key: FLINK-17390
 URL: https://issues.apache.org/jira/browse/FLINK-17390
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Affects Versions: 1.11.0
Reporter: Xintong Song
 Fix For: 1.11.0


In FLINK-16438, we introduced {{WorkerSpecContainerResourceAdapter}} for 
mapping Yarn container {{Resource}} with Flink {{WorkerResourceSpec}}. Inside 
this class, we use {{Resource}} for hash map keys and set elements, assuming 
that {{Resource}} instances that describes the same set of resources have the 
same hash code.

This assumption is not always true. {{Resource}} is an abstract class and may 
have different implementations. In Hadoop 2.10+, {{LightWeightResource}}, a new 
implementation of {{Resource}}, is introduced for {{Resource}} generated by 
{{Resource.newInstance}} on the AM side, which overrides the {{hashCode}} 
method. That means, a {{Resource}} generated on AM may have a different hash 
code compared to an equal {{Resource}} returned from Yarn.

To solve this problem, we may introduce an {{InternalResource}} as an inner 
class of {{WorkerSpecContainerResourceAdapter}}, with {{hashCode}} method 
depends only on the fields needed by Flink (ATM memroy and vcores). 
{{WorkerSpecContainerResourceAdapter}} should only use {{InternalResource}} for 
internal state management, and do conversions for {{Resource}} passed into and 
returned from it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[ANNOUNCE] Weekly Community Update 2020/17

2020-04-26 Thread Konstantin Knauf
Dear community,

happy to share this week's community digest with an update on Flink 1.11,
Flink 1.10.1 and Flink 1.9.3, the revival of FLIP-36 to support interactive
programming, a new FLIP to unify (& separate) TimestampAssigners and a bit
more.


Flink Development
==

Releases
^

* [releases] Apache Flink 1.9.3 was released. [1,2]

* [releases] Stephan has proposed 15th of May as the feature freeze date
for Flink 1.11 [3]. Subsequently, Piotr also published a status update on
the development progress for the upcoming release. Check it out to get an
overview of all the features, which are still or not anymore planned for
this release. [4]

* [releases] The first release candidate for Flink 1.10.1 is out. [5]

FLIPs
^^

* [table api] Xuannan has revived the discussion on FLIP-36 to support
interactive programming the Table API. In essence, the FLIP proposes to
support caching (intermediate) results of one (batch) job, so that they can
be used by following (batch) jobs in the same TableEnvironment. [6]

* [time] Aljoscha proposes to a) unify punctuated and periodic watermark
assigners and b) to separate watermarks assigners and timestamp extractors.
[7]

More
^

* [configuration] Yangze started a discussion to unify the way max/min are
used in the config options. Currently, there is a max of different patterns
(**.max, max-**, and more). [8]

* [connectors] Karim Mansour proposes a change to the current RabbitMQ
connector in apache Flink to make message deduplication more flexible. [9]

* [metrics] Jinhai would like to add additional labels to the metrics
reported by the PrometheusReporter. [10]

* [datastream api] Stephan proposes to remove a couple of
deprecated methods for state access in Flink 1.11. [11]


[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-9-3-released-tp40730.html
[2] https://flink.apache.org/news/2020/04/24/release-1.9.3.html
[3]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Exact-feature-freeze-date-tp40624.html
[4]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Development-progress-of-Apache-Flink-1-11-tp40718.html
[5]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-10-1-release-candidate-1-tp40724.html
[6]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-36-Support-Interactive-Programming-in-Flink-Table-API-td40592.html
[7]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-126-Unify-and-separate-Watermark-Assigners-tp40525p40565.html
[8]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Should-max-min-be-part-of-the-hierarchy-of-config-option-tp40578.html
[9]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-flink-connector-rabbitmq-api-changes-tp40704.html
[10]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Add-custom-labels-on-AbstractPrometheusReporter-like-PrometheusPushGatewayReporter-s-groupiny-tp40708.html
[11]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Removing-deprecated-state-methods-in-1-11-tp40651.html

Notable Bugs
==

* [FLINK-17350] [1.10.0] [1.9.3] [1.8.3]  Since Flink 1.5 users can choose
not to fail their Flink job on checkpoint errors. For the synchronous part
of a checkpoint the implementation of this feature was incorrect, leaving
operators in an inconsistent state following such a failure. Piotr proposes
to always fail tasks on failures in the synchronous part of a checkpoint
going forward. [12]

* [FLINK-17351] [1.10.0] [1.9.3] CheckpointFailureManager ignores
checkpoint timeouts when checking against the maximally tolerable number of
checkpoints failures. So, checkpoint failures are not discovered when they
only surface in the CheckpointFailureManager as a checkpoint timeout
instead of an exception. [13]

Background: both issues were discovered based on a bug report by Jun Qin
[14].

[12] https://issues.apache.org/jira/browse/FLINK-17351
[13] https://issues.apache.org/jira/browse/FLINK-17350
[14] https://issues.apache.org/jira/browse/FLINK-17327

Events, Blog Posts, Misc
===

* Andrey recaps the changes and simplifications to Flink's memory
management (released in Flink 1.10) on the Apache Flink blog. [15] Closely
related, there is also a small tool to test different memory configurations
on flink-packages.org. [16]

[15]
https://flink.apache.org/news/2020/04/21/memory-management-improvements-flink-1.10.html
[16] https://flink-packages.org/packages/flink-memory-calculator

Cheers,

Konstantin

-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk


Re: [ANNOUNCE] Apache Flink 1.9.3 released

2020-04-26 Thread Jark Wu
Thanks Dian for being the release manager and thanks all who make this
possible.

Best,
Jark

On Sun, 26 Apr 2020 at 18:06, Leonard Xu  wrote:

> Thanks Dian for the release and being the release manager !
>
> Best,
> Leonard Xu
>
>
> 在 2020年4月26日,17:58,Benchao Li  写道:
>
> Thanks Dian for the effort, and all who make this release possible. Great
> work!
>
> Konstantin Knauf  于2020年4月26日周日 下午5:21写道:
>
>> Thanks for managing this release!
>>
>> On Sun, Apr 26, 2020 at 3:58 AM jincheng sun 
>> wrote:
>>
>>> Thanks for your great job, Dian!
>>>
>>> Best,
>>> Jincheng
>>>
>>>
>>> Hequn Cheng  于2020年4月25日周六 下午8:30写道:
>>>
 @Dian, thanks a lot for the release and for being the release manager.
 Also thanks to everyone who made this release possible!

 Best,
 Hequn

 On Sat, Apr 25, 2020 at 7:57 PM Dian Fu  wrote:

> Hi everyone,
>
> The Apache Flink community is very happy to announce the release of
> Apache Flink 1.9.3, which is the third bugfix release for the Apache Flink
> 1.9 series.
>
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data 
> streaming
> applications.
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>
> Please check out the release blog post for an overview of the
> improvements for this bugfix release:
> https://flink.apache.org/news/2020/04/24/release-1.9.3.html
>
> The full release notes are available in Jira:
> https://issues.apache.org/jira/projects/FLINK/versions/12346867
>
> We would like to thank all contributors of the Apache Flink community
> who made this release possible!
> Also great thanks to @Jincheng for helping finalize this release.
>
> Regards,
> Dian
>

>>
>> --
>> Konstantin Knauf | Head of Product
>> +49 160 91394525
>>
>> Follow us @VervericaData Ververica 
>>
>> --
>> Join Flink Forward  - The Apache Flink
>> Conference
>> Stream Processing | Event Driven | Real Time
>> --
>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>> --
>> Ververica GmbH
>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
>> (Tony) Cheng
>>
>
>
> --
>
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking University
> Tel:+86-15650713730
> Email: libenc...@gmail.com; libenc...@pku.edu.cn
>
>
>


Re: [ANNOUNCE] Apache Flink 1.9.3 released

2020-04-26 Thread Leonard Xu
Thanks Dian for the release and being the release manager !

Best,
Leonard Xu


> 在 2020年4月26日,17:58,Benchao Li  写道:
> 
> Thanks Dian for the effort, and all who make this release possible. Great 
> work!
> 
> Konstantin Knauf mailto:konstan...@ververica.com>> 
> 于2020年4月26日周日 下午5:21写道:
> Thanks for managing this release!
> 
> On Sun, Apr 26, 2020 at 3:58 AM jincheng sun  > wrote:
> Thanks for your great job, Dian! 
> 
> Best,
> Jincheng
> 
> 
> Hequn Cheng mailto:he...@apache.org>> 于2020年4月25日周六 
> 下午8:30写道:
> @Dian, thanks a lot for the release and for being the release manager.
> Also thanks to everyone who made this release possible!
> 
> Best,
> Hequn 
> 
> On Sat, Apr 25, 2020 at 7:57 PM Dian Fu  > wrote:
> Hi everyone,
> 
> The Apache Flink community is very happy to announce the release of Apache 
> Flink 1.9.3, which is the third bugfix release for the Apache Flink 1.9 
> series.
>  
> Apache Flink® is an open-source stream processing framework for distributed, 
> high-performing, always-available, and accurate data streaming applications.
>  
> The release is available for download at:
> https://flink.apache.org/downloads.html 
> 
>  
> Please check out the release blog post for an overview of the improvements 
> for this bugfix release:
> https://flink.apache.org/news/2020/04/24/release-1.9.3.html 
> 
>  
> The full release notes are available in Jira:
> https://issues.apache.org/jira/projects/FLINK/versions/12346867 
> 
>  
> We would like to thank all contributors of the Apache Flink community who 
> made this release possible!
> Also great thanks to @Jincheng for helping finalize this release.
>  
> Regards,
> Dian
> 
> 
> -- 
> Konstantin Knauf | Head of Product
> +49 160 91394525
> 
> Follow us @VervericaData Ververica 
> 
> --
> Join Flink Forward  - The Apache Flink Conference
> Stream Processing | Event Driven | Real Time
> --
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji 
> (Tony) Cheng
> 
> 
> -- 
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking University
> Tel:+86-15650713730
> Email: libenc...@gmail.com ; libenc...@pku.edu.cn 
> 


Re: [ANNOUNCE] Apache Flink 1.9.3 released

2020-04-26 Thread Benchao Li
Thanks Dian for the effort, and all who make this release possible. Great
work!

Konstantin Knauf  于2020年4月26日周日 下午5:21写道:

> Thanks for managing this release!
>
> On Sun, Apr 26, 2020 at 3:58 AM jincheng sun 
> wrote:
>
>> Thanks for your great job, Dian!
>>
>> Best,
>> Jincheng
>>
>>
>> Hequn Cheng  于2020年4月25日周六 下午8:30写道:
>>
>>> @Dian, thanks a lot for the release and for being the release manager.
>>> Also thanks to everyone who made this release possible!
>>>
>>> Best,
>>> Hequn
>>>
>>> On Sat, Apr 25, 2020 at 7:57 PM Dian Fu  wrote:
>>>
 Hi everyone,

 The Apache Flink community is very happy to announce the release of
 Apache Flink 1.9.3, which is the third bugfix release for the Apache Flink
 1.9 series.

 Apache Flink® is an open-source stream processing framework for
 distributed, high-performing, always-available, and accurate data streaming
 applications.

 The release is available for download at:
 https://flink.apache.org/downloads.html

 Please check out the release blog post for an overview of the
 improvements for this bugfix release:
 https://flink.apache.org/news/2020/04/24/release-1.9.3.html

 The full release notes are available in Jira:
 https://issues.apache.org/jira/projects/FLINK/versions/12346867

 We would like to thank all contributors of the Apache Flink community
 who made this release possible!
 Also great thanks to @Jincheng for helping finalize this release.

 Regards,
 Dian

>>>
>
> --
>
> Konstantin Knauf | Head of Product
>
> +49 160 91394525
>
>
> Follow us @VervericaData Ververica 
>
>
> --
>
> Join Flink Forward  - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> (Tony) Cheng
>


-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn


Re: [ANNOUNCE] Apache Flink 1.9.3 released

2020-04-26 Thread Konstantin Knauf
Thanks for managing this release!

On Sun, Apr 26, 2020 at 3:58 AM jincheng sun 
wrote:

> Thanks for your great job, Dian!
>
> Best,
> Jincheng
>
>
> Hequn Cheng  于2020年4月25日周六 下午8:30写道:
>
>> @Dian, thanks a lot for the release and for being the release manager.
>> Also thanks to everyone who made this release possible!
>>
>> Best,
>> Hequn
>>
>> On Sat, Apr 25, 2020 at 7:57 PM Dian Fu  wrote:
>>
>>> Hi everyone,
>>>
>>> The Apache Flink community is very happy to announce the release of
>>> Apache Flink 1.9.3, which is the third bugfix release for the Apache Flink
>>> 1.9 series.
>>>
>>> Apache Flink® is an open-source stream processing framework for
>>> distributed, high-performing, always-available, and accurate data streaming
>>> applications.
>>>
>>> The release is available for download at:
>>> https://flink.apache.org/downloads.html
>>>
>>> Please check out the release blog post for an overview of the
>>> improvements for this bugfix release:
>>> https://flink.apache.org/news/2020/04/24/release-1.9.3.html
>>>
>>> The full release notes are available in Jira:
>>> https://issues.apache.org/jira/projects/FLINK/versions/12346867
>>>
>>> We would like to thank all contributors of the Apache Flink community
>>> who made this release possible!
>>> Also great thanks to @Jincheng for helping finalize this release.
>>>
>>> Regards,
>>> Dian
>>>
>>

-- 

Konstantin Knauf | Head of Product

+49 160 91394525


Follow us @VervericaData Ververica 


--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Tony) Cheng


[jira] [Created] (FLINK-17389) LocalExecutorITCase.testBatchQueryCancel asserts error

2020-04-26 Thread Zhijiang (Jira)
Zhijiang created FLINK-17389:


 Summary: LocalExecutorITCase.testBatchQueryCancel asserts error
 Key: FLINK-17389
 URL: https://issues.apache.org/jira/browse/FLINK-17389
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Reporter: Zhijiang
 Fix For: 1.11.0


CI [https://api.travis-ci.org/v3/job/679144612/log.txt]
{code:java}
19:28:13.121 [INFO] ---
19:28:13.121 [INFO]  T E S T S
19:28:13.121 [INFO] ---
19:28:17.231 [INFO] Running 
org.apache.flink.table.client.gateway.local.LocalExecutorITCase
19:32:06.049 [ERROR] Tests run: 70, Failures: 1, Errors: 0, Skipped: 5, Time 
elapsed: 228.813 s <<< FAILURE! - in 
org.apache.flink.table.client.gateway.local.LocalExecutorITCase
19:32:06.051 [ERROR] testBatchQueryCancel[Planner: 
old](org.apache.flink.table.client.gateway.local.LocalExecutorITCase)  Time 
elapsed: 32.767 s  <<< FAILURE!
java.lang.AssertionError: expected: but was:
at 
org.apache.flink.table.client.gateway.local.LocalExecutorITCase.testBatchQueryCancel(LocalExecutorITCase.java:738)

19:32:06.440 [INFO] 
19:32:06.440 [INFO] Results:
19:32:06.440 [INFO] 
19:32:06.440 [ERROR] Failures: 
19:32:06.440 [ERROR]   LocalExecutorITCase.testBatchQueryCancel:738 
expected: but was:
19:32:06.440 [INFO] 
19:32:06.440 [ERROR] Tests run: 70, Failures: 1, Errors: 0, Skipped: 5{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17388) flink sql The custom function in ROW type is executed multiple times

2020-04-26 Thread cloudzhao (Jira)
cloudzhao created FLINK-17388:
-

 Summary: flink sql  The custom function in ROW type is executed 
multiple times
 Key: FLINK-17388
 URL: https://issues.apache.org/jira/browse/FLINK-17388
 Project: Flink
  Issue Type: Bug
Reporter: cloudzhao


val tableA = tableEnv.sqlQuery("select custom_func(a) as a, custom_func(b) as b 
from tableS")

tableEnv.registerTable("tableA", tableA)

val tableB = tableEnv.sqlQuery("select ROW(a, b) as body from tableA")

tableEnv.registerTable("tableB", tableB)

val tableC = tableEnv.sqlQuery("select body.a, body.b from tableB")

In this logic, the custom_func is executed four times

tableC === select Row(custom_func(a) as a, custom_func(b) as b).a, 
Row(custom_func(a) as a, custom_func(b) as b).b from tableS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17387) Implement LookupableTableSource for Hive connector

2020-04-26 Thread Rui Li (Jira)
Rui Li created FLINK-17387:
--

 Summary: Implement LookupableTableSource for Hive connector
 Key: FLINK-17387
 URL: https://issues.apache.org/jira/browse/FLINK-17387
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / Hive
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Integration of DataSketches into Flink

2020-04-26 Thread leerho
Hello All,

I am a committer on DataSketches.apache.org
 and just learning about Flink,  Since
Flink is designed for stateful stream processing I would think it would
make sense to have the DataSketches library integrated into its core so all
users of Flink could take advantage of these advanced streaming
algorithms.  If there is interest in the Flink community for this
capability, please contact us at d...@datasketches.apache.org or on our
datasketches-dev Slack channel.
Cheers,
Lee.