We would like to start a discussion thread on "FLIP-53: Fine Grained
Resource Management", where we propose how to improve Flink resource
management and scheduling.
This FLIP mainly discusses the following issues.
- How to support tasks with fine grained resource
Robert Metzger created FLINK-13738:
Summary: NegativeArraySizeException in LongHybridHashTable
@Thomas just to double check:
- parallelism and configuration changes should be well possible on
- changes in state types and JobGraph structure would be tricky, and
changing the on-the-wire types would not be possible.
On Wed, Aug 14, 2019 at 7:48 PM Thomas Weise
improving our build times is a hot topic at the moment so let's discuss
the different ways how they could be reduced.
First up, let's look at some numbers:
1 full build currently consumes 5h of build time total ("total time"),
and in the ideal case
Thanks Kurt for checking that.
The mentioned problem with table-examples is that, when working on
FLINK-13558, I forgot to add dependency on flink-examples-table to
flink-dist. So this module is not built if only the flink-dist with its
dependencies is built (this happens in the release scripts:
Great, then I have no other comments on legal check.
On Thu, Aug 15, 2019 at 9:56 PM Chesnay Schepler wrote:
> The licensing items aren't a problem; we don't care about Flink modules
> in NOTICE files, and we don't have to update the source-release
> licensing since we don't have a
+1 for this.
Am 15.08.19 um 15:57 schrieb JingsongLee:
Hi Flink devs,
I would like to start the voting for FLIP-51 Rework of the Expression
regarding the LegacyTypeInformation esp. for decimals. I don't have a
clear answer yet, but I think it should not limit us. If possible it
should travel through the type inference and we only need some special
cases at some locations e.g. when computing the leastRestrictive. E.g.
On Thu, Aug 15, 2019 at 3:30 PM Fabian Hueske wrote:
> Congrats Andrey!
> Am Do., 15. Aug. 2019 um 07:58 Uhr schrieb Gary Yao :
> > Congratulations Andrey, well deserved!
> > Best,
> > Gary
> > On Thu, Aug 15, 2019 at 7:50 AM Bowen Li wrote:
> > >
Dawid Wysakowicz created FLINK-13737:
Summary: flink-dist should add provided dependency on
The licensing items aren't a problem; we don't care about Flink modules
in NOTICE files, and we don't have to update the source-release
licensing since we don't have a pre-built version of the WebUI in the
On 15/08/2019 15:22, Kurt Young wrote:
After going through the licenses, I
I remember an issue regarding the watermark fetch request from the WebUI
exceeding some HTTP size limit, since it tries to fetch all watermarks
at once, and the format of this request isn't exactly efficient.
Querying metrics for individual operators still works since the request
After going through the licenses, I found 2 suspicions but not sure if they
valid or not.
1. flink-state-processing-api is packaged in to flink-dist jar, but not
NOTICE-binary file (the one under the root directory) like other modules.
2. flink-runtime-web distributed some
Thanks for starting this discussion.
I'd like to also add my 2 cents:
+1 for #2, differential build scripts.
I've worked on the approach. And with it, I think it's possible to reduce
total build time with relatively low effort, without enforcing any new
build tool and low maintenance
Hi Flink devs,
I would like to start the voting for FLIP-51 Rework of the Expression
+1 for this feature. I think this will be appreciated by users, as a way to
use the HeapStateBackend with a safety-net against OOM errors.
And having had major production exposure is great.
>From the implementation plan, it looks like this exists purely in a new
module and does not require any
I am trying the DDL feature in branch 1.9-releasae. I am stucked in creating a
table from kafka with nested json format. Is it possibe to specify a "Row" type
of columns to derive the nested json schema?
String sql = "create table kafka_stream(\n" +
" a varchar, \n" +
Thanks a lot for the quick response!
I will consider the Flink Accepted and will start working on it.
On Thu, Aug 15, 2019 at 5:29 AM SHI Xiaogang wrote:
> Glad that programming with flink becomes simpler and easier.
> Aljoscha Krettek
Thanks Kostas for pushing this.
On Thu, 15 Aug 2019 at 16:03, Kostas Kloudas wrote:
> Thanks a lot for the quick response!
> I will consider the Flink Accepted and will start working on it.
> On Thu, Aug 15, 2019 at 5:29 AM SHI Xiaogang
Hi @Timo Walther @Dawid Wysakowicz:
Now, flink-planner have some legacy DataTypes:
like: legacy decimal, legacy basic array type info...
And If the new type inference infer a Decimal/VarChar with precision, there
should will fail in TypeConversions.
The better we do on DataType, the more
Kurt Young created FLINK-13735:
Summary: Support session window with blink planner in batch mode
Kurt Young created FLINK-13736:
Summary: Support count window with blink planner in batch mode
Jepsen test suite passed 10 times consecutively
On Wed, Aug 14, 2019 at 5:31 PM Aljoscha Krettek
> I did some testing on a Google Cloud Dataproc cluster (it gives you a
> managed YARN and Google Cloud Storage (GCS)):
> - tried both YARN session mode and YARN
Am Do., 15. Aug. 2019 um 07:58 Uhr schrieb Gary Yao :
> Congratulations Andrey, well deserved!
> On Thu, Aug 15, 2019 at 7:50 AM Bowen Li wrote:
> > Congratulations Andrey!
> > On Wed, Aug 14, 2019 at 10:18 PM Rong Rong wrote:
> >> Congratulations
I'll add a chapter to list blink planner extended functions.
Send Time:2019年8月15日(星期四) 05:12
Subject:Re: [DISCUSS] FLIP-51: Rework of the Expression Design
Thanks Jingsong for
I agree that this is a serious bug. However, I would not block the
release because of this. As you said, there is a workaround and the
`execute()` works in the most common case of a single execution. We can
fix this in a minor release shortly after.
What do others think?
With the same argument as before, given that it is mentioned in the release
announcement that it is a preview feature, I would not block this release
because of it.
Nevertheless, it would be important to mention this explicitly in the
release notes .
Till Rohrmann created FLINK-13733:
Summary: FlinkKafkaInternalProducerITCase.testHappyPath fails on
Very thanks for the great points!
For the prioritizing inputs, from another point of view, I think it might
not cause other bad effects, since we do not need to totally block the channels
that have seen barriers after the operator has taking snapshot. After the
snapshotting, if the
Thanks Chesnay for bringing up this discussion and sharing those thoughts
to speed up the building process.
I'd +1 for option 2 and 3.
We can benefits a lot from Option 2. Developing table, connectors,
libraries, docs modules would result in much fewer tests(1/3 to 1/tens) to
PRs for those
Thanks for starting this discussion Chesnay. I think it has become obvious
to the Flink community that with the existing build setup we cannot really
deliver fast build times which are essential for fast iteration cycles and
high developer productivity. The reasons for this situation are manifold
We just find a serious bug around blink planner:
When user reused the table environment instance, and call `execute` method
multiple times for
different sql, the later call will trigger the earlier ones to be
It's a serious bug
Tested in AWS EMR Yarn: 1 master and 4 worker nodes (m5.xlarge: 4 vCore, 16
EMR runs only on Java 8. Fine-grained recovery is enabled by default.
Modified E2E test scripts can be found here (asserting output):
Does "flink run -j jarpath ..." work for you?
If that jar id deployed to the same path on each worker machine, you can
try "flink run -C classpath ..." as well.
刘建刚 于2019年8月15日周四 下午5:31写道：
> We are using per-job to load udf jar when start job. Our jar file
Thomas, thanks for confirming this. I have noticed, that in 1.9 the
WebUI has been reworked a lot, does anyone know if this is still an
issue? I currently cannot easily try 1.9, so I cannot confirm or
On 8/14/19 6:25 PM, Thomas Weise wrote:
I have also noticed this
Jark Wu created FLINK-13734:
Summary: Support DDL in SQL CLI
Issue Type: New Feature
Hi Gordon & Timo,
Thanks for the feedback, and I agree with it. I will document this in the
On Thu, Aug 15, 2019 at 6:14 PM Tzu-Li (Gordon) Tai
> Hi Kurt,
> With the same argument as before, given that it is mentioned in the release
> announcement that it
Yes. Flink 1.9 supports nested json derived. You should declare the ROW
type with nested schema explicitly. I tested a similar DDL against 1.9.0
RC2 and worked well.
CREATE TABLE kafka_json_source (
) WITH (
We are using per-job to load udf jar when start job. Our jar file is
in another path but not flink's lib path. In the main function, we use
classLoader to load the jar file by the jar path. But it reports the
following error when job starts running.
If the jar file is in lib,
Big +1 for this feature.
Our customers including me, have ever met dilemma where we have to use window
to aggregate events in applications like real-time monitoring. The larger of
timer and window state, the poor performance of RocksDB. However, switching to
use FsStateBackend would always
Till Rohrmann created FLINK-13740:
Summary: TableAggregateITCase.testNonkeyedFlatAggregate failed on
Sorry for the late response. So many FLIPs these days.
I am a bit unsure about the motivation here, and that this need to be a
part of Flink. It sounds like this can be perfectly built around Flink as a
minimal library on top of it, without any change in the core APIs or
The proposal to
+1 to start a VOTE for this FLIP.
Given the properties of this new state backend and that it will exist as a
new module without touching the original heap backend, I don't see a harm
in including this.
Regarding design of the feature, I've already mentioned my comments in the
Robert Metzger created FLINK-13739:
Summary: BinaryRowTest.testWriteString() fails in some environments
Thanks for all the test efforts, verifications and votes so far.
So far, things are looking good, but we still require one more PMC binding
vote for this RC to be the official release, so I would like to extend the
vote time for 1 more day, until *Aug. 16th 17:00 CET*.
In the meantime, the
Thanks all for the reviews and comments!
bq. From the implementation plan, it looks like this exists purely in a new
module and does not require any changes in other parts of Flink's code. Can
you confirm that?
On Thu, 15 Aug 2019 at 18:04, Tzu-Li (Gordon)
-1 for RC2.
I found a bug https://issues.apache.org/jira/browse/FLINK-13741, and I
think it's a blocker. The bug means currently if users call
`tEnv.listUserDefinedFunctions()` in Table API or `show functions;` thru
SQL would not be able to see Flink's built-in functions.
I'm preparing a fix
Bowen Li created FLINK-13741:
Summary: FunctionCatalog.getUserDefinedFunctions() does not return
Flink built-in functions' names
Big +1 for this feature.
This FLIP can help improves at least the following two scenarios:
- Temporary data peak when using Heap StateBackend
- Heap State Backend has better performance than RocksDBStateBackend,
especially on SATA disk. there are some guys ever told me that they
Thanks for reporting this.
However, I don't think this is an issue. IMO, it is by design.
The `tEnv.listUserDefinedFunctions()` in Table API and `show functions;` in
SQL CLI are intended to return only the registered UDFs, not including
This is also the behavior in
Jark Wu created FLINK-13742:
Summary: Fix code generation when aggregation contains both
distinct aggregate with and without filter
Thanks Chesnay for starting this discussion.
+1 for #1, it might be the easiest way to get a significant speedup.
If the only reason is for isolation. I think we can fix the static fields
or global state used in Flink if possible.
+1 for #2, and thanks Aleksey for the prototype. I think it's a
Thanks for letting me know that it's been like this in previous releases.
Though I don't think that's the right behavior, it can be discussed for
later release. Thus I retract my -1 for RC2.
On Thu, Aug 15, 2019 at 7:49 PM Jark Wu wrote:
> Hi Bowen,
> Thanks for reporting
Mail list logo