date:20190818

Review of pull request

2019-08-18 Thread Rishindra Kumar

Hi,

I created pull request with the change I proposed in the comment. Could
someone please review it?
https://github.com/apache/flink/pull/9468

-- 
*Maddila Rishindra Kumar*
*Software Engineer*
*Walmartlabs India*
*Contact No: +919967379528 | Alternate E-mail
ID: rishindra.madd...@walmartlabs.com *

[jira] [Created] (FLINK-13768) Update documentation regarding `path style access` for S3 filesystem implementations

2019-08-18 Thread Achyuth Narayan Samudrala (JIRA)

Achyuth Narayan Samudrala created FLINK-13768:
-

 Summary: Update documentation regarding `path style access` for S3 
filesystem implementations
 Key: FLINK-13768
 URL: https://issues.apache.org/jira/browse/FLINK-13768
 Project: Flink
  Issue Type: New Feature
  Components: Documentation
Reporter: Achyuth Narayan Samudrala


The documentation related to various properties that can be provided for the s3 
sink is not very informative. According to the code in 
flink-s3-fs-base/flink-s3-fs-hadoop, any property specified as 
s3. is transformed to fs.s3a..

 

When interacting with s3 compatible file systems such as CEPH/Minio, default 
configuration properties might not be sufficient. One such property is the 
fs.s3a.path.style.access. This property enables different modes of access to 
the s3 buckets. By default if this property is not set, virtual host style 
addressing is used. The documentation should mention how this property can be 
passed on as a flink conf property.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re: [VOTE] Flink Project Bylaws

2019-08-18 Thread Thomas Weise

+0 (binding)

I don't think committers should be allowed to approve their own changes. I
would prefer if non-committer contributors can approve committer PRs as
that would encourage more participation in code review and ability to
contribute.


On Fri, Aug 16, 2019 at 9:02 PM Shaoxuan Wang  wrote:

> +1 (binding)
>
> On Fri, Aug 16, 2019 at 7:48 PM Chesnay Schepler 
> wrote:
>
> > +1 (binding)
> >
> > Although I think it would be a good idea to always cc
> > priv...@flink.apache.org when modifying bylaws, if anything to speed up
> > the voting process.
> >
> > On 16/08/2019 11:26, Ufuk Celebi wrote:
> > > +1 (binding)
> > >
> > > – Ufuk
> > >
> > >
> > > On Wed, Aug 14, 2019 at 4:50 AM Biao Liu  wrote:
> > >
> > >> +1 (non-binding)
> > >>
> > >> Thanks for pushing this!
> > >>
> > >> Thanks,
> > >> Biao /'bɪ.aʊ/
> > >>
> > >>
> > >>
> > >> On Wed, 14 Aug 2019 at 09:37, Jark Wu  wrote:
> > >>
> > >>> +1 (non-binding)
> > >>>
> > >>> Best,
> > >>> Jark
> > >>>
> > >>> On Wed, 14 Aug 2019 at 09:22, Kurt Young  wrote:
> > >>>
> >  +1 (binding)
> > 
> >  Best,
> >  Kurt
> > 
> > 
> >  On Wed, Aug 14, 2019 at 1:34 AM Yun Tang  wrote:
> > 
> > > +1 (non-binding)
> > >
> > > But I have a minor question about "code change" action, for those
> > > "[hotfix]" github pull requests [1], the dev mailing list would not
> > >> be
> > > notified currently. I think we should change the description of
> this
> >  action.
> > >
> > > [1]
> > >
> > >>
> >
> https://flink.apache.org/contributing/contribute-code.html#code-contribution-process
> > > Best
> > > Yun Tang
> > > 
> > > From: JingsongLee 
> > > Sent: Tuesday, August 13, 2019 23:56
> > > To: dev 
> > > Subject: Re: [VOTE] Flink Project Bylaws
> > >
> > > +1 (non-binding)
> > > Thanks Becket.
> > > I've learned a lot from current bylaws.
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > >
> > > --
> > > From:Yu Li 
> > > Send Time:2019年8月13日(星期二) 17:48
> > > To:dev 
> > > Subject:Re: [VOTE] Flink Project Bylaws
> > >
> > > +1 (non-binding)
> > >
> > > Thanks for the efforts Becket!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Tue, 13 Aug 2019 at 16:09, Xintong Song 
> >  wrote:
> > >> +1 (non-binding)
> > >>
> > >> Thank you~
> > >>
> > >> Xintong Song
> > >>
> > >>
> > >>
> > >> On Tue, Aug 13, 2019 at 1:48 PM Robert Metzger <
> > >> rmetz...@apache.org>
> > >> wrote:
> > >>
> > >>> +1 (binding)
> > >>>
> > >>> On Tue, Aug 13, 2019 at 1:47 PM Becket Qin  > > wrote:
> >  Thanks everyone for voting.
> > 
> >  For those who have already voted, just want to bring this up to
> >  your
> >  attention that there is a minor clarification to the bylaws
> > >> wiki
> >  this
> >  morning. The change is in bold format below:
> > 
> >  one +1 from a committer followed by a Lazy approval (not
> > >> counting
> >  the
> > >>> vote
> > > of the contributor), moving to lazy majority if a -1 is
> > >>> received.
> > 
> >  Note that this implies that committers can +1 their own commits
> > >>> and
> > >> merge
> > > right away. *However, the committe**rs should use their best
> > >> judgement
> > >>> to
> > > respect the components expertise and ongoing development
> > >> plan.*
> > 
> >  This addition does not really change anything the bylaws meant
> > >> to
> > > set.
> > >> It
> >  is simply a clarification. If anyone who have casted the vote
> > > objects,
> >  please feel free to withdraw the vote.
> > 
> >  Thanks,
> > 
> >  Jiangjie (Becket) Qin
> > 
> > 
> >  On Tue, Aug 13, 2019 at 1:29 PM Piotr Nowojski <
> >  pi...@ververica.com>
> >  wrote:
> > 
> > > +1
> > >
> > >> On 13 Aug 2019, at 13:22, vino yang  > > wrote:
> > >> +1
> > >>
> > >> Tzu-Li (Gordon) Tai  于2019年8月13日周二
> > > 下午6:32写道：
> > >>> +1
> > >>>
> > >>> On Tue, Aug 13, 2019, 12:31 PM Hequn Cheng <
> > > chenghe...@gmail.com>
> > > wrote:
> >  +1 (non-binding)
> > 
> >  Thanks a lot for driving this! Good job. @Becket Qin <
> > >>> becket@gmail.com
> >  Best, Hequn
> > 
> >  On Tue, Aug 13, 2019 at 6:26 PM Stephan Ewen <
> >  se...@apache.org
> >  wrote:
> > > +1
> > >
> > > On Tue, Aug 13, 2019 at 12:22 PM Maximilian Michels <
> > >>> m...@apache.org
> > > wrote:
> > >
> >

Cwiki edit access

2019-08-18 Thread Thomas Weise

Hi,

I would like to be able to edit pages in the Confluence Flink space. Can
someone give me access please?

Thanks

Re: [DISCUSS] FLIP-39: Flink ML pipeline and ML libs

2019-08-18 Thread Shuiqiang Chen

Hi Robert,

Thank you for your reminding! I have added the wiki page[1] for this FLIP.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs

Robert Metzger  于2019年8月14日周三 下午5:56写道：

> It seems that this FLIP doesn't have a Wiki page yet [1], even though it is
> already partially implemented [2]
> We should try to stick more to the FLIP process to manage the project more
> efficiently.
>
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
> [2] https://issues.apache.org/jira/browse/FLINK-12470
>
> On Mon, Jun 17, 2019 at 12:27 PM Gen Luo  wrote:
>
> > Hi all,
> >
> > In the review of PR for FLINK-12473, there were a few comments regarding
> > pipeline exportation. We would like to start a follow up discussions to
> > address some related comments.
> >
> > Currently, FLIP-39 proposal gives a way for users to persist a pipeline
> in
> > JSON format. But it does not specify how users can export a pipeline for
> > serving purpose. We summarized some thoughts on this in the following
> doc.
> >
> >
> >
> https://docs.google.com/document/d/1B84b-1CvOXtwWQ6_tQyiaHwnSeiRqh-V96Or8uHqCp8/edit?usp=sharing
> >
> > After we reach consensus on the pipeline exportation, we will add a
> > corresponding section in FLIP-39.
> >
> >
> > Shaoxuan Wang  于2019年6月5日周三 上午8:47写道：
> >
> > > Stavros,
> > > They have the similar logic concept, but the implementation details are
> > > quite different. It is hard to migrate the interface with different
> > > implementations. The built-in algorithms are useful legacy that we will
> > > consider migrate to the new API (but still with different
> > implementations).
> > > BTW, the new API has already been merged via FLINK-12473.
> > >
> > > Thanks,
> > > Shaoxuan
> > >
> > >
> > >
> > > On Mon, Jun 3, 2019 at 6:08 PM Stavros Kontopoulos <
> > > st.kontopou...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Some portion of the code could be migrated to the new Table API no?
> > > > I am saying that because the new API design is based on scikit-learn
> > and
> > > > the old one was also inspired by it.
> > > >
> > > > Best,
> > > > Stavros
> > > > On Wed, May 22, 2019 at 1:24 PM Shaoxuan Wang 
> > > wrote:
> > > >
> > > > > Another consensus (from the offline discussion) is that we will
> > > > > delete/deprecate flink-libraries/flink-ml. I have started a survey
> > and
> > > > > discussion [1] in dev/user-ml to collect the feedback. Depending on
> > the
> > > > > replies, we will decide if we shall delete it in Flink1.9 or
> > > > > deprecate&delete in the next release after 1.9.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Usage-of-flink-ml-and-DISCUSS-Delete-flink-ml-td29057.html
> > > > >
> > > > > Regards,
> > > > > Shaoxuan
> > > > >
> > > > >
> > > > > On Tue, May 21, 2019 at 9:22 PM Gen Luo 
> wrote:
> > > > >
> > > > > > Yes, this is our conclusion. I'd like to add only one point that
> > > > > > registering user defined aggregator is also needed which is
> > currently
> > > > > > provided by 'bridge' and finally will be merged into Table API.
> > It's
> > > > same
> > > > > > with collect().
> > > > > >
> > > > > > I will add a TableEnvironment argument in Estimator.fit() and
> > > > > > Transformer.transform() to get rid of the dependency on
> > > > > > flink-table-planner. This will be committed soon.
> > > > > >
> > > > > > Aljoscha Krettek  于2019年5月21日周二 下午7:31写道：
> > > > > >
> > > > > > > We discussed this in private and came to the conclusion that we
> > > > should
> > > > > > > (for now) have the dependency on flink-table-api-xxx-bridge
> > because
> > > > we
> > > > > > need
> > > > > > > access to the collect() method, which is not yet available in
> the
> > > > Table
> > > > > > > API. Once that is available the code can be refactored but for
> > now
> > > we
> > > > > > want
> > > > > > > to unblock work on this new module.
> > > > > > >
> > > > > > > We also agreed that we don’t need a direct dependency on
> > > > > > > flink-table-planner.
> > > > > > >
> > > > > > > I hope I summarised our discussion correctly.
> > > > > > >
> > > > > > > > On 17. May 2019, at 12:20, Gen Luo 
> > wrote:
> > > > > > > >
> > > > > > > > Thanks for your reply.
> > > > > > > >
> > > > > > > > For the first question, it's not strictly necessary. But I
> > perfer
> > > > not
> > > > > > to
> > > > > > > > have a TableEnvironment argument in Estimator.fit() or
> > > > > > > > Transformer.transform(), which is not part of machine
> learning
> > > > > concept,
> > > > > > > and
> > > > > > > > may make our API not as clean and pretty as other systems
> do. I
> > > > would
> > > > > > > like
> > > > > > > > another way other than introducing flink-table-planner to do
> > > this.
> > > > If
> > > > > > > it's
> > > > > > > > impossible or severely opposed, I may make the concession to
> > add
> > > > the
> > > > > > > > a

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-18 Thread Zili Chen

We should investigate the performance regression but regardless the
regression I vote +1

Have verified following things

- Jobs running on YARN x (Session & Per Job) with high-availability enabled.
- Simulate JM and TM failures.
- Simulate temporary network partition.

Best,
tison.


Stephan Ewen  于2019年8月18日周日 下午10:12写道：

> For reference, this is the JIRA issue about the regression in question:
>
> https://issues.apache.org/jira/browse/FLINK-13752
>
>
> On Fri, Aug 16, 2019 at 10:57 AM Guowei Ma  wrote:
>
> > Hi, till
> > I can send the job to you offline.
> > It is just a datastream job and does not use
> TwoInputSelectableStreamTask.
> > A->B
> >  \
> >C
> >  /
> > D->E
> > Best,
> > Guowei
> >
> >
> > Till Rohrmann  于2019年8月16日周五 下午4:34写道：
> >
> > > Thanks for reporting this issue Guowei. Could you share a bit more
> > details
> > > what the job exactly does and which operators it uses? Does the job
> uses
> > > the new `TwoInputSelectableStreamTask` which might cause the
> performance
> > > regression?
> > >
> > > I think it is important to understand where the problem comes from
> before
> > > we proceed with the release.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Fri, Aug 16, 2019 at 10:27 AM Guowei Ma 
> wrote:
> > >
> > > > Hi,
> > > > -1
> > > > We have a benchmark job, which includes a two-input operator.
> > > > This job has a big performance regression using 1.9 compared to 1.8.
> > > > It's still not very clear why this regression happens.
> > > >
> > > > Best,
> > > > Guowei
> > > >
> > > >
> > > > Yu Li  于2019年8月16日周五 下午3:27写道：
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > - checked release notes: OK
> > > > > - checked sums and signatures: OK
> > > > > - source release
> > > > >  - contains no binaries: OK
> > > > >  - contains no 1.9-SNAPSHOT references: OK
> > > > >  - build from source: OK (8u102)
> > > > >  - mvn clean verify: OK (8u102)
> > > > > - binary release
> > > > >  - no examples appear to be missing
> > > > >  - started a cluster; WebUI reachable, example ran successfully
> > > > > - repository appears to contain all expected artifacts
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > > >
> > > > > On Fri, 16 Aug 2019 at 06:06, Bowen Li 
> wrote:
> > > > >
> > > > > > Hi Jark,
> > > > > >
> > > > > > Thanks for letting me know that it's been like this in previous
> > > > releases.
> > > > > > Though I don't think that's the right behavior, it can be
> discussed
> > > for
> > > > > > later release. Thus I retract my -1 for RC2.
> > > > > >
> > > > > > Bowen
> > > > > >
> > > > > >
> > > > > > On Thu, Aug 15, 2019 at 7:49 PM Jark Wu 
> wrote:
> > > > > >
> > > > > > > Hi Bowen,
> > > > > > >
> > > > > > > Thanks for reporting this.
> > > > > > > However, I don't think this is an issue. IMO, it is by design.
> > > > > > > The `tEnv.listUserDefinedFunctions()` in Table API and `show
> > > > > functions;`
> > > > > > in
> > > > > > > SQL CLI are intended to return only the registered UDFs, not
> > > > including
> > > > > > > built-in functions.
> > > > > > > This is also the behavior in previous versions.
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark
> > > > > > >
> > > > > > > On Fri, 16 Aug 2019 at 06:52, Bowen Li 
> > > wrote:
> > > > > > >
> > > > > > > > -1 for RC2.
> > > > > > > >
> > > > > > > > I found a bug
> > https://issues.apache.org/jira/browse/FLINK-13741,
> > > > > and I
> > > > > > > > think it's a blocker.  The bug means currently if users call
> > > > > > > > `tEnv.listUserDefinedFunctions()` in Table API or `show
> > > functions;`
> > > > > > thru
> > > > > > > > SQL would not be able to see Flink's built-in functions.
> > > > > > > >
> > > > > > > > I'm preparing a fix right now.
> > > > > > > >
> > > > > > > > Bowen
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Aug 15, 2019 at 8:55 AM Tzu-Li (Gordon) Tai <
> > > > > > tzuli...@apache.org
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for all the test efforts, verifications and votes so
> > > far.
> > > > > > > > >
> > > > > > > > > So far, things are looking good, but we still require one
> > more
> > > > PMC
> > > > > > > > binding
> > > > > > > > > vote for this RC to be the official release, so I would
> like
> > to
> > > > > > extend
> > > > > > > > the
> > > > > > > > > vote time for 1 more day, until *Aug. 16th 17:00 CET*.
> > > > > > > > >
> > > > > > > > > In the meantime, the release notes for 1.9.0 had only just
> > been
> > > > > > > finalized
> > > > > > > > > [1], and could use a few more eyes before closing the vote.
> > > > > > > > > Any help with checking if anything else should be mentioned
> > > there
> > > > > > > > regarding
> > > > > > > > > breaking changes / known shortcomings would be appreciated.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Gordon
> > > > > > > > >
> > > > > > > > > [1] https://github.com/apache/flink/pull/9438
> > > > > > > >

[jira] [Created] (FLINK-13767) Migrate isFinished method from AvailabilityListener to AsyncDataInput

2019-08-18 Thread zhijiang (JIRA)

zhijiang created FLINK-13767:


 Summary: Migrate isFinished method from AvailabilityListener to 
AsyncDataInput
 Key: FLINK-13767
 URL: https://issues.apache.org/jira/browse/FLINK-13767
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network, Runtime / Task
Reporter: zhijiang
Assignee: zhijiang


AvailabilityListener is both used in AsyncDataInput and StreamTaskInput. We 
already introduced InputStatus for StreamTaskInput#emitNext, and then 
InputStatus#END_OF_INPUT has the same semantic with 
AvailabilityListener#isFinished.

But for the case of AsyncDataInput which is mainly used by InputGate layer, the 
isFinished() method is still needed at the moment. So we migrate this method 
from AvailabilityListener to  AsyncDataInput, and refactor the 
StreamInputProcessor implementations by using InputStatus to judge finished 
state.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (FLINK-13766) Refactor the implementation of StreamInputProcessor based on StreamTaskInput#emitNext

2019-08-18 Thread zhijiang (JIRA)

zhijiang created FLINK-13766:


 Summary: Refactor the implementation of StreamInputProcessor based 
on StreamTaskInput#emitNext
 Key: FLINK-13766
 URL: https://issues.apache.org/jira/browse/FLINK-13766
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Task
Reporter: zhijiang
Assignee: zhijiang


The current processing in task input processor is based on the way of pollNext. 
In order to unify the processing way of new source operator, we introduce the 
new StreamTaskInput#emitNext(Output) instead of current pollNext. Then we need 
to adjust the existing implementations of 
StreamOneInputProcessor/StreamTwoInputSelectableProcessor based on the new emit 
way.

To do so, we could integrate all the task inputs from network/source in a 
unified processing on runtime side.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (FLINK-13765) Introduce the InputSelectionHandler for selecting next input in StreamTwoInputSelectableProcessor

2019-08-18 Thread zhijiang (JIRA)

zhijiang created FLINK-13765:


 Summary: Introduce the InputSelectionHandler for selecting next 
input in StreamTwoInputSelectableProcessor
 Key: FLINK-13765
 URL: https://issues.apache.org/jira/browse/FLINK-13765
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Task
Reporter: zhijiang
Assignee: zhijiang


In StreamTwoInputSelectableProcessor there are three fields \{InputSelectable, 
InputSelection, availableInputsMask} to be used together for the function of 
selecting next available input index. It would bring two problems:
 * From design aspect, these fields should be abstracted into a separate 
component and passed into StreamTwoInputSelectableProcessor.
 * inputSelector.nextSelection() is called while processing elements in  
StreamTwoInputSelectableProcessor, so it is the blocker for integrating task 
input/output for both StreamOneInputProcessor/StreamTwoInputSelectableProcessor 
later.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-08-18 Thread Stephan Ewen

A "List Type" sounds like a good direction to me.

The comment on the type system was a bit brief, I agree. The idea is to see
if something like that can ease validation. Especially the correlation
system seems quite complex (proxies to work around order of initialization).

For example, let's assume we don't think primarily about "java types" but
would define types as one of the following (just examples, haven't thought
all the details through):

  (a) category type: implies string, and a fix set of possible values.
Those would be passes and naturally make it into the docs and validation.
Maps to a String or Enum in Java.

  (b) numeric integer type: implies long (or optionally integer, if we want
to automatically check overflow / underflow). would take typical domain
validators, like non-negative, etc.

  (c) numeric real type: same as above (double or float)

  (d) numeric interval type: either defined as an interval, or references
other parameter by key. validation by valid interval.

  (e) quantity: a measure and a unit. separately parsable. The measure's
type could be any of the numeric types above, with same validation rules.

With a system like the above, would we still correlation validators? Are
there still cases that we need to catch early (config loading) or are the
remaining cases sufficiently rare and runtime or setup specific, that it is
fine to handle them in component initialization?


On Sun, Aug 18, 2019 at 6:36 PM Dawid Wysakowicz 
wrote:

> Hi Stephan,
>
> Thank you for your opinion.
>
> Actually list/composite types are the topics we spent the most of the
> time. I understand that from a perspective of a full blown type system,
> a field like isList may look weird. Please let me elaborate a bit more
> on the reason behind it though. Maybe we weren't clear enough about it
> in the FLIP. The key feature of all the conifg options is that they must
> have a string representation as they might come from a configuration
> file. Moreover it must be a human readable format, so that the values
> might be manually adjusted. Having that in mind we did not want to add a
> support of an arbitrary nesting and we decided to allow for lists only
> (and flat objects - I think though in the current design there is a
> mistake around the Configurable interface). I think though you have a
> point here and it would be better to have a ListConfigOption instead of
> this field. Does it make sense to you?
>
> As for the second part of your message. I am not sure if I understood
> it. The validators work with parse/deserialized values from
> Configuration that means they can be bound to the generic parameter of
> Configuration. You can have a RangeValidator Comparable/Number>. I don't think the type hierarchy in the ConfigOption
> has anything to do with the validation logic. Could you elaborate a bit
> more what did you mean?
>
> Best,
>
> Dawid
>
> On 18/08/2019 16:42, Stephan Ewen wrote:
> > I like the idea of enhancing the configuration and to do early
> validation.
> >
> > I feel that some of the ideas in the FLIP seem a bit ad hoc, though. For
> > example, having a boolean "isList" is a clear indication of not having
> > thought through the type/category system.
> > Also, having a more clear category system makes validation simpler.
> >
> > For example, I have seen systems distinguishing between numeric
> parameters
> > (valid ranges), category parameters (set of possible values), quantities
> > like duration and memory size (need measure and unit), which results in
> an
> > elegant system for validation.
> >
> >
> > On Fri, Aug 16, 2019 at 5:22 PM JingsongLee  .invalid>
> > wrote:
> >
> >> +1 to this, thanks Timo and Dawid for the design.
> >> This allows the currently cluttered configuration of various
> >>  modules to be unified.
> >> This is also first step of one of the keys to making new unified
> >> TableEnvironment available for production.
> >>
> >> Previously, we did encounter complex configurations, such as
> >> specifying the skewed values of column in DDL. The skew may
> >>  be a single field or a combination of multiple fields. So the
> >>  configuration is very troublesome. We used JSON string to
> >>  configure it.
> >>
> >> Best,
> >> Jingsong Lee
> >>
> >>
> >>
> >> --
> >> From:Jark Wu 
> >> Send Time:2019年8月16日(星期五) 16:44
> >> To:dev 
> >> Subject:Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration
> >>
> >> Thanks for starting this design Timo and Dawid,
> >>
> >> Improving ConfigOption has been hovering in my mind for a long time.
> >> We have seen the benefit when developing blink configurations and
> connector
> >> properties in 1.9 release.
> >> Thanks for bringing it up and make such a detailed design.
> >> I will leave my thoughts and comments there.
> >>
> >> Cheers,
> >> Jark
> >>
> >>
> >> On Fri, 16 Aug 2019 at 22:30, Zili Chen  wrote:
> >>
> >>> Hi Timo,
> >>>
> >>> It looks interesting. Thanks for p

[jira] [Created] (FLINK-13764) Pass the counter of numRecordsIn into the constructors of StreamOne/TwoInputProcessor

2019-08-18 Thread zhijiang (JIRA)

zhijiang created FLINK-13764:


 Summary: Pass the counter of numRecordsIn into the constructors of 
StreamOne/TwoInputProcessor
 Key: FLINK-13764
 URL: https://issues.apache.org/jira/browse/FLINK-13764
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Task
Reporter: zhijiang
Assignee: zhijiang


Currently the counter of numRecordsIn is setup while processing input in 
processor. In order to integrate the processing logic based on 
StreamTaskInput#emitNext(Output) later, we need to pass the counter into output 
functions then.

So this refactoring is the precondition of following works, and it could get 
additional benefits. One is that we could make the counter as final field in 
StreamInputProcessor. Another is that we could reuse the counter setup logic 
for both StreamOne/TwoInputProcessors.

There should be no side effects if we make the counter setup a bit earlier than 
the previous way.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (FLINK-13763) Master build is broken because of wrong Maven version

2019-08-18 Thread Till Rohrmann (JIRA)

Till Rohrmann created FLINK-13763:
-

 Summary: Master build is broken because of wrong Maven version
 Key: FLINK-13763
 URL: https://issues.apache.org/jira/browse/FLINK-13763
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.10.0
Reporter: Till Rohrmann
 Fix For: 1.10.0


Currently, all master builds fail on Travis because Maven {{3.6.0}} is being 
used instead of Maven {{3.2.5}} (FLINK-3158). Strangely, this only seems to 
happen for the master branch.

{code}
/home/travis/maven_cache/apache-maven-3.2.5
/home/travis/maven_cache/apache-maven-3.2.5/bin:/home/travis/.rvm/gems/ruby-2.5.3/bin:/home/travis/.rvm/gems/ruby-2.5.3@global/bin:/home/travis/.rvm/rubies/ruby-2.5.3/bin:/home/travis/.rvm/bin:/usr/lib/jvm/java-1.8.0-openjdk-amd64/bin:/home/travis/bin:/home/travis/.local/bin:/usr/local/lib/jvm/openjdk11/bin:/opt/pyenv/shims:/home/travis/.phpenv/shims:/home/travis/perl5/perlbrew/bin:/home/travis/.nvm/versions/node/v8.12.0/bin:/home/travis/gopath/bin:/home/travis/.gimme/versions/go1.11.1.linux.amd64/bin:/usr/local/maven-3.6.0/bin:/usr/local/cmake-3.12.4/bin:/usr/local/clang-7.0.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/home/travis/.phpenv/bin:/opt/pyenv/bin:/home/travis/.yarn/bin
-Dorg.slf4j.simpleLogger.showDateTime=true 
-Dorg.slf4j.simpleLogger.dateTimeFormat=HH:mm:ss.SSS
Apache Maven 3.6.0 (97c98ec64a1fdfee7767ce5ffb20918da4f719f3; 
2018-10-24T18:41:47Z)
Maven home: /usr/local/maven-3.6.0
{code}

https://api.travis-ci.org/v3/job/573429209/log.txt
https://api.travis-ci.org/v3/job/573427149/log.txt
https://api.travis-ci.org/v3/job/573405515/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re: [DISCUSS] Reducing build times

2019-08-18 Thread Robert Metzger

Hi all,

I wanted to understand the impact of the hardware we are using for running
our tests. Each travis worker has 2 virtual cores, and 7.5 gb memory [1].
They are using Google Cloud Compute Engine *n1-standard-2* instances.
Running a full "mvn clean verify" takes *03:32 h* on such a machine type.

Running the same workload on a 32 virtual cores, 64 gb machine, takes *1:21
h*.

What is interesting are the per-module build time differences.
Modules which are parallelizing tests well greatly benefit from the
additional cores:
"flink-tests" 36:51 min vs 4:33 min
"flink-runtime" 23:41 min vs 3:47 min
"flink-table-planner" 15:54 min vs 3:13 min

On the other hand, we have modules which are not parallel at all:
"flink-connector-kafka": 16:32 min vs 15:19 min
"flink-connector-kafka-0.11": 9:52 min vs 7:46 min
Also, the checkstyle plugin is not scaling at all.

Chesnay reported some significant speedups by reusing forks.
I don't know how much effort it would be to make the Kafka tests
parallelizable. In total, they currently use 30 minutes on the big machine
(while 31 CPUs are idling :) )

Let me know what you think about these results. If the community is
generally interested in further investigating into that direction, I could
look into software to orchestrate this, as well as sponsors for such an
infrastructure.

[1] https://docs.travis-ci.com/user/reference/overview/


On Fri, Aug 16, 2019 at 3:27 PM Chesnay Schepler  wrote:

> @Aljoscha Shading takes a few minutes for a full build; you can see this
> quite easily by looking at the compile step in the misc profile
> ; all modules that
> longer than a fraction of a section are usually caused by shading lots
> of classes. Note that I cannot tell you how much of this is spent on
> relocations, and how much on writing the jar.
>
> Personally, I'd very much like us to move all shading to flink-shaded;
> this would finally allows us to use newer maven versions without needing
> cumbersome workarounds for flink-dist. However, this isn't a trivial
> affair in some cases; IIRC calcite could be difficult to handle.
>
> On another note, this would also simplify switching the main repo to
> another build system, since you would no longer had to deal with
> relocations, just packaging + merging NOTICE files.
>
> @BowenLi I disagree, flink-shaded does not include any tests,  API
> compatibility checks, checkstyle, layered shading (e.g., flink-runtime
> and flink-dist, where both relocate dependencies and one is bundled by
> the other), and, most importantly, CI (and really, without CI being
> covered in a PoC there's nothing to discuss).
>
> On 16/08/2019 15:13, Aljoscha Krettek wrote:
> > Speaking of flink-shaded, do we have any idea what the impact of shading
> is on the build time? We could get rid of shading completely in the Flink
> main repository by moving everything that we shade to flink-shaded.
> >
> > Aljoscha
> >
> >> On 16. Aug 2019, at 14:58, Bowen Li  wrote:
> >>
> >> +1 to Till's points on #2 and #5, especially the potential
> non-disruptive,
> >> gradual migration approach if we decide to go that route.
> >>
> >> To add on, I want to point it out that we can actually start with
> >> flink-shaded project [1] which is a perfect candidate for PoC. It's of
> much
> >> smaller size, totally isolated from and not interfered with flink
> project
> >> [2], and it actually covers most of our practical feature requirements
> for
> >> a build tool - all making it an ideal experimental field.
> >>
> >> [1] https://github.com/apache/flink-shaded
> >> [2] https://github.com/apache/flink
> >>
> >>
> >> On Fri, Aug 16, 2019 at 4:52 AM Till Rohrmann 
> wrote:
> >>
> >>> For the sake of keeping the discussion focused and not cluttering the
> >>> discussion thread I would suggest to split the detailed reporting for
> >>> reusing JVMs to a separate thread and cross linking it from here.
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Fri, Aug 16, 2019 at 1:36 PM Chesnay Schepler 
> >>> wrote:
> >>>
>  Update:
> 
>  TL;DR: table-planner is a good candidate for enabling fork reuse right
>  away, while flink-tests has the potential for huge savings, but we
> have
>  to figure out some issues first.
> 
> 
>  Build link: https://travis-ci.org/zentol/flink/builds/572659220
> 
>  4/8 profiles failed.
> 
>  No speedup in libraries, python, blink_planner, 7 minutes saved in
>  libraries (table-planner).
> 
>  The kafka and connectors profiles both fail in kafka tests due to
>  producer leaks, and no speed up could be confirmed so far:
> 
>  java.lang.AssertionError: Detected producer leak. Thread name:
>  kafka-producer-network-thread | producer-239
>  at org.junit.Assert.fail(Assert.java:88)
>  at
> 
> >>>
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011ITCase.checkProducerLeak(FlinkKafkaProducer011ITCase.java:

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-08-18 Thread Dawid Wysakowicz

Hi Stephan,

Thank you for your opinion.

Actually list/composite types are the topics we spent the most of the
time. I understand that from a perspective of a full blown type system,
a field like isList may look weird. Please let me elaborate a bit more
on the reason behind it though. Maybe we weren't clear enough about it
in the FLIP. The key feature of all the conifg options is that they must
have a string representation as they might come from a configuration
file. Moreover it must be a human readable format, so that the values
might be manually adjusted. Having that in mind we did not want to add a
support of an arbitrary nesting and we decided to allow for lists only
(and flat objects - I think though in the current design there is a
mistake around the Configurable interface). I think though you have a
point here and it would be better to have a ListConfigOption instead of
this field. Does it make sense to you?

As for the second part of your message. I am not sure if I understood
it. The validators work with parse/deserialized values from
Configuration that means they can be bound to the generic parameter of
Configuration. You can have a RangeValidator. I don't think the type hierarchy in the ConfigOption
has anything to do with the validation logic. Could you elaborate a bit
more what did you mean?

Best,

Dawid

On 18/08/2019 16:42, Stephan Ewen wrote:
> I like the idea of enhancing the configuration and to do early validation.
>
> I feel that some of the ideas in the FLIP seem a bit ad hoc, though. For
> example, having a boolean "isList" is a clear indication of not having
> thought through the type/category system.
> Also, having a more clear category system makes validation simpler.
>
> For example, I have seen systems distinguishing between numeric parameters
> (valid ranges), category parameters (set of possible values), quantities
> like duration and memory size (need measure and unit), which results in an
> elegant system for validation.
>
>
> On Fri, Aug 16, 2019 at 5:22 PM JingsongLee 
> wrote:
>
>> +1 to this, thanks Timo and Dawid for the design.
>> This allows the currently cluttered configuration of various
>>  modules to be unified.
>> This is also first step of one of the keys to making new unified
>> TableEnvironment available for production.
>>
>> Previously, we did encounter complex configurations, such as
>> specifying the skewed values of column in DDL. The skew may
>>  be a single field or a combination of multiple fields. So the
>>  configuration is very troublesome. We used JSON string to
>>  configure it.
>>
>> Best,
>> Jingsong Lee
>>
>>
>>
>> --
>> From:Jark Wu 
>> Send Time:2019年8月16日(星期五) 16:44
>> To:dev 
>> Subject:Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration
>>
>> Thanks for starting this design Timo and Dawid,
>>
>> Improving ConfigOption has been hovering in my mind for a long time.
>> We have seen the benefit when developing blink configurations and connector
>> properties in 1.9 release.
>> Thanks for bringing it up and make such a detailed design.
>> I will leave my thoughts and comments there.
>>
>> Cheers,
>> Jark
>>
>>
>> On Fri, 16 Aug 2019 at 22:30, Zili Chen  wrote:
>>
>>> Hi Timo,
>>>
>>> It looks interesting. Thanks for preparing this FLIP!
>>>
>>> Client API enhancement benefit from this evolution which
>>> hopefully provides a better view of configuration of Flink.
>>> In client API enhancement, we likely make the deployment
>>> of cluster and submission of job totally defined by configuration.
>>>
>>> Will take a look at the document in days.
>>>
>>> Best,
>>> tison.
>>>
>>>
>>> Timo Walther  于2019年8月16日周五 下午10:12写道：
>>>
 Hi everyone,

 Dawid and I are working on making parts of ExecutionConfig and
 TableConfig configurable via config options. This is necessary to make
 all properties also available in SQL. Additionally, with the new SQL
>> DDL
 based on properties as well as more connectors and formats coming up,
 unified configuration becomes more important.

 We need more features around string-based configuration in the future,
 which is why Dawid and I would like to propose FLIP-54 for evolving the
 ConfigOption and Configuration classes:



>> https://docs.google.com/document/d/1IQ7nwXqmhCy900t2vQLEL3N2HIdMg-JO8vTzo1BtyKU/edit
 In summary it adds:
 - documented types and validation
 - more common types such as memory size, duration, list
 - simple non-nested object types

 Looking forward to your feedback,
 Timo


>>



signature.asc
Description: OpenPGP digital signature

[jira] [Created] (FLINK-13762) Integrate the implementation of ForwardingValveOutputHandler for StreamOne/TwoInputProcessor

2019-08-18 Thread zhijiang (JIRA)

zhijiang created FLINK-13762:


 Summary: Integrate the implementation of 
ForwardingValveOutputHandler for StreamOne/TwoInputProcessor
 Key: FLINK-13762
 URL: https://issues.apache.org/jira/browse/FLINK-13762
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Task
Reporter: zhijiang
Assignee: zhijiang


Currently StreamOneInputProcessor and StreamTwoInputSelectableProcessor have 
separate implementations of ForwardingValveOutputHandler. Especially for the 
implementation in  StreamTwoInputSelectableProcessor, it couples the internal 
input index logic which would be a blocker for the following unification of 
StreamTaskInput/Output.

We could realize a unified ForwardingValveOutputHandler for both 
StreamOneInput/ TwoInputSelectableProcessor, and it does not consider different 
inputs to always consume StreamStatus. Then we refactor the implementation of 
StreamStatusMaintainer for judging the status of different inputs internally 
before really emitting the StreamStatus.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (FLINK-13761) `SplitStream` should be deprecated because `SplitJavaStream` is deprecated

2019-08-18 Thread zhihao zhang (JIRA)

zhihao zhang created FLINK-13761:


 Summary: `SplitStream` should be deprecated because 
`SplitJavaStream` is deprecated
 Key: FLINK-13761
 URL: https://issues.apache.org/jira/browse/FLINK-13761
 Project: Flink
  Issue Type: Bug
  Components: API / Scala
Affects Versions: 1.8.1
Reporter: zhihao zhang


h1. `SplitStream` should be deprecated because `SplitJavaStream` is deprecated.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re: [DISCUSS] Release flink-shaded 8.0

2019-08-18 Thread Stephan Ewen

Are we fine with the current Netty version, or would be want to bump it?

On Fri, Aug 16, 2019 at 10:30 AM Chesnay Schepler 
wrote:

> Hello,
>
> I would like to kick off the next flink-shaded release next week. There
> are 2 ongoing efforts that are blocked on this release:
>
>   * [FLINK-13467] Java 11 support requires a bump to ASM to correctly
> handle Java 11 bytecode
>   * [FLINK-11767] Reworking the typeSerializerSnapshotMigrationTestBase
> requires asm-commons to be added to flink-shaded-asm
>
> Are there any other changes on anyone's radar that we will have to make
> for 1.10? (will bumping calcite require anything, for example)
>
>
>

Re: [DISCUSS] Update our Roadmap

2019-08-18 Thread Stephan Ewen

I could help with that.

On Fri, Aug 16, 2019 at 2:36 PM Robert Metzger  wrote:

> Flink 1.9 is feature freezed and almost released.
> I guess it makes sense to update the roadmap on the website again.
>
> Who feels like having a good overview of what's coming up?
>
> On Tue, May 7, 2019 at 4:33 PM Fabian Hueske  wrote:
>
> > Yes, that's a very good proposal Jark.
> > +1
> >
> > Best, Fabian
> >
> > Am Mo., 6. Mai 2019 um 16:33 Uhr schrieb Till Rohrmann <
> > trohrm...@apache.org
> > >:
> >
> > > I think this is a good idea Jark. Putting the last update date on the
> > > roadmap would also force us to regularly update it.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Mon, May 6, 2019 at 4:14 AM Jark Wu  wrote:
> > >
> > > > Hi,
> > > >
> > > > One suggestion for the roadmap:
> > > >
> > > > Shall we add a `latest-update-time` to the top of Roadmap page? So
> that
> > > > users can know this is a up-to-date Roadmap.
> > > >
> > > > On Thu, 2 May 2019 at 04:49, Bowen Li  wrote:
> > > >
> > > > > +1
> > > > >
> > > > > On Mon, Apr 29, 2019 at 11:41 PM jincheng sun <
> > > sunjincheng...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Jeff&Fabian,
> > > > > >
> > > > > > I have open the PR about add Python Table API section to the
> > > roadmap. I
> > > > > > appreciate if you have time to look at it. :)
> > > > > >
> > > > > > https://github.com/apache/flink-web/pull/204
> > > > > >
> > > > > > Regards,
> > > > > > Jincheng
> > > > > >
> > > > > > jincheng sun  于2019年4月29日周一 下午11:12写道：
> > > > > >
> > > > > > > Sure, I will do it！I think the python table api info should in
> > the
> > > > > > >  roadmap! Thank you @Jeff @Fabian
> > > > > > >
> > > > > > > Fabian Hueske 于2019年4月29日 周一23:05写道：
> > > > > > >
> > > > > > >> Great, thanks Jeff and Timo!
> > > > > > >>
> > > > > > >> @Jincheng do you want to write a paragraph about the Python
> > effort
> > > > and
> > > > > > >> open a PR for it?
> > > > > > >>
> > > > > > >> I'll remove the issue about Hadoop convenience builds
> > > (FLINK-11266).
> > > > > > >>
> > > > > > >> Best, Fabian
> > > > > > >>
> > > > > > >> Am Mo., 29. Apr. 2019 um 16:37 Uhr schrieb Jeff Zhang <
> > > > > zjf...@gmail.com
> > > > > > >:
> > > > > > >>
> > > > > > >>> jincheng(cc) is driving the python effort, I think he can
> help
> > to
> > > > > > >>> prepare it.
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Fabian Hueske  于2019年4月29日周一 下午10:15写道：
> > > > > > >>>
> > > > > >  Hi everyone,
> > > > > > 
> > > > > >  Since we had no more comments on this thread, I think we
> > proceed
> > > > to
> > > > > >  update the roadmap.
> > > > > > 
> > > > > >  @Jeff Zhang  I agree, we should add the
> > > Python
> > > > > >  efforts to the roadmap.
> > > > > >  Do you want to prepare a short paragraph that we can add to
> > the
> > > > > >  document?
> > > > > > 
> > > > > >  Best, Fabian
> > > > > > 
> > > > > >  Am Mi., 17. Apr. 2019 um 15:04 Uhr schrieb Jeff Zhang <
> > > > > > zjf...@gmail.com
> > > > > >  >:
> > > > > > 
> > > > > > > Hi Fabian,
> > > > > > >
> > > > > > > One thing missing is python api and python udf, we already
> > > > > discussed
> > > > > > > it in
> > > > > > > community, and it is very close to reach consensus.
> > > > > > >
> > > > > > >
> > > > > > > Fabian Hueske  于2019年4月17日周三 下午7:51写道：
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > We recently added a roadmap to our project website [1]
> and
> > > > > decided
> > > > > > to
> > > > > > > > update it after every release. Flink 1.8.0 was released a
> > few
> > > > > days
> > > > > > > ago, so
> > > > > > > > I think it we should check and remove from the roadmap
> what
> > > was
> > > > > > > achieved so
> > > > > > > > far and add features / improvements that we plan for the
> > > > future.
> > > > > > > >
> > > > > > > > I had a look at the roadmap and found that
> > > > > > > >
> > > > > > > > > We are changing the build setup to not bundle Hadoop by
> > > > > default,
> > > > > > > but
> > > > > > > > rather offer pre-packaged
> > > > > > > > > Hadoop libraries for the use with Yarn, HDFS, etc. as
> > > > > convenience
> > > > > > > > downloads FLINK-11266 <
> > > > > > > https://issues.apache.org/jira/browse/FLINK-11266>.
> > > > > > > >
> > > > > > > > was implemented for 1.8.0 and should be removed from the
> > > > roadmap.
> > > > > > > > All other issues are still ongoing efforts.
> > > > > > > >
> > > > > > > > Are there any other efforts that we want to put on the
> > > roadmap?
> > > > > > > >
> > > > > > > > Best, Fabian
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best Regards
> > > > > > >
> > > > > > > Jeff Zhang
> > > > > > >
> > > > > > 
> > > > > > >>>
> > > > > > >>> --
> > > > > > >>> B

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-08-18 Thread Stephan Ewen

I like the idea of enhancing the configuration and to do early validation.

I feel that some of the ideas in the FLIP seem a bit ad hoc, though. For
example, having a boolean "isList" is a clear indication of not having
thought through the type/category system.
Also, having a more clear category system makes validation simpler.

For example, I have seen systems distinguishing between numeric parameters
(valid ranges), category parameters (set of possible values), quantities
like duration and memory size (need measure and unit), which results in an
elegant system for validation.


On Fri, Aug 16, 2019 at 5:22 PM JingsongLee 
wrote:

> +1 to this, thanks Timo and Dawid for the design.
> This allows the currently cluttered configuration of various
>  modules to be unified.
> This is also first step of one of the keys to making new unified
> TableEnvironment available for production.
>
> Previously, we did encounter complex configurations, such as
> specifying the skewed values of column in DDL. The skew may
>  be a single field or a combination of multiple fields. So the
>  configuration is very troublesome. We used JSON string to
>  configure it.
>
> Best,
> Jingsong Lee
>
>
>
> --
> From:Jark Wu 
> Send Time:2019年8月16日(星期五) 16:44
> To:dev 
> Subject:Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration
>
> Thanks for starting this design Timo and Dawid,
>
> Improving ConfigOption has been hovering in my mind for a long time.
> We have seen the benefit when developing blink configurations and connector
> properties in 1.9 release.
> Thanks for bringing it up and make such a detailed design.
> I will leave my thoughts and comments there.
>
> Cheers,
> Jark
>
>
> On Fri, 16 Aug 2019 at 22:30, Zili Chen  wrote:
>
> > Hi Timo,
> >
> > It looks interesting. Thanks for preparing this FLIP!
> >
> > Client API enhancement benefit from this evolution which
> > hopefully provides a better view of configuration of Flink.
> > In client API enhancement, we likely make the deployment
> > of cluster and submission of job totally defined by configuration.
> >
> > Will take a look at the document in days.
> >
> > Best,
> > tison.
> >
> >
> > Timo Walther  于2019年8月16日周五 下午10:12写道：
> >
> > > Hi everyone,
> > >
> > > Dawid and I are working on making parts of ExecutionConfig and
> > > TableConfig configurable via config options. This is necessary to make
> > > all properties also available in SQL. Additionally, with the new SQL
> DDL
> > > based on properties as well as more connectors and formats coming up,
> > > unified configuration becomes more important.
> > >
> > > We need more features around string-based configuration in the future,
> > > which is why Dawid and I would like to propose FLIP-54 for evolving the
> > > ConfigOption and Configuration classes:
> > >
> > >
> > >
> >
> https://docs.google.com/document/d/1IQ7nwXqmhCy900t2vQLEL3N2HIdMg-JO8vTzo1BtyKU/edit
> > >
> > > In summary it adds:
> > > - documented types and validation
> > > - more common types such as memory size, duration, list
> > > - simple non-nested object types
> > >
> > > Looking forward to your feedback,
> > > Timo
> > >
> > >
> >
>
>

Re: [VOTE] FLIP-50: Spill-able Heap State Backend

2019-08-18 Thread Stephan Ewen

+1

On Sun, Aug 18, 2019 at 3:31 PM Till Rohrmann  wrote:

> +1
>
> On Fri, Aug 16, 2019 at 4:54 PM Yu Li  wrote:
>
> > Hi All,
> >
> > Since we have reached a consensus in the discussion thread [1], I'd like
> to
> > start the voting for FLIP-50 [2].
> >
> > This vote will be open for at least 72 hours. Unless objection I will try
> > to close it by end of Tuesday August 20, 2019 if we have sufficient
> votes.
> > Thanks.
> >
> > [1] https://s.apache.org/cq358
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
> >
> > Best Regards,
> > Yu
> >
>

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-18 Thread Stephan Ewen

For reference, this is the JIRA issue about the regression in question:

https://issues.apache.org/jira/browse/FLINK-13752


On Fri, Aug 16, 2019 at 10:57 AM Guowei Ma  wrote:

> Hi, till
> I can send the job to you offline.
> It is just a datastream job and does not use TwoInputSelectableStreamTask.
> A->B
>  \
>C
>  /
> D->E
> Best,
> Guowei
>
>
> Till Rohrmann  于2019年8月16日周五 下午4:34写道：
>
> > Thanks for reporting this issue Guowei. Could you share a bit more
> details
> > what the job exactly does and which operators it uses? Does the job uses
> > the new `TwoInputSelectableStreamTask` which might cause the performance
> > regression?
> >
> > I think it is important to understand where the problem comes from before
> > we proceed with the release.
> >
> > Cheers,
> > Till
> >
> > On Fri, Aug 16, 2019 at 10:27 AM Guowei Ma  wrote:
> >
> > > Hi,
> > > -1
> > > We have a benchmark job, which includes a two-input operator.
> > > This job has a big performance regression using 1.9 compared to 1.8.
> > > It's still not very clear why this regression happens.
> > >
> > > Best,
> > > Guowei
> > >
> > >
> > > Yu Li  于2019年8月16日周五 下午3:27写道：
> > >
> > > > +1 (non-binding)
> > > >
> > > > - checked release notes: OK
> > > > - checked sums and signatures: OK
> > > > - source release
> > > >  - contains no binaries: OK
> > > >  - contains no 1.9-SNAPSHOT references: OK
> > > >  - build from source: OK (8u102)
> > > >  - mvn clean verify: OK (8u102)
> > > > - binary release
> > > >  - no examples appear to be missing
> > > >  - started a cluster; WebUI reachable, example ran successfully
> > > > - repository appears to contain all expected artifacts
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > >
> > > > On Fri, 16 Aug 2019 at 06:06, Bowen Li  wrote:
> > > >
> > > > > Hi Jark,
> > > > >
> > > > > Thanks for letting me know that it's been like this in previous
> > > releases.
> > > > > Though I don't think that's the right behavior, it can be discussed
> > for
> > > > > later release. Thus I retract my -1 for RC2.
> > > > >
> > > > > Bowen
> > > > >
> > > > >
> > > > > On Thu, Aug 15, 2019 at 7:49 PM Jark Wu  wrote:
> > > > >
> > > > > > Hi Bowen,
> > > > > >
> > > > > > Thanks for reporting this.
> > > > > > However, I don't think this is an issue. IMO, it is by design.
> > > > > > The `tEnv.listUserDefinedFunctions()` in Table API and `show
> > > > functions;`
> > > > > in
> > > > > > SQL CLI are intended to return only the registered UDFs, not
> > > including
> > > > > > built-in functions.
> > > > > > This is also the behavior in previous versions.
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > > On Fri, 16 Aug 2019 at 06:52, Bowen Li 
> > wrote:
> > > > > >
> > > > > > > -1 for RC2.
> > > > > > >
> > > > > > > I found a bug
> https://issues.apache.org/jira/browse/FLINK-13741,
> > > > and I
> > > > > > > think it's a blocker.  The bug means currently if users call
> > > > > > > `tEnv.listUserDefinedFunctions()` in Table API or `show
> > functions;`
> > > > > thru
> > > > > > > SQL would not be able to see Flink's built-in functions.
> > > > > > >
> > > > > > > I'm preparing a fix right now.
> > > > > > >
> > > > > > > Bowen
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Aug 15, 2019 at 8:55 AM Tzu-Li (Gordon) Tai <
> > > > > tzuli...@apache.org
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks for all the test efforts, verifications and votes so
> > far.
> > > > > > > >
> > > > > > > > So far, things are looking good, but we still require one
> more
> > > PMC
> > > > > > > binding
> > > > > > > > vote for this RC to be the official release, so I would like
> to
> > > > > extend
> > > > > > > the
> > > > > > > > vote time for 1 more day, until *Aug. 16th 17:00 CET*.
> > > > > > > >
> > > > > > > > In the meantime, the release notes for 1.9.0 had only just
> been
> > > > > > finalized
> > > > > > > > [1], and could use a few more eyes before closing the vote.
> > > > > > > > Any help with checking if anything else should be mentioned
> > there
> > > > > > > regarding
> > > > > > > > breaking changes / known shortcomings would be appreciated.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Gordon
> > > > > > > >
> > > > > > > > [1] https://github.com/apache/flink/pull/9438
> > > > > > > >
> > > > > > > > On Thu, Aug 15, 2019 at 3:58 PM Kurt Young  >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Great, then I have no other comments on legal check.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Kurt
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Aug 15, 2019 at 9:56 PM Chesnay Schepler <
> > > > > ches...@apache.org
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > The licensing items aren't a problem; we don't care about
> > > Flink
> > > > > > > modules
> > > > > > > > > > in NOTICE files, and we don't have to update the
> > > source-release
> > > >

Re: [VOTE] FLIP-50: Spill-able Heap State Backend

2019-08-18 Thread Till Rohrmann

+1

On Fri, Aug 16, 2019 at 4:54 PM Yu Li  wrote:

> Hi All,
>
> Since we have reached a consensus in the discussion thread [1], I'd like to
> start the voting for FLIP-50 [2].
>
> This vote will be open for at least 72 hours. Unless objection I will try
> to close it by end of Tuesday August 20, 2019 if we have sufficient votes.
> Thanks.
>
> [1] https://s.apache.org/cq358
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
>
> Best Regards,
> Yu
>

[jira] [Created] (FLINK-13760) Fix hardcode Scala version dependency in hive connector

2019-08-18 Thread Jark Wu (JIRA)

Jark Wu created FLINK-13760:
---

 Summary: Fix hardcode Scala version dependency in hive connector
 Key: FLINK-13760
 URL: https://issues.apache.org/jira/browse/FLINK-13760
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Reporter: Jark Wu
Assignee: Jark Wu
 Fix For: 1.9.1


FLINK-13688 introduced a {{flink-test-utils}} dependency. However, the Scala 
version of the artifactId is hardcoded, this result in recent CRON jobs failed. 

Here is an instance: https://api.travis-ci.org/v3/job/573092374/log.txt


{code}
11:46:09.078 [INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce 
(enforce-versions) @ flink-connector-hive_2.12 ---
11:46:09.134 [WARNING] Rule 0: 
org.apache.maven.plugins.enforcer.BannedDependencies failed with message:
Found Banned Dependency: com.typesafe.akka:akka-slf4j_2.11:jar:2.5.21
Found Banned Dependency: com.typesafe.akka:akka-actor_2.11:jar:2.5.21
Found Banned Dependency: com.typesafe:ssl-config-core_2.11:jar:0.3.7
Found Banned Dependency: 
org.scala-lang.modules:scala-java8-compat_2.11:jar:0.7.0
Found Banned Dependency: com.typesafe.akka:akka-protobuf_2.11:jar:2.5.21
Found Banned Dependency: org.apache.flink:flink-clients_2.11:jar:1.10-SNAPSHOT
Found Banned Dependency: 
org.apache.flink:flink-streaming-java_2.11:jar:1.10-SNAPSHOT
Found Banned Dependency: com.typesafe.akka:akka-stream_2.11:jar:2.5.21
Found Banned Dependency: com.github.scopt:scopt_2.11:jar:3.5.0
Found Banned Dependency: 
org.apache.flink:flink-test-utils_2.11:jar:1.10-SNAPSHOT
Found Banned Dependency: org.apache.flink:flink-runtime_2.11:jar:1.10-SNAPSHOT
Found Banned Dependency: 
org.apache.flink:flink-runtime_2.11:test-jar:tests:1.10-SNAPSHOT
Found Banned Dependency: 
org.scala-lang.modules:scala-parser-combinators_2.11:jar:1.1.1
Found Banned Dependency: com.twitter:chill_2.11:jar:0.7.6
Found Banned Dependency: org.clapper:grizzled-slf4j_2.11:jar:1.3.2
Found Banned Dependency: org.apache.flink:flink-optimizer_2.11:jar:1.10-SNAPSHOT
Use 'mvn dependency:tree' to locate the source of the banned dependencies.
{code}




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (FLINK-13759) All builds for master branch are failed during compile stage

2019-08-18 Thread Jark Wu (JIRA)

Jark Wu created FLINK-13759:
---

 Summary: All builds for master branch are failed during compile 
stage
 Key: FLINK-13759
 URL: https://issues.apache.org/jira/browse/FLINK-13759
 Project: Flink
  Issue Type: Bug
Reporter: Jark Wu


Here is an instance: https://api.travis-ci.org/v3/job/572950228/log.txt

There is an error in the log.

{{code}}
==
find: 
‘flink-connectors/flink-connector-elasticsearch/target/flink-connector-elasticsearch*.jar’:
 No such file or directory
==
Previous build failure detected, skipping cache setup.
==
{{code}}

The {{flink-connector-elasticsearch}} is not exist. But recent commits didn't 
modify this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Review of pull request

[jira] [Created] (FLINK-13768) Update documentation regarding `path style access` for S3 filesystem implementations

Re: [VOTE] Flink Project Bylaws

Cwiki edit access

Re: [DISCUSS] FLIP-39: Flink ML pipeline and ML libs

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

[jira] [Created] (FLINK-13767) Migrate isFinished method from AvailabilityListener to AsyncDataInput

[jira] [Created] (FLINK-13766) Refactor the implementation of StreamInputProcessor based on StreamTaskInput#emitNext

[jira] [Created] (FLINK-13765) Introduce the InputSelectionHandler for selecting next input in StreamTwoInputSelectableProcessor

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

[jira] [Created] (FLINK-13764) Pass the counter of numRecordsIn into the constructors of StreamOne/TwoInputProcessor

[jira] [Created] (FLINK-13763) Master build is broken because of wrong Maven version

Re: [DISCUSS] Reducing build times

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

[jira] [Created] (FLINK-13762) Integrate the implementation of ForwardingValveOutputHandler for StreamOne/TwoInputProcessor

[jira] [Created] (FLINK-13761) `SplitStream` should be deprecated because `SplitJavaStream` is deprecated

Re: [DISCUSS] Release flink-shaded 8.0

Re: [DISCUSS] Update our Roadmap

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

Re: [VOTE] FLIP-50: Spill-able Heap State Backend

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

Re: [VOTE] FLIP-50: Spill-able Heap State Backend

[jira] [Created] (FLINK-13760) Fix hardcode Scala version dependency in hive connector

[jira] [Created] (FLINK-13759) All builds for master branch are failed during compile stage

24 matches

Site Navigation

Mail list logo

Footer information