No data issued by flink window after a few hours

2018-10-09 Thread ? ??
Hi all,
I used flink window, and when the job begins, we could get the results of 
windiow.But there’re no results issued after a few hours.
I found the job is still running and no errors, and the data not used 
window all can be issued.
By the way, I used Flink 1.3.2 and ram to cache the distinct data about 
sliding window.

Yours,
September


[jira] [Created] (FLINK-10520) Job save points REST API fails unless parameters are specified

2018-10-09 Thread Elias Levy (JIRA)
Elias Levy created FLINK-10520:
--

 Summary: Job save points REST API fails unless parameters are 
specified
 Key: FLINK-10520
 URL: https://issues.apache.org/jira/browse/FLINK-10520
 Project: Flink
  Issue Type: Bug
  Components: REST
Affects Versions: 1.6.1
Reporter: Elias Levy


The new REST API POST endpoint, {{/jobs/:jobid/savepoints}}, returns an error 
unless the request includes a body with all parameters ({{target-directory}} 
and {{cancel-job}})), even thought the 
[documentation|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/runtime/rest/handler/job/savepoints/SavepointHandlers.html]
 suggests these are optional.

If a POST request with no data is made, the response is a 400 status code with 
the error message "Bad request received."

If the POST request submits an empty JSON object ( {} ), the response is a 400 
status code with the error message "Request did not match expected format 
SavepointTriggerRequestBody."  The same is true if only the 
{{target-directory}} or {{cancel-job}} parameters are included.

As the system is configured with a default savepoint location, there shouldn't 
be a need to include the parameter in the quest.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10519) flink-parent:1.6.1 artifact can't be found on maven central

2018-10-09 Thread Florian Schmidt (JIRA)
Florian Schmidt created FLINK-10519:
---

 Summary: flink-parent:1.6.1 artifact can't be found on maven 
central
 Key: FLINK-10519
 URL: https://issues.apache.org/jira/browse/FLINK-10519
 Project: Flink
  Issue Type: Bug
Reporter: Florian Schmidt


The flink-parent:1.6.1 artifact can't be found on maven central:

*Stacktrace from maven*
{code:java}
...
Caused by: org.eclipse.aether.transfer.ArtifactNotFoundException: Could not 
find artifact org.apache.flink:flink-parent:pom:1.6.1 in central 
(https://repo.maven.apache.org/maven2)
...
{code}
 

Also when browsing the repository in the browser 
([https://repo.maven.apache.org/maven2/org/apache/flink/flink-parent/1.6.1/]) 
it will show the flink-parent artifact in the list, but return 404 when trying 
to download it. This does only seem to happen from some networks, as I was able 
to successfully run the following on a server that I ssh'd into, but not on my 
local device
{code:java}
curl 
https://repo.maven.apache.org/maven2/org/apache/flink/flink-parent/1.6.1/flink-parent-1.6.1.pom{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Sharing state between subtasks

2018-10-09 Thread Elias Levy
On Tue, Oct 9, 2018 at 1:25 AM Aljoscha Krettek  wrote:

> @Elias Do you know if Kafka Consumers do this alignment across multiple
> consumers or only within one Consumer across the partitions that it reads
> from.
>

The behavior is part of Kafka Streams
,
not the Kafka consumer.  The alignment does not occur across Kafka
consumers, but that is because Kafka Streams, unlikely Flink, uses a single
consumer to fetch records from multiple sources / topics.  The alignment
occurs with the stream task.  Stream tasks keep queues per topic-partition
(which may be from different topics), and select the next record to
processed by selecting the queue with the lowest timestamp.

The equivalent in Flink would be for the Kafka connector source to select
the message among partitions with the lowest timestamp to emit next, and
for multiple input stream operators to select the message among inputs with
the lowest timestamp to process.


[DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-09 Thread Zhang, Xuefu
Hi all,

Along with the community's effort, inside Alibaba we have explored Flink's 
potential as an execution engine not just for stream processing but also for 
batch processing. We are encouraged by our findings and have initiated our 
effort to make Flink's SQL capabilities full-fledged. When comparing what's 
available in Flink to the offerings from competitive data processing engines, 
we identified a major gap in Flink: a well integration with Hive ecosystem. 
This is crucial to the success of Flink SQL and batch due to the 
well-established data ecosystem around Hive. Therefore, we have done some 
initial work along this direction but there are still a lot of effort needed.

We have two strategies in mind. The first one is to make Flink SQL full-fledged 
and well-integrated with Hive ecosystem. This is a similar approach to what 
Spark SQL adopted. The second strategy is to make Hive itself work with Flink, 
similar to the proposal in [1]. Each approach bears its pros and cons, but they 
don’t need to be mutually exclusive with each targeting at different users and 
use cases. We believe that both will promote a much greater adoption of Flink 
beyond stream processing.

We have been focused on the first approach and would like to showcase Flink's 
batch and SQL capabilities with Flink SQL. However, we have also planned to 
start strategy #2 as the follow-up effort.

I'm completely new to Flink(, with a short bio [2] below), though many of my 
colleagues here at Alibaba are long-time contributors. Nevertheless, I'd like 
to share our thoughts and invite your early feedback. At the same time, I am 
working on a detailed proposal on Flink SQL's integration with Hive ecosystem, 
which will be also shared when ready.

While the ideas are simple, each approach will demand significant effort, more 
than what we can afford. Thus, the input and contributions from the communities 
are greatly welcome and appreciated.

Regards,


Xuefu

References:

[1] https://issues.apache.org/jira/browse/HIVE-10712
[2] Xuefu Zhang is a long-time open source veteran, worked or working on many 
projects under Apache Foundation, of which he is also an honored member. About 
10 years ago he worked in the Hadoop team at Yahoo where the projects just got 
started. Later he worked at Cloudera, initiating and leading the development of 
Hive on Spark project in the communities and across many organizations. Prior 
to joining Alibaba, he worked at Uber where he promoted Hive on Spark to all 
Uber's SQL on Hadoop workload and significantly improved Uber's cluster 
efficiency.




Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

2018-10-09 Thread Jin Sun
Great job! That would very helpful for debug.

I would suggest to use small icons for this Job Manager/Managers when there are 
too many instances (like a thousand)
May be we can also introduce locality,  that task managers belongs to same rack 
shows together?




Small icons can be like this:




> On Oct 9, 2018, at 8:49 PM, Till Rohrmann  wrote:
> 
> mation on the front
> page. Your mock looks really promising to me since it shows some basic
> metrics and cluster information at a glance. Apart from the the source
> input and sink output metrics, all other required information should be
> available to display it in the dashboard. Thus, your proposal should only
> affect flink-runtime-web which should make it easier to realize.
> 
> I'm in favour of adding this feature to Flink's dashboard to make it
> available to the whole community.



Re: [DISCUSS] [Contributing] (2) - Review Steps

2018-10-09 Thread Hequn Cheng
+1

On Tue, Oct 9, 2018 at 3:25 PM Till Rohrmann  wrote:

> +1
>
> On Tue, Oct 9, 2018 at 9:08 AM Zhijiang(wangzhijiang999)
>  wrote:
>
> > +1
> > --
> > 发件人:vino yang 
> > 发送时间:2018年10月9日(星期二) 14:08
> > 收件人:dev 
> > 主 题:Re: [DISCUSS] [Contributing] (2) - Review Steps
> >
> > +1
> >
> > Peter Huang  于2018年10月9日周二 下午1:54写道:
> >
> > > +1
> > >
> > > On Mon, Oct 8, 2018 at 7:47 PM Thomas Weise  wrote:
> > >
> > > > +1
> > > >
> > > >
> > > > On Mon, Oct 8, 2018 at 7:36 PM Tzu-Li Chen 
> > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Jin Sun  于2018年10月9日周二 上午2:10写道:
> > > > >
> > > > > > +1, look forward to see the change.
> > > > > >
> > > > > > > On Oct 9, 2018, at 12:07 AM, Fabian Hueske 
> > > > wrote:
> > > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > Since we have addressed all comments (please raise your voice
> if
> > > > > not!), I
> > > > > > > would like to move forward and convert the proposal [1] into a
> > page
> > > > for
> > > > > > > Flink's website [2].
> > > > > > > I will create a pull request against the website repo [3].
> > > > > > >
> > > > > > > Once the page got merged, we can start posting the review form
> on
> > > new
> > > > > > pull
> > > > > > > requests.
> > > > > > >
> > > > > > > Best, Fabian
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1yaX2b9LNh-6LxrAmE23U3D2cRbocGlGKCYnvJd9lVhk
> > > > > > > [2] https://flink.apache.org
> > > > > > > [3] https://github.com/apache/flink-web
> > > > > > >
> > > > > > > Am Di., 25. Sep. 2018 um 17:56 Uhr schrieb Tzu-Li Chen <
> > > > > > wander4...@gmail.com
> > > > > > >> :
> > > > > > >
> > > > > > >> I agree with Chesnay that we don't guarantee (quick) review
> of a
> > > PR
> > > > at
> > > > > > the
> > > > > > >> project level. As ASF statement[1]:
> > > > > > >>
> > > > > > >>> Please show some patience with the developers if your patch
> is
> > > not
> > > > > > >> applied as fast as you'd like or a developer asks you to make
> > > > changes
> > > > > to
> > > > > > >> the patch. If you do not receive any feedback in a reasonable
> > > amount
> > > > > of
> > > > > > >> time (say a week or two), feel free to send a follow-up e-mail
> > to
> > > > the
> > > > > > >> developer list. Open Source developers are all volunteers,
> often
> > > > doing
> > > > > > the
> > > > > > >> development in their spare time.
> > > > > > >>
> > > > > > >> However, an open source community shows its friendliness to
> > > > > > contributors.
> > > > > > >> Thus contributors believe their contribution would be take
> care
> > > of,
> > > > > > even be
> > > > > > >> rejected with a reason; project members are thought kind to
> > > provide
> > > > > > help to
> > > > > > >> the process.
> > > > > > >>
> > > > > > >> Just like this thread kicked off, it is glad to see that Flink
> > > > > community
> > > > > > >> try best to help its contributors and committers, then take
> > > > advantage
> > > > > of
> > > > > > >> "open source".
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> tison.
> > > > > > >>
> > > > > > >> [1] http://www.apache.org/dev/contributors#patches
> > > > > > >>
> > > > > > >>
> > > > > > >> Chesnay Schepler  于2018年9月25日周二
> 下午11:21写道:
> > > > > > >>
> > > > > > >>> There is no guarantee that a PR will be looked at nor is it
> > > > possible
> > > > > to
> > > > > > >>> provide this in any way on the project level.
> > > > > > >>>
> > > > > > >>> As far as Apache is concerned all contributors/committers
> etc.
> > > work
> > > > > > >>> voluntarily, and
> > > > > > >>> as such assigning work (which includes ownership if it
> implies
> > > > such)
> > > > > or
> > > > > > >>> similar is simply not feasible.
> > > > > > >>>
> > > > > > >>> On 25.09.2018 16:54, Thomas Weise wrote:
> > > > > >  I think that all discussion/coordination related to a
> > > > contribution /
> > > > > > PR
> > > > > >  should be handled through the official project channel.
> > > > > > 
> > > > > >  I would also prefer that there are no designated "owners"
> and
> > > > > > >> "experts",
> > > > > >  for the reasons Fabian mentioned.
> > > > > > 
> > > > > >  Ideally there is no need to have "suggested reviewers"
> either,
> > > but
> > > > > > then
> > > > > >  what will be the process to ensure that PRs will be looked
> at?
> > > > > > 
> > > > > >  Thanks,
> > > > > >  Thomas
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > >  On Tue, Sep 25, 2018 at 6:17 AM Tzu-Li Chen <
> > > wander4...@gmail.com
> > > > >
> > > > > > >>> wrote:
> > > > > > 
> > > > > > > Hi Fabian,
> > > > > > >
> > > > > > > You convinced me. I miss the advantage we can take from
> > mailing
> > > > > > lists.
> > > > > > >
> > > > > > > Now I am of the same opinion.
> > > > > > >
> > > > > > > Best,
> > > > > > > tison.
> > > > > > 

[ANNOUNCE] Weekly community update #41

2018-10-09 Thread Till Rohrmann
Dear community,

this is the weekly community update thread #41. Please post any news and
updates you want to share with the community to this thread.

# Feature freeze for Flink 1.7

The community has decided to freeze the feature development for Flink 1.7.0
on the 22nd of October [1].

# Flink operators for Kubernetes

Anand sent out a note that Lyft is building a K8s Operator for Flink [2].
If the community is interested, they would be willing to open source it.

# Scala 2.12 support requires breaking the API

Aljoscha, who works on enabling Scala 2.12 for Flink, found out that
bumping the Scala version requires breaking API changes [3]. The community
is currently discussing how to proceed, especially wrt the upcoming release.

# Flink web dashboard improvement proposal

Fabian started a discussion about improving Flink's web dashboard to
display more cluster information [4]. He also attached a quick mock how
things could look like.

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Feature-freeze-for-Flink-1-7-22nd-of-October-tp24477.html
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Flink-operators-for-Kubernetes-tp24440.html
[3]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Breaking-the-Scala-API-for-Scala-2-12-Support-tp24464.html
[4]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Cluster-Overview-Dashboard-Improvement-Proposal-tp24531.html

Cheers,
Till


Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

2018-10-09 Thread Till Rohrmann
Hi Fabian,

thanks for starting this discussion. I agree with you that Flink's web
dashboard lacks a bit of general cluster overview information on the front
page. Your mock looks really promising to me since it shows some basic
metrics and cluster information at a glance. Apart from the the source
input and sink output metrics, all other required information should be
available to display it in the dashboard. Thus, your proposal should only
affect flink-runtime-web which should make it easier to realize.

I'm in favour of adding this feature to Flink's dashboard to make it
available to the whole community.

Cheers,
Till

On Tue, Oct 9, 2018 at 12:54 PM Fabian Wollert  wrote:

> argh, i think the screenshot is missing (at least nabble is not showing
> anything). here is a link to the mockup:
>
>
> https://drive.google.com/file/d/1p3wVP028_AFFLZ6fjPb41yAI8zUhgDTO/view?usp=sharing
>
> Cheers
>
> --
>
>
> *Fabian WollertZalando SE*
>
> E-Mail: fab...@zalando.de
>
>
> Am Di., 9. Okt. 2018 um 12:46 Uhr schrieb Fabian Wollert <
> fab...@zalando.de>:
>
>> Hi everyone,
>>
>> disclaimer: i read the contribution guide about improvement requests
>> (i.e. i should actually just start a jira ticket) but i thought it would
>> make sense to run this first through the mailing list here. after
>> collecting some input i would then create the jira ticket.
>>
>> When accessing the Flink Web Dashboard (which is basically what i do
>> almost every day to check some status of a job or so), I recently felt that
>> the actual information given in the top portion of the start page is highly
>> improvable. I created a first mock by moving html elements around and
>> wanted to share this one now:
>>
>> [image: image.png]
>>
>> With the exception of the metrics (see below) none of this information
>> should be new, but rather re-organized to speed up investigation and
>> monitoring:
>>
>>- complete overview on the cluster status and health, without
>>clicking through a lot of pages.
>>- Active and stand-by Job Managers. Also their health is depicted as
>>   a color (as a first suggestion: last heartbeat is inside 
>> heartbeat.timeout)
>>   - Current registered Task Managers
>>  - the little bar on the side indicates task slot usage. i did
>>  not color it since a fully utilised task manager is not necessarily
>>  something bad.
>>  - the color indicates the health of the task manager (as a
>>  first suggestion: last heartbeat is inside heartbeat.timeout)
>>   - overview on some cluster metrics
>>
>> Some points to notice:
>>
>>- All data you see on the screenshot is mock, no number relates to
>>another number at all. but colors should relate to the numbers already
>>which they indicate.
>>- All of this could also be done with other monitoring solutions
>>someone might have in his company, by reading out JMX metrics and then
>>plotting those in his monitoring solution (e.g. grafana). But this out of
>>the box solution would save everyone from doing it on their own and they
>>could trust the metrics shown here.
>>- Some of the metrics can only be done with FLINK-7286
>> being done. So i
>>would split the implementation of this into two parts (cluster overview 
>> and
>>metrics) and do them separately.
>>- This first mock up is targeted to what we here at Zalando would
>>like to see first glance, so it fits our use case very well. We mostly use
>>long-running session clusters.
>>- I'm more a Backend Guy with some Frontend expertise (but mostly in
>>React, no angular1 (Flink Web Dashboard is built with this currently)
>>experience) and not at all a designer.
>>
>> What do you think? I would be glad to have some feedback on this,
>> especially if this makes sense in the broad community. I would no matter
>> what implement this somehow, if not in the Flink Master branch, then as a
>> OS project which anyone can deploy next to their flink clusters. But i
>> first wanted to run it through here to see if this sparks any interest.
>>
>> Please also let me know if you see difficulties implementing this
>> already, maybe i have overseen something.
>>
>> Can't wait for your input.
>>
>> Cheers
>>
>> --
>>
>>
>> *Fabian WollertZalando SE*
>>
>> E-Mail: fab...@zalando.de
>>
>


[jira] [Created] (FLINK-10518) Inefficient design in ContinuousFileMonitoringFunction

2018-10-09 Thread Huyen Levan (JIRA)
Huyen Levan created FLINK-10518:
---

 Summary: Inefficient design in ContinuousFileMonitoringFunction
 Key: FLINK-10518
 URL: https://issues.apache.org/jira/browse/FLINK-10518
 Project: Flink
  Issue Type: Improvement
  Components: filesystem-connector
Affects Versions: 1.5.2
Reporter: Huyen Levan


The ContinuousFileMonitoringFunction class keeps track of the latest file 
modification time to rule out all file it has processed in the previous cycles. 
For a long-running job, the list of eligible files will be much less than the 
list of all files in the folder being monitored.
In the current implementation of the getInputSplitsSortedByModTime method, a 
list of all available splits are created first, and then every single split is 
checked with the list of eligible files.
{quote}for (FileInputSplit split: format.createInputSplits(readerParallelism)) {
 FileStatus fileStatus = eligibleFiles.get(split.getPath());
 if (fileStatus != null) {
{quote}
The improvement can be done as:
 * Listing of all files should be done once in 
_ContinuousFileMonitoringFunction.listEligibleFiles()_ (as of now it is done 
the 2nd time in _FileInputFormat.createInputSplits()_ )
 * The list of file-splits should then be created from the list of paths in 
eligibleFiles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

2018-10-09 Thread Fabian Wollert
argh, i think the screenshot is missing (at least nabble is not showing
anything). here is a link to the mockup:

https://drive.google.com/file/d/1p3wVP028_AFFLZ6fjPb41yAI8zUhgDTO/view?usp=sharing

Cheers

--


*Fabian WollertZalando SE*

E-Mail: fab...@zalando.de


Am Di., 9. Okt. 2018 um 12:46 Uhr schrieb Fabian Wollert :

> Hi everyone,
>
> disclaimer: i read the contribution guide about improvement requests (i.e.
> i should actually just start a jira ticket) but i thought it would make
> sense to run this first through the mailing list here. after collecting
> some input i would then create the jira ticket.
>
> When accessing the Flink Web Dashboard (which is basically what i do
> almost every day to check some status of a job or so), I recently felt that
> the actual information given in the top portion of the start page is highly
> improvable. I created a first mock by moving html elements around and
> wanted to share this one now:
>
> [image: image.png]
>
> With the exception of the metrics (see below) none of this information
> should be new, but rather re-organized to speed up investigation and
> monitoring:
>
>- complete overview on the cluster status and health, without clicking
>through a lot of pages.
>- Active and stand-by Job Managers. Also their health is depicted as a
>   color (as a first suggestion: last heartbeat is inside 
> heartbeat.timeout)
>   - Current registered Task Managers
>  - the little bar on the side indicates task slot usage. i did
>  not color it since a fully utilised task manager is not necessarily
>  something bad.
>  - the color indicates the health of the task manager (as a first
>  suggestion: last heartbeat is inside heartbeat.timeout)
>   - overview on some cluster metrics
>
> Some points to notice:
>
>- All data you see on the screenshot is mock, no number relates to
>another number at all. but colors should relate to the numbers already
>which they indicate.
>- All of this could also be done with other monitoring solutions
>someone might have in his company, by reading out JMX metrics and then
>plotting those in his monitoring solution (e.g. grafana). But this out of
>the box solution would save everyone from doing it on their own and they
>could trust the metrics shown here.
>- Some of the metrics can only be done with FLINK-7286
> being done. So i
>would split the implementation of this into two parts (cluster overview and
>metrics) and do them separately.
>- This first mock up is targeted to what we here at Zalando would like
>to see first glance, so it fits our use case very well. We mostly use
>long-running session clusters.
>- I'm more a Backend Guy with some Frontend expertise (but mostly in
>React, no angular1 (Flink Web Dashboard is built with this currently)
>experience) and not at all a designer.
>
> What do you think? I would be glad to have some feedback on this,
> especially if this makes sense in the broad community. I would no matter
> what implement this somehow, if not in the Flink Master branch, then as a
> OS project which anyone can deploy next to their flink clusters. But i
> first wanted to run it through here to see if this sparks any interest.
>
> Please also let me know if you see difficulties implementing this already,
> maybe i have overseen something.
>
> Can't wait for your input.
>
> Cheers
>
> --
>
>
> *Fabian WollertZalando SE*
>
> E-Mail: fab...@zalando.de
>


[DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

2018-10-09 Thread Fabian Wollert
Hi everyone,

disclaimer: i read the contribution guide about improvement requests (i.e.
i should actually just start a jira ticket) but i thought it would make
sense to run this first through the mailing list here. after collecting
some input i would then create the jira ticket.

When accessing the Flink Web Dashboard (which is basically what i do almost
every day to check some status of a job or so), I recently felt that the
actual information given in the top portion of the start page is highly
improvable. I created a first mock by moving html elements around and
wanted to share this one now:

[image: image.png]

With the exception of the metrics (see below) none of this information
should be new, but rather re-organized to speed up investigation and
monitoring:

   - complete overview on the cluster status and health, without clicking
   through a lot of pages.
   - Active and stand-by Job Managers. Also their health is depicted as a
  color (as a first suggestion: last heartbeat is inside heartbeat.timeout)
  - Current registered Task Managers
 - the little bar on the side indicates task slot usage. i did not
 color it since a fully utilised task manager is not
necessarily something
 bad.
 - the color indicates the health of the task manager (as a first
 suggestion: last heartbeat is inside heartbeat.timeout)
  - overview on some cluster metrics

Some points to notice:

   - All data you see on the screenshot is mock, no number relates to
   another number at all. but colors should relate to the numbers already
   which they indicate.
   - All of this could also be done with other monitoring solutions someone
   might have in his company, by reading out JMX metrics and then plotting
   those in his monitoring solution (e.g. grafana). But this out of the box
   solution would save everyone from doing it on their own and they could
   trust the metrics shown here.
   - Some of the metrics can only be done with FLINK-7286
    being done. So i
   would split the implementation of this into two parts (cluster overview and
   metrics) and do them separately.
   - This first mock up is targeted to what we here at Zalando would like
   to see first glance, so it fits our use case very well. We mostly use
   long-running session clusters.
   - I'm more a Backend Guy with some Frontend expertise (but mostly in
   React, no angular1 (Flink Web Dashboard is built with this currently)
   experience) and not at all a designer.

What do you think? I would be glad to have some feedback on this,
especially if this makes sense in the broad community. I would no matter
what implement this somehow, if not in the Flink Master branch, then as a
OS project which anyone can deploy next to their flink clusters. But i
first wanted to run it through here to see if this sparks any interest.

Please also let me know if you see difficulties implementing this already,
maybe i have overseen something.

Can't wait for your input.

Cheers

--


*Fabian WollertZalando SE*

E-Mail: fab...@zalando.de


[jira] [Created] (FLINK-10517) Add stability test for the REST API

2018-10-09 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-10517:


 Summary: Add stability test for the REST API
 Key: FLINK-10517
 URL: https://issues.apache.org/jira/browse/FLINK-10517
 Project: Flink
  Issue Type: Improvement
  Components: REST
Affects Versions: 1.7.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.7.0


The the versioning scheme introduced in FLINK-7551 we should add a test that no 
API breaking changes occur within a given version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Sharing state between subtasks

2018-10-09 Thread Fabian Hueske
Hi,

I think watermark / event-time skew is a problem that many users are
struggling with.
A built-in primitive to align event-time would be a great feature!

However, there are also some cases when it would be useful for different
streams to have diverging event-time, such as an interval join [1]
(DataStream API) or time-windowed join (SQL) that joins one stream will
events from another stream that happened 2 to 1 hour ago.
Granted, this is a very specific case and not the norm, but it might make
sense to have it in the back of our heads when designing this feature.

Best, Fabian

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/operators/joining.html#interval-join

Am Di., 9. Okt. 2018 um 10:25 Uhr schrieb Aljoscha Krettek <
aljos...@apache.org>:

> Yes, I think this is the way to go.
>
> This would also go well with a redesign of the source interface that has
> been floated for a while now. I also created a prototype a while back:
> https://github.com/aljoscha/flink/tree/refactor-source-interface <
> https://github.com/aljoscha/flink/tree/refactor-source-interface>. Just
> as a refresher, the redesign aims at several things:
>
> 1. Make partitions/splits explicit in the interface. Currently, the fact
> that there are file splits or Kafka partitions or Kinesis shards is hidden
> in the source implementation while it would be beneficial for the system to
> know of these and to be able to track watermarks for them. Currently, there
> is a custom implementation for per-partition watermark tracking in the
> Kafka Consumer that this redesign would obviate.
>
> 2. Split split/partition/shard discovery from the reading part. This would
> allow rebalancing work and again makes the nature of sources more explicit
> in the interfaces.
>
> 3. Go away from the push model to a pull model. The problem with the
> current source interface is that the source controls the read-loop and has
> to get the checkpoint lock for emitting elements/updating state. If we get
> the loop out of the source this leaves more potential for Flink to be
> clever about reading from sources.
>
> The prototype posted above defines three new interfaces: Source,
> SplitEnumerator, and SplitReader, along with a naive example and a working
> Kafka Consumer (with checkpointing, actually).
>
> If we had this source interface, along with a service for propagating
> watermark information the code that reads form the splits could
> de-prioritise certain splits and we would get the event-time alignment
> behaviour for all sources that are implemented using the new interface
> without requiring special code in each source implementation.
>
> @Elias Do you know if Kafka Consumers do this alignment across multiple
> consumers or only within one Consumer across the partitions that it reads
> from.
>
> > On 9. Oct 2018, at 00:55, Elias Levy 
> wrote:
> >
> > Kafka Streams handles this problem, time alignment, by processing records
> > from the partitions with the lowest timestamp in a best effort basis.
> > See KIP-353 for the details.  The same could be done within the Kafka
> > source and multiple input stream operators.  I opened FLINK-4558
> >  a while ago regarding
> > this topic.
> >
> > On Mon, Oct 8, 2018 at 3:41 PM Jamie Grier 
> wrote:
> >
> >> I'd be very curious to hear others' thoughts on this..  I would expect
> many
> >> people to have run into similar issues.  I also wonder if anybody has
> >> already been working on similar issues.  It seems there is room for some
> >> core Flink changes to address this as well and I'm guessing people have
> >> already thought about it.
> >>
>
>


Re: Sharing state between subtasks

2018-10-09 Thread Aljoscha Krettek
Yes, I think this is the way to go.

This would also go well with a redesign of the source interface that has been 
floated for a while now. I also created a prototype a while back: 
https://github.com/aljoscha/flink/tree/refactor-source-interface 
. Just as a 
refresher, the redesign aims at several things:

1. Make partitions/splits explicit in the interface. Currently, the fact that 
there are file splits or Kafka partitions or Kinesis shards is hidden in the 
source implementation while it would be beneficial for the system to know of 
these and to be able to track watermarks for them. Currently, there is a custom 
implementation for per-partition watermark tracking in the Kafka Consumer that 
this redesign would obviate.

2. Split split/partition/shard discovery from the reading part. This would 
allow rebalancing work and again makes the nature of sources more explicit in 
the interfaces.

3. Go away from the push model to a pull model. The problem with the current 
source interface is that the source controls the read-loop and has to get the 
checkpoint lock for emitting elements/updating state. If we get the loop out of 
the source this leaves more potential for Flink to be clever about reading from 
sources.

The prototype posted above defines three new interfaces: Source, 
SplitEnumerator, and SplitReader, along with a naive example and a working 
Kafka Consumer (with checkpointing, actually).

If we had this source interface, along with a service for propagating watermark 
information the code that reads form the splits could de-prioritise certain 
splits and we would get the event-time alignment behaviour for all sources that 
are implemented using the new interface without requiring special code in each 
source implementation.

@Elias Do you know if Kafka Consumers do this alignment across multiple 
consumers or only within one Consumer across the partitions that it reads from.

> On 9. Oct 2018, at 00:55, Elias Levy  wrote:
> 
> Kafka Streams handles this problem, time alignment, by processing records
> from the partitions with the lowest timestamp in a best effort basis.
> See KIP-353 for the details.  The same could be done within the Kafka
> source and multiple input stream operators.  I opened FLINK-4558
>  a while ago regarding
> this topic.
> 
> On Mon, Oct 8, 2018 at 3:41 PM Jamie Grier  wrote:
> 
>> I'd be very curious to hear others' thoughts on this..  I would expect many
>> people to have run into similar issues.  I also wonder if anybody has
>> already been working on similar issues.  It seems there is room for some
>> core Flink changes to address this as well and I'm guessing people have
>> already thought about it.
>> 



Re: [DISCUSS] Dropping flink-storm?

2018-10-09 Thread Fabian Hueske
Yes, let's do it this way.
The wrapper classes are probably not too complex and can be easily tested.
We have the same for the Hadoop interfaces, although I think only the
Input- and OutputFormatWrappers are actually used.


Am Di., 9. Okt. 2018 um 09:46 Uhr schrieb Chesnay Schepler <
ches...@apache.org>:

> That sounds very good to me.
>
> On 08.10.2018 11:36, Till Rohrmann wrote:
> > Good point. The initial idea of this thread was to remove the storm
> > compatibility layer completely.
> >
> > During the discussion I realized that it might be useful for our users
> > to not completely remove it in one go. Instead for those who still
> > want to use some Bolt and Spout code in Flink, it could be nice to
> > keep the wrappers. At least, we could remove flink-storm in a more
> > graceful way by first removing the Topology and client parts and then
> > the wrappers. What do you think?
> >
> > Cheers,
> > Till
> >
> > On Mon, Oct 8, 2018 at 11:13 AM Chesnay Schepler  > > wrote:
> >
> > I don't believe that to be the consensus. For starters it is
> > contradictory; we can't /drop /flink-storm yet still /keep //some
> > parts/.
> >
> > From my understanding we drop flink-storm completely, and put a
> > note in the docs that the bolt/spout wrappers of previous versions
> > will continue to work.
> >
> > On 08.10.2018 11:04, Till Rohrmann wrote:
> >> Thanks for opening the issue Chesnay. I think the overall
> >> consensus is to drop flink-storm and only keep the Bolt and Spout
> >> wrappers. Thanks for your feedback!
> >>
> >> Cheers,
> >> Till
> >>
> >> On Mon, Oct 8, 2018 at 9:37 AM Chesnay Schepler
> >> mailto:ches...@apache.org>> wrote:
> >>
> >> I've created
> >> https://issues.apache.org/jira/browse/FLINK-10509 for
> >> removing flink-storm.
> >>
> >> On 28.09.2018 15:22, Till Rohrmann wrote:
> >> > Hi everyone,
> >> >
> >> > I would like to discuss how to proceed with Flink's storm
> >> compatibility
> >> > layer flink-strom.
> >> >
> >> > While working on removing Flink's legacy mode, I noticed
> >> that some parts of
> >> > flink-storm rely on the legacy Flink client. In fact, at
> >> the moment
> >> > flink-storm does not work together with Flink's new
> distributed
> >> > architecture.
> >> >
> >> > I'm also wondering how many people are actually using
> >> Flink's Storm
> >> > compatibility layer and whether it would be worth porting it.
> >> >
> >> > I see two options how to proceed:
> >> >
> >> > 1) Commit to maintain flink-storm and port it to Flink's
> >> new architecture
> >> > 2) Drop flink-storm
> >> >
> >> > I doubt that we can contribute it to Apache Bahir [1],
> >> because once we
> >> > remove the legacy mode, this module will no longer work
> >> with all newer
> >> > Flink versions.
> >> >
> >> > Therefore, I would like to hear your opinion on this and in
> >> particular if
> >> > you are using or planning to use flink-storm in the future.
> >> >
> >> > [1] https://github.com/apache/bahir-flink
> >> >
> >> > Cheers,
> >> > Till
> >> >
> >>
> >
>
>


Re: [DISCUSS] Dropping flink-storm?

2018-10-09 Thread Chesnay Schepler

That sounds very good to me.

On 08.10.2018 11:36, Till Rohrmann wrote:
Good point. The initial idea of this thread was to remove the storm 
compatibility layer completely.


During the discussion I realized that it might be useful for our users 
to not completely remove it in one go. Instead for those who still 
want to use some Bolt and Spout code in Flink, it could be nice to 
keep the wrappers. At least, we could remove flink-storm in a more 
graceful way by first removing the Topology and client parts and then 
the wrappers. What do you think?


Cheers,
Till

On Mon, Oct 8, 2018 at 11:13 AM Chesnay Schepler > wrote:


I don't believe that to be the consensus. For starters it is
contradictory; we can't /drop /flink-storm yet still /keep //some
parts/.

From my understanding we drop flink-storm completely, and put a
note in the docs that the bolt/spout wrappers of previous versions
will continue to work.

On 08.10.2018 11:04, Till Rohrmann wrote:

Thanks for opening the issue Chesnay. I think the overall
consensus is to drop flink-storm and only keep the Bolt and Spout
wrappers. Thanks for your feedback!

Cheers,
Till

On Mon, Oct 8, 2018 at 9:37 AM Chesnay Schepler
mailto:ches...@apache.org>> wrote:

I've created
https://issues.apache.org/jira/browse/FLINK-10509 for
removing flink-storm.

On 28.09.2018 15:22, Till Rohrmann wrote:
> Hi everyone,
>
> I would like to discuss how to proceed with Flink's storm
compatibility
> layer flink-strom.
>
> While working on removing Flink's legacy mode, I noticed
that some parts of
> flink-storm rely on the legacy Flink client. In fact, at
the moment
> flink-storm does not work together with Flink's new distributed
> architecture.
>
> I'm also wondering how many people are actually using
Flink's Storm
> compatibility layer and whether it would be worth porting it.
>
> I see two options how to proceed:
>
> 1) Commit to maintain flink-storm and port it to Flink's
new architecture
> 2) Drop flink-storm
>
> I doubt that we can contribute it to Apache Bahir [1],
because once we
> remove the legacy mode, this module will no longer work
with all newer
> Flink versions.
>
> Therefore, I would like to hear your opinion on this and in
particular if
> you are using or planning to use flink-storm in the future.
>
> [1] https://github.com/apache/bahir-flink
>
> Cheers,
> Till
>







Re: [DISCUSS] [Contributing] (2) - Review Steps

2018-10-09 Thread Till Rohrmann
+1

On Tue, Oct 9, 2018 at 9:08 AM Zhijiang(wangzhijiang999)
 wrote:

> +1
> --
> 发件人:vino yang 
> 发送时间:2018年10月9日(星期二) 14:08
> 收件人:dev 
> 主 题:Re: [DISCUSS] [Contributing] (2) - Review Steps
>
> +1
>
> Peter Huang  于2018年10月9日周二 下午1:54写道:
>
> > +1
> >
> > On Mon, Oct 8, 2018 at 7:47 PM Thomas Weise  wrote:
> >
> > > +1
> > >
> > >
> > > On Mon, Oct 8, 2018 at 7:36 PM Tzu-Li Chen 
> wrote:
> > >
> > > > +1
> > > >
> > > > Jin Sun  于2018年10月9日周二 上午2:10写道:
> > > >
> > > > > +1, look forward to see the change.
> > > > >
> > > > > > On Oct 9, 2018, at 12:07 AM, Fabian Hueske 
> > > wrote:
> > > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > Since we have addressed all comments (please raise your voice if
> > > > not!), I
> > > > > > would like to move forward and convert the proposal [1] into a
> page
> > > for
> > > > > > Flink's website [2].
> > > > > > I will create a pull request against the website repo [3].
> > > > > >
> > > > > > Once the page got merged, we can start posting the review form on
> > new
> > > > > pull
> > > > > > requests.
> > > > > >
> > > > > > Best, Fabian
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1yaX2b9LNh-6LxrAmE23U3D2cRbocGlGKCYnvJd9lVhk
> > > > > > [2] https://flink.apache.org
> > > > > > [3] https://github.com/apache/flink-web
> > > > > >
> > > > > > Am Di., 25. Sep. 2018 um 17:56 Uhr schrieb Tzu-Li Chen <
> > > > > wander4...@gmail.com
> > > > > >> :
> > > > > >
> > > > > >> I agree with Chesnay that we don't guarantee (quick) review of a
> > PR
> > > at
> > > > > the
> > > > > >> project level. As ASF statement[1]:
> > > > > >>
> > > > > >>> Please show some patience with the developers if your patch is
> > not
> > > > > >> applied as fast as you'd like or a developer asks you to make
> > > changes
> > > > to
> > > > > >> the patch. If you do not receive any feedback in a reasonable
> > amount
> > > > of
> > > > > >> time (say a week or two), feel free to send a follow-up e-mail
> to
> > > the
> > > > > >> developer list. Open Source developers are all volunteers, often
> > > doing
> > > > > the
> > > > > >> development in their spare time.
> > > > > >>
> > > > > >> However, an open source community shows its friendliness to
> > > > > contributors.
> > > > > >> Thus contributors believe their contribution would be take care
> > of,
> > > > > even be
> > > > > >> rejected with a reason; project members are thought kind to
> > provide
> > > > > help to
> > > > > >> the process.
> > > > > >>
> > > > > >> Just like this thread kicked off, it is glad to see that Flink
> > > > community
> > > > > >> try best to help its contributors and committers, then take
> > > advantage
> > > > of
> > > > > >> "open source".
> > > > > >>
> > > > > >> Best,
> > > > > >> tison.
> > > > > >>
> > > > > >> [1] http://www.apache.org/dev/contributors#patches
> > > > > >>
> > > > > >>
> > > > > >> Chesnay Schepler  于2018年9月25日周二 下午11:21写道:
> > > > > >>
> > > > > >>> There is no guarantee that a PR will be looked at nor is it
> > > possible
> > > > to
> > > > > >>> provide this in any way on the project level.
> > > > > >>>
> > > > > >>> As far as Apache is concerned all contributors/committers etc.
> > work
> > > > > >>> voluntarily, and
> > > > > >>> as such assigning work (which includes ownership if it implies
> > > such)
> > > > or
> > > > > >>> similar is simply not feasible.
> > > > > >>>
> > > > > >>> On 25.09.2018 16:54, Thomas Weise wrote:
> > > > >  I think that all discussion/coordination related to a
> > > contribution /
> > > > > PR
> > > > >  should be handled through the official project channel.
> > > > > 
> > > > >  I would also prefer that there are no designated "owners" and
> > > > > >> "experts",
> > > > >  for the reasons Fabian mentioned.
> > > > > 
> > > > >  Ideally there is no need to have "suggested reviewers" either,
> > but
> > > > > then
> > > > >  what will be the process to ensure that PRs will be looked at?
> > > > > 
> > > > >  Thanks,
> > > > >  Thomas
> > > > > 
> > > > > 
> > > > > 
> > > > >  On Tue, Sep 25, 2018 at 6:17 AM Tzu-Li Chen <
> > wander4...@gmail.com
> > > >
> > > > > >>> wrote:
> > > > > 
> > > > > > Hi Fabian,
> > > > > >
> > > > > > You convinced me. I miss the advantage we can take from
> mailing
> > > > > lists.
> > > > > >
> > > > > > Now I am of the same opinion.
> > > > > >
> > > > > > Best,
> > > > > > tison.
> > > > > >
> > > > > >
> > > > > > Fabian Hueske  于2018年9月25日周二 下午3:01写道:
> > > > > >
> > > > > >> Hi,
> > > > > >>
> > > > > >> I think questions about Flink should be posted on the public
> > > > mailing
> > > > > > lists
> > > > > >> instead of asking just a single expert.
> > > > > >>
> > > > > >> There's many reasons for that:
> > > > > >> * usually more than 

[jira] [Created] (FLINK-10516) YarnApplicationMasterRunner fail to initialize FileSystem with correct Flink Configuration during setup

2018-10-09 Thread Shuyi Chen (JIRA)
Shuyi Chen created FLINK-10516:
--

 Summary: YarnApplicationMasterRunner fail to initialize FileSystem 
with correct Flink Configuration during setup
 Key: FLINK-10516
 URL: https://issues.apache.org/jira/browse/FLINK-10516
 Project: Flink
  Issue Type: Bug
  Components: YARN
Affects Versions: 1.6.0, 1.5.0, 1.4.0, 1.7.0
Reporter: Shuyi Chen
Assignee: Shuyi Chen
 Fix For: 1.7.0


Will add a fix, and refactor YarnApplicationMasterRunner to add a unittest to 
prevent future regression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Handling burst I/O when using tumbling/sliding windows

2018-10-09 Thread Piotr Nowojski
Hi,

Sorry for getting back so late and thanks for the improved document :) I think 
now I got your idea.

You are now trying (or have you already done it?) to implement a custom window 
assigner, that would work as in the [Figure 3] from your document? 


I think that indeed should be possible and relatively easy to do without the 
need for API changes.

Piotrek

> On 1 Oct 2018, at 17:48, Rong Rong  wrote:
> 
> Hi Piotrek,
> 
> Thanks for the quick response. To follow up with the questions:
> Re 1). Yes it is causing network I/O issues on Kafka itself.
> 
> Re 2a). Actually. I thought about it last weekend and I think there's a way
> for a work around: We directly duplicated the key extraction logic in our
> window assigner. Since the element record is passed in, it should be OK to
> create a customized window assigner to handle offset-based on key by
> extracting the key from record
> This was the main part of my change: to let WindowAssignerContext to
> provide current key information extracted from KeyedStateBackend.
> 
> Re 2b). Thanks for the explanation, we will try to profile it! We've seems
> some weird behaviors previously when loading up the network buffer in
> Flink, although it's very rare and inconsistent when trying to reproduce.
> 
> Re 3) Regarding the event time offset. I think I might have not explain my
> idea clearly. I added some more details to the doc. Please kindly take a
> look.
> In a nutshell, window offsets does not change the event time of records at
> all. We simply changes how window assigner assigns records to windows with
> various different offsets.
> 
> --
> Rong
> 
> On Fri, Sep 28, 2018 at 8:03 AM Piotr Nowojski 
> wrote:
> 
>> Hi,
>> 
>> Thanks for the response again :)
>> 
>> Re 1). Do you mean that this extra burst external I/O network traffic is
>> causing disturbance with other systems reading/writing from Kafka? With
>> Kafka itself?
>> 
>> Re 2a) Yes, it should be relatively simple, however any new brick makes
>> the overall component more and more complicated, which has long term
>> consequences in maintenance/refactoring/adding new features/just making
>> reading the code more difficult etc.
>> 
>> Re 2b) With setup of:
>> 
>> WindowOperator -> RateLimitingOperator(maxSize = 0) -> Sink
>> 
>> RateLimitingOperator would just slow down data processing via standard
>> back pressure mechanism. Flink by default allocates 10% of the memory to
>> Network buffers we could partially relay on them to buffer some smaller
>> bursts, without blocking whole pipeline altogether. Essentially
>> RateLimitingOperator(maxSize = 0) would cause back pressure and slow down
>> record emission from the WindowOperator. So yes, there would be still batch
>> emission of the data in the WindowOperator itself, but it would be
>> prolonged/slowed down in terms of wall time because of down stream back
>> pressure caused by RateLimitingOperator.
>> 
>> Btw, with your proposal, with what event time do you want to emit the
>> delayed data? If the event time of the produced records changes based on
>> using/not using windows offsets, this can cause quite a lot of semantic
>> problems and side effects for the downstream operators.
>> 
>> Piotrek
>> 
>>> On 28 Sep 2018, at 15:18, Rong Rong  wrote:
>>> 
>>> Hi Piotrek,
>>> 
>>> Thanks for getting back to me so quickly. Let me explain.
>>> 
>>> Re 1). As I explained in the doc. we are using a basic Kafka-in Kafka-out
>>> system with same partition number on both side. It is causing degraded
>>> performance in external I/O network traffic.
>>> It is definitely possible to configure more resource (e.g. larger
>> partition
>>> count) for output to handle the burst but it can also be resolved through
>>> some sort of smoothing through internal (either through rate limiting as
>>> you suggested, or through the dynamic offset).
>>> 
>>> Re 2a). Yes I agree and I think I understand your concern. However it is
>>> one simple API addition with default fallbacks that are fully
>>> backward-compatible (or I think it be made fully compatible if I missed
>> and
>>> corner cases).
>>> Re 2b). Yes. there could be many potential issues that causes data burst.
>>> However, putting aside the scenarios that was caused by the nature of the
>>> stream (data skew, bursts) that both affects input and output. We want to
>>> address specifically the case that a smooth input is *deterministically*
>>> resulting in burst output. What we are proposing here is kind of exactly
>>> like the case of users' customer operator. However we can't do so unless
>>> there's an API to control the offset.
>>> 
>>> Regarding the problem of rate limiting and skew. I think I missed one key
>>> point from you. I think you are right. If we introduce a *new rate
>> limiting
>>> operator *(with size > 0) it will
>>> - causes extra state usage within the container (moving all the
>>> components from window operator and store in rate limit buffer at window
>>> boundaries).
>>> - will not cause 

回复:[DISCUSS] [Contributing] (2) - Review Steps

2018-10-09 Thread Zhijiang(wangzhijiang999)
+1
--
发件人:vino yang 
发送时间:2018年10月9日(星期二) 14:08
收件人:dev 
主 题:Re: [DISCUSS] [Contributing] (2) - Review Steps

+1

Peter Huang  于2018年10月9日周二 下午1:54写道:

> +1
>
> On Mon, Oct 8, 2018 at 7:47 PM Thomas Weise  wrote:
>
> > +1
> >
> >
> > On Mon, Oct 8, 2018 at 7:36 PM Tzu-Li Chen  wrote:
> >
> > > +1
> > >
> > > Jin Sun  于2018年10月9日周二 上午2:10写道:
> > >
> > > > +1, look forward to see the change.
> > > >
> > > > > On Oct 9, 2018, at 12:07 AM, Fabian Hueske 
> > wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > Since we have addressed all comments (please raise your voice if
> > > not!), I
> > > > > would like to move forward and convert the proposal [1] into a page
> > for
> > > > > Flink's website [2].
> > > > > I will create a pull request against the website repo [3].
> > > > >
> > > > > Once the page got merged, we can start posting the review form on
> new
> > > > pull
> > > > > requests.
> > > > >
> > > > > Best, Fabian
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1yaX2b9LNh-6LxrAmE23U3D2cRbocGlGKCYnvJd9lVhk
> > > > > [2] https://flink.apache.org
> > > > > [3] https://github.com/apache/flink-web
> > > > >
> > > > > Am Di., 25. Sep. 2018 um 17:56 Uhr schrieb Tzu-Li Chen <
> > > > wander4...@gmail.com
> > > > >> :
> > > > >
> > > > >> I agree with Chesnay that we don't guarantee (quick) review of a
> PR
> > at
> > > > the
> > > > >> project level. As ASF statement[1]:
> > > > >>
> > > > >>> Please show some patience with the developers if your patch is
> not
> > > > >> applied as fast as you'd like or a developer asks you to make
> > changes
> > > to
> > > > >> the patch. If you do not receive any feedback in a reasonable
> amount
> > > of
> > > > >> time (say a week or two), feel free to send a follow-up e-mail to
> > the
> > > > >> developer list. Open Source developers are all volunteers, often
> > doing
> > > > the
> > > > >> development in their spare time.
> > > > >>
> > > > >> However, an open source community shows its friendliness to
> > > > contributors.
> > > > >> Thus contributors believe their contribution would be take care
> of,
> > > > even be
> > > > >> rejected with a reason; project members are thought kind to
> provide
> > > > help to
> > > > >> the process.
> > > > >>
> > > > >> Just like this thread kicked off, it is glad to see that Flink
> > > community
> > > > >> try best to help its contributors and committers, then take
> > advantage
> > > of
> > > > >> "open source".
> > > > >>
> > > > >> Best,
> > > > >> tison.
> > > > >>
> > > > >> [1] http://www.apache.org/dev/contributors#patches
> > > > >>
> > > > >>
> > > > >> Chesnay Schepler  于2018年9月25日周二 下午11:21写道:
> > > > >>
> > > > >>> There is no guarantee that a PR will be looked at nor is it
> > possible
> > > to
> > > > >>> provide this in any way on the project level.
> > > > >>>
> > > > >>> As far as Apache is concerned all contributors/committers etc.
> work
> > > > >>> voluntarily, and
> > > > >>> as such assigning work (which includes ownership if it implies
> > such)
> > > or
> > > > >>> similar is simply not feasible.
> > > > >>>
> > > > >>> On 25.09.2018 16:54, Thomas Weise wrote:
> > > >  I think that all discussion/coordination related to a
> > contribution /
> > > > PR
> > > >  should be handled through the official project channel.
> > > > 
> > > >  I would also prefer that there are no designated "owners" and
> > > > >> "experts",
> > > >  for the reasons Fabian mentioned.
> > > > 
> > > >  Ideally there is no need to have "suggested reviewers" either,
> but
> > > > then
> > > >  what will be the process to ensure that PRs will be looked at?
> > > > 
> > > >  Thanks,
> > > >  Thomas
> > > > 
> > > > 
> > > > 
> > > >  On Tue, Sep 25, 2018 at 6:17 AM Tzu-Li Chen <
> wander4...@gmail.com
> > >
> > > > >>> wrote:
> > > > 
> > > > > Hi Fabian,
> > > > >
> > > > > You convinced me. I miss the advantage we can take from mailing
> > > > lists.
> > > > >
> > > > > Now I am of the same opinion.
> > > > >
> > > > > Best,
> > > > > tison.
> > > > >
> > > > >
> > > > > Fabian Hueske  于2018年9月25日周二 下午3:01写道:
> > > > >
> > > > >> Hi,
> > > > >>
> > > > >> I think questions about Flink should be posted on the public
> > > mailing
> > > > > lists
> > > > >> instead of asking just a single expert.
> > > > >>
> > > > >> There's many reasons for that:
> > > > >> * usually more than one person can answer the question (what
> if
> > > the
> > > > > expert
> > > > >> is not available?)
> > > > >> * non-committers can join the discussion and contribute to the
> > > > >>> community
> > > > >> (how can they become experts otherwise?)
> > > > >> * the knowledge is shared on the mailing list (helps in cases
> > when
> > > > >> only
> > > > > one
> > > > >> 

Re: [DISCUSS] [Contributing] (2) - Review Steps

2018-10-09 Thread vino yang
+1

Peter Huang  于2018年10月9日周二 下午1:54写道:

> +1
>
> On Mon, Oct 8, 2018 at 7:47 PM Thomas Weise  wrote:
>
> > +1
> >
> >
> > On Mon, Oct 8, 2018 at 7:36 PM Tzu-Li Chen  wrote:
> >
> > > +1
> > >
> > > Jin Sun  于2018年10月9日周二 上午2:10写道:
> > >
> > > > +1, look forward to see the change.
> > > >
> > > > > On Oct 9, 2018, at 12:07 AM, Fabian Hueske 
> > wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > Since we have addressed all comments (please raise your voice if
> > > not!), I
> > > > > would like to move forward and convert the proposal [1] into a page
> > for
> > > > > Flink's website [2].
> > > > > I will create a pull request against the website repo [3].
> > > > >
> > > > > Once the page got merged, we can start posting the review form on
> new
> > > > pull
> > > > > requests.
> > > > >
> > > > > Best, Fabian
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1yaX2b9LNh-6LxrAmE23U3D2cRbocGlGKCYnvJd9lVhk
> > > > > [2] https://flink.apache.org
> > > > > [3] https://github.com/apache/flink-web
> > > > >
> > > > > Am Di., 25. Sep. 2018 um 17:56 Uhr schrieb Tzu-Li Chen <
> > > > wander4...@gmail.com
> > > > >> :
> > > > >
> > > > >> I agree with Chesnay that we don't guarantee (quick) review of a
> PR
> > at
> > > > the
> > > > >> project level. As ASF statement[1]:
> > > > >>
> > > > >>> Please show some patience with the developers if your patch is
> not
> > > > >> applied as fast as you'd like or a developer asks you to make
> > changes
> > > to
> > > > >> the patch. If you do not receive any feedback in a reasonable
> amount
> > > of
> > > > >> time (say a week or two), feel free to send a follow-up e-mail to
> > the
> > > > >> developer list. Open Source developers are all volunteers, often
> > doing
> > > > the
> > > > >> development in their spare time.
> > > > >>
> > > > >> However, an open source community shows its friendliness to
> > > > contributors.
> > > > >> Thus contributors believe their contribution would be take care
> of,
> > > > even be
> > > > >> rejected with a reason; project members are thought kind to
> provide
> > > > help to
> > > > >> the process.
> > > > >>
> > > > >> Just like this thread kicked off, it is glad to see that Flink
> > > community
> > > > >> try best to help its contributors and committers, then take
> > advantage
> > > of
> > > > >> "open source".
> > > > >>
> > > > >> Best,
> > > > >> tison.
> > > > >>
> > > > >> [1] http://www.apache.org/dev/contributors#patches
> > > > >>
> > > > >>
> > > > >> Chesnay Schepler  于2018年9月25日周二 下午11:21写道:
> > > > >>
> > > > >>> There is no guarantee that a PR will be looked at nor is it
> > possible
> > > to
> > > > >>> provide this in any way on the project level.
> > > > >>>
> > > > >>> As far as Apache is concerned all contributors/committers etc.
> work
> > > > >>> voluntarily, and
> > > > >>> as such assigning work (which includes ownership if it implies
> > such)
> > > or
> > > > >>> similar is simply not feasible.
> > > > >>>
> > > > >>> On 25.09.2018 16:54, Thomas Weise wrote:
> > > >  I think that all discussion/coordination related to a
> > contribution /
> > > > PR
> > > >  should be handled through the official project channel.
> > > > 
> > > >  I would also prefer that there are no designated "owners" and
> > > > >> "experts",
> > > >  for the reasons Fabian mentioned.
> > > > 
> > > >  Ideally there is no need to have "suggested reviewers" either,
> but
> > > > then
> > > >  what will be the process to ensure that PRs will be looked at?
> > > > 
> > > >  Thanks,
> > > >  Thomas
> > > > 
> > > > 
> > > > 
> > > >  On Tue, Sep 25, 2018 at 6:17 AM Tzu-Li Chen <
> wander4...@gmail.com
> > >
> > > > >>> wrote:
> > > > 
> > > > > Hi Fabian,
> > > > >
> > > > > You convinced me. I miss the advantage we can take from mailing
> > > > lists.
> > > > >
> > > > > Now I am of the same opinion.
> > > > >
> > > > > Best,
> > > > > tison.
> > > > >
> > > > >
> > > > > Fabian Hueske  于2018年9月25日周二 下午3:01写道:
> > > > >
> > > > >> Hi,
> > > > >>
> > > > >> I think questions about Flink should be posted on the public
> > > mailing
> > > > > lists
> > > > >> instead of asking just a single expert.
> > > > >>
> > > > >> There's many reasons for that:
> > > > >> * usually more than one person can answer the question (what
> if
> > > the
> > > > > expert
> > > > >> is not available?)
> > > > >> * non-committers can join the discussion and contribute to the
> > > > >>> community
> > > > >> (how can they become experts otherwise?)
> > > > >> * the knowledge is shared on the mailing list (helps in cases
> > when
> > > > >> only
> > > > > one
> > > > >> person can answer the question)
> > > > >>
> > > > >> Last but not least, my concern is that committers for popular
> > > > > contribution
> > > > >> areas would