[jira] [Created] (FLINK-6259) Fix a small spelling error

2017-04-03 Thread sunjincheng (JIRA)
sunjincheng created FLINK-6259:
--

 Summary: Fix a small spelling error
 Key: FLINK-6259
 URL: https://issues.apache.org/jira/browse/FLINK-6259
 Project: Flink
  Issue Type: Bug
  Components: Gelly
Reporter: sunjincheng
Assignee: sunjincheng


flink-gelly-scala/pom.xml  {{har-with-dependencies}} -> {{ 
jar-with-dependencies}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] FLIP-18: Code Generation for improving sorting performance

2017-04-03 Thread Pattarawat Chormai
Hi guys,

I have made an additional optimization[1] related to megamorphic call issue
that Greg mentioned earlier. The optimization[2] improves execution time
around ~13%, while the original code from FLINK-5734 is ~11%.

IMHO, the improvement from metamorphic call optimization is very small
compared to the code we have to introduce. So, I think we can just go with
the PR that we currently have. What do you think?

[1]
https://github.com/heytitle/flink/commit/8e38b4d738b9953337361c62a8d77e909327d28f
[2]https://docs.google.com/spreadsheets/d/1PcdCdFX4bGecO6Lb5dLI2nww2NoeaA8c3MgbEdsVmk0/edit#gid=598217386

Best,
Pat



--
View this message in context: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-18-Code-Generation-for-improving-sorting-performance-tp16486p16923.html
Sent from the Apache Flink Mailing List archive. mailing list archive at 
Nabble.com.


Flink limitations under Beam

2017-04-03 Thread amir bahmanyari
Hi colleagues, been a long time. New project.What feature(s)/capabilities of 
Flink would become unavailable/limited if the pipeline app is written in Beam 
sdk using FlinkRunner?
Thanks+regards

[jira] [Created] (FLINK-6258) Deprecate ListCheckpointed interface for managed operator state

2017-04-03 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-6258:
--

 Summary: Deprecate ListCheckpointed interface for managed operator 
state
 Key: FLINK-6258
 URL: https://issues.apache.org/jira/browse/FLINK-6258
 Project: Flink
  Issue Type: Improvement
  Components: State Backends, Checkpointing
Reporter: Tzu-Li (Gordon) Tai


Per discussion in https://github.com/apache/flink/pull/3508, we consider 
deprecating the `ListCheckpointed` interface to discourage Java serialization 
shortcuts for state registrations (towards this, the Java serialization 
shortcuts provided by the `OperatorStateStore` interface have already been 
deprecated in https://github.com/apache/flink/pull/3508).

We should also remember to update 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/stream/state.html#using-managed-keyed-state
 if we decide to do this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Flink streaming job with iterations gets stuck waiting for network buffers

2017-04-03 Thread Gábor Hermann

Hi Andrey,

As Paris has explained it, this is a known issue and there are ongoing 
efforts to solve it.


I can suggest a workaround: limit the amount of messages sent into the 
iteration manually. You can do this with a e.g. a Map operator that 
limits records per seconds and simply sends what it has received. You 
can check at every incoming record whether the limit has been reached, 
and if so Thread.sleep until the next second. You could place Map 
operator before the operator that ingests data into the iteration 
(operator with ID 9 in your dataflow graph). This way you can avoid 
overloading the network inside the iteration, and thus avoid deadlock 
caused by backpressure.


This approach is, of course, a bit hacky. Also, it does not eliminate 
the possibility of a deadlock entirely. Other disadvantage is that you 
have to manually tune the rate of ingesting. That could depend on lot of 
things: the data load, the number of operator instances, the placement 
of operator instances, etc. But I have used something like this as a 
temporary workaround until we see more progress with FLIP-15.


Cheers,
Gabor


On 2017-04-03 13:33, Paris Carbone wrote:

Hi Andrey,

If I am not mistaken this sounds like a known deadlock case and can be caused 
by the combination of Flink's backpressure mechanism with iterations (more 
likely when there is heavy feedback load).
Keep in mind that, currently, iterations are (perhaps the only) not stable 
feature to use. The good news is that there is a complete redesign planned for 
it (partly FLIP-15 [1]) that has to entirely address this pending flow control 
issue as well.

Increasing network buffers or feedback queue capacity to a really high number 
decreases the possibility of the deadlock but does not eliminate it.
I really cannot think of a quick solution to the problem that does not involve 
some deep changes.

I am CCing dev since this seems like a very relevant use case to revive the 
discussion for the loops redesign and also keep you in the loop (no pun 
intended) regarding this specific issue.
Will also update FLIP-15 with several interesting proposals under discussion 
from Stephan to tackle this issue.

cheers,
Paris

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-15+Scoped+Loops+and+Job+Termination


On 3 Apr 2017, at 12:54, Andrey Melentyev 
> wrote:

Hi,

I have a Flink 1.2.0 streaming job using a number of stateful operators and an 
iteration loop with a RichFlatMapFunction inside. On the high level, the app 
reads some data, massages it and feeds into an iterative algorithm which 
produces some output and feedback while keeping the state. All stateful 
operators are on KeyedStreams. Input is some data on file system and output is 
stdout.

The implementation passes functional tests but when tested with noticeable 
amounts of input data (tens of thousands records, dozens of MB raw data) after 
a few seconds of good throughput, backpressure kicks in and the application 
essentially gets stuck: most of the threads are blocked waiting for buffers, 
occasional message gets processed every few minutes. There's nothing strange in 
the log files.

The behaviour is reproducible both in local execution environment and in Flink 
standalone cluster (started using jobmanager.sh and taskmanager.sh)

The problematic part is likely in the iterations since the part of the job 
before iterations works fine with the same data.

I would appreciate pointers as to how to debug this. 
taskmanager.network.numberOfBuffers from the config sounds relevant but the 
default value of 2048 is already much higher than slots-per-TM^2 * #TMs * 4 = 
4^2 * 1 * 4 = 64.

Attaching flink config, job execution plan and thread dump with some sensitive 
parts retracted.

flink-conf.yml

jobmanager.rpc.address: localhost
jobmanager.rpc.port: 6123
jobmanager.heap.mb: 512
taskmanager.heap.mb: 8192
taskmanager.numberOfTaskSlots: 4
taskmanager.memory.preallocate: false
parallelism.default: 4
jobmanager.web.port: 8081
state.backend: rocksdb
state.backend.fs.checkpointdir: 
file:///Users/andrey.melentyev/tmp/flink-checkpoints

Job execution plan

{
   "nodes": [
 {
   "contents": "IterationSource-10",
   "id": -1,
   "pact": "Data Source",
   "parallelism": 8,
   "type": "IterationSource-10"
 },
 {
   "contents": "Source: Custom File Source",
   "id": 1,
   "pact": "Data Source",
   "parallelism": 1,
   "type": "Source: Custom File Source"
 },
 {
   "contents": "Split Reader: Custom File Source",
   "id": 2,
   "pact": "Operator",
   "parallelism": 8,
   "predecessors": [
 {
   "id": 1,
   "ship_strategy": "REBALANCE",
   "side": "second"
 }
   ],
   "type": "Split Reader: Custom File Source"
 },
 {
   "contents": "Parse JSON",
   "id": 3,
   "pact": "Operator",
   

Re: [VOTE] Release Apache Flink 1.2.1 (RC1)

2017-04-03 Thread Aljoscha Krettek
I created a PR for the revert: https://github.com/apache/flink/pull/3664

> On 3. Apr 2017, at 18:32, Stephan Ewen  wrote:
> 
> +1 for options (1), but also invest the time to fix it properly for 1.2.2
> 
> 
> On Mon, Apr 3, 2017 at 9:10 AM, Kostas Kloudas 
> wrote:
> 
>> +1 for 1
>> 
>>> On Apr 3, 2017, at 5:52 PM, Till Rohrmann  wrote:
>>> 
>>> +1 for option 1)
>>> 
>>> On Mon, Apr 3, 2017 at 5:48 PM, Fabian Hueske  wrote:
>>> 
 +1 to option 1)
 
 2017-04-03 16:57 GMT+02:00 Ted Yu :
 
> Looks like #1 is better - 1.2.1 would be at least as stable as 1.2.0
> 
> Cheers
> 
> On Mon, Apr 3, 2017 at 7:39 AM, Aljoscha Krettek 
> wrote:
> 
>> Just so we’re all on the same page. ;-)
>> 
>> There was https://issues.apache.org/jira/browse/FLINK-5808 which was
>> a
>> bug that we initially discovered in Flink 1.2 which was/is about
 missing
>> verification for the correctness of the combination of parallelism and
>> max-parallelism. Due to lacking test coverage this introduced two more
> bugs:
>> - https://issues.apache.org/jira/browse/FLINK-6188: Some
>> setParallelism() methods can't cope with default parallelism
>> - https://issues.apache.org/jira/browse/FLINK-6209:
>> StreamPlanEnvironment always has a parallelism of 1
>> 
>> IMHO, the options are:
>> 1) revert the changes made for FLINK-5808 on the release-1.2 branch
 and
>> live with the bug still being present
>> 2) put in more work to fix FLINK-5808 which requires fixing some
> problems
>> that have existed for a long time with how the parallelism is set in
>> streaming programs
>> 
>> Best,
>> Aljoscha
>> 
>>> On 31. Mar 2017, at 21:34, Robert Metzger 
 wrote:
>>> 
>>> I don't know what is best to do, but I think releasing 1.2.1 with
>>> potentially more bugs than 1.2.0 is not a good option.
>>> I suspect a good workaround for FLINK-6188
>>>  is setting the
>>> parallelism manually for operators that can't cope with the default
 -1
>>> parallelism.
>>> 
>>> On Fri, Mar 31, 2017 at 9:06 PM, Aljoscha Krettek <
 aljos...@apache.org
>> 
>>> wrote:
>>> 
 You mean reverting the changes around FLINK-5808 [1]? This is what
 introduced the follow-up FLINK-6188 [2].
 
 [1] https://issues.apache.org/jira/browse/FLINK-5808
 [2]https://issues.apache.org/jira/browse/FLINK-6188
 
 On Fri, Mar 31, 2017, at 19:10, Robert Metzger wrote:
> I think reverting FLINK-6188 for the 1.2 branch might be a good
 idea.
> FLINK-6188 introduced two new bugs, so undoing the FLINK-6188 fix
> will
> lead
> only to one known bug in 1.2.1, instead of an uncertain number of
>> issues.
> So 1.2.1 is not going to be worse than 1.2.0
> 
> The fix will hopefully make it into 1.2.2 then.
> 
> Any other thoughts on this?
> 
> 
> 
> 
> On Fri, Mar 31, 2017 at 6:46 PM, Fabian Hueske 
 wrote:
> 
>> I merged the fix for FLINK-6044 to the release-1.2 and release-1.1
 branch.
>> 
>> 2017-03-31 15:02 GMT+02:00 Fabian Hueske :
>> 
>>> We should also backport the fix for FLINK-6044 to Flink 1.2.1.
>>> 
>>> I'll take care of that.
>>> 
>>> 2017-03-30 18:50 GMT+02:00 Aljoscha Krettek  :
>>> 
 https://issues.apache.org/jira/browse/FLINK-6188 turns out to
 be
> a
 bit
 more involved, see my comments on the PR:
 https://github.com/apache/flink/pull/3616.
 
 As I said there, maybe we should revert the commits regarding
 parallelism/max-parallelism changes and release and then fix it
 later.
 
 On Wed, Mar 29, 2017, at 23:08, Aljoscha Krettek wrote:
> I commented on FLINK-6214: I think it's working as intended,
 although
>> we
> could fix the javadoc/doc.
> 
> On Wed, Mar 29, 2017, at 17:35, Timo Walther wrote:
>> A user reported that all tumbling and slinding window
 assigners
 contain
>> a pretty obvious bug about offsets.
>> 
>> https://issues.apache.org/jira/browse/FLINK-6214
>> 
>> I think we should also fix this for 1.2.1. What do you think?
>> 
>> Regards,
>> Timo
>> 
>> 
>> Am 29/03/17 um 11:30 schrieb Robert Metzger:

Re: [DISCUSS] Code style / checkstyle

2017-04-03 Thread Aljoscha Krettek
I think enough people did already look at the checkstyle rules proposed in the 
PR. 

On most of the rules reaching consensus is easy so I propose to enable all 
rules except those regarding placement of curly braces and control flow 
formatting. The two styles in the Flink code base are:

1)
if () {
} else {
}

try {
} catch () {
}

and 

2)

if () {
}
else {
}

try {
}
catch () {
}

I think it’s hard to reach consensus on these so I suggest to keep allowing 
both styles.

Any comments very welcome! :-)

Best,
Aljoscha
> On 19. Mar 2017, at 17:09, Aljoscha Krettek  wrote:
> 
> I played around with this over the week end and it turns out that the 
> required changes in flink-streaming-java are not that big. I created a PR 
> with a proposed checkstyle.xml and the required code changes: 
> https://github.com/apache/flink/pull/3567 
> . There’s a longer description of 
> the style in the PR. The commits add continuously more invasive changes so we 
> can start with the more lightweight changes if we want to.
> 
> If we want to go forward with this I would also encourage other people to use 
> this for different modules and see how it turns out.
> 
> Best,
> Aljoscha
> 
>> On 18 Mar 2017, at 08:00, Aljoscha Krettek > > wrote:
>> 
>> I added an issue for adding a custom checkstyle.xml for flink-streaming-java 
>> so that we can gradually add checks: 
>> https://issues.apache.org/jira/browse/FLINK-6107 
>> . I outlined the procedure 
>> in the Jira. We can use this as a pilot project and see how it goes and then 
>> gradually also apply to other modules.
>> 
>> What do you think?
>> 
>>> On 6 Mar 2017, at 12:42, Stephan Ewen >> > wrote:
>>> 
>>> A singular "all reformat in one instant" will cause immense damage to the
>>> project, in my opinion.
>>> 
>>> - There are so many pull requests that we are having a hard time keeping
>>> up, and merging will a lot more time intensive.
>>> - I personally have many forked branches with WIP features that will
>>> probably never go in if the branches become unmergeable. I expect that to
>>> be true for many other committers and contributors.
>>> - Some companies have Flink forks and are rebasing patches onto master
>>> regularly. They will be completely screwed by a full reformat.
>>> 
>>> If we do something, the only thing that really is possible is:
>>> 
>>> (1) Define a style. Ideally not too far away from Flink's style.
>>> (2) Apply it to new projects/modules
>>> (3) Coordinate carefully to pull it into existing modules, module by
>>> module. Leaving time to adopt pull requests bit by bit, and allowing forks
>>> to go through minor merges, rather than total conflict.
>>> 
>>> 
>>> 
>>> On Wed, Mar 1, 2017 at 5:57 PM, Henry Saputra >> >
>>> wrote:
>>> 
 It is actually Databricks Scala guide to help contributions to Apache Spark
 so not really official Spark Scala guide.
 The style guide feels bit more like asking people to write Scala in Java
 mode so I am -1 to follow the style for Apache Flink Scala if that what you
 are recommending.
 
 If the "unification" means ONE style guide for both Java and Scala I would
 vote -1 to it because both languages have different semantics and styles to
 make them readable and understandable.
 
 We could start with improving the Scala maven style guide to follow more
 Scala official style guide [1] and add IntelliJ Idea or Eclipse plugin
 style to follow suit.
 
 Java side has bit more strict style check but we could make it tighter but
 embracing more Google Java guide closely with minor exceptions
 
 - Henry
 
 [1] http://docs.scala-lang.org/style/ 
 
 On Mon, Feb 27, 2017 at 11:54 AM, Stavros Kontopoulos <
 st.kontopou...@gmail.com > wrote:
 
> +1 to provide and enforcing a unified code style for both java and scala.
> Unification should apply when it makes sense like comments though.
> 
> Eventually code base should be re-factored. I would vote for the one at a
> time module fix apporoach.
> Style guide should be part of any PR review.
> 
> We could also have a look at the spark style guide:
> https://github.com/databricks/scala-style-guide 
> 
> 
> The style code and general guidelines help keep code more readable and
 keep
> things simple
> with many contributors and different styles of code writing + language
> features.
> 
> 
> On Mon, Feb 27, 2017 at 8:01 PM, Stephan Ewen  > wrote:
> 
>> I agree, reformatting 90% of the 

Re: [VOTE] Release Apache Flink 1.2.1 (RC1)

2017-04-03 Thread Stephan Ewen
+1 for options (1), but also invest the time to fix it properly for 1.2.2


On Mon, Apr 3, 2017 at 9:10 AM, Kostas Kloudas 
wrote:

> +1 for 1
>
> > On Apr 3, 2017, at 5:52 PM, Till Rohrmann  wrote:
> >
> > +1 for option 1)
> >
> > On Mon, Apr 3, 2017 at 5:48 PM, Fabian Hueske  wrote:
> >
> >> +1 to option 1)
> >>
> >> 2017-04-03 16:57 GMT+02:00 Ted Yu :
> >>
> >>> Looks like #1 is better - 1.2.1 would be at least as stable as 1.2.0
> >>>
> >>> Cheers
> >>>
> >>> On Mon, Apr 3, 2017 at 7:39 AM, Aljoscha Krettek 
> >>> wrote:
> >>>
>  Just so we’re all on the same page. ;-)
> 
>  There was https://issues.apache.org/jira/browse/FLINK-5808 which was
> a
>  bug that we initially discovered in Flink 1.2 which was/is about
> >> missing
>  verification for the correctness of the combination of parallelism and
>  max-parallelism. Due to lacking test coverage this introduced two more
> >>> bugs:
>   - https://issues.apache.org/jira/browse/FLINK-6188: Some
>  setParallelism() methods can't cope with default parallelism
>   - https://issues.apache.org/jira/browse/FLINK-6209:
>  StreamPlanEnvironment always has a parallelism of 1
> 
>  IMHO, the options are:
>  1) revert the changes made for FLINK-5808 on the release-1.2 branch
> >> and
>  live with the bug still being present
>  2) put in more work to fix FLINK-5808 which requires fixing some
> >>> problems
>  that have existed for a long time with how the parallelism is set in
>  streaming programs
> 
>  Best,
>  Aljoscha
> 
> > On 31. Mar 2017, at 21:34, Robert Metzger 
> >> wrote:
> >
> > I don't know what is best to do, but I think releasing 1.2.1 with
> > potentially more bugs than 1.2.0 is not a good option.
> > I suspect a good workaround for FLINK-6188
> >  is setting the
> > parallelism manually for operators that can't cope with the default
> >> -1
> > parallelism.
> >
> > On Fri, Mar 31, 2017 at 9:06 PM, Aljoscha Krettek <
> >> aljos...@apache.org
> 
> > wrote:
> >
> >> You mean reverting the changes around FLINK-5808 [1]? This is what
> >> introduced the follow-up FLINK-6188 [2].
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-5808
> >> [2]https://issues.apache.org/jira/browse/FLINK-6188
> >>
> >> On Fri, Mar 31, 2017, at 19:10, Robert Metzger wrote:
> >>> I think reverting FLINK-6188 for the 1.2 branch might be a good
> >> idea.
> >>> FLINK-6188 introduced two new bugs, so undoing the FLINK-6188 fix
> >>> will
> >>> lead
> >>> only to one known bug in 1.2.1, instead of an uncertain number of
>  issues.
> >>> So 1.2.1 is not going to be worse than 1.2.0
> >>>
> >>> The fix will hopefully make it into 1.2.2 then.
> >>>
> >>> Any other thoughts on this?
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Mar 31, 2017 at 6:46 PM, Fabian Hueske 
> >> wrote:
> >>>
>  I merged the fix for FLINK-6044 to the release-1.2 and release-1.1
> >> branch.
> 
>  2017-03-31 15:02 GMT+02:00 Fabian Hueske :
> 
> > We should also backport the fix for FLINK-6044 to Flink 1.2.1.
> >
> > I'll take care of that.
> >
> > 2017-03-30 18:50 GMT+02:00 Aljoscha Krettek  >>> :
> >
> >> https://issues.apache.org/jira/browse/FLINK-6188 turns out to
> >> be
> >>> a
> >> bit
> >> more involved, see my comments on the PR:
> >> https://github.com/apache/flink/pull/3616.
> >>
> >> As I said there, maybe we should revert the commits regarding
> >> parallelism/max-parallelism changes and release and then fix it
> >> later.
> >>
> >> On Wed, Mar 29, 2017, at 23:08, Aljoscha Krettek wrote:
> >>> I commented on FLINK-6214: I think it's working as intended,
> >> although
>  we
> >>> could fix the javadoc/doc.
> >>>
> >>> On Wed, Mar 29, 2017, at 17:35, Timo Walther wrote:
>  A user reported that all tumbling and slinding window
> >> assigners
> >> contain
>  a pretty obvious bug about offsets.
> 
>  https://issues.apache.org/jira/browse/FLINK-6214
> 
>  I think we should also fix this for 1.2.1. What do you think?
> 
>  Regards,
>  Timo
> 
> 
>  Am 29/03/17 um 11:30 schrieb Robert Metzger:
> > Hi Haohui,
> > I agree that we should fix the parallelism issue. Otherwise,
> >> the
> >> 1.2.1
> > release would introduce a new bug.
> >
> 

[jira] [Created] (FLINK-6257) Post-pass OVER windows

2017-04-03 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-6257:


 Summary: Post-pass OVER windows
 Key: FLINK-6257
 URL: https://issues.apache.org/jira/browse/FLINK-6257
 Project: Flink
  Issue Type: Improvement
  Components: Table API & SQL
Affects Versions: 1.3.0
Reporter: Fabian Hueske
Priority: Critical


The OVER windows have been implemented by several contributors.
We should do a post pass over the contributed code and:

* Functionality
** currently every time attributes is allows as ORDER BY attribute. We must 
check that this is actually a time indicator ({{procTime()}}, {{rowTime()}}) an 
that the order is ASCENDING.
* Documentation
** Add documentation for OVER windows
* Code style
** Consistent naming of {{ProcessFunctions}} and methods
* Tests
** Move the OVER window tests out of SqlITCase into a dedicated class
** Move the OVER window tests out of WindowAggregateTest into a dedicated class
** Add tests based on the test harness for all {{ProcessFunctions}} similar to 
{{BoundedProcessingOverRangeProcessFunction}}. The tests should include exact 
boundary checks for range windows and check for proper parallelization with 
multiple keys.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] Release Apache Flink 1.2.1 (RC1)

2017-04-03 Thread Kostas Kloudas
+1 for 1

> On Apr 3, 2017, at 5:52 PM, Till Rohrmann  wrote:
> 
> +1 for option 1)
> 
> On Mon, Apr 3, 2017 at 5:48 PM, Fabian Hueske  wrote:
> 
>> +1 to option 1)
>> 
>> 2017-04-03 16:57 GMT+02:00 Ted Yu :
>> 
>>> Looks like #1 is better - 1.2.1 would be at least as stable as 1.2.0
>>> 
>>> Cheers
>>> 
>>> On Mon, Apr 3, 2017 at 7:39 AM, Aljoscha Krettek 
>>> wrote:
>>> 
 Just so we’re all on the same page. ;-)
 
 There was https://issues.apache.org/jira/browse/FLINK-5808 which was a
 bug that we initially discovered in Flink 1.2 which was/is about
>> missing
 verification for the correctness of the combination of parallelism and
 max-parallelism. Due to lacking test coverage this introduced two more
>>> bugs:
  - https://issues.apache.org/jira/browse/FLINK-6188: Some
 setParallelism() methods can't cope with default parallelism
  - https://issues.apache.org/jira/browse/FLINK-6209:
 StreamPlanEnvironment always has a parallelism of 1
 
 IMHO, the options are:
 1) revert the changes made for FLINK-5808 on the release-1.2 branch
>> and
 live with the bug still being present
 2) put in more work to fix FLINK-5808 which requires fixing some
>>> problems
 that have existed for a long time with how the parallelism is set in
 streaming programs
 
 Best,
 Aljoscha
 
> On 31. Mar 2017, at 21:34, Robert Metzger 
>> wrote:
> 
> I don't know what is best to do, but I think releasing 1.2.1 with
> potentially more bugs than 1.2.0 is not a good option.
> I suspect a good workaround for FLINK-6188
>  is setting the
> parallelism manually for operators that can't cope with the default
>> -1
> parallelism.
> 
> On Fri, Mar 31, 2017 at 9:06 PM, Aljoscha Krettek <
>> aljos...@apache.org
 
> wrote:
> 
>> You mean reverting the changes around FLINK-5808 [1]? This is what
>> introduced the follow-up FLINK-6188 [2].
>> 
>> [1] https://issues.apache.org/jira/browse/FLINK-5808
>> [2]https://issues.apache.org/jira/browse/FLINK-6188
>> 
>> On Fri, Mar 31, 2017, at 19:10, Robert Metzger wrote:
>>> I think reverting FLINK-6188 for the 1.2 branch might be a good
>> idea.
>>> FLINK-6188 introduced two new bugs, so undoing the FLINK-6188 fix
>>> will
>>> lead
>>> only to one known bug in 1.2.1, instead of an uncertain number of
 issues.
>>> So 1.2.1 is not going to be worse than 1.2.0
>>> 
>>> The fix will hopefully make it into 1.2.2 then.
>>> 
>>> Any other thoughts on this?
>>> 
>>> 
>>> 
>>> 
>>> On Fri, Mar 31, 2017 at 6:46 PM, Fabian Hueske 
>> wrote:
>>> 
 I merged the fix for FLINK-6044 to the release-1.2 and release-1.1
>> branch.
 
 2017-03-31 15:02 GMT+02:00 Fabian Hueske :
 
> We should also backport the fix for FLINK-6044 to Flink 1.2.1.
> 
> I'll take care of that.
> 
> 2017-03-30 18:50 GMT+02:00 Aljoscha Krettek >> :
> 
>> https://issues.apache.org/jira/browse/FLINK-6188 turns out to
>> be
>>> a
>> bit
>> more involved, see my comments on the PR:
>> https://github.com/apache/flink/pull/3616.
>> 
>> As I said there, maybe we should revert the commits regarding
>> parallelism/max-parallelism changes and release and then fix it
>> later.
>> 
>> On Wed, Mar 29, 2017, at 23:08, Aljoscha Krettek wrote:
>>> I commented on FLINK-6214: I think it's working as intended,
>> although
 we
>>> could fix the javadoc/doc.
>>> 
>>> On Wed, Mar 29, 2017, at 17:35, Timo Walther wrote:
 A user reported that all tumbling and slinding window
>> assigners
>> contain
 a pretty obvious bug about offsets.
 
 https://issues.apache.org/jira/browse/FLINK-6214
 
 I think we should also fix this for 1.2.1. What do you think?
 
 Regards,
 Timo
 
 
 Am 29/03/17 um 11:30 schrieb Robert Metzger:
> Hi Haohui,
> I agree that we should fix the parallelism issue. Otherwise,
>> the
>> 1.2.1
> release would introduce a new bug.
> 
> On Tue, Mar 28, 2017 at 11:59 PM, Haohui Mai <
>> ricet...@gmail.com>
>> wrote:
> 
>> -1 (non-binding)
>> 
>> We recently found out that all jobs submitted via UI will
>> have a
>> parallelism of 1, potentially due to FLINK-5808.
>> 
>> Filed FLINK-6209 to track it.
>> 

Re: [VOTE] Release Apache Flink 1.2.1 (RC1)

2017-04-03 Thread Till Rohrmann
+1 for option 1)

On Mon, Apr 3, 2017 at 5:48 PM, Fabian Hueske  wrote:

> +1 to option 1)
>
> 2017-04-03 16:57 GMT+02:00 Ted Yu :
>
> > Looks like #1 is better - 1.2.1 would be at least as stable as 1.2.0
> >
> > Cheers
> >
> > On Mon, Apr 3, 2017 at 7:39 AM, Aljoscha Krettek 
> > wrote:
> >
> > > Just so we’re all on the same page. ;-)
> > >
> > > There was https://issues.apache.org/jira/browse/FLINK-5808 which was a
> > > bug that we initially discovered in Flink 1.2 which was/is about
> missing
> > > verification for the correctness of the combination of parallelism and
> > > max-parallelism. Due to lacking test coverage this introduced two more
> > bugs:
> > >   - https://issues.apache.org/jira/browse/FLINK-6188: Some
> > > setParallelism() methods can't cope with default parallelism
> > >   - https://issues.apache.org/jira/browse/FLINK-6209:
> > > StreamPlanEnvironment always has a parallelism of 1
> > >
> > > IMHO, the options are:
> > >  1) revert the changes made for FLINK-5808 on the release-1.2 branch
> and
> > > live with the bug still being present
> > >  2) put in more work to fix FLINK-5808 which requires fixing some
> > problems
> > > that have existed for a long time with how the parallelism is set in
> > > streaming programs
> > >
> > > Best,
> > > Aljoscha
> > >
> > > > On 31. Mar 2017, at 21:34, Robert Metzger 
> wrote:
> > > >
> > > > I don't know what is best to do, but I think releasing 1.2.1 with
> > > > potentially more bugs than 1.2.0 is not a good option.
> > > > I suspect a good workaround for FLINK-6188
> > > >  is setting the
> > > > parallelism manually for operators that can't cope with the default
> -1
> > > > parallelism.
> > > >
> > > > On Fri, Mar 31, 2017 at 9:06 PM, Aljoscha Krettek <
> aljos...@apache.org
> > >
> > > > wrote:
> > > >
> > > >> You mean reverting the changes around FLINK-5808 [1]? This is what
> > > >> introduced the follow-up FLINK-6188 [2].
> > > >>
> > > >> [1] https://issues.apache.org/jira/browse/FLINK-5808
> > > >> [2]https://issues.apache.org/jira/browse/FLINK-6188
> > > >>
> > > >> On Fri, Mar 31, 2017, at 19:10, Robert Metzger wrote:
> > > >>> I think reverting FLINK-6188 for the 1.2 branch might be a good
> idea.
> > > >>> FLINK-6188 introduced two new bugs, so undoing the FLINK-6188 fix
> > will
> > > >>> lead
> > > >>> only to one known bug in 1.2.1, instead of an uncertain number of
> > > issues.
> > > >>> So 1.2.1 is not going to be worse than 1.2.0
> > > >>>
> > > >>> The fix will hopefully make it into 1.2.2 then.
> > > >>>
> > > >>> Any other thoughts on this?
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Fri, Mar 31, 2017 at 6:46 PM, Fabian Hueske 
> > > >> wrote:
> > > >>>
> > >  I merged the fix for FLINK-6044 to the release-1.2 and release-1.1
> > > >> branch.
> > > 
> > >  2017-03-31 15:02 GMT+02:00 Fabian Hueske :
> > > 
> > > > We should also backport the fix for FLINK-6044 to Flink 1.2.1.
> > > >
> > > > I'll take care of that.
> > > >
> > > > 2017-03-30 18:50 GMT+02:00 Aljoscha Krettek  >:
> > > >
> > > >> https://issues.apache.org/jira/browse/FLINK-6188 turns out to
> be
> > a
> > > >> bit
> > > >> more involved, see my comments on the PR:
> > > >> https://github.com/apache/flink/pull/3616.
> > > >>
> > > >> As I said there, maybe we should revert the commits regarding
> > > >> parallelism/max-parallelism changes and release and then fix it
> > > >> later.
> > > >>
> > > >> On Wed, Mar 29, 2017, at 23:08, Aljoscha Krettek wrote:
> > > >>> I commented on FLINK-6214: I think it's working as intended,
> > > >> although
> > >  we
> > > >>> could fix the javadoc/doc.
> > > >>>
> > > >>> On Wed, Mar 29, 2017, at 17:35, Timo Walther wrote:
> > >  A user reported that all tumbling and slinding window
> assigners
> > > >> contain
> > >  a pretty obvious bug about offsets.
> > > 
> > >  https://issues.apache.org/jira/browse/FLINK-6214
> > > 
> > >  I think we should also fix this for 1.2.1. What do you think?
> > > 
> > >  Regards,
> > >  Timo
> > > 
> > > 
> > >  Am 29/03/17 um 11:30 schrieb Robert Metzger:
> > > > Hi Haohui,
> > > > I agree that we should fix the parallelism issue. Otherwise,
> > > >> the
> > > >> 1.2.1
> > > > release would introduce a new bug.
> > > >
> > > > On Tue, Mar 28, 2017 at 11:59 PM, Haohui Mai <
> > > >> ricet...@gmail.com>
> > > >> wrote:
> > > >
> > > >> -1 (non-binding)
> > > >>
> > > >> We recently found out that all jobs submitted via UI will
> > > >> have a
> > > >> parallelism of 1, potentially due to FLINK-5808.
> > > 

Re: [VOTE] Release Apache Flink 1.2.1 (RC1)

2017-04-03 Thread Fabian Hueske
+1 to option 1)

2017-04-03 16:57 GMT+02:00 Ted Yu :

> Looks like #1 is better - 1.2.1 would be at least as stable as 1.2.0
>
> Cheers
>
> On Mon, Apr 3, 2017 at 7:39 AM, Aljoscha Krettek 
> wrote:
>
> > Just so we’re all on the same page. ;-)
> >
> > There was https://issues.apache.org/jira/browse/FLINK-5808 which was a
> > bug that we initially discovered in Flink 1.2 which was/is about missing
> > verification for the correctness of the combination of parallelism and
> > max-parallelism. Due to lacking test coverage this introduced two more
> bugs:
> >   - https://issues.apache.org/jira/browse/FLINK-6188: Some
> > setParallelism() methods can't cope with default parallelism
> >   - https://issues.apache.org/jira/browse/FLINK-6209:
> > StreamPlanEnvironment always has a parallelism of 1
> >
> > IMHO, the options are:
> >  1) revert the changes made for FLINK-5808 on the release-1.2 branch and
> > live with the bug still being present
> >  2) put in more work to fix FLINK-5808 which requires fixing some
> problems
> > that have existed for a long time with how the parallelism is set in
> > streaming programs
> >
> > Best,
> > Aljoscha
> >
> > > On 31. Mar 2017, at 21:34, Robert Metzger  wrote:
> > >
> > > I don't know what is best to do, but I think releasing 1.2.1 with
> > > potentially more bugs than 1.2.0 is not a good option.
> > > I suspect a good workaround for FLINK-6188
> > >  is setting the
> > > parallelism manually for operators that can't cope with the default -1
> > > parallelism.
> > >
> > > On Fri, Mar 31, 2017 at 9:06 PM, Aljoscha Krettek  >
> > > wrote:
> > >
> > >> You mean reverting the changes around FLINK-5808 [1]? This is what
> > >> introduced the follow-up FLINK-6188 [2].
> > >>
> > >> [1] https://issues.apache.org/jira/browse/FLINK-5808
> > >> [2]https://issues.apache.org/jira/browse/FLINK-6188
> > >>
> > >> On Fri, Mar 31, 2017, at 19:10, Robert Metzger wrote:
> > >>> I think reverting FLINK-6188 for the 1.2 branch might be a good idea.
> > >>> FLINK-6188 introduced two new bugs, so undoing the FLINK-6188 fix
> will
> > >>> lead
> > >>> only to one known bug in 1.2.1, instead of an uncertain number of
> > issues.
> > >>> So 1.2.1 is not going to be worse than 1.2.0
> > >>>
> > >>> The fix will hopefully make it into 1.2.2 then.
> > >>>
> > >>> Any other thoughts on this?
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Fri, Mar 31, 2017 at 6:46 PM, Fabian Hueske 
> > >> wrote:
> > >>>
> >  I merged the fix for FLINK-6044 to the release-1.2 and release-1.1
> > >> branch.
> > 
> >  2017-03-31 15:02 GMT+02:00 Fabian Hueske :
> > 
> > > We should also backport the fix for FLINK-6044 to Flink 1.2.1.
> > >
> > > I'll take care of that.
> > >
> > > 2017-03-30 18:50 GMT+02:00 Aljoscha Krettek :
> > >
> > >> https://issues.apache.org/jira/browse/FLINK-6188 turns out to be
> a
> > >> bit
> > >> more involved, see my comments on the PR:
> > >> https://github.com/apache/flink/pull/3616.
> > >>
> > >> As I said there, maybe we should revert the commits regarding
> > >> parallelism/max-parallelism changes and release and then fix it
> > >> later.
> > >>
> > >> On Wed, Mar 29, 2017, at 23:08, Aljoscha Krettek wrote:
> > >>> I commented on FLINK-6214: I think it's working as intended,
> > >> although
> >  we
> > >>> could fix the javadoc/doc.
> > >>>
> > >>> On Wed, Mar 29, 2017, at 17:35, Timo Walther wrote:
> >  A user reported that all tumbling and slinding window assigners
> > >> contain
> >  a pretty obvious bug about offsets.
> > 
> >  https://issues.apache.org/jira/browse/FLINK-6214
> > 
> >  I think we should also fix this for 1.2.1. What do you think?
> > 
> >  Regards,
> >  Timo
> > 
> > 
> >  Am 29/03/17 um 11:30 schrieb Robert Metzger:
> > > Hi Haohui,
> > > I agree that we should fix the parallelism issue. Otherwise,
> > >> the
> > >> 1.2.1
> > > release would introduce a new bug.
> > >
> > > On Tue, Mar 28, 2017 at 11:59 PM, Haohui Mai <
> > >> ricet...@gmail.com>
> > >> wrote:
> > >
> > >> -1 (non-binding)
> > >>
> > >> We recently found out that all jobs submitted via UI will
> > >> have a
> > >> parallelism of 1, potentially due to FLINK-5808.
> > >>
> > >> Filed FLINK-6209 to track it.
> > >>
> > >> ~Haohui
> > >>
> > >> On Mon, Mar 27, 2017 at 2:59 AM Chesnay Schepler <
> > >> ches...@apache.org>
> > >> wrote:
> > >>
> > >>> If possible I would like to include FLINK-6183 & FLINK-6184
> > >> as
> > >> well.
> > >>>
> > >>> 

[jira] [Created] (FLINK-6256) Fix documentation of ProcessFunction.

2017-04-03 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-6256:
-

 Summary: Fix documentation of ProcessFunction.
 Key: FLINK-6256
 URL: https://issues.apache.org/jira/browse/FLINK-6256
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.3.0
Reporter: Kostas Kloudas
Priority: Blocker
 Fix For: 1.3.0


In the code example on how to define an {{OutputTag}} and how to use it to 
extract the side-output stream, the name of the defined output tag and that of 
the one used in the {{getSideOutput()}} differ.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6255) Remove PatternStream.getSideOutput()

2017-04-03 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-6255:
---

 Summary: Remove PatternStream.getSideOutput()
 Key: FLINK-6255
 URL: https://issues.apache.org/jira/browse/FLINK-6255
 Project: Flink
  Issue Type: Improvement
  Components: CEP
Reporter: Aljoscha Krettek


We cannot currently use the result of {{select()/flatSelect()}} to get the side 
output stream because the operator that emits the side output is not the same 
operator that is returned from the {{select()}} method. There is always a map 
or flatMap after the actual CEP operator.

We first have to change the CEP operator(s) to directly execute the map/flatMap 
function inside the operator.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6254) Consolidate late data methods on PatternStream and WindowedStream

2017-04-03 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-6254:
---

 Summary: Consolidate late data methods on PatternStream and 
WindowedStream
 Key: FLINK-6254
 URL: https://issues.apache.org/jira/browse/FLINK-6254
 Project: Flink
  Issue Type: Improvement
  Components: CEP
Reporter: Aljoscha Krettek
Priority: Blocker
 Fix For: 1.3.0


{{WindowedStream}} has {{sideOutputLateData(OutputTag outputTag)}} while 
{{PatternStream}} has {{withLateDataOutputTag(OutputTag outputTag)}}.

{{WindowedStream}} had the method first so we should stick to that naming 
scheme.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6253) Distinct rowTime with time range boundaries

2017-04-03 Thread radu (JIRA)
radu created FLINK-6253:
---

 Summary: Distinct rowTime with time range boundaries
 Key: FLINK-6253
 URL: https://issues.apache.org/jira/browse/FLINK-6253
 Project: Flink
  Issue Type: Sub-task
Reporter: radu


Support distinct aggregates with rowtime order and time range boundaries

Q1.4. `SELECT SUM( DISTINCT  b) OVER (ORDER BY rowTime() RANGE BETWEEN INTERVAL 
'1' HOUR PRECEDING AND CURRENT ROW) FROM stream1`

Q1.4. `SELECT COUNT(b), SUM( DISTINCT  b) OVER (ORDER BY rowTime() RANGE 
BETWEEN INTERVAL '1' HOUR PRECEDING AND CURRENT ROW) FROM stream1`



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6252) Distinct rowTime with Rows boundaries

2017-04-03 Thread radu (JIRA)
radu created FLINK-6252:
---

 Summary: Distinct rowTime with Rows boundaries
 Key: FLINK-6252
 URL: https://issues.apache.org/jira/browse/FLINK-6252
 Project: Flink
  Issue Type: Sub-task
Reporter: radu


Support distinct aggregates over row time order with rows boundaries

Q1.3. `SELECT SUM( DISTINCT  b) OVER (ORDER BY rowTime() ROWS BETWEEN 2 
PRECEDING AND CURRENT ROW) FROM stream1`

Q1.3. `SELECT COUNT(b), SUM( DISTINCT  b) OVER (ORDER BY rowTime() ROWS BETWEEN 
2 PRECEDING AND CURRENT ROW) FROM stream1`



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6251) Distinct procTime with time range boundaries

2017-04-03 Thread radu (JIRA)
radu created FLINK-6251:
---

 Summary: Distinct procTime with time range boundaries
 Key: FLINK-6251
 URL: https://issues.apache.org/jira/browse/FLINK-6251
 Project: Flink
  Issue Type: Sub-task
Reporter: radu


Support proctime distinct aggregates with time boundaries
Q1.2. `SELECT SUM( DISTINCT  b) OVER (ORDER BY procTime() RANGE BETWEEN 
INTERVAL '1' HOUR PRECEDING AND CURRENT ROW) FROM stream1`

Q1.2. `SELECT COUNT(b), SUM( DISTINCT  b) OVER (ORDER BY procTime() RANGE 
BETWEEN INTERVAL '1' HOUR PRECEDING AND CURRENT ROW) FROM stream1`



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6250) Distinct procTime with Rows boundaries

2017-04-03 Thread radu (JIRA)
radu created FLINK-6250:
---

 Summary: Distinct procTime with Rows boundaries
 Key: FLINK-6250
 URL: https://issues.apache.org/jira/browse/FLINK-6250
 Project: Flink
  Issue Type: Sub-task
Reporter: radu


Support proctime with rows boundaries

Q1.1. `SELECT SUM( DISTINCT  b) OVER (ORDER BY procTime() ROWS BETWEEN 2 
PRECEDING AND CURRENT ROW) FROM stream1`





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] Release Apache Flink 1.2.1 (RC1)

2017-04-03 Thread Ted Yu
Looks like #1 is better - 1.2.1 would be at least as stable as 1.2.0

Cheers

On Mon, Apr 3, 2017 at 7:39 AM, Aljoscha Krettek 
wrote:

> Just so we’re all on the same page. ;-)
>
> There was https://issues.apache.org/jira/browse/FLINK-5808 which was a
> bug that we initially discovered in Flink 1.2 which was/is about missing
> verification for the correctness of the combination of parallelism and
> max-parallelism. Due to lacking test coverage this introduced two more bugs:
>   - https://issues.apache.org/jira/browse/FLINK-6188: Some
> setParallelism() methods can't cope with default parallelism
>   - https://issues.apache.org/jira/browse/FLINK-6209:
> StreamPlanEnvironment always has a parallelism of 1
>
> IMHO, the options are:
>  1) revert the changes made for FLINK-5808 on the release-1.2 branch and
> live with the bug still being present
>  2) put in more work to fix FLINK-5808 which requires fixing some problems
> that have existed for a long time with how the parallelism is set in
> streaming programs
>
> Best,
> Aljoscha
>
> > On 31. Mar 2017, at 21:34, Robert Metzger  wrote:
> >
> > I don't know what is best to do, but I think releasing 1.2.1 with
> > potentially more bugs than 1.2.0 is not a good option.
> > I suspect a good workaround for FLINK-6188
> >  is setting the
> > parallelism manually for operators that can't cope with the default -1
> > parallelism.
> >
> > On Fri, Mar 31, 2017 at 9:06 PM, Aljoscha Krettek 
> > wrote:
> >
> >> You mean reverting the changes around FLINK-5808 [1]? This is what
> >> introduced the follow-up FLINK-6188 [2].
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-5808
> >> [2]https://issues.apache.org/jira/browse/FLINK-6188
> >>
> >> On Fri, Mar 31, 2017, at 19:10, Robert Metzger wrote:
> >>> I think reverting FLINK-6188 for the 1.2 branch might be a good idea.
> >>> FLINK-6188 introduced two new bugs, so undoing the FLINK-6188 fix will
> >>> lead
> >>> only to one known bug in 1.2.1, instead of an uncertain number of
> issues.
> >>> So 1.2.1 is not going to be worse than 1.2.0
> >>>
> >>> The fix will hopefully make it into 1.2.2 then.
> >>>
> >>> Any other thoughts on this?
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Mar 31, 2017 at 6:46 PM, Fabian Hueske 
> >> wrote:
> >>>
>  I merged the fix for FLINK-6044 to the release-1.2 and release-1.1
> >> branch.
> 
>  2017-03-31 15:02 GMT+02:00 Fabian Hueske :
> 
> > We should also backport the fix for FLINK-6044 to Flink 1.2.1.
> >
> > I'll take care of that.
> >
> > 2017-03-30 18:50 GMT+02:00 Aljoscha Krettek :
> >
> >> https://issues.apache.org/jira/browse/FLINK-6188 turns out to be a
> >> bit
> >> more involved, see my comments on the PR:
> >> https://github.com/apache/flink/pull/3616.
> >>
> >> As I said there, maybe we should revert the commits regarding
> >> parallelism/max-parallelism changes and release and then fix it
> >> later.
> >>
> >> On Wed, Mar 29, 2017, at 23:08, Aljoscha Krettek wrote:
> >>> I commented on FLINK-6214: I think it's working as intended,
> >> although
>  we
> >>> could fix the javadoc/doc.
> >>>
> >>> On Wed, Mar 29, 2017, at 17:35, Timo Walther wrote:
>  A user reported that all tumbling and slinding window assigners
> >> contain
>  a pretty obvious bug about offsets.
> 
>  https://issues.apache.org/jira/browse/FLINK-6214
> 
>  I think we should also fix this for 1.2.1. What do you think?
> 
>  Regards,
>  Timo
> 
> 
>  Am 29/03/17 um 11:30 schrieb Robert Metzger:
> > Hi Haohui,
> > I agree that we should fix the parallelism issue. Otherwise,
> >> the
> >> 1.2.1
> > release would introduce a new bug.
> >
> > On Tue, Mar 28, 2017 at 11:59 PM, Haohui Mai <
> >> ricet...@gmail.com>
> >> wrote:
> >
> >> -1 (non-binding)
> >>
> >> We recently found out that all jobs submitted via UI will
> >> have a
> >> parallelism of 1, potentially due to FLINK-5808.
> >>
> >> Filed FLINK-6209 to track it.
> >>
> >> ~Haohui
> >>
> >> On Mon, Mar 27, 2017 at 2:59 AM Chesnay Schepler <
> >> ches...@apache.org>
> >> wrote:
> >>
> >>> If possible I would like to include FLINK-6183 & FLINK-6184
> >> as
> >> well.
> >>>
> >>> They fix 2 metric-related issues that could arise when a
> >> Task is
> >>> cancelled very early. (like, right away)
> >>>
> >>> FLINK-6183 fixes a memory leak where the TaskMetricGroup was
> >> never closed
> >>> FLINK-6184 fixes a NullPointerExceptions in the buffer
> >> metrics
> >>>
> >>> PR 

[jira] [Created] (FLINK-6249) Distinct Aggregates for OVER window

2017-04-03 Thread radu (JIRA)
radu created FLINK-6249:
---

 Summary: Distinct Aggregates for OVER window
 Key: FLINK-6249
 URL: https://issues.apache.org/jira/browse/FLINK-6249
 Project: Flink
  Issue Type: New Feature
  Components: Table API & SQL
Affects Versions: 1.3.0
Reporter: radu


Time target: ProcTime/EventTime

SQL targeted query examples:


Q1. Boundaries are expressed in windows and meant for the elements to be 
aggregated

Q1.1. `SELECT SUM( DISTINCT  b) OVER (ORDER BY procTime() ROWS BETWEEN 2 
PRECEDING AND CURRENT ROW) FROM stream1`

Q1.2. `SELECT SUM( DISTINCT  b) OVER (ORDER BY procTime() RANGE BETWEEN 
INTERVAL '1' HOUR PRECEDING AND CURRENT ROW) FROM stream1`

Q1.3. `SELECT SUM( DISTINCT  b) OVER (ORDER BY rowTime() ROWS BETWEEN 2 
PRECEDING AND CURRENT ROW) FROM stream1`

Q1.4. `SELECT SUM( DISTINCT  b) OVER (ORDER BY rowTime() RANGE BETWEEN INTERVAL 
'1' HOUR PRECEDING AND CURRENT ROW) FROM stream1`

General comments:

-   DISTINCT operation makes sense only within the context of windows or
some bounded defined structures. Otherwise the operation would keep
an infinite amount of data to ensure uniqueness and would not
trigger for certain functions (e.g. aggregates)

-   We can consider as a sub-JIRA issue the implementation of DISTINCT
for UNBOUND sliding windows. However, there would be no control  over 
the data structure to keep seen data (to check it is not re-process). ->   This 
needs to be decided if we want to support it (to create appropriate JIRA issues)
=> We will open sub-JIRA issues to extend the current functionality of 
aggregates for the DISTINCT CASE (Q1.{1-4}).  (This is the main target of this 
JIRA)

=>   Aggregations over distinct elements without any boundary (i.e.
within SELECT clause) do not make sense just as aggregations do not
make sense without groupings or windows.

Other similar query support


Q2. Boundaries are expressed in GROUP BY clause and distinct is applied for the 
elements of the aggregate(s)

`SELECT MIN( DISTINCT rowtime), prodID FROM stream1 GROUP BY FLOOR(procTime() 
TO HOUR)`

=> We need to decide if we aim to support for this release distinct aggregates 
for the group by (Q2). If so sub-JIRA issues need to be created. We can follow 
the same design/implementation.


=> We can consider as a sub-JIRA issue the implementation of DISTINCT
for select clauses. However, there is no control over the growing
size of the data structure and it will unavoidably crash the memory.
Q3. Distinct is applied to the collection of outputs to be selected.

`SELECT STREAM DISTINCT procTime(), prodId  FROM stream1 GROUP BY 
FLOOR(procTime() TO DAY)`

Description:


The DISTINCT operator requires processing the elements to ensure
uniqueness. Either that the operation is used for SELECT ALL distinct
elements or for applying typical aggregation functions over a set of
elements, there is a prior need of forming a collection of elements.
This brings the need of using windows or grouping methods. Therefore the 
distinct function will be implemented within windows. Depending on the
type of window definition there are several options:
-   Main Scope: If distinct is applied as in Q1 example for window aggregations 
than either we extend the implementation with distinct aggregates (less 
prefered) or extend the sliding window aggregates implementation in the 
processFunction with distinctinction identification support (prefered). The 
later option is prefered because a query can carry multiple aggregates 
including multiple aggregates that have the distinct key word set up. 
Implementing the distinction between elements in the process function avoid the 
need to multiply the data structure to mark what what was seen across multiple 
aggregates. It also makes the implementation more robust and resilient as we cn 
keep the data structure for marking the seen elements in a state (mapstate).
-   If distinct is applied as in Q2 example on group elements than
either we define a new implementation if selection is general or
extend the current implementation of grouped aggregates with
distinct group aggregates
-   If distinct is applied as in Q3 example for the select all elements,
then a new implementation needs to be defined. This would work over
a specific window and within the window function the uniqueness of
the results to be processed will be done.


Functionality example
-

We exemplify below the functionality of the IN/Exists when working with
streams.

`Q1: SELECT STREAM DISTINCT b FROM stream1 GROUP BY FLOOR(PROCTIME TO HOUR) `

`Q2: SELECT  COUNT(DISTINCT  b) FROM stream1 GROUP BY FLOOR(PROCTIME() TO HOUR) 
`

`Q3:  SELECT  sum(DISTINCT  a) OVER (ORDER BY procTime() ROWS BETWEEN 2 
PRECEDING AND CURRENT ROW) FROM stream1`






  
Proctime

[jira] [Created] (FLINK-6248) Make the optional() available to all offered patterns.

2017-04-03 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-6248:
-

 Summary: Make the optional() available to all offered patterns.
 Key: FLINK-6248
 URL: https://issues.apache.org/jira/browse/FLINK-6248
 Project: Flink
  Issue Type: Improvement
  Components: CEP
Affects Versions: 1.3.0
Reporter: Kostas Kloudas
 Fix For: 1.3.0


Currently the {{optional()}} quantifier is available as a separate pattern. 
This issue proposes to make it available as a flag to all patterns.

This implies that: 
1) a singleton pattern with {{optional=true}} will become the current 
{{OPTIONAL}}, 
2) a {{oneToMany}} will become {{zeroToMany}}, 
3) the {{zeroToMany}} will not exist as a direct option in the {{Pattern}} 
class, and 
4) the {{times()}} will require some changes in the {{NFACompiler}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6247) Build a jar-with-dependencies for flink-table and put it into ./opt

2017-04-03 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-6247:


 Summary: Build a jar-with-dependencies for flink-table and put it 
into ./opt
 Key: FLINK-6247
 URL: https://issues.apache.org/jira/browse/FLINK-6247
 Project: Flink
  Issue Type: Improvement
  Components: Build System, Table API & SQL
Affects Versions: 1.3.0
Reporter: Fabian Hueske
Assignee: sunjincheng


Due to a problem with Calcite and the unloading of classes, user-code 
classloaders that include Calcite cannot be garbage collected. This is a 
problem for long-running clusters that execute multiple Table API / SQL 
programs with fat JARs that include the flink-table dependency. Each executed 
program comes with an own user-code classloader that cannot be cleaned up later.

As a workaround, we recommend to copy the flink-table dependency into the ./lib 
folder. However, we do not have a jar file with all required transitive 
dependencies (Calcite, Janino, etc). Hence, users would need to build this jar 
file themselves or copy all jars into ./lib.

This issue is about creating a jar-with-dependencies and adding it to the ./opt 
folder. Users can then copy the jar file from ./opt to ./lib to include the 
table API in the classpath of Flink.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] Release Apache Flink 1.2.1 (RC1)

2017-04-03 Thread Aljoscha Krettek
Just so we’re all on the same page. ;-)

There was https://issues.apache.org/jira/browse/FLINK-5808 which was a bug that 
we initially discovered in Flink 1.2 which was/is about missing verification 
for the correctness of the combination of parallelism and max-parallelism. Due 
to lacking test coverage this introduced two more bugs:
  - https://issues.apache.org/jira/browse/FLINK-6188: Some setParallelism() 
methods can't cope with default parallelism
  - https://issues.apache.org/jira/browse/FLINK-6209: StreamPlanEnvironment 
always has a parallelism of 1

IMHO, the options are:
 1) revert the changes made for FLINK-5808 on the release-1.2 branch and live 
with the bug still being present 
 2) put in more work to fix FLINK-5808 which requires fixing some problems that 
have existed for a long time with how the parallelism is set in streaming 
programs

Best,
Aljoscha

> On 31. Mar 2017, at 21:34, Robert Metzger  wrote:
> 
> I don't know what is best to do, but I think releasing 1.2.1 with
> potentially more bugs than 1.2.0 is not a good option.
> I suspect a good workaround for FLINK-6188
>  is setting the
> parallelism manually for operators that can't cope with the default -1
> parallelism.
> 
> On Fri, Mar 31, 2017 at 9:06 PM, Aljoscha Krettek 
> wrote:
> 
>> You mean reverting the changes around FLINK-5808 [1]? This is what
>> introduced the follow-up FLINK-6188 [2].
>> 
>> [1] https://issues.apache.org/jira/browse/FLINK-5808
>> [2]https://issues.apache.org/jira/browse/FLINK-6188
>> 
>> On Fri, Mar 31, 2017, at 19:10, Robert Metzger wrote:
>>> I think reverting FLINK-6188 for the 1.2 branch might be a good idea.
>>> FLINK-6188 introduced two new bugs, so undoing the FLINK-6188 fix will
>>> lead
>>> only to one known bug in 1.2.1, instead of an uncertain number of issues.
>>> So 1.2.1 is not going to be worse than 1.2.0
>>> 
>>> The fix will hopefully make it into 1.2.2 then.
>>> 
>>> Any other thoughts on this?
>>> 
>>> 
>>> 
>>> 
>>> On Fri, Mar 31, 2017 at 6:46 PM, Fabian Hueske 
>> wrote:
>>> 
 I merged the fix for FLINK-6044 to the release-1.2 and release-1.1
>> branch.
 
 2017-03-31 15:02 GMT+02:00 Fabian Hueske :
 
> We should also backport the fix for FLINK-6044 to Flink 1.2.1.
> 
> I'll take care of that.
> 
> 2017-03-30 18:50 GMT+02:00 Aljoscha Krettek :
> 
>> https://issues.apache.org/jira/browse/FLINK-6188 turns out to be a
>> bit
>> more involved, see my comments on the PR:
>> https://github.com/apache/flink/pull/3616.
>> 
>> As I said there, maybe we should revert the commits regarding
>> parallelism/max-parallelism changes and release and then fix it
>> later.
>> 
>> On Wed, Mar 29, 2017, at 23:08, Aljoscha Krettek wrote:
>>> I commented on FLINK-6214: I think it's working as intended,
>> although
 we
>>> could fix the javadoc/doc.
>>> 
>>> On Wed, Mar 29, 2017, at 17:35, Timo Walther wrote:
 A user reported that all tumbling and slinding window assigners
>> contain
 a pretty obvious bug about offsets.
 
 https://issues.apache.org/jira/browse/FLINK-6214
 
 I think we should also fix this for 1.2.1. What do you think?
 
 Regards,
 Timo
 
 
 Am 29/03/17 um 11:30 schrieb Robert Metzger:
> Hi Haohui,
> I agree that we should fix the parallelism issue. Otherwise,
>> the
>> 1.2.1
> release would introduce a new bug.
> 
> On Tue, Mar 28, 2017 at 11:59 PM, Haohui Mai <
>> ricet...@gmail.com>
>> wrote:
> 
>> -1 (non-binding)
>> 
>> We recently found out that all jobs submitted via UI will
>> have a
>> parallelism of 1, potentially due to FLINK-5808.
>> 
>> Filed FLINK-6209 to track it.
>> 
>> ~Haohui
>> 
>> On Mon, Mar 27, 2017 at 2:59 AM Chesnay Schepler <
>> ches...@apache.org>
>> wrote:
>> 
>>> If possible I would like to include FLINK-6183 & FLINK-6184
>> as
>> well.
>>> 
>>> They fix 2 metric-related issues that could arise when a
>> Task is
>>> cancelled very early. (like, right away)
>>> 
>>> FLINK-6183 fixes a memory leak where the TaskMetricGroup was
>> never closed
>>> FLINK-6184 fixes a NullPointerExceptions in the buffer
>> metrics
>>> 
>>> PR here: https://github.com/apache/flink/pull/3611
>>> 
>>> On 26.03.2017 12:35, Aljoscha Krettek wrote:
 I opened a PR for FLINK-6188: https://github.com/apache/
>> flink/pull/3616
>>> 
 This improves the previously very sparse test coverage for
>>> timestamp/watermark 

Re: Side Outputs - type bounds.

2017-04-03 Thread Aljoscha Krettek
Hi,
You’re absolutely right. I created an Issue for this: 
https://issues.apache.org/jira/browse/FLINK-6246

Best,
Aljoscha
> On 3. Apr 2017, at 10:22, Dawid Wysakowicz  wrote:
> 
> Hi,
> I am implementing emitting discarded patterns in CEP library through Side
> Outputs and I have a question about the Output::collect method which is:
> 
>  void collect(OutputTag outputTag, StreamRecord record);
> 
> Why the type of the outputTag is not also X? Or at least T extends X?
> 
> Thanks in advance for some info on it.
> 
> Regards
> Dawid Wysakowicz



[jira] [Created] (FLINK-6246) Fix generic type of OutputTag in operator Output

2017-04-03 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-6246:
---

 Summary: Fix generic type of OutputTag in operator Output
 Key: FLINK-6246
 URL: https://issues.apache.org/jira/browse/FLINK-6246
 Project: Flink
  Issue Type: Bug
  Components: DataStream API
Reporter: Aljoscha Krettek


The current signature is
{code}
 void collect(OutputTag outputTag, StreamRecord record)
{code}

which can be improved to
{code}
 void collect(OutputTag outputTag, StreamRecord record)
{code}

This is probably leftover from an intermediate stage of development.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Flink streaming job with iterations gets stuck waiting for network buffers

2017-04-03 Thread Paris Carbone
Hi Andrey,

If I am not mistaken this sounds like a known deadlock case and can be caused 
by the combination of Flink's backpressure mechanism with iterations (more 
likely when there is heavy feedback load).
Keep in mind that, currently, iterations are (perhaps the only) not stable 
feature to use. The good news is that there is a complete redesign planned for 
it (partly FLIP-15 [1]) that has to entirely address this pending flow control 
issue as well.

Increasing network buffers or feedback queue capacity to a really high number 
decreases the possibility of the deadlock but does not eliminate it.
I really cannot think of a quick solution to the problem that does not involve 
some deep changes.

I am CCing dev since this seems like a very relevant use case to revive the 
discussion for the loops redesign and also keep you in the loop (no pun 
intended) regarding this specific issue.
Will also update FLIP-15 with several interesting proposals under discussion 
from Stephan to tackle this issue.

cheers,
Paris

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-15+Scoped+Loops+and+Job+Termination


On 3 Apr 2017, at 12:54, Andrey Melentyev 
> wrote:

Hi,

I have a Flink 1.2.0 streaming job using a number of stateful operators and an 
iteration loop with a RichFlatMapFunction inside. On the high level, the app 
reads some data, massages it and feeds into an iterative algorithm which 
produces some output and feedback while keeping the state. All stateful 
operators are on KeyedStreams. Input is some data on file system and output is 
stdout.

The implementation passes functional tests but when tested with noticeable 
amounts of input data (tens of thousands records, dozens of MB raw data) after 
a few seconds of good throughput, backpressure kicks in and the application 
essentially gets stuck: most of the threads are blocked waiting for buffers, 
occasional message gets processed every few minutes. There's nothing strange in 
the log files.

The behaviour is reproducible both in local execution environment and in Flink 
standalone cluster (started using jobmanager.sh and taskmanager.sh)

The problematic part is likely in the iterations since the part of the job 
before iterations works fine with the same data.

I would appreciate pointers as to how to debug this. 
taskmanager.network.numberOfBuffers from the config sounds relevant but the 
default value of 2048 is already much higher than slots-per-TM^2 * #TMs * 4 = 
4^2 * 1 * 4 = 64.

Attaching flink config, job execution plan and thread dump with some sensitive 
parts retracted.

flink-conf.yml

jobmanager.rpc.address: localhost
jobmanager.rpc.port: 6123
jobmanager.heap.mb: 512
taskmanager.heap.mb: 8192
taskmanager.numberOfTaskSlots: 4
taskmanager.memory.preallocate: false
parallelism.default: 4
jobmanager.web.port: 8081
state.backend: rocksdb
state.backend.fs.checkpointdir: 
file:///Users/andrey.melentyev/tmp/flink-checkpoints

Job execution plan

{
  "nodes": [
{
  "contents": "IterationSource-10",
  "id": -1,
  "pact": "Data Source",
  "parallelism": 8,
  "type": "IterationSource-10"
},
{
  "contents": "Source: Custom File Source",
  "id": 1,
  "pact": "Data Source",
  "parallelism": 1,
  "type": "Source: Custom File Source"
},
{
  "contents": "Split Reader: Custom File Source",
  "id": 2,
  "pact": "Operator",
  "parallelism": 8,
  "predecessors": [
{
  "id": 1,
  "ship_strategy": "REBALANCE",
  "side": "second"
}
  ],
  "type": "Split Reader: Custom File Source"
},
{
  "contents": "Parse JSON",
  "id": 3,
  "pact": "Operator",
  "parallelism": 8,
  "predecessors": [
{
  "id": 2,
  "ship_strategy": "FORWARD",
  "side": "second"
}
  ],
  "type": "Parse JSON"
},
{
  "contents": "Split records",
  "id": 4,
  "pact": "Operator",
  "parallelism": 8,
  "predecessors": [
{
  "id": 3,
  "ship_strategy": "FORWARD",
  "side": "second"
}
  ],
  "type": "Split records (Stateless)"
},
{
  "contents": "Produce Some Data",
  "id": 6,
  "pact": "Operator",
  "parallelism": 8,
  "predecessors": [
{
  "id": 3,
  "ship_strategy": "FORWARD",
  "side": "second"
}
  ],
  "type": "Produce Some Data (Stateless)"
},
{
  "contents": "Produce Some More Data (Stateful)",
  "id": 7,
  "pact": "Operator",
  "parallelism": 8,
  "predecessors": [
{
  "id": 4,
  "ship_strategy": "HASH",
  "side": "second"
}
  ],
  "type": "Produce Some More Data (Stateful)"
},
{
  "contents": "Map",
  "id": 9,
  "pact": "Operator",
  "parallelism": 8,
  "predecessors": [

Side Outputs - type bounds.

2017-04-03 Thread Dawid Wysakowicz
Hi,
I am implementing emitting discarded patterns in CEP library through Side
Outputs and I have a question about the Output::collect method which is:

 void collect(OutputTag outputTag, StreamRecord record);

Why the type of the outputTag is not also X? Or at least T extends X?

Thanks in advance for some info on it.

Regards
Dawid Wysakowicz


[jira] [Created] (FLINK-6244) Emit timeouted Patterns as Side Output

2017-04-03 Thread Dawid Wysakowicz (JIRA)
Dawid Wysakowicz created FLINK-6244:
---

 Summary: Emit timeouted Patterns as Side Output
 Key: FLINK-6244
 URL: https://issues.apache.org/jira/browse/FLINK-6244
 Project: Flink
  Issue Type: Improvement
  Components: CEP
Reporter: Dawid Wysakowicz


Now that we have SideOuputs I think timeouted patterns should be emitted into 
them rather than producing a stream of `Either`



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)