[jira] [Comment Edited] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400473#comment-15400473
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4280 at 7/30/16 5:48 AM:
-

It seems like that this JIRA and FLINK-3398 has snowballed quite a bit of 
thoughts and changes on the Kafka connector configuration, and possibly other 
supported connectors in general (I personally think we have quite a bit of 
diversion in terms of configuration and naming for our supported connectors; 
it'd be good if we can possibly unify the differences).

Would we want a formally structured FLIP, with migration plan, on the issue, so 
that we can gather more consensus and thoughts from the community? I'm not sure 
if discussions like this are considered major changes for Flink and require 
FLIPs.


was (Author: tzulitai):
It seems like that this JIRA and FLINK-3398 has snowballed quite a bit of 
thoughts and changes on the Kafka connector configuration, and possibly other 
supported connectors in general (I personally think we have quite a bit of 
diversion in terms of configuration and naming for our supported connectors; 
it'd be good if we can possibly unify the differences).

Would we want a formally structured FLIP, with migration plan, on the issue, so 
that we can gather more consensus and thoughts from the community on this? I'm 
not sure if discussions like this are considered major changes for Flink and 
require FLIPs.

> New Flink-specific option to set starting position of Kafka consumer without 
> respecting external offsets in ZK / Broker
> ---
>
> Key: FLINK-4280
> URL: https://issues.apache.org/jira/browse/FLINK-4280
> Project: Flink
>  Issue Type: New Feature
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently, to start reading from the "earliest" and "latest" position in 
> topics for the Flink Kafka consumer, users set the Kafka config 
> {{auto.offset.reset}} in the provided properties configuration.
> However, the way this config actually works might be a bit misleading if 
> users were trying to find a way to "read topics from a starting position". 
> The way the {{auto.offset.reset}} config works in the Flink Kafka consumer 
> resembles Kafka's original intent for the setting: first, existing external 
> offsets committed to the ZK / brokers will be checked; if none exists, then 
> will {{auto.offset.reset}} be respected.
> I propose to add Flink-specific ways to define the starting position, without 
> taking into account the external offsets. The original behaviour (reference 
> external offsets first) can be changed to be a user option, so that the 
> behaviour can be retained for frequent Kafka users that may need some 
> collaboration with existing non-Flink Kafka consumer applications.
> How users will interact with the Flink Kafka consumer after this is added, 
> with a newly introduced {{flink.starting-position}} config:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "earliest/latest");
> props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
> warning)
> props.setProperty("group.id", "...") // this won't have effect on the 
> starting position anymore (may still be used in external offset committing)
> ...
> {code}
> Or, reference external offsets in ZK / broker:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "external-offsets");
> props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
> latest
> props.setProperty("group.id", "..."); // will be used to lookup external 
> offsets in ZK / broker on startup
> ...
> {code}
> A thing we would need to decide on is what would the default value be for 
> {{flink.starting-position}}.
> Two merits I see in adding this:
> 1. This compensates the way users generally interpret "read from a starting 
> position". As the Flink Kafka connector is somewhat essentially a 
> "high-level" Kafka consumer for Flink users, I think it is reasonable to add 
> Flink-specific functionality that users will find useful, although it wasn't 
> supported in Kafka's original consumer designs.
> 2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is 
> used only to expose progress to the outside world, and not used to manipulate 
> how Kafka topics are read in Flink (unless users opt to do so)" is even more 
> definite and solid. There was some discussion in this PR 
> (https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I 
> think adding this "decouples" more Flink's internal offset checkpointing from 
> the external Kafka's offset store.


[jira] [Comment Edited] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400473#comment-15400473
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4280 at 7/30/16 5:46 AM:
-

It seems like that this JIRA and FLINK-3398 has snowballed quite a bit of 
thoughts and changes on the Kafka connector configuration, and possibly other 
supported connectors in general (I personally think we have quite a bit of 
diversion in terms of configuration and naming for our supported connectors; 
it'd be good if we can possibly unify the differences).

Would we want a formally structured FLIP, with migration plan, on the issue, so 
that we can gather more consensus and thoughts from the community on this? I'm 
not sure if discussions like this are considered major changes for Flink and 
require FLIPs.


was (Author: tzulitai):
It seems like that this JIRA and FLINK-3398 has snowballed quite a bit of 
thoughts and changes on the Kafka connector configuration, and possibly other 
supported connectors in general (I personally think we have quite a bit of 
diversion in terms of configuration and naming for our supported connectors; 
it'd be good if we can possibly unify the differences). Would we want a 
formally structured FLIP, with migration plan, on the issue, so that we can 
gather more consensus and thoughts from the community on this? I'm not sure if 
discussions like this are considered major changes for Flink and require FLIPs.

> New Flink-specific option to set starting position of Kafka consumer without 
> respecting external offsets in ZK / Broker
> ---
>
> Key: FLINK-4280
> URL: https://issues.apache.org/jira/browse/FLINK-4280
> Project: Flink
>  Issue Type: New Feature
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently, to start reading from the "earliest" and "latest" position in 
> topics for the Flink Kafka consumer, users set the Kafka config 
> {{auto.offset.reset}} in the provided properties configuration.
> However, the way this config actually works might be a bit misleading if 
> users were trying to find a way to "read topics from a starting position". 
> The way the {{auto.offset.reset}} config works in the Flink Kafka consumer 
> resembles Kafka's original intent for the setting: first, existing external 
> offsets committed to the ZK / brokers will be checked; if none exists, then 
> will {{auto.offset.reset}} be respected.
> I propose to add Flink-specific ways to define the starting position, without 
> taking into account the external offsets. The original behaviour (reference 
> external offsets first) can be changed to be a user option, so that the 
> behaviour can be retained for frequent Kafka users that may need some 
> collaboration with existing non-Flink Kafka consumer applications.
> How users will interact with the Flink Kafka consumer after this is added, 
> with a newly introduced {{flink.starting-position}} config:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "earliest/latest");
> props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
> warning)
> props.setProperty("group.id", "...") // this won't have effect on the 
> starting position anymore (may still be used in external offset committing)
> ...
> {code}
> Or, reference external offsets in ZK / broker:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "external-offsets");
> props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
> latest
> props.setProperty("group.id", "..."); // will be used to lookup external 
> offsets in ZK / broker on startup
> ...
> {code}
> A thing we would need to decide on is what would the default value be for 
> {{flink.starting-position}}.
> Two merits I see in adding this:
> 1. This compensates the way users generally interpret "read from a starting 
> position". As the Flink Kafka connector is somewhat essentially a 
> "high-level" Kafka consumer for Flink users, I think it is reasonable to add 
> Flink-specific functionality that users will find useful, although it wasn't 
> supported in Kafka's original consumer designs.
> 2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is 
> used only to expose progress to the outside world, and not used to manipulate 
> how Kafka topics are read in Flink (unless users opt to do so)" is even more 
> definite and solid. There was some discussion in this PR 
> (https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I 
> think adding this "decouples" more Flink's internal offset checkpointing from 
> the external Kafka's offset 

[jira] [Commented] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400475#comment-15400475
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-3398:


I've gave this idea a second thought and commented about it in FLINK-4280.

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400473#comment-15400473
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-4280:


It seems like that this JIRA and FLINK-3398 has snowballed quite a bit of 
thoughts and changes on the Kafka connector configuration, and possibly other 
supported connectors in general (I personally think we have quite a bit of 
diversion in terms of configuration and naming for our supported connectors; 
it'd be good if we can possibly unify the differences). Would we want a 
formally structured FLIP, with migration plan, on the issue, so that we can 
gather more consensus and thoughts from the community on this? I'm not sure if 
discussions like this are considered major changes for Flink and require FLIPs.

> New Flink-specific option to set starting position of Kafka consumer without 
> respecting external offsets in ZK / Broker
> ---
>
> Key: FLINK-4280
> URL: https://issues.apache.org/jira/browse/FLINK-4280
> Project: Flink
>  Issue Type: New Feature
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently, to start reading from the "earliest" and "latest" position in 
> topics for the Flink Kafka consumer, users set the Kafka config 
> {{auto.offset.reset}} in the provided properties configuration.
> However, the way this config actually works might be a bit misleading if 
> users were trying to find a way to "read topics from a starting position". 
> The way the {{auto.offset.reset}} config works in the Flink Kafka consumer 
> resembles Kafka's original intent for the setting: first, existing external 
> offsets committed to the ZK / brokers will be checked; if none exists, then 
> will {{auto.offset.reset}} be respected.
> I propose to add Flink-specific ways to define the starting position, without 
> taking into account the external offsets. The original behaviour (reference 
> external offsets first) can be changed to be a user option, so that the 
> behaviour can be retained for frequent Kafka users that may need some 
> collaboration with existing non-Flink Kafka consumer applications.
> How users will interact with the Flink Kafka consumer after this is added, 
> with a newly introduced {{flink.starting-position}} config:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "earliest/latest");
> props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
> warning)
> props.setProperty("group.id", "...") // this won't have effect on the 
> starting position anymore (may still be used in external offset committing)
> ...
> {code}
> Or, reference external offsets in ZK / broker:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "external-offsets");
> props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
> latest
> props.setProperty("group.id", "..."); // will be used to lookup external 
> offsets in ZK / broker on startup
> ...
> {code}
> A thing we would need to decide on is what would the default value be for 
> {{flink.starting-position}}.
> Two merits I see in adding this:
> 1. This compensates the way users generally interpret "read from a starting 
> position". As the Flink Kafka connector is somewhat essentially a 
> "high-level" Kafka consumer for Flink users, I think it is reasonable to add 
> Flink-specific functionality that users will find useful, although it wasn't 
> supported in Kafka's original consumer designs.
> 2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is 
> used only to expose progress to the outside world, and not used to manipulate 
> how Kafka topics are read in Flink (unless users opt to do so)" is even more 
> definite and solid. There was some discussion in this PR 
> (https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I 
> think adding this "decouples" more Flink's internal offset checkpointing from 
> the external Kafka's offset store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400472#comment-15400472
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-4280:


[~StephanEwen] On second thought, I think I misunderstood what you meant in the 
first place.

What you're proposing is this (I think this is a clearer design than what I 
mentioned above):

{code}
Properties props = new Properties();
...

FlinkKafkaConsumer kafka = new FlinkKafkaConsumer("topic", schema, props);
kafka.setStartFromEarliest();
kafka.setStartFromLatest();
kafka.setEnableCommitOffsets(boolean); // if true, commits on checkpoint if 
checkpointing is enabled, otherwise, periodically.
kafka.setForwardMetrics(boolean);
...

env.addSource(kafka) ...
{code}

Echoing your statement if we are going with this approach:

now that we are trying to separate Flink-specific configs with Kafka configs, I 
think we should clearly state (and change implementation) that the 
{{Properties}} provided in the constructor will only be used to configure the 
internal Kafka consumers the connector is using by simply passing the 
{{Properties}}. So, the only valid configs given in the {{Properties}} that 
will take effect are the ones that the Kafka API supports, i.e. in 
{{FlinkKafkaConsumer08}}, only the configs that the Kafka {{SimpleConsumer}} 
API support take effect; in {{FlinkKafkaConsumer09}}, only the configs that the 
new consumer API {{KafkaConsumer}} support will take effect. Any additional 
function or Flink-specific behaviour on top of the internal Kafka consumers 
should go through setter methods.

The problem to solve, in general, with the current configuration is that we are 
trying to "mimic" high-level consumer functions with original config keys. Take 
{{FlinkKafkaConsumer08}} for example: the {{SimpleConsumer}} API actually 
doesn't use the {{group.id}} or {{auto.offset.reset}} configs. We're 
re-implementing the behavior of these configs ourselves, and providing them 
through the original config keys in the {{Properties}}. When it comes to adding 
functionality on top of the internally used {{SimpleConsumer}}, we tend to 
stretch the original definition of these keys and try to have them work with 
our re-implementations of configs such as {{group.id}} and 
{{auto.offset.reset}}. An example of confusions that users might also get when 
we're re-implementing configs when the internal API doesn't actually use them 
is present in this user ML: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-group-question-td8185.html.

This also reasons the idea you mentioned in FLINK-3398 that we should drop 
Kafka's {{group.id}} and perhaps have Flink's own groupId. Since Kafka's 
{{group.id}} was never actually used by the internal {{SimpleConsumer}} of 
{{FlinkKafkaConsumer08}} in the first place, we should have setter methods for 
functions like "start with offset" or "offset committing", which the user 
should supply with a groupId. For {{FlinkKafkaConsumer09}}, we won't need a 
setter method for "periodic offset committing" because the internal 
{{KafkaConsumer}} supports the function through {{group.id}} and 
{{enable.auto.commit}}; instead, we have a setter method to opt to switch to 
"commit offsets on checkpoint".

Summarize in code:

{code}
// for FlinkKafkaConsumer08
Properties props = new Properties();
...
FlinkKafkaConsumer08 kafka = new FlinkKafkaConsumer08("topic", schema, props);
kafka.setStartFromEarliest();
kafka.setStartFromLatest();
kafka.setStartFromExternalOffsets("groupId")
kafka.setEnableCommitOffsets("groupId"); // periodic if checkpointing is not 
enabled, otherwise on notifyCheckpointComplete()
kafka.setForwardMetrics(boolean);
...
{code}

{code}
// for FlinkKafkaConsumer09
Properties props = new Properties();
...
FlinkKafkaConsumer09 kafka = new FlinkKafkaConsumer09("topic", schema, props);
kafka.setStartFromEarliest();
kafka.setStartFromLatest();
kafka.setStartFromExternalOffsets(); // doesn't take a "group.id", because in 
FlinkKafkaConsumer09, "group.id" is a reckognized config by the new 
KafkaConsumer API
kafka.setCommitOffsetsOnCheckpoint(boolean); // if true (checkpointing should 
be enabled), overrides periodic checkpointing if "enable.auto.commit" is set in 
props
kafka.setForwardMetrics(boolean);
...
{code}


So, the general rule is:

- Supplied configuration is used only to configure the internally used client 
APIs of the external system.
- All Flink-specific configuration, or functions that the internal API do not 
support, go through connector-specific setter methods.

This might be a general rule we would like all Flink supported connectors to 
follow, in the long run? Users will have clear understanding and full control 
of the behaviours of the internal API that the connectors are using, and we'd 
also have a clear line on how new functionality should be added upon them.

> New Flink-specific 

[jira] [Commented] (FLINK-3674) Add an interface for EventTime aware User Function

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400424#comment-15400424
 ] 

ASF GitHub Bot commented on FLINK-3674:
---

Github user ramkrish86 closed the pull request at:

https://github.com/apache/flink/pull/2301


> Add an interface for EventTime aware User Function
> --
>
> Key: FLINK-3674
> URL: https://issues.apache.org/jira/browse/FLINK-3674
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: ramkrishna.s.vasudevan
>
> I suggest to add an interface that UDFs can implement, which will let them be 
> notified upon watermark updates.
> Example usage:
> {code}
> public interface EventTimeFunction {
> void onWatermark(Watermark watermark);
> }
> public class MyMapper implements MapFunction, 
> EventTimeFunction {
> private long currentEventTime = Long.MIN_VALUE;
> public String map(String value) {
> return value + " @ " + currentEventTime;
> }
> public void onWatermark(Watermark watermark) {
> currentEventTime = watermark.getTimestamp();
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3674) Add an interface for EventTime aware User Function

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400425#comment-15400425
 ] 

ASF GitHub Bot commented on FLINK-3674:
---

Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/2301
  
Closing this as discussed. I will wait for more feedback on what is 
expected here and then proceed.


> Add an interface for EventTime aware User Function
> --
>
> Key: FLINK-3674
> URL: https://issues.apache.org/jira/browse/FLINK-3674
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: ramkrishna.s.vasudevan
>
> I suggest to add an interface that UDFs can implement, which will let them be 
> notified upon watermark updates.
> Example usage:
> {code}
> public interface EventTimeFunction {
> void onWatermark(Watermark watermark);
> }
> public class MyMapper implements MapFunction, 
> EventTimeFunction {
> private long currentEventTime = Long.MIN_VALUE;
> public String map(String value) {
> return value + " @ " + currentEventTime;
> }
> public void onWatermark(Watermark watermark) {
> currentEventTime = watermark.getTimestamp();
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2301: FLINK-3674 Add an interface for EventTime aware User Func...

2016-07-29 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/2301
  
Closing this as discussed. I will wait for more feedback on what is 
expected here and then proceed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2301: FLINK-3674 Add an interface for EventTime aware Us...

2016-07-29 Thread ramkrish86
Github user ramkrish86 closed the pull request at:

https://github.com/apache/flink/pull/2301


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3674) Add an interface for EventTime aware User Function

2016-07-29 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400422#comment-15400422
 ] 

ramkrishna.s.vasudevan commented on FLINK-3674:
---

Thanks for the comments/feedback. 
I could see that the initial thought was to just expose some interface so that 
UDFs can implement them and get a call back onWaterMark. LAter Timer based 
interface was talked about.

So my initial thought was to just do the changes so that Timer is exposed as an 
interface based on EventTimeFunction. The idea was not to make this PR as a 
final one but to bring in the discussion. If the practice in FLINK is to make 
design doc based discussions I can ensure that for such PRs i will first add a 
doc and then PR. This happened to another PR also. So I will learn better and 
change my methodology. 
bq.Right now, WindowOperator has a custom implementation of this. This should 
be taken as the basis for a generic implementation than can then also be 
exposed to users.
My thought of exposing the Timer as a first step and then build it based on 
feedback was because of this. Since the Timer in WindowOperator is custom one I 
thought first converting it to an interface would help to add on and see what 
can we do to make it generic.

> Add an interface for EventTime aware User Function
> --
>
> Key: FLINK-3674
> URL: https://issues.apache.org/jira/browse/FLINK-3674
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: ramkrishna.s.vasudevan
>
> I suggest to add an interface that UDFs can implement, which will let them be 
> notified upon watermark updates.
> Example usage:
> {code}
> public interface EventTimeFunction {
> void onWatermark(Watermark watermark);
> }
> public class MyMapper implements MapFunction, 
> EventTimeFunction {
> private long currentEventTime = Long.MIN_VALUE;
> public String map(String value) {
> return value + " @ " + currentEventTime;
> }
> public void onWatermark(Watermark watermark) {
> currentEventTime = watermark.getTimestamp();
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399691#comment-15399691
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4280 at 7/30/16 3:30 AM:
-

Migration plan I have in mind:

1) Keep the current {{FlinkKafkaConsumer}} and {{FlinkKafkaProducer}} 
connectors to take {{Properties}} config. Support the new {{flink.*}} keys for 
Flink-specific settings through the {{Properties}}.
2) Mark the original constructors as deprecated, and have a new constructor 
that accepts the proposed typed configuration class. The new config class can 
take in {{Properties}} to build up the settings, both Flink-specific and Kafka 
(or simply use the setter methods).
3) At some point, perhaps 2.0, remove the original constructors that accepts 
{{Properties}}.


was (Author: tzulitai):
Migration plan I have in mind:

1) Keep the current {{FlinkKafkaConsumer}} and {{FlinkKafkaProducer}} 
connectors to take {{Properties}} config. Support the new "flink.*" keys for 
Flink-specific settings through the {{Properties}}.
2) Mark the original constructors as deprecated, and have a new constructor 
that accepts the proposed typed configuration class. The new config class can 
take in {{Properties}} to build up the settings, both Flink-specific and Kafka 
(or simply use the setter methods).
3) At some point, perhaps 2.0, remove the original constructors that accepts 
{{Properties}}.

> New Flink-specific option to set starting position of Kafka consumer without 
> respecting external offsets in ZK / Broker
> ---
>
> Key: FLINK-4280
> URL: https://issues.apache.org/jira/browse/FLINK-4280
> Project: Flink
>  Issue Type: New Feature
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently, to start reading from the "earliest" and "latest" position in 
> topics for the Flink Kafka consumer, users set the Kafka config 
> {{auto.offset.reset}} in the provided properties configuration.
> However, the way this config actually works might be a bit misleading if 
> users were trying to find a way to "read topics from a starting position". 
> The way the {{auto.offset.reset}} config works in the Flink Kafka consumer 
> resembles Kafka's original intent for the setting: first, existing external 
> offsets committed to the ZK / brokers will be checked; if none exists, then 
> will {{auto.offset.reset}} be respected.
> I propose to add Flink-specific ways to define the starting position, without 
> taking into account the external offsets. The original behaviour (reference 
> external offsets first) can be changed to be a user option, so that the 
> behaviour can be retained for frequent Kafka users that may need some 
> collaboration with existing non-Flink Kafka consumer applications.
> How users will interact with the Flink Kafka consumer after this is added, 
> with a newly introduced {{flink.starting-position}} config:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "earliest/latest");
> props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
> warning)
> props.setProperty("group.id", "...") // this won't have effect on the 
> starting position anymore (may still be used in external offset committing)
> ...
> {code}
> Or, reference external offsets in ZK / broker:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "external-offsets");
> props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
> latest
> props.setProperty("group.id", "..."); // will be used to lookup external 
> offsets in ZK / broker on startup
> ...
> {code}
> A thing we would need to decide on is what would the default value be for 
> {{flink.starting-position}}.
> Two merits I see in adding this:
> 1. This compensates the way users generally interpret "read from a starting 
> position". As the Flink Kafka connector is somewhat essentially a 
> "high-level" Kafka consumer for Flink users, I think it is reasonable to add 
> Flink-specific functionality that users will find useful, although it wasn't 
> supported in Kafka's original consumer designs.
> 2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is 
> used only to expose progress to the outside world, and not used to manipulate 
> how Kafka topics are read in Flink (unless users opt to do so)" is even more 
> definite and solid. There was some discussion in this PR 
> (https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I 
> think adding this "decouples" more Flink's internal offset checkpointing from 
> the external Kafka's offset store.



--
This message was sent by 

[jira] [Comment Edited] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399667#comment-15399667
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-3398 at 7/30/16 3:25 AM:
-

Would this be better?:
Have settings probably like {{flink.start-from-offsets}} (or 
{{flink.starting-position=external-offsets}} as proposed in FLINK-4280) and 
{{flink.commmit-offsets}}. If these settings are present, the Kafka settings 
supplied must contain the {{group.id}}.
A specific "group" setting for Flink might not make sense and confusing, 
because the Kafka consumer doesn't really use any consumer group management 
functionality. The setting is implying that the user wants to trigger a 
Flink-specific behaviour that depends on a Kafka setting.

Does this make sense?


was (Author: tzulitai):
Would this be better?:
Have settings probably like {{flink.start-from-offsets}} (or 
{{flink.starting-position=external-offsets}} as proposed in FLINK-4280) and 
{{flink.commmit-offsets}}. If these settings are present, the Kafka settings 
supplied must contain the {{group.id}}.
A specific "group" setting for Flink might not make sense, because the Kafka 
consumer doesn't really use any consumer group management functionality. The 
setting is implying that the user wants to trigger a Flink-specific behaviour 
that depends on a Kafka setting.

Does this make sense?

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399699#comment-15399699
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-3398 at 7/29/16 5:46 PM:
-

I've left some thoughts on the migration plan in the comments of FLINK-4280.
Sorry for having to jump around the JIRAs, seems like the issues are somewhat 
related.

For migration, the default values for {{flink.start-from-offsets}} and 
{{flink.commit-offsets}}, if not set, will need to be true (or 
{{flink.starting-position}} default to {{external-offsets}}), so that we don't 
break behaviors of existing user configs which won't have these new settings. 


was (Author: tzulitai):
I've left some thoughts on the migration plan in the comments of FLINK-4280.
Sorry for having to jump around the JIRAs, seems like the issues are somewhat 
related.

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399699#comment-15399699
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-3398:


I've left some thoughts on the migration plan in the comments of FLINK-4280.
Sorry for having to jump around the JIRAs, seems like the issues are somewhat 
related.

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399691#comment-15399691
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-4280:


Migration plan I have in mind:

1) Keep the current {{FlinkKafkaConsumer}} and {{FlinkKafkaProducer}} 
connectors to take {{Properties}} config. Support the new "flink.*" keys for 
Flink-specific settings through the {{Properties}}.
2) Mark the original constructors as deprecated, and have a new constructor 
that accepts the proposed typed configuration class. The new config class can 
take in {{Properties}} to build up the settings, both Flink-specific and Kafka 
(or simply use the setter methods).
3) At some point, perhaps 2.0, remove the original constructors that accepts 
{{Properties}}.

> New Flink-specific option to set starting position of Kafka consumer without 
> respecting external offsets in ZK / Broker
> ---
>
> Key: FLINK-4280
> URL: https://issues.apache.org/jira/browse/FLINK-4280
> Project: Flink
>  Issue Type: New Feature
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently, to start reading from the "earliest" and "latest" position in 
> topics for the Flink Kafka consumer, users set the Kafka config 
> {{auto.offset.reset}} in the provided properties configuration.
> However, the way this config actually works might be a bit misleading if 
> users were trying to find a way to "read topics from a starting position". 
> The way the {{auto.offset.reset}} config works in the Flink Kafka consumer 
> resembles Kafka's original intent for the setting: first, existing external 
> offsets committed to the ZK / brokers will be checked; if none exists, then 
> will {{auto.offset.reset}} be respected.
> I propose to add Flink-specific ways to define the starting position, without 
> taking into account the external offsets. The original behaviour (reference 
> external offsets first) can be changed to be a user option, so that the 
> behaviour can be retained for frequent Kafka users that may need some 
> collaboration with existing non-Flink Kafka consumer applications.
> How users will interact with the Flink Kafka consumer after this is added, 
> with a newly introduced {{flink.starting-position}} config:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "earliest/latest");
> props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
> warning)
> props.setProperty("group.id", "...") // this won't have effect on the 
> starting position anymore (may still be used in external offset committing)
> ...
> {code}
> Or, reference external offsets in ZK / broker:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "external-offsets");
> props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
> latest
> props.setProperty("group.id", "..."); // will be used to lookup external 
> offsets in ZK / broker on startup
> ...
> {code}
> A thing we would need to decide on is what would the default value be for 
> {{flink.starting-position}}.
> Two merits I see in adding this:
> 1. This compensates the way users generally interpret "read from a starting 
> position". As the Flink Kafka connector is somewhat essentially a 
> "high-level" Kafka consumer for Flink users, I think it is reasonable to add 
> Flink-specific functionality that users will find useful, although it wasn't 
> supported in Kafka's original consumer designs.
> 2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is 
> used only to expose progress to the outside world, and not used to manipulate 
> how Kafka topics are read in Flink (unless users opt to do so)" is even more 
> definite and solid. There was some discussion in this PR 
> (https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I 
> think adding this "decouples" more Flink's internal offset checkpointing from 
> the external Kafka's offset store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399667#comment-15399667
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-3398:


Would this be better?:
Have settings probably like "flink.start-from-offsets" (or 
"flink.starting-position=external-offsets" as proposed in FLINK-4280) and 
"flink.commmit-offsets". If these settings are present, the Kafka settings 
supplied must contain the "group.id".
A specific "group" setting for Flink might not make sense, because the Kafka 
consumer doesn't really use any consumer group management functionality. The 
setting is implying that the user wants to trigger a Flink-specific behaviour 
that depends on a Kafka setting.

Does this make sense?

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2313: [FLINK-4273] Modify JobClient to attach to running...

2016-07-29 Thread mxm
GitHub user mxm opened a pull request:

https://github.com/apache/flink/pull/2313

[FLINK-4273] Modify JobClient to attach to running jobs

These changes are required for FLINK-4272 (introduce a JobClient class
for job control). Essentially, we want to be able to re-attach to a
running job and monitor it. It shouldn't make any difference whether we
just submitted the job or we recover it from an existing JobID.

This PR modifies the JobClientActor to support two different operation
modes: a) submitJob and monitor b) re-attach to job and monitor

The JobClient class has been updated with methods to access this
functionality. Before it just had `submitJobAndWait` and
`submitJobDetachd`. Additionally, it has `submitJob` and
`attachToRunningJob` and `awaitJobResult`.

`submitJob` -> Submit job and return a future which can be completed to
get the result with `awaitJobResult`

`attachToRunningJob` -> Re-attached the JobClientActor to a runnning
job, registering at the JobManager and downloading its class loader

`awaitJobResult` -> Blocks until the returned future from either
`submitJob` or `attachToRunningJob` has been completed

TODO
- missing integration test to test downloading of the user code
class loader from the JobManager and to end-to-end test the
re-attachment.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink FLINK-4273

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2313.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2313


commit 7832810728abea5824e2fe0d9e9dc75b14ef61af
Author: Maximilian Michels 
Date:   2016-07-29T15:37:41Z

[FLINK-4273] Modify JobClient to attach to running jobs

These changes are required for FLINK-4272 (introduce a JobClient class
for job control). Essentially, we want to be able to re-attach to a
running job and monitor it. It shouldn't make any difference whether we
just submitted the job or we recover it from an existing JobID.

This PR modifies the JobClientActor to support two different operation
modes: a) submitJob and monitor b) re-attach to job and monitor

The JobClient class has been updated with methods to access this
functionality. Before it just had `submitJobAndWait` and
`submitJobDetachd`. Additionally, it has `submitJob` and
`attachToRunningJob` and `awaitJobResult`.

`submitJob` -> Submit job and return a future which can be completed to
get the result with `awaitJobResult`

`attachToRunningJob` -> Re-attached the JobClientActor to a runnning
job, registering at the JobManager and downloading its class loader

`awaitJobResult` -> Blocks until the returned future from either
`submitJob` or `attachToRunningJob` has been completed

TODO
- missing integration test to test downloading of the user code
class loader from the JobManager and to end-to-end test the
re-attachment.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399667#comment-15399667
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-3398 at 7/29/16 4:54 PM:
-

Would this be better?:
Have settings probably like {{flink.start-from-offsets}} (or 
{{flink.starting-position=external-offsets}} as proposed in FLINK-4280) and 
{{flink.commmit-offsets}}. If these settings are present, the Kafka settings 
supplied must contain the {{group.id}}.
A specific "group" setting for Flink might not make sense, because the Kafka 
consumer doesn't really use any consumer group management functionality. The 
setting is implying that the user wants to trigger a Flink-specific behaviour 
that depends on a Kafka setting.

Does this make sense?


was (Author: tzulitai):
Would this be better?:
Have settings probably like {{flink.start-from-offsets}} (or 
{{flink.starting-position=external-offsets}} as proposed in FLINK-4280) and 
{{flink.commmit-offsets}}. If these settings are present, the Kafka settings 
supplied must contain the "group.id".
A specific "group" setting for Flink might not make sense, because the Kafka 
consumer doesn't really use any consumer group management functionality. The 
setting is implying that the user wants to trigger a Flink-specific behaviour 
that depends on a Kafka setting.

Does this make sense?

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399667#comment-15399667
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-3398 at 7/29/16 4:54 PM:
-

Would this be better?:
Have settings probably like {{flink.start-from-offsets}} (or 
{{flink.starting-position=external-offsets}} as proposed in FLINK-4280) and 
{{flink.commmit-offsets}}. If these settings are present, the Kafka settings 
supplied must contain the "group.id".
A specific "group" setting for Flink might not make sense, because the Kafka 
consumer doesn't really use any consumer group management functionality. The 
setting is implying that the user wants to trigger a Flink-specific behaviour 
that depends on a Kafka setting.

Does this make sense?


was (Author: tzulitai):
Would this be better?:
Have settings probably like "flink.start-from-offsets" (or 
"flink.starting-position=external-offsets" as proposed in FLINK-4280) and 
"flink.commmit-offsets". If these settings are present, the Kafka settings 
supplied must contain the "group.id".
A specific "group" setting for Flink might not make sense, because the Kafka 
consumer doesn't really use any consumer group management functionality. The 
setting is implying that the user wants to trigger a Flink-specific behaviour 
that depends on a Kafka setting.

Does this make sense?

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4273) Refactor JobClientActor to watch already submitted jobs

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399660#comment-15399660
 ] 

ASF GitHub Bot commented on FLINK-4273:
---

GitHub user mxm opened a pull request:

https://github.com/apache/flink/pull/2313

[FLINK-4273] Modify JobClient to attach to running jobs

These changes are required for FLINK-4272 (introduce a JobClient class
for job control). Essentially, we want to be able to re-attach to a
running job and monitor it. It shouldn't make any difference whether we
just submitted the job or we recover it from an existing JobID.

This PR modifies the JobClientActor to support two different operation
modes: a) submitJob and monitor b) re-attach to job and monitor

The JobClient class has been updated with methods to access this
functionality. Before it just had `submitJobAndWait` and
`submitJobDetachd`. Additionally, it has `submitJob` and
`attachToRunningJob` and `awaitJobResult`.

`submitJob` -> Submit job and return a future which can be completed to
get the result with `awaitJobResult`

`attachToRunningJob` -> Re-attached the JobClientActor to a runnning
job, registering at the JobManager and downloading its class loader

`awaitJobResult` -> Blocks until the returned future from either
`submitJob` or `attachToRunningJob` has been completed

TODO
- missing integration test to test downloading of the user code
class loader from the JobManager and to end-to-end test the
re-attachment.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink FLINK-4273

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2313.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2313


commit 7832810728abea5824e2fe0d9e9dc75b14ef61af
Author: Maximilian Michels 
Date:   2016-07-29T15:37:41Z

[FLINK-4273] Modify JobClient to attach to running jobs

These changes are required for FLINK-4272 (introduce a JobClient class
for job control). Essentially, we want to be able to re-attach to a
running job and monitor it. It shouldn't make any difference whether we
just submitted the job or we recover it from an existing JobID.

This PR modifies the JobClientActor to support two different operation
modes: a) submitJob and monitor b) re-attach to job and monitor

The JobClient class has been updated with methods to access this
functionality. Before it just had `submitJobAndWait` and
`submitJobDetachd`. Additionally, it has `submitJob` and
`attachToRunningJob` and `awaitJobResult`.

`submitJob` -> Submit job and return a future which can be completed to
get the result with `awaitJobResult`

`attachToRunningJob` -> Re-attached the JobClientActor to a runnning
job, registering at the JobManager and downloading its class loader

`awaitJobResult` -> Blocks until the returned future from either
`submitJob` or `attachToRunningJob` has been completed

TODO
- missing integration test to test downloading of the user code
class loader from the JobManager and to end-to-end test the
re-attachment.




> Refactor JobClientActor to watch already submitted jobs 
> 
>
> Key: FLINK-4273
> URL: https://issues.apache.org/jira/browse/FLINK-4273
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.2.0
>
>
> The JobClientActor assumes that it receives a job, submits it, and waits for 
> the result. This process should be broken up into a submission process and a 
> waiting process which can both be entered independently. This leads to two 
> different entry points:
> 1) submit(job) -> wait
> 2) retrieve(jobID) -> wait



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399650#comment-15399650
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4280 at 7/29/16 4:45 PM:
-

Or perhaps we can seed the config with the Kafka properties by taking it in the 
constructor.


was (Author: tzulitai):
Or perhaps we can seed the config with the Kafka properties, and take them in 
the constructor.

> New Flink-specific option to set starting position of Kafka consumer without 
> respecting external offsets in ZK / Broker
> ---
>
> Key: FLINK-4280
> URL: https://issues.apache.org/jira/browse/FLINK-4280
> Project: Flink
>  Issue Type: New Feature
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently, to start reading from the "earliest" and "latest" position in 
> topics for the Flink Kafka consumer, users set the Kafka config 
> {{auto.offset.reset}} in the provided properties configuration.
> However, the way this config actually works might be a bit misleading if 
> users were trying to find a way to "read topics from a starting position". 
> The way the {{auto.offset.reset}} config works in the Flink Kafka consumer 
> resembles Kafka's original intent for the setting: first, existing external 
> offsets committed to the ZK / brokers will be checked; if none exists, then 
> will {{auto.offset.reset}} be respected.
> I propose to add Flink-specific ways to define the starting position, without 
> taking into account the external offsets. The original behaviour (reference 
> external offsets first) can be changed to be a user option, so that the 
> behaviour can be retained for frequent Kafka users that may need some 
> collaboration with existing non-Flink Kafka consumer applications.
> How users will interact with the Flink Kafka consumer after this is added, 
> with a newly introduced {{flink.starting-position}} config:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "earliest/latest");
> props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
> warning)
> props.setProperty("group.id", "...") // this won't have effect on the 
> starting position anymore (may still be used in external offset committing)
> ...
> {code}
> Or, reference external offsets in ZK / broker:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "external-offsets");
> props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
> latest
> props.setProperty("group.id", "..."); // will be used to lookup external 
> offsets in ZK / broker on startup
> ...
> {code}
> A thing we would need to decide on is what would the default value be for 
> {{flink.starting-position}}.
> Two merits I see in adding this:
> 1. This compensates the way users generally interpret "read from a starting 
> position". As the Flink Kafka connector is somewhat essentially a 
> "high-level" Kafka consumer for Flink users, I think it is reasonable to add 
> Flink-specific functionality that users will find useful, although it wasn't 
> supported in Kafka's original consumer designs.
> 2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is 
> used only to expose progress to the outside world, and not used to manipulate 
> how Kafka topics are read in Flink (unless users opt to do so)" is even more 
> definite and solid. There was some discussion in this PR 
> (https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I 
> think adding this "decouples" more Flink's internal offset checkpointing from 
> the external Kafka's offset store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399650#comment-15399650
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4280 at 7/29/16 4:46 PM:
-

Or perhaps we can seed the config with the Kafka properties by taking it in the 
constructor. Will need a migration plan for this.


was (Author: tzulitai):
Or perhaps we can seed the config with the Kafka properties by taking it in the 
constructor.

> New Flink-specific option to set starting position of Kafka consumer without 
> respecting external offsets in ZK / Broker
> ---
>
> Key: FLINK-4280
> URL: https://issues.apache.org/jira/browse/FLINK-4280
> Project: Flink
>  Issue Type: New Feature
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently, to start reading from the "earliest" and "latest" position in 
> topics for the Flink Kafka consumer, users set the Kafka config 
> {{auto.offset.reset}} in the provided properties configuration.
> However, the way this config actually works might be a bit misleading if 
> users were trying to find a way to "read topics from a starting position". 
> The way the {{auto.offset.reset}} config works in the Flink Kafka consumer 
> resembles Kafka's original intent for the setting: first, existing external 
> offsets committed to the ZK / brokers will be checked; if none exists, then 
> will {{auto.offset.reset}} be respected.
> I propose to add Flink-specific ways to define the starting position, without 
> taking into account the external offsets. The original behaviour (reference 
> external offsets first) can be changed to be a user option, so that the 
> behaviour can be retained for frequent Kafka users that may need some 
> collaboration with existing non-Flink Kafka consumer applications.
> How users will interact with the Flink Kafka consumer after this is added, 
> with a newly introduced {{flink.starting-position}} config:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "earliest/latest");
> props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
> warning)
> props.setProperty("group.id", "...") // this won't have effect on the 
> starting position anymore (may still be used in external offset committing)
> ...
> {code}
> Or, reference external offsets in ZK / broker:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "external-offsets");
> props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
> latest
> props.setProperty("group.id", "..."); // will be used to lookup external 
> offsets in ZK / broker on startup
> ...
> {code}
> A thing we would need to decide on is what would the default value be for 
> {{flink.starting-position}}.
> Two merits I see in adding this:
> 1. This compensates the way users generally interpret "read from a starting 
> position". As the Flink Kafka connector is somewhat essentially a 
> "high-level" Kafka consumer for Flink users, I think it is reasonable to add 
> Flink-specific functionality that users will find useful, although it wasn't 
> supported in Kafka's original consumer designs.
> 2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is 
> used only to expose progress to the outside world, and not used to manipulate 
> how Kafka topics are read in Flink (unless users opt to do so)" is even more 
> definite and solid. There was some discussion in this PR 
> (https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I 
> think adding this "decouples" more Flink's internal offset checkpointing from 
> the external Kafka's offset store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399650#comment-15399650
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-4280:


Or perhaps we can seed the config with the Kafka properties, and take them in 
the constructor.

> New Flink-specific option to set starting position of Kafka consumer without 
> respecting external offsets in ZK / Broker
> ---
>
> Key: FLINK-4280
> URL: https://issues.apache.org/jira/browse/FLINK-4280
> Project: Flink
>  Issue Type: New Feature
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently, to start reading from the "earliest" and "latest" position in 
> topics for the Flink Kafka consumer, users set the Kafka config 
> {{auto.offset.reset}} in the provided properties configuration.
> However, the way this config actually works might be a bit misleading if 
> users were trying to find a way to "read topics from a starting position". 
> The way the {{auto.offset.reset}} config works in the Flink Kafka consumer 
> resembles Kafka's original intent for the setting: first, existing external 
> offsets committed to the ZK / brokers will be checked; if none exists, then 
> will {{auto.offset.reset}} be respected.
> I propose to add Flink-specific ways to define the starting position, without 
> taking into account the external offsets. The original behaviour (reference 
> external offsets first) can be changed to be a user option, so that the 
> behaviour can be retained for frequent Kafka users that may need some 
> collaboration with existing non-Flink Kafka consumer applications.
> How users will interact with the Flink Kafka consumer after this is added, 
> with a newly introduced {{flink.starting-position}} config:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "earliest/latest");
> props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
> warning)
> props.setProperty("group.id", "...") // this won't have effect on the 
> starting position anymore (may still be used in external offset committing)
> ...
> {code}
> Or, reference external offsets in ZK / broker:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "external-offsets");
> props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
> latest
> props.setProperty("group.id", "..."); // will be used to lookup external 
> offsets in ZK / broker on startup
> ...
> {code}
> A thing we would need to decide on is what would the default value be for 
> {{flink.starting-position}}.
> Two merits I see in adding this:
> 1. This compensates the way users generally interpret "read from a starting 
> position". As the Flink Kafka connector is somewhat essentially a 
> "high-level" Kafka consumer for Flink users, I think it is reasonable to add 
> Flink-specific functionality that users will find useful, although it wasn't 
> supported in Kafka's original consumer designs.
> 2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is 
> used only to expose progress to the outside world, and not used to manipulate 
> how Kafka topics are read in Flink (unless users opt to do so)" is even more 
> definite and solid. There was some discussion in this PR 
> (https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I 
> think adding this "decouples" more Flink's internal offset checkpointing from 
> the external Kafka's offset store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4195) Dedicated Configuration classes for Kinesis Consumer / Producer

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-4195:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> Dedicated Configuration classes for Kinesis Consumer / Producer
> ---
>
> Key: FLINK-4195
> URL: https://issues.apache.org/jira/browse/FLINK-4195
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kinesis Connector, Streaming Connectors
>Affects Versions: 1.1.0
>Reporter: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> While fixing FLINK-4170, I feel that configuration and default value setting 
> & validation is quite messy and unconsolidated for the current state of the 
> code, and will likely become worse as more configs grow for the Kinesis 
> connector.
> I propose to have a dedicated configuration class (instead of only Java 
> properties) along the lines of Flink's own {{Configuration}}, so that the 
> usage pattern is alike. There will be separate configuration classes for 
> {{FlinkKinesisConsumerConfig}} and {{FlinkKinesisProducerConfig}}.
> [~uce] [~rmetzger] What do you think? This will break the interface, so if 
> we're to change this, I'd prefer to fix it along with FLINK-4170 for Flink 
> 1.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399648#comment-15399648
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-4280:


I like the idea of typed configuration classes too. For the Kinesis connector, 
we also have the same idea in mind (JIRA: FLINK-4195).
This also gives a better separation between Flink-specific configs and Kafka 
configs.
So, I think it'd be something like this? :
{code}
FlinkKafkaConsumerConfig config = new FlinkKafkaConsumerConfig();
config.setStartPosition(...);
config.setCommitOffsets(boolean);
config.setForwardMetrics(boolean);
config.setKafkaProperties(Properties);
...
{code}

> New Flink-specific option to set starting position of Kafka consumer without 
> respecting external offsets in ZK / Broker
> ---
>
> Key: FLINK-4280
> URL: https://issues.apache.org/jira/browse/FLINK-4280
> Project: Flink
>  Issue Type: New Feature
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently, to start reading from the "earliest" and "latest" position in 
> topics for the Flink Kafka consumer, users set the Kafka config 
> {{auto.offset.reset}} in the provided properties configuration.
> However, the way this config actually works might be a bit misleading if 
> users were trying to find a way to "read topics from a starting position". 
> The way the {{auto.offset.reset}} config works in the Flink Kafka consumer 
> resembles Kafka's original intent for the setting: first, existing external 
> offsets committed to the ZK / brokers will be checked; if none exists, then 
> will {{auto.offset.reset}} be respected.
> I propose to add Flink-specific ways to define the starting position, without 
> taking into account the external offsets. The original behaviour (reference 
> external offsets first) can be changed to be a user option, so that the 
> behaviour can be retained for frequent Kafka users that may need some 
> collaboration with existing non-Flink Kafka consumer applications.
> How users will interact with the Flink Kafka consumer after this is added, 
> with a newly introduced {{flink.starting-position}} config:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "earliest/latest");
> props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
> warning)
> props.setProperty("group.id", "...") // this won't have effect on the 
> starting position anymore (may still be used in external offset committing)
> ...
> {code}
> Or, reference external offsets in ZK / broker:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "external-offsets");
> props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
> latest
> props.setProperty("group.id", "..."); // will be used to lookup external 
> offsets in ZK / broker on startup
> ...
> {code}
> A thing we would need to decide on is what would the default value be for 
> {{flink.starting-position}}.
> Two merits I see in adding this:
> 1. This compensates the way users generally interpret "read from a starting 
> position". As the Flink Kafka connector is somewhat essentially a 
> "high-level" Kafka consumer for Flink users, I think it is reasonable to add 
> Flink-specific functionality that users will find useful, although it wasn't 
> supported in Kafka's original consumer designs.
> 2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is 
> used only to expose progress to the outside world, and not used to manipulate 
> how Kafka topics are read in Flink (unless users opt to do so)" is even more 
> definite and solid. There was some discussion in this PR 
> (https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I 
> think adding this "decouples" more Flink's internal offset checkpointing from 
> the external Kafka's offset store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399636#comment-15399636
 ] 

Stephan Ewen commented on FLINK-3398:
-

We could do that, need to think then about whether we simply want to break 
behavior, of have a migration plan.

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399633#comment-15399633
 ] 

Stephan Ewen commented on FLINK-3398:
-

[~tzulitai] If we use a separate flink group-id setting for this, should we 
simply ignore the kafka group.id property?

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399629#comment-15399629
 ] 

ASF GitHub Bot commented on FLINK-3398:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/1690
  
I left a few thoughts in the corresponding JIRA issue: 
https://issues.apache.org/jira/browse/FLINK-3398


> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #1690: FLINK-3398: Allow for opting-out from Kafka offset auto-c...

2016-07-29 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/1690
  
I left a few thoughts in the corresponding JIRA issue: 
https://issues.apache.org/jira/browse/FLINK-3398


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399625#comment-15399625
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-3398:


[~StephanEwen] That's actually a very good summary! From the usage pattern, I 
think it makes sense. Our Kafka consumers are using "group.id" for start offset 
& offset committing only, so the change will also fit into the current 
implementation.

However, this again is stretching Kafka's original intent of the "group.id" 
config. I think we need to decide whether or not these Flink-specific 
behaviours should be separated apart from Kafka's original configs.

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3779) Add support for queryable state

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399592#comment-15399592
 ] 

ASF GitHub Bot commented on FLINK-3779:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2051
  
Good from my side.


> Add support for queryable state
> ---
>
> Key: FLINK-3779
> URL: https://issues.apache.org/jira/browse/FLINK-3779
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>
> Flink offers state abstractions for user functions in order to guarantee 
> fault-tolerant processing of streams. Users can work with both 
> non-partitioned (Checkpointed interface) and partitioned state 
> (getRuntimeContext().getState(ValueStateDescriptor) and other variants).
> The partitioned state interface provides access to different types of state 
> that are all scoped to the key of the current input element. This type of 
> state can only be used on a KeyedStream, which is created via stream.keyBy().
> Currently, all of this state is internal to Flink and used in order to 
> provide processing guarantees in failure cases (e.g. exactly-once processing).
> The goal of Queryable State is to expose this state outside of Flink by 
> supporting queries against the partitioned key value state.
> This will help to eliminate the need for distributed operations/transactions 
> with external systems such as key-value stores which are often the bottleneck 
> in practice. Exposing the local state to the outside moves a good part of the 
> database work into the stream processor, allowing both high throughput 
> queries and immediate access to the computed state.
> This is the initial design doc for the feature: 
> https://docs.google.com/document/d/1NkQuhIKYmcprIU5Vjp04db1HgmYSsZtCMxgDi_iTN-g.
>  Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2051: [FLINK-3779] Add support for queryable state

2016-07-29 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2051
  
Good from my side.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399575#comment-15399575
 ] 

Stephan Ewen commented on FLINK-3398:
-

Let's reboot the discussion and work on this issue. Some observations:

  1. Many cases do not not need committing of offsets externally. Users often 
specify some random "group.id" if no meaningful group can be specified.
  2. In some cases, committing is even harmful, as the picked up start offsets 
override the reset strategies like (earliest offset). To avoid that, people use 
a random "group.id".
  3. If one wants to use external offsets, one typically also wants to update 
them, and vice versa. Both is tied to a "group.id"

With these observations, it seems we can tie this to the existence of a group:

  - If one sets a "group.id", we use it for start offset and commit offsets.
  - If no "group.id" exists, no start offsets are obtained and no offsets are 
committed.

Does that make sense?

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-3398:

Component/s: Kafka Connector

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-3398:

Priority: Blocker  (was: Major)

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
>Priority: Blocker
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-3398:

Fix Version/s: 1.2.0

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-3398:

Issue Type: Improvement  (was: Bug)

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3398) Flink Kafka consumer should support auto-commit opt-outs

2016-07-29 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-3398:

Priority: Major  (was: Blocker)

> Flink Kafka consumer should support auto-commit opt-outs
> 
>
> Key: FLINK-3398
> URL: https://issues.apache.org/jira/browse/FLINK-3398
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Shikhar Bhushan
> Fix For: 1.2.0
>
>
> Currently the Kafka source will commit consumer offsets to Zookeeper, either 
> upon a checkpoint if checkpointing is enabled, otherwise periodically based 
> on {{auto.commit.interval.ms}}
> It should be possible to opt-out of committing consumer offsets to Zookeeper. 
> Kafka has this config as {{auto.commit.enable}} (0.8) and 
> {{enable.auto.commit}} (0.9).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399544#comment-15399544
 ] 

Stephan Ewen commented on FLINK-4280:
-

A more general thought on the configuration of Flink-specific properties in the 
FlinkKafkaConsumer:

Do we want to make all configurations go through the {{Properties}}? I 
personally find setter methods quite nice, and they are type- and compile-time 
safe.
One possible approach could be that all Kafka settings go through the 
Properties object (as they do in Kafka), and that we use getters/setters for 
Flink properties.

That would apply also to other settings like
  - start offsets
  - commit to kafka/zookeeper or not
  - activate metric forwarding

> New Flink-specific option to set starting position of Kafka consumer without 
> respecting external offsets in ZK / Broker
> ---
>
> Key: FLINK-4280
> URL: https://issues.apache.org/jira/browse/FLINK-4280
> Project: Flink
>  Issue Type: New Feature
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently, to start reading from the "earliest" and "latest" position in 
> topics for the Flink Kafka consumer, users set the Kafka config 
> {{auto.offset.reset}} in the provided properties configuration.
> However, the way this config actually works might be a bit misleading if 
> users were trying to find a way to "read topics from a starting position". 
> The way the {{auto.offset.reset}} config works in the Flink Kafka consumer 
> resembles Kafka's original intent for the setting: first, existing external 
> offsets committed to the ZK / brokers will be checked; if none exists, then 
> will {{auto.offset.reset}} be respected.
> I propose to add Flink-specific ways to define the starting position, without 
> taking into account the external offsets. The original behaviour (reference 
> external offsets first) can be changed to be a user option, so that the 
> behaviour can be retained for frequent Kafka users that may need some 
> collaboration with existing non-Flink Kafka consumer applications.
> How users will interact with the Flink Kafka consumer after this is added, 
> with a newly introduced {{flink.starting-position}} config:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "earliest/latest");
> props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
> warning)
> props.setProperty("group.id", "...") // this won't have effect on the 
> starting position anymore (may still be used in external offset committing)
> ...
> {code}
> Or, reference external offsets in ZK / broker:
> {code}
> Properties props = new Properties();
> props.setProperty("flink.starting-position", "external-offsets");
> props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
> latest
> props.setProperty("group.id", "..."); // will be used to lookup external 
> offsets in ZK / broker on startup
> ...
> {code}
> A thing we would need to decide on is what would the default value be for 
> {{flink.starting-position}}.
> Two merits I see in adding this:
> 1. This compensates the way users generally interpret "read from a starting 
> position". As the Flink Kafka connector is somewhat essentially a 
> "high-level" Kafka consumer for Flink users, I think it is reasonable to add 
> Flink-specific functionality that users will find useful, although it wasn't 
> supported in Kafka's original consumer designs.
> 2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is 
> used only to expose progress to the outside world, and not used to manipulate 
> how Kafka topics are read in Flink (unless users opt to do so)" is even more 
> definite and solid. There was some discussion in this PR 
> (https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I 
> think adding this "decouples" more Flink's internal offset checkpointing from 
> the external Kafka's offset store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4282) Add Offset Parameter to WindowAssigners

2016-07-29 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399487#comment-15399487
 ] 

Stephan Ewen commented on FLINK-4282:
-

Big +1 for this feature

> Add Offset Parameter to WindowAssigners
> ---
>
> Key: FLINK-4282
> URL: https://issues.apache.org/jira/browse/FLINK-4282
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Aljoscha Krettek
>
> Currently, windows are always aligned to EPOCH, which basically means days 
> are aligned with GMT. This is somewhat problematic for people living in 
> different timezones.
> And offset parameter would allow to adapt the window assigner to the timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-07-29 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399484#comment-15399484
 ] 

Stephan Ewen commented on FLINK-4283:
-

To the attention of the distributed coordination shepherd ([~till.rohrmann])

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3674) Add an interface for EventTime aware User Function

2016-07-29 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399481#comment-15399481
 ] 

Stephan Ewen commented on FLINK-3674:
-

I would also like to take a step back here from the pull request.
We have not even decided what approach we want to take. Right now, there are 
two suggestions:

  1. Add an interface and simply get calls on watermarks
  2. Expose timers in all functions.



> Add an interface for EventTime aware User Function
> --
>
> Key: FLINK-3674
> URL: https://issues.apache.org/jira/browse/FLINK-3674
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: ramkrishna.s.vasudevan
>
> I suggest to add an interface that UDFs can implement, which will let them be 
> notified upon watermark updates.
> Example usage:
> {code}
> public interface EventTimeFunction {
> void onWatermark(Watermark watermark);
> }
> public class MyMapper implements MapFunction, 
> EventTimeFunction {
> private long currentEventTime = Long.MIN_VALUE;
> public String map(String value) {
> return value + " @ " + currentEventTime;
> }
> public void onWatermark(Watermark watermark) {
> currentEventTime = watermark.getTimestamp();
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4284) DataSet/CEP link to non-existant "Linking with Flink" section

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399472#comment-15399472
 ] 

ASF GitHub Bot commented on FLINK-4284:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2311
  
If this `[transformations](#dataset-transformations)` is a pure anchor link 
then they aren't broken.


> DataSet/CEP link to non-existant "Linking with Flink" section
> -
>
> Key: FLINK-4284
> URL: https://issues.apache.org/jira/browse/FLINK-4284
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> Relevant section for DataSet (apis/batch/index.md; L 57-60):
> {code}
> The following program is a complete, working example of WordCount. You can 
> copy  paste the code
> to run it locally. You only have to include the correct Flink's library into 
> your project
> (see Section [Linking with Flink](#linking-with-flink)) and specify the 
> imports. Then you are ready
> to go!
> {code}
> Relevant section for CEP(apis/streaming/libs/cep.md; L 45-48:
> {code}
> ## Getting Started
> If you want to jump right in, you have to [set up a Flink program]({{ 
> site.baseurl }}/apis/batch/index.html#linking-with-flink).
> Next, you have to add the FlinkCEP dependency to the `pom.xml` of your 
> project.
> {code}
> The CEP doc probably shouldn't refer to the DataSet documentation at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2311: [FLINK-4284] [docu] Fix broken links

2016-07-29 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2311
  
If this `[transformations](#dataset-transformations)` is a pure anchor link 
then they aren't broken.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4279) [py] Set flink dependencies to provided

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399438#comment-15399438
 ] 

ASF GitHub Bot commented on FLINK-4279:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2308
  
For all the connectors and other libraries (that were included into user 
fat jars), these changes were good.
So, if the Python module is used in a similar way, it should make sense 
there as well...

+1


> [py] Set flink dependencies to provided
> ---
>
> Key: FLINK-4279
> URL: https://issues.apache.org/jira/browse/FLINK-4279
> Project: Flink
>  Issue Type: Improvement
>  Components: Python API
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3674) Add an interface for EventTime aware User Function

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399447#comment-15399447
 ] 

ASF GitHub Bot commented on FLINK-3674:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2301
  
Can we go back to the JIRA issue for a step, and first decide how we 
actually want his feature to look like?
For API additions, it is crucial to not do "something fast", but discuss 
and understand deeply what the feature should actually be like.


> Add an interface for EventTime aware User Function
> --
>
> Key: FLINK-3674
> URL: https://issues.apache.org/jira/browse/FLINK-3674
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: ramkrishna.s.vasudevan
>
> I suggest to add an interface that UDFs can implement, which will let them be 
> notified upon watermark updates.
> Example usage:
> {code}
> public interface EventTimeFunction {
> void onWatermark(Watermark watermark);
> }
> public class MyMapper implements MapFunction, 
> EventTimeFunction {
> private long currentEventTime = Long.MIN_VALUE;
> public String map(String value) {
> return value + " @ " + currentEventTime;
> }
> public void onWatermark(Watermark watermark) {
> currentEventTime = watermark.getTimestamp();
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2301: FLINK-3674 Add an interface for EventTime aware User Func...

2016-07-29 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2301
  
Can we go back to the JIRA issue for a step, and first decide how we 
actually want his feature to look like?
For API additions, it is crucial to not do "something fast", but discuss 
and understand deeply what the feature should actually be like.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2308: [FLINK-4279] [py] Set flink dependencies to provided

2016-07-29 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2308
  
For all the connectors and other libraries (that were included into user 
fat jars), these changes were good.
So, if the Python module is used in a similar way, it should make sense 
there as well...

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4161) Quickstarts can exclude more flink-dist dependencies

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399433#comment-15399433
 ] 

ASF GitHub Bot commented on FLINK-4161:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2309
  
Probably good for master (1.2) and 1.1.x


> Quickstarts can exclude more flink-dist dependencies
> 
>
> Key: FLINK-4161
> URL: https://issues.apache.org/jira/browse/FLINK-4161
> Project: Flink
>  Issue Type: Improvement
>  Components: Quickstarts
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
>
> The Quickstart poms exclude several dependencies that flink-dist contains 
> from being packaged into the fat-jar.
> However, the following flink-dist dependencies are not excluded:
> {code}
> org.apache.flink:flink-streaming-scala_2.10
> org.apache.flink:flink-scala-shell_2.10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2309: [FLINK-4161] Add Quickstart exclusion for flink-dist depe...

2016-07-29 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2309
  
Probably good for master (1.2) and 1.1.x


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4161) Quickstarts can exclude more flink-dist dependencies

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399432#comment-15399432
 ] 

ASF GitHub Bot commented on FLINK-4161:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2309
  
I think this change makes sense.

+1 from my side


> Quickstarts can exclude more flink-dist dependencies
> 
>
> Key: FLINK-4161
> URL: https://issues.apache.org/jira/browse/FLINK-4161
> Project: Flink
>  Issue Type: Improvement
>  Components: Quickstarts
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
>
> The Quickstart poms exclude several dependencies that flink-dist contains 
> from being packaged into the fat-jar.
> However, the following flink-dist dependencies are not excluded:
> {code}
> org.apache.flink:flink-streaming-scala_2.10
> org.apache.flink:flink-scala-shell_2.10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2309: [FLINK-4161] Add Quickstart exclusion for flink-dist depe...

2016-07-29 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2309
  
I think this change makes sense.

+1 from my side


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4284) DataSet/CEP link to non-existant "Linking with Flink" section

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399428#comment-15399428
 ] 

ASF GitHub Bot commented on FLINK-4284:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2311
  
Out of curiosity: Are pure anchor linkes in general broken in the docs?


> DataSet/CEP link to non-existant "Linking with Flink" section
> -
>
> Key: FLINK-4284
> URL: https://issues.apache.org/jira/browse/FLINK-4284
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> Relevant section for DataSet (apis/batch/index.md; L 57-60):
> {code}
> The following program is a complete, working example of WordCount. You can 
> copy  paste the code
> to run it locally. You only have to include the correct Flink's library into 
> your project
> (see Section [Linking with Flink](#linking-with-flink)) and specify the 
> imports. Then you are ready
> to go!
> {code}
> Relevant section for CEP(apis/streaming/libs/cep.md; L 45-48:
> {code}
> ## Getting Started
> If you want to jump right in, you have to [set up a Flink program]({{ 
> site.baseurl }}/apis/batch/index.html#linking-with-flink).
> Next, you have to add the FlinkCEP dependency to the `pom.xml` of your 
> project.
> {code}
> The CEP doc probably shouldn't refer to the DataSet documentation at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2311: [FLINK-4284] [docu] Fix broken links

2016-07-29 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2311
  
Out of curiosity: Are pure anchor linkes in general broken in the docs?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4277) TaskManagerConfigurationTest fails

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399426#comment-15399426
 ] 

ASF GitHub Bot commented on FLINK-4277:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2310
  
good catch, thanks!

 +1 to merge this


> TaskManagerConfigurationTest fails
> --
>
> Key: FLINK-4277
> URL: https://issues.apache.org/jira/browse/FLINK-4277
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>  Labels: test-stability
>
> When running this test on Ubuntu i encountered this error:
> Running org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest
> java.lang.Exception: Could not load configuration
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$.parseArgsAndLoadConfig(TaskManager.scala:1526)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.parseArgsAndLoadConfig(TaskManager.scala)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest.testDefaultFsParameterLoading(TaskManagerConfigurationTest.java:125)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Caused by: org.apache.flink.configuration.IllegalConfigurationException: The 
> Flink config file 
> '/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml/flink-conf.yaml'
>  
> (/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml)
>  does not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2310: [FLINK-4277] Fix TaskManagerConfigurationTest#testDefault...

2016-07-29 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2310
  
good catch, thanks!

 +1 to merge this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-4277) TaskManagerConfigurationTest fails

2016-07-29 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-4277:

Labels: test-stability  (was: )

> TaskManagerConfigurationTest fails
> --
>
> Key: FLINK-4277
> URL: https://issues.apache.org/jira/browse/FLINK-4277
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>  Labels: test-stability
>
> When running this test on Ubuntu i encountered this error:
> Running org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest
> java.lang.Exception: Could not load configuration
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$.parseArgsAndLoadConfig(TaskManager.scala:1526)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.parseArgsAndLoadConfig(TaskManager.scala)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest.testDefaultFsParameterLoading(TaskManagerConfigurationTest.java:125)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Caused by: org.apache.flink.configuration.IllegalConfigurationException: The 
> Flink config file 
> '/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml/flink-conf.yaml'
>  
> (/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml)
>  does not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4276) TextInputFormatTest.testNestedFileRead fails on Windows OS

2016-07-29 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-4276:

Labels: test-stability  (was: )

> TextInputFormatTest.testNestedFileRead fails on Windows OS
> --
>
> Key: FLINK-4276
> URL: https://issues.apache.org/jira/browse/FLINK-4276
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats, Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
>  Labels: test-stability
>
> Stack-trace i got when running the test on W10:
> Running org.apache.flink.api.java.io.TextInputFormatTest
> test failed with exception: null
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.flink.api.java.io.TextInputFormatTest.testNestedFileRead(TextInputFormatTest.java:133)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4283) ExecutionGraphRestartTest fails

2016-07-29 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-4283:

Labels: test-stability  (was: )

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4255) Unstable test WebRuntimeMonitorITCase.testNoEscape

2016-07-29 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399306#comment-15399306
 ] 

Chesnay Schepler commented on FLINK-4255:
-

The same exception occurred here: 
https://issues.apache.org/jira/browse/FLINK-3746

> Unstable test WebRuntimeMonitorITCase.testNoEscape
> --
>
> Key: FLINK-4255
> URL: https://issues.apache.org/jira/browse/FLINK-4255
> Project: Flink
>  Issue Type: Bug
>Reporter: Kostas Kloudas
>  Labels: test-stability
>
> An instance of the problem can be found here:
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/146615994/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4276) TextInputFormatTest.testNestedFileRead fails on Windows OS

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399286#comment-15399286
 ] 

ASF GitHub Bot commented on FLINK-4276:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2312

[FLINK-4276] Fix TextInputFormatTest#testNestedFileRead

The problem with test was a simple formatting mismatch between the paths 
returned by `"file:" + File.getAbsolutePath()` and `Path#toString()`. The 
orientation of slashes were different, and one slash missing after `file:`.

The test was changed to make sure that the paths were formatted the same 
way.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 4276_test_text

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2312.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2312


commit 7ca74b10c645ce7b1db7cfa3e3b51fdc32554f71
Author: zentol 
Date:   2016-07-29T12:48:28Z

[FLINK-4276] Fix TextInputFormatTest#testNestedFileRead




> TextInputFormatTest.testNestedFileRead fails on Windows OS
> --
>
> Key: FLINK-4276
> URL: https://issues.apache.org/jira/browse/FLINK-4276
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats, Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> Stack-trace i got when running the test on W10:
> Running org.apache.flink.api.java.io.TextInputFormatTest
> test failed with exception: null
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.flink.api.java.io.TextInputFormatTest.testNestedFileRead(TextInputFormatTest.java:133)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-4276) TextInputFormatTest.testNestedFileRead fails on Windows OS

2016-07-29 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-4276:
---

Assignee: Chesnay Schepler

> TextInputFormatTest.testNestedFileRead fails on Windows OS
> --
>
> Key: FLINK-4276
> URL: https://issues.apache.org/jira/browse/FLINK-4276
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats, Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
>
> Stack-trace i got when running the test on W10:
> Running org.apache.flink.api.java.io.TextInputFormatTest
> test failed with exception: null
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.flink.api.java.io.TextInputFormatTest.testNestedFileRead(TextInputFormatTest.java:133)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2312: [FLINK-4276] Fix TextInputFormatTest#testNestedFil...

2016-07-29 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2312

[FLINK-4276] Fix TextInputFormatTest#testNestedFileRead

The problem with test was a simple formatting mismatch between the paths 
returned by `"file:" + File.getAbsolutePath()` and `Path#toString()`. The 
orientation of slashes were different, and one slash missing after `file:`.

The test was changed to make sure that the paths were formatted the same 
way.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 4276_test_text

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2312.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2312


commit 7ca74b10c645ce7b1db7cfa3e3b51fdc32554f71
Author: zentol 
Date:   2016-07-29T12:48:28Z

[FLINK-4276] Fix TextInputFormatTest#testNestedFileRead




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4269) Decrease log level in RuntimeMonitorHandler

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399253#comment-15399253
 ] 

ASF GitHub Bot commented on FLINK-4269:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2307


> Decrease log level in RuntimeMonitorHandler
> ---
>
> Key: FLINK-4269
> URL: https://issues.apache.org/jira/browse/FLINK-4269
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.1.0
>Reporter: Ufuk Celebi
>Priority: Minor
> Fix For: 1.1.0
>
>
> Having a browser window open which points to a page of an unavailable job 
> (for example from a previous cluster setup) leads to many log warning 
> messages like this in the job manager log file:
> {code}
> 2016-06-30 10:50:55,812 WARN  
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler - Error while 
> handling request
> org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job 
> with id 343bd3c0c7415370d79e04f305b5a2e9
>   at 
> org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:135)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:112)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:60)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
>   at 
> io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
>   at 
> io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:104)
>   at 
> org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> {code}
> Since the REST API calls are refreshed every X seconds, this can lead to 
> quite some log pollution. We might consider decreasing the log level to DEBUG 
> or TRACE even.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4269) Decrease log level in RuntimeMonitorHandler

2016-07-29 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-4269.
---
   Resolution: Fixed
Fix Version/s: 1.1.0

Implemented in d618b13e8ce21a965a01d47e90eb607211a09bdc

> Decrease log level in RuntimeMonitorHandler
> ---
>
> Key: FLINK-4269
> URL: https://issues.apache.org/jira/browse/FLINK-4269
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.1.0
>Reporter: Ufuk Celebi
>Priority: Minor
> Fix For: 1.1.0
>
>
> Having a browser window open which points to a page of an unavailable job 
> (for example from a previous cluster setup) leads to many log warning 
> messages like this in the job manager log file:
> {code}
> 2016-06-30 10:50:55,812 WARN  
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler - Error while 
> handling request
> org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job 
> with id 343bd3c0c7415370d79e04f305b5a2e9
>   at 
> org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:135)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:112)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:60)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
>   at 
> io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
>   at 
> io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:104)
>   at 
> org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> {code}
> Since the REST API calls are refreshed every X seconds, this can lead to 
> quite some log pollution. We might consider decreasing the log level to DEBUG 
> or TRACE even.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2307: [FLINK-4269] Decrease log level in RuntimeMonitorH...

2016-07-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2307


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4269) Decrease log level in RuntimeMonitorHandler

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399240#comment-15399240
 ] 

ASF GitHub Bot commented on FLINK-4269:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2307
  
merging


> Decrease log level in RuntimeMonitorHandler
> ---
>
> Key: FLINK-4269
> URL: https://issues.apache.org/jira/browse/FLINK-4269
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.1.0
>Reporter: Ufuk Celebi
>Priority: Minor
>
> Having a browser window open which points to a page of an unavailable job 
> (for example from a previous cluster setup) leads to many log warning 
> messages like this in the job manager log file:
> {code}
> 2016-06-30 10:50:55,812 WARN  
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler - Error while 
> handling request
> org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job 
> with id 343bd3c0c7415370d79e04f305b5a2e9
>   at 
> org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:135)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:112)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:60)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
>   at 
> io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
>   at 
> io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:104)
>   at 
> org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> {code}
> Since the REST API calls are refreshed every X seconds, this can lead to 
> quite some log pollution. We might consider decreasing the log level to DEBUG 
> or TRACE even.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2307: [FLINK-4269] Decrease log level in RuntimeMonitorHandler

2016-07-29 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2307
  
merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2310: [FLINK-4277] Fix TaskManagerConfigurationTest#testDefault...

2016-07-29 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2310
  
@mxm addressed your comments


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4277) TaskManagerConfigurationTest fails

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399225#comment-15399225
 ] 

ASF GitHub Bot commented on FLINK-4277:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2310
  
@mxm addressed your comments


> TaskManagerConfigurationTest fails
> --
>
> Key: FLINK-4277
> URL: https://issues.apache.org/jira/browse/FLINK-4277
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> When running this test on Ubuntu i encountered this error:
> Running org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest
> java.lang.Exception: Could not load configuration
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$.parseArgsAndLoadConfig(TaskManager.scala:1526)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.parseArgsAndLoadConfig(TaskManager.scala)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest.testDefaultFsParameterLoading(TaskManagerConfigurationTest.java:125)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Caused by: org.apache.flink.configuration.IllegalConfigurationException: The 
> Flink config file 
> '/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml/flink-conf.yaml'
>  
> (/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml)
>  does not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2311: [FLINK-4284] [docu] Fix broken links

2016-07-29 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2311

[FLINK-4284] [docu] Fix broken links

Fixes several broken links in the documentation. There were all links to 
the Batch documentation that are now under "Basic API Concepts".

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 4284_docs_links

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2311.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2311


commit 9c0ee4daa5c040c2131d3794aadc7ec574859219
Author: zentol 
Date:   2016-07-29T12:06:14Z

[FLINK-4284] [docu] Fix broken links




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4284) DataSet/CEP link to non-existant "Linking with Flink" section

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399219#comment-15399219
 ] 

ASF GitHub Bot commented on FLINK-4284:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2311

[FLINK-4284] [docu] Fix broken links

Fixes several broken links in the documentation. There were all links to 
the Batch documentation that are now under "Basic API Concepts".

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 4284_docs_links

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2311.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2311


commit 9c0ee4daa5c040c2131d3794aadc7ec574859219
Author: zentol 
Date:   2016-07-29T12:06:14Z

[FLINK-4284] [docu] Fix broken links




> DataSet/CEP link to non-existant "Linking with Flink" section
> -
>
> Key: FLINK-4284
> URL: https://issues.apache.org/jira/browse/FLINK-4284
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> Relevant section for DataSet (apis/batch/index.md; L 57-60):
> {code}
> The following program is a complete, working example of WordCount. You can 
> copy  paste the code
> to run it locally. You only have to include the correct Flink's library into 
> your project
> (see Section [Linking with Flink](#linking-with-flink)) and specify the 
> imports. Then you are ready
> to go!
> {code}
> Relevant section for CEP(apis/streaming/libs/cep.md; L 45-48:
> {code}
> ## Getting Started
> If you want to jump right in, you have to [set up a Flink program]({{ 
> site.baseurl }}/apis/batch/index.html#linking-with-flink).
> Next, you have to add the FlinkCEP dependency to the `pom.xml` of your 
> project.
> {code}
> The CEP doc probably shouldn't refer to the DataSet documentation at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4277) TaskManagerConfigurationTest fails

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399214#comment-15399214
 ] 

ASF GitHub Bot commented on FLINK-4277:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2310
  
Thanks for the PR @zentol. 

The test didn't fail because the exception is swallowed in a catch block:

```java
} catch (Exception e) {
e.printStackTrace();
}
```

It has to be changed in this PR to not swallow exceptions.


> TaskManagerConfigurationTest fails
> --
>
> Key: FLINK-4277
> URL: https://issues.apache.org/jira/browse/FLINK-4277
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> When running this test on Ubuntu i encountered this error:
> Running org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest
> java.lang.Exception: Could not load configuration
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$.parseArgsAndLoadConfig(TaskManager.scala:1526)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.parseArgsAndLoadConfig(TaskManager.scala)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest.testDefaultFsParameterLoading(TaskManagerConfigurationTest.java:125)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Caused by: org.apache.flink.configuration.IllegalConfigurationException: The 
> Flink config file 
> '/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml/flink-conf.yaml'
>  
> (/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml)
>  does not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2310: [FLINK-4277] Fix TaskManagerConfigurationTest#testDefault...

2016-07-29 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2310
  
Thanks for the PR @zentol. 

The test didn't fail because the exception is swallowed in a catch block:

```java
} catch (Exception e) {
e.printStackTrace();
}
```

It has to be changed in this PR to not swallow exceptions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4277) TaskManagerConfigurationTest fails

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399208#comment-15399208
 ] 

ASF GitHub Bot commented on FLINK-4277:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2310#discussion_r72780980
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/taskmanager/TaskManagerConfigurationTest.java
 ---
@@ -110,7 +110,7 @@ public void testActorSystemPortConfig() {
@Test
public void testDefaultFsParameterLoading() {
final File tmpDir = getTmpDir();
-   final File confFile =  new File(tmpDir, 
UUID.randomUUID().toString() + ".yaml");
+   final File confFile =  new File(tmpDir, "flink-conf.yaml");
--- End diff --

"flink-conf.yaml" can be replaced by 
`GlobalConfiguration.FLINK_CONF_FILENAME`.


> TaskManagerConfigurationTest fails
> --
>
> Key: FLINK-4277
> URL: https://issues.apache.org/jira/browse/FLINK-4277
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> When running this test on Ubuntu i encountered this error:
> Running org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest
> java.lang.Exception: Could not load configuration
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$.parseArgsAndLoadConfig(TaskManager.scala:1526)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.parseArgsAndLoadConfig(TaskManager.scala)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest.testDefaultFsParameterLoading(TaskManagerConfigurationTest.java:125)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Caused by: org.apache.flink.configuration.IllegalConfigurationException: The 
> Flink config file 
> '/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml/flink-conf.yaml'
>  
> (/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml)
>  does not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2310: [FLINK-4277] Fix TaskManagerConfigurationTest#test...

2016-07-29 Thread mxm
Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2310#discussion_r72780980
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/taskmanager/TaskManagerConfigurationTest.java
 ---
@@ -110,7 +110,7 @@ public void testActorSystemPortConfig() {
@Test
public void testDefaultFsParameterLoading() {
final File tmpDir = getTmpDir();
-   final File confFile =  new File(tmpDir, 
UUID.randomUUID().toString() + ".yaml");
+   final File confFile =  new File(tmpDir, "flink-conf.yaml");
--- End diff --

"flink-conf.yaml" can be replaced by 
`GlobalConfiguration.FLINK_CONF_FILENAME`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-4284) DataSet/CEP link to non-existant "Linking with Flink" section

2016-07-29 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-4284:
---

Assignee: Chesnay Schepler

> DataSet/CEP link to non-existant "Linking with Flink" section
> -
>
> Key: FLINK-4284
> URL: https://issues.apache.org/jira/browse/FLINK-4284
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> Relevant section for DataSet (apis/batch/index.md; L 57-60):
> {code}
> The following program is a complete, working example of WordCount. You can 
> copy  paste the code
> to run it locally. You only have to include the correct Flink's library into 
> your project
> (see Section [Linking with Flink](#linking-with-flink)) and specify the 
> imports. Then you are ready
> to go!
> {code}
> Relevant section for CEP(apis/streaming/libs/cep.md; L 45-48:
> {code}
> ## Getting Started
> If you want to jump right in, you have to [set up a Flink program]({{ 
> site.baseurl }}/apis/batch/index.html#linking-with-flink).
> Next, you have to add the FlinkCEP dependency to the `pom.xml` of your 
> project.
> {code}
> The CEP doc probably shouldn't refer to the DataSet documentation at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4284) DataSet/CEP link to non-existant "Linking with Flink" section

2016-07-29 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-4284:
---

 Summary: DataSet/CEP link to non-existant "Linking with Flink" 
section
 Key: FLINK-4284
 URL: https://issues.apache.org/jira/browse/FLINK-4284
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.1.0
Reporter: Chesnay Schepler


Relevant section for DataSet (apis/batch/index.md; L 57-60):

{code}
The following program is a complete, working example of WordCount. You can copy 
 paste the code
to run it locally. You only have to include the correct Flink's library into 
your project
(see Section [Linking with Flink](#linking-with-flink)) and specify the 
imports. Then you are ready
to go!
{code}

Relevant section for CEP(apis/streaming/libs/cep.md; L 45-48:

{code}
## Getting Started

If you want to jump right in, you have to [set up a Flink program]({{ 
site.baseurl }}/apis/batch/index.html#linking-with-flink).
Next, you have to add the FlinkCEP dependency to the `pom.xml` of your project.
{code}

The CEP doc probably shouldn't refer to the DataSet documentation at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4283) ExecutionGraphRestartTest fails

2016-07-29 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-4283:
---

 Summary: ExecutionGraphRestartTest fails
 Key: FLINK-4283
 URL: https://issues.apache.org/jira/browse/FLINK-4283
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.1.0
 Environment: Ubuntu 14.04
W10
Reporter: Chesnay Schepler


I encounter reliable failures for the following tests:

testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
  Time elapsed: 120.089 sec  <<< FAILURE!
java.lang.AssertionError: expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
at 
org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)

taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
  Time elapsed: 2.055 sec  <<< FAILURE!
java.lang.AssertionError: expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
at 
org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)

testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
  Time elapsed: 120.079 sec  <<< FAILURE!
java.lang.AssertionError: expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4275) Generic Folding, Reducing and List states behave differently from other state backends

2016-07-29 Thread Gyula Fora (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399143#comment-15399143
 ] 

Gyula Fora commented on FLINK-4275:
---

I would not remove them in any case. They provide a very nice way to use custom 
state implementations without having to implement everything you don't care 
about.

(We also use them extensively)

> Generic Folding, Reducing and List states behave differently from other state 
> backends
> --
>
> Key: FLINK-4275
> URL: https://issues.apache.org/jira/browse/FLINK-4275
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing, Streaming
>Reporter: Gyula Fora
>Priority: Critical
>
> In 
> https://github.com/apache/flink/commit/12bf7c1a0b81d199085fe874c64763c51a93b3bf
>  the expected behaviour of Folding/Reducing/List states have been changed to 
> return null instead of empty collections/default values.
> This was adapted for the included state backends (Memory, FS, Rocks) but not 
> for the Generic state wrappers. As there are no tests for the Generic backend 
> using the StateBackendTestBase this didnt show.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2307: [FLINK-4269] Decrease log level in RuntimeMonitorHandler

2016-07-29 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2307
  
Please don't spend your time to write an actual test for this; for this 
change it is not necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4269) Decrease log level in RuntimeMonitorHandler

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399134#comment-15399134
 ] 

ASF GitHub Bot commented on FLINK-4269:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2307
  
Please don't spend your time to write an actual test for this; for this 
change it is not necessary.


> Decrease log level in RuntimeMonitorHandler
> ---
>
> Key: FLINK-4269
> URL: https://issues.apache.org/jira/browse/FLINK-4269
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.1.0
>Reporter: Ufuk Celebi
>Priority: Minor
>
> Having a browser window open which points to a page of an unavailable job 
> (for example from a previous cluster setup) leads to many log warning 
> messages like this in the job manager log file:
> {code}
> 2016-06-30 10:50:55,812 WARN  
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler - Error while 
> handling request
> org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job 
> with id 343bd3c0c7415370d79e04f305b5a2e9
>   at 
> org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:135)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:112)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:60)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
>   at 
> io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
>   at 
> io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:104)
>   at 
> org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> {code}
> Since the REST API calls are refreshed every X seconds, this can lead to 
> quite some log pollution. We might consider decreasing the log level to DEBUG 
> or TRACE even.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4269) Decrease log level in RuntimeMonitorHandler

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399111#comment-15399111
 ] 

ASF GitHub Bot commented on FLINK-4269:
---

Github user aditivin commented on the issue:

https://github.com/apache/flink/pull/2307
  
Thanks @nssalian , @zentol - I'll keep the points in mind for next time.

I will try testing the logging and update this thread. Also, I've enabled 
travis in my repository :)


> Decrease log level in RuntimeMonitorHandler
> ---
>
> Key: FLINK-4269
> URL: https://issues.apache.org/jira/browse/FLINK-4269
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.1.0
>Reporter: Ufuk Celebi
>Priority: Minor
>
> Having a browser window open which points to a page of an unavailable job 
> (for example from a previous cluster setup) leads to many log warning 
> messages like this in the job manager log file:
> {code}
> 2016-06-30 10:50:55,812 WARN  
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler - Error while 
> handling request
> org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job 
> with id 343bd3c0c7415370d79e04f305b5a2e9
>   at 
> org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:135)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:112)
>   at 
> org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:60)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
>   at 
> io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
>   at 
> io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:104)
>   at 
> org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> {code}
> Since the REST API calls are refreshed every X seconds, this can lead to 
> quite some log pollution. We might consider decreasing the log level to DEBUG 
> or TRACE even.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2307: [FLINK-4269] Decrease log level in RuntimeMonitorHandler

2016-07-29 Thread aditivin
Github user aditivin commented on the issue:

https://github.com/apache/flink/pull/2307
  
Thanks @nssalian , @zentol - I'll keep the points in mind for next time.

I will try testing the logging and update this thread. Also, I've enabled 
travis in my repository :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-4282) Add Offset Parameter to WindowAssigners

2016-07-29 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-4282:
---

 Summary: Add Offset Parameter to WindowAssigners
 Key: FLINK-4282
 URL: https://issues.apache.org/jira/browse/FLINK-4282
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Aljoscha Krettek


Currently, windows are always aligned to EPOCH, which basically means days are 
aligned with GMT. This is somewhat problematic for people living in different 
timezones.

And offset parameter would allow to adapt the window assigner to the timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4277) TaskManagerConfigurationTest fails

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399100#comment-15399100
 ] 

ASF GitHub Bot commented on FLINK-4277:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2310

[FLINK-4277] Fix TaskManagerConfigurationTest#testDefaultFsParameterLoading

This PR fixes the 
`TaskManagerConfigurationTest#testDefaultFsParameterLoading` by creating a 
proper `flink-conf.yaml` file and passing the correct `--configDir` parameter.

Previously, the generated yaml had a random name (`.yaml`) 
even though it should be called `flink-conf.yaml`. In addition, the 
`--configDir` parameter was set to the file name, and not directory it resides 
in.

I don't know why this test did not fail earlier; maybe @mxm has an idea as 
he recently did some work on the `GlobalConfiguration`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 4277_tm_configuration_test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2310.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2310


commit fecd900b1030d0850ff8698ce85ee720c8cae613
Author: zentol 
Date:   2016-07-29T10:15:54Z

[FLINK-4277] Fix TaskManagerConfigurationTest#testDefaultFsParameterLoading




> TaskManagerConfigurationTest fails
> --
>
> Key: FLINK-4277
> URL: https://issues.apache.org/jira/browse/FLINK-4277
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> When running this test on Ubuntu i encountered this error:
> Running org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest
> java.lang.Exception: Could not load configuration
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$.parseArgsAndLoadConfig(TaskManager.scala:1526)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.parseArgsAndLoadConfig(TaskManager.scala)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest.testDefaultFsParameterLoading(TaskManagerConfigurationTest.java:125)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Caused by: org.apache.flink.configuration.IllegalConfigurationException: The 
> Flink config file 
> '/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml/flink-conf.yaml'
>  
> (/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml)
>  does not exist.



--
This message was sent by Atlassian JIRA

[GitHub] flink pull request #2310: [FLINK-4277] Fix TaskManagerConfigurationTest#test...

2016-07-29 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2310

[FLINK-4277] Fix TaskManagerConfigurationTest#testDefaultFsParameterLoading

This PR fixes the 
`TaskManagerConfigurationTest#testDefaultFsParameterLoading` by creating a 
proper `flink-conf.yaml` file and passing the correct `--configDir` parameter.

Previously, the generated yaml had a random name (`.yaml`) 
even though it should be called `flink-conf.yaml`. In addition, the 
`--configDir` parameter was set to the file name, and not directory it resides 
in.

I don't know why this test did not fail earlier; maybe @mxm has an idea as 
he recently did some work on the `GlobalConfiguration`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 4277_tm_configuration_test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2310.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2310


commit fecd900b1030d0850ff8698ce85ee720c8cae613
Author: zentol 
Date:   2016-07-29T10:15:54Z

[FLINK-4277] Fix TaskManagerConfigurationTest#testDefaultFsParameterLoading




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3674) Add an interface for EventTime aware User Function

2016-07-29 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399092#comment-15399092
 ] 

Aljoscha Krettek commented on FLINK-3674:
-

Could you please go a bit into the ideas behind the changes in your PR, i.e. 
what can users do now, with theses changes. As I see it they don't address the 
things that we want to tackle for this issue.

> Add an interface for EventTime aware User Function
> --
>
> Key: FLINK-3674
> URL: https://issues.apache.org/jira/browse/FLINK-3674
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: ramkrishna.s.vasudevan
>
> I suggest to add an interface that UDFs can implement, which will let them be 
> notified upon watermark updates.
> Example usage:
> {code}
> public interface EventTimeFunction {
> void onWatermark(Watermark watermark);
> }
> public class MyMapper implements MapFunction, 
> EventTimeFunction {
> private long currentEventTime = Long.MIN_VALUE;
> public String map(String value) {
> return value + " @ " + currentEventTime;
> }
> public void onWatermark(Watermark watermark) {
> currentEventTime = watermark.getTimestamp();
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-4277) TaskManagerConfigurationTest fails

2016-07-29 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-4277:
---

Assignee: Chesnay Schepler

> TaskManagerConfigurationTest fails
> --
>
> Key: FLINK-4277
> URL: https://issues.apache.org/jira/browse/FLINK-4277
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> When running this test on Ubuntu i encountered this error:
> Running org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest
> java.lang.Exception: Could not load configuration
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$.parseArgsAndLoadConfig(TaskManager.scala:1526)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.parseArgsAndLoadConfig(TaskManager.scala)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManagerConfigurationTest.testDefaultFsParameterLoading(TaskManagerConfigurationTest.java:125)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Caused by: org.apache.flink.configuration.IllegalConfigurationException: The 
> Flink config file 
> '/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml/flink-conf.yaml'
>  
> (/tmp/26d1532e-a1f2-4007-bf77-f66c92669241/33c51581-617c-4029-8b6e-11b9a55c792b.yaml)
>  does not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4094) Off heap memory deallocation might not properly work

2016-07-29 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399085#comment-15399085
 ] 

ramkrishna.s.vasudevan commented on FLINK-4094:
---

[~mxm]
Thanks for the comment. My comment was bit vague as I  added them while on 
travel.
Let me try to explain based on what I see in code. If am missing something or 
wrong, pls do correct me.
The MemoryManager manages both Heap and offheap memory segment. 
{code}
@Override
HybridMemorySegment allocateNewSegment(Object owner) {
ByteBuffer memory = 
ByteBuffer.allocateDirect(segmentSize);
return 
HybridMemorySegment.FACTORY.wrapPooledOffHeapMemory(memory, owner);
}

@Override
HybridMemorySegment requestSegmentFromPool(Object owner) {
ByteBuffer buf = availableMemory.remove();
return 
HybridMemorySegment.FACTORY.wrapPooledOffHeapMemory(buf, owner);
}

@Override
void returnSegmentToPool(MemorySegment segment) {
if (segment.getClass() == HybridMemorySegment.class) {
HybridMemorySegment hybridSegment = 
(HybridMemorySegment) segment;
ByteBuffer buf = 
hybridSegment.getOffHeapBuffer();
availableMemory.add(buf);
hybridSegment.free();
}
else {
throw new IllegalArgumentException("Memory 
segment is not a " + HeapMemorySegment.class.getSimpleName());
}
}
{code}
If you see the usage of the above APIs
{code}
if (isPreAllocated) {
for (int i = numPages; i > 0; i--) {
MemorySegment segment = 
memoryPool.requestSegmentFromPool(owner);
target.add(segment);
segmentsForOwner.add(segment);
}
}
else {
for (int i = numPages; i > 0; i--) {
MemorySegment segment = 
memoryPool.allocateNewSegment(owner);
target.add(segment);
segmentsForOwner.add(segment);
}
numNonAllocatedPages -= numPages;
}
{code}
So if there is preAllocation enabled the memory buffer is requested from the 
pool or every time there is newsegment allocated.
Coming to the release of these buffers
{code}
if (isPreAllocated) {
// release the memory in any case
memoryPool.returnSegmentToPool(segment);
}
else {
segment.free();
numNonAllocatedPages++;
}
{code}
Again only if preAllocation is enabled we are returning to pool. Ya as you 
clearly pointed out it is just dynamic allocation that we do and on memory 
manager shutdown we clear the allocated buffers. But for offheap this will not 
be enough as the GC will not be able to garbage collect them unless the fullGC 
happens. 
I would rather say that it is better we do internal management of offheap 
buffers. We should create a pool from which the buffers are allocated and if 
the pool is of fixed size and we have requests for more buffers than the size 
of the pool we should allocate them onheap only. (if that is acceptable).

Currently the memory management pool is done by ArrayDeque. We only allow 
initialSize and I think it can grow beyond too. So for offheap buffers we 
should have a fixed size pool and as and when the demand grows we should 
allocate few buffers onheap and once the pool is again able to offer buffers we 
use them. 

bq.I don't think just disallowing preallocation:false is a good fix.
Yes. I agree. That is a hacky one.

> Off heap memory deallocation might not properly work
> 
>
> Key: FLINK-4094
> URL: https://issues.apache.org/jira/browse/FLINK-4094
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 1.1.0
>
>
> A user reported that off-heap memory is not properly deallocated when setting 
> {{taskmanager.memory.preallocate:false}} 

[jira] [Updated] (FLINK-4281) Wrap all Calcite Exceptions in Flink Exceptions

2016-07-29 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-4281:

Affects Version/s: 1.2.0

> Wrap all Calcite Exceptions in Flink Exceptions
> ---
>
> Key: FLINK-4281
> URL: https://issues.apache.org/jira/browse/FLINK-4281
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Timo Walther
>
> Some exceptions are already wrapped in Flink exceptions but there are still 
> exceptions thrown by Calcite. I would propose that all Exceptions thrown by 
> the Table API are Flink's Exceptions, esp. the FlinkPlannerImpl exceptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4281) Wrap all Calcite Exceptions in Flink Exceptions

2016-07-29 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-4281:

Issue Type: Improvement  (was: Bug)

> Wrap all Calcite Exceptions in Flink Exceptions
> ---
>
> Key: FLINK-4281
> URL: https://issues.apache.org/jira/browse/FLINK-4281
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Timo Walther
>
> Some exceptions are already wrapped in Flink exceptions but there are still 
> exceptions thrown by Calcite. I would propose that all Exceptions thrown by 
> the Table API are Flink's Exceptions, esp. the FlinkPlannerImpl exceptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4281) Wrap all Calcite Exceptions in Flink Exceptions

2016-07-29 Thread Timo Walther (JIRA)
Timo Walther created FLINK-4281:
---

 Summary: Wrap all Calcite Exceptions in Flink Exceptions
 Key: FLINK-4281
 URL: https://issues.apache.org/jira/browse/FLINK-4281
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Reporter: Timo Walther


Some exceptions are already wrapped in Flink exceptions but there are still 
exceptions thrown by Calcite. I would propose that all Exceptions thrown by the 
Table API are Flink's Exceptions, esp. the FlinkPlannerImpl exceptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4275) Generic Folding, Reducing and List states behave differently from other state backends

2016-07-29 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399081#comment-15399081
 ] 

Aljoscha Krettek commented on FLINK-4275:
-

We can either fix those or remove them altogether. 

> Generic Folding, Reducing and List states behave differently from other state 
> backends
> --
>
> Key: FLINK-4275
> URL: https://issues.apache.org/jira/browse/FLINK-4275
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing, Streaming
>Reporter: Gyula Fora
>Priority: Critical
>
> In 
> https://github.com/apache/flink/commit/12bf7c1a0b81d199085fe874c64763c51a93b3bf
>  the expected behaviour of Folding/Reducing/List states have been changed to 
> return null instead of empty collections/default values.
> This was adapted for the included state backends (Memory, FS, Rocks) but not 
> for the Generic state wrappers. As there are no tests for the Generic backend 
> using the StateBackendTestBase this didnt show.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4161) Quickstarts can exclude more flink-dist dependencies

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399067#comment-15399067
 ] 

ASF GitHub Bot commented on FLINK-4161:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2309

[FLINK-4161] Add Quickstart exclusion for flink-dist dependencies

Updated version of #2218 which got closed as I accidentally deleted the 
branch.

In addition to the previous changes:
* `flink-dist` now explicitly dependends on `flink-metrics-core`
* excluded `flink-python`
* excluded `flink-metrics-core`
* excluded `flink-metrics-jmx`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 4161_qs_include

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2309.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2309


commit fe733860c4e1e8ea2bdbd5530b53064d00f27e27
Author: zentol 
Date:   2016-07-29T09:53:24Z

[FLINK-4161] Add Quickstart exclusion for flink-dist dependencies




> Quickstarts can exclude more flink-dist dependencies
> 
>
> Key: FLINK-4161
> URL: https://issues.apache.org/jira/browse/FLINK-4161
> Project: Flink
>  Issue Type: Improvement
>  Components: Quickstarts
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
>
> The Quickstart poms exclude several dependencies that flink-dist contains 
> from being packaged into the fat-jar.
> However, the following flink-dist dependencies are not excluded:
> {code}
> org.apache.flink:flink-streaming-scala_2.10
> org.apache.flink:flink-scala-shell_2.10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2309: [FLINK-4161] Add Quickstart exclusion for flink-di...

2016-07-29 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2309

[FLINK-4161] Add Quickstart exclusion for flink-dist dependencies

Updated version of #2218 which got closed as I accidentally deleted the 
branch.

In addition to the previous changes:
* `flink-dist` now explicitly dependends on `flink-metrics-core`
* excluded `flink-python`
* excluded `flink-metrics-core`
* excluded `flink-metrics-jmx`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 4161_qs_include

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2309.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2309


commit fe733860c4e1e8ea2bdbd5530b53064d00f27e27
Author: zentol 
Date:   2016-07-29T09:53:24Z

[FLINK-4161] Add Quickstart exclusion for flink-dist dependencies




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-4280:
---
Description: 
Currently, to start reading from the "earliest" and "latest" position in topics 
for the Flink Kafka consumer, users set the Kafka config {{auto.offset.reset}} 
in the provided properties configuration.

However, the way this config actually works might be a bit misleading if users 
were trying to find a way to "read topics from a starting position". The way 
the {{auto.offset.reset}} config works in the Flink Kafka consumer resembles 
Kafka's original intent for the setting: first, existing external offsets 
committed to the ZK / brokers will be checked; if none exists, then will 
{{auto.offset.reset}} be respected.

I propose to add Flink-specific ways to define the starting position, without 
taking into account the external offsets. The original behaviour (reference 
external offsets first) can be changed to be a user option, so that the 
behaviour can be retained for frequent Kafka users that may need some 
collaboration with existing non-Flink Kafka consumer applications.

How users will interact with the Flink Kafka consumer after this is added, with 
a newly introduced {{flink.starting-position}} config:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "earliest/latest");
props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
warning)
props.setProperty("group.id", "...") // this won't have effect on the starting 
position anymore (may still be used in external offset committing)
...
{code}

Or, reference external offsets in ZK / broker:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "external-offsets");
props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
latest
props.setProperty("group.id", "..."); // will be used to lookup external 
offsets in ZK / broker on startup
...
{code}

A thing we would need to decide on is what would the default value be for 
{{flink.starting-position}}.


Two merits I see in adding this:

1. This compensates the way users generally interpret "read from a starting 
position". As the Flink Kafka connector is somewhat essentially a "high-level" 
Kafka consumer for Flink users, I think it is reasonable to add Flink-specific 
functionality that users will find useful, although it wasn't supported in 
Kafka's original consumer designs.

2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is used 
only to expose progress to the outside world, and not used to manipulate how 
Kafka topics are read in Flink (unless users opt to do so)" is even more 
definite and solid. There was some discussion in this PR 
(https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I think 
adding this "decouples" more Flink's internal offset checkpointing from the 
external Kafka's offset store.

  was:
Currently, to start reading from the "earliest" and "latest" position in topics 
for the Flink Kafka consumer, users set the Kafka config {{auto.offset.reset}} 
in the provided properties configuration.

However, the way this config actually works might be a bit misleading if users 
were trying to find a way to "read topics from a starting position". The way 
the {{auto.offset.reset}} config works in the Flink Kafka consumer resembles 
Kafka's original intent for the setting: first, existing external offsets 
committed to the ZK / brokers will be checked; if none exists, then will 
{{auto.offset.reset}} be respected.

I propose to add Flink-specific ways to define the starting position, without 
taking into account the external offsets. The original behaviour (reference 
external offsets first) can be changed to be a user option, so that the 
behaviour can be retained for frequent Kafka users that may need some 
collaboration with existing non-Flink Kafka consumer applications.

How users will interact with the Flink Kafka consumer after this is added:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "earliest/latest");
props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
warning)
props.setProperty("group.id", "...") // this won't have effect on the starting 
position anymore (may still be used in external offset committing)
...
{code}

Or, reference external offsets in ZK / broker:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "external-offsets");
props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
latest
props.setProperty("group.id", "..."); // will be used to lookup external 
offsets in ZK / broker on startup
...
{code}

A thing we would need to decide on is what would the default value be for 
{{flink.starting-position}}.


Two merits I see in adding this:

1. This 

[jira] [Updated] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-4280:
---
Description: 
Currently, to start reading from the "earliest" and "latest" position in topics 
for the Flink Kafka consumer, users set the Kafka config {{auto.offset.reset}} 
in the provided properties configuration.

However, the way this config actually works might be a bit misleading if users 
were trying to find a way to "read topics from a starting position". The way 
the {{auto.offset.reset}} config works in the Flink Kafka consumer resembles 
Kafka's original intent for the setting: first, existing external offsets 
committed to the ZK / brokers will be checked; if none exists, then will 
{{auto.offset.reset}} be respected.

I propose to add Flink-specific ways to define the starting position, without 
taking into account the external offsets. The original behaviour (reference 
external offsets first) can be changed to be a user option, so that the 
behaviour can be retained for frequent Kafka users that may need some 
collaboration with existing non-Flink Kafka consumer applications.

How users will interact with the Flink Kafka consumer after this is added:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "earliest/latest");
props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
warning)
props.setProperty("group.id", "...") // this won't have effect on the starting 
position anymore (may still be used in external offset committing)
...
{code}

Or, reference external offsets in ZK / broker:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "external-offsets");
props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
latest
props.setProperty("group.id", "..."); // will be used to lookup external 
offsets in ZK / broker on startup
...
{code}

A thing we would need to decide on is what would the default value be for 
{{flink.starting-position}}.


Two merits I see in adding this:

1. This compensates the way users generally interpret "read from a starting 
position". As the Flink Kafka connector is somewhat essentially a "high-level" 
Kafka consumer for Flink users, I think it is reasonable to add Flink-specific 
functionality that users will find useful, although it wasn't supported in 
Kafka's original consumer designs.

2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is used 
only to expose progress to the outside world, and not used to manipulate how 
Kafka topics are read in Flink (unless users opt to do so)" is even more 
definite and solid. There was some discussion in this PR 
(https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I think 
adding this "decouples" more Flink's internal offset checkpointing from the 
external Kafka's offset store.

  was:
Currently, to start reading from the "earliest" and "latest" position in topics 
for the Flink Kafka consumer, users set the Kafka config {{auto.offset.reset}} 
in the provided properties configuration.

However, the way this config actually works might be a bit misleading if users 
were trying to find a way to "read topics from a starting position". The way 
the {{auto.offset.reset}} config works in the Flink Kafka consumer resembles 
Kafka's original intent for the setting: first, existing external offsets 
committed to the ZK / brokers will be checked; if none exists, then will 
{{auto.offset.reset}} be respected.

I propose to add Flink-specific ways to define the starting position, without 
taking into account the external offsets. The original behaviour (reference 
external offsets first) can be changed to be a user option, so that the 
behaviour can be retained for frequent Kafka users that may need some 
collaboration with existing non-Flink Kafka consumer applications.

How users will interact with the Flink Kafka consumer after this is added:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "earliest/latest");
props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
warning)
props.setProperty("group.id", "...") // this won't have effect on the starting 
position anymore (may still be used in external offset committing)
...
{code}

Or, reference external offsets in ZK / broker:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "external-offsets");
props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
latest
props.setProperty("group.id", "..."); // will be used to lookup external 
offsets in ZK / broker
...
{code}

A thing we would need to decide on is what would the default value be for 
{{flink.starting-position}}.


Two merits I see in adding this:

1. This compensates the way users generally interpret "read from a starting 

[jira] [Updated] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker

2016-07-29 Thread Tzu-Li (Gordon) Tai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-4280:
---
Description: 
Currently, to start reading from the "earliest" and "latest" position in topics 
for the Flink Kafka consumer, users set the Kafka config {{auto.offset.reset}} 
in the provided properties configuration.

However, the way this config actually works might be a bit misleading if users 
were trying to find a way to "read topics from a starting position". The way 
the {{auto.offset.reset}} config works in the Flink Kafka consumer resembles 
Kafka's original intent for the setting: first, existing external offsets 
committed to the ZK / brokers will be checked; if none exists, then will 
{{auto.offset.reset}} be respected.

I propose to add Flink-specific ways to define the starting position, without 
taking into account the external offsets. The original behaviour (reference 
external offsets first) can be changed to be a user option, so that the 
behaviour can be retained for frequent Kafka users that may need some 
collaboration with existing non-Flink Kafka consumer applications.

How users will interact with the Flink Kafka consumer after this is added:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "earliest/latest");
props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
warning)
props.setProperty("group.id", "...") // this won't have effect on the starting 
position anymore (may still be used in external offset committing)
...
{code}

Or, reference external offsets in ZK / broker:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "external-offsets");
props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
latest
props.setProperty("group.id", "..."); // will be used to lookup external 
offsets in ZK / broker
...
{code}

A thing we would need to decide on is what would the default value be for 
{{flink.starting-position}}.


Two merits I see in adding this:

1. This compensates the way users generally interpret "read from a starting 
position". As the Flink Kafka connector is somewhat essentially a "high-level" 
Kafka consumer for Flink users, I think it is reasonable to add Flink-specific 
functionality that users will find useful, although it wasn't supported in 
Kafka's original consumer designs.

2. By adding this, the idea that "the Kafka offset store (ZK / brokers) is used 
only to expose progress to the outside world, and not used to manipulate how 
Kafka topics are read in Flink (unless users opt to do so)" is even more 
definite and solid. There was some discussion in this PR 
(https://github.com/apache/flink/pull/1690, FLINK-3398) on this aspect. I think 
adding this "decouples" more Flink's internal offset checkpointing from the 
external Kafka's offset store.

  was:
Currently, to start reading from the "earliest" and "latest" position in topics 
for the Flink Kafka consumer, users set the Kafka config {{auto.offset.reset}} 
in the provided properties configuration.

However, the way this config actually works might be a bit misleading if users 
were trying to find a way to "read topics from a starting position". The way 
the {{auto.offset.reset}} config works in the Flink Kafka consumer resembles 
Kafka's original intent for the setting: first, existing external offsets 
committed to the ZK / brokers will be checked; if none exists, then will 
{{auto.offset.reset}} be respected.

I propose to add Flink-specific ways to define the starting position, without 
taking into account the external offsets. The original behaviour (reference 
external offsets first) can be changed to be a user option, so that the 
behaviour can be retained for frequent Kafka users that may need some 
collaboration with existing non-Flink Kafka consumer applications.

How users will interact with the Flink Kafka consumer after this is added:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "earliest/latest");
props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a 
warning)
props.setProperty("group.id", "...") // this won't have effect on the starting 
position anymore (may still be used in external offset committing)
...
{code}

Or, reference external offsets in ZK / broker:

{code}
Properties props = new Properties();
props.setProperty("flink.starting-position", "external-offsets");
props.setProperty("auto.offset.reset", "earliest/latest"); // default will be 
latest
props.setProperty("group.id", "..."); // will be used to lookup external 
offsets in ZK / broker
...
{code}

A thing we would need to decide on is what would the default value be for 
{{flink.starting-position}}.


Two merits I see in adding this:

1. This compensates the way users generally interpret "read from a starting 
position". As the 

[jira] [Commented] (FLINK-4279) [py] Set flink dependencies to provided

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399043#comment-15399043
 ] 

ASF GitHub Bot commented on FLINK-4279:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2308

[FLINK-4279] [py] Set flink dependencies to provided

* removed unused `flink-optimizer` and `flink-clients` dependency
* set the remaining dependencies to `provided`
 * as otherwise several shaded classes were included in the jar

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 4279_python_dep

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2308.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2308


commit 6ccb61ba228ade45f6a4e1de1ecd87cbc1129136
Author: zentol 
Date:   2016-07-29T09:38:06Z

[FLINK-4279] [py] Set flink dependencies to provided




> [py] Set flink dependencies to provided
> ---
>
> Key: FLINK-4279
> URL: https://issues.apache.org/jira/browse/FLINK-4279
> Project: Flink
>  Issue Type: Improvement
>  Components: Python API
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >