[jira] [Assigned] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-04-15 Thread Subhankar Biswas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subhankar Biswas reassigned FLINK-3769:
---

Assignee: Subhankar Biswas

> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>Assignee: Subhankar Biswas
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-3763) RabbitMQ Source/Sink standardize connection parameters

2016-04-15 Thread Subhankar Biswas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subhankar Biswas reassigned FLINK-3763:
---

Assignee: Subhankar Biswas

> RabbitMQ Source/Sink standardize connection parameters
> --
>
> Key: FLINK-3763
> URL: https://issues.apache.org/jira/browse/FLINK-3763
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>Assignee: Subhankar Biswas
>
> The RabbitMQ source and sink should have the same capabilities in terms of 
> establishing a connection, currently the sink is lacking connection 
> parameters that are available on the source. Additionally, VirtualHost should 
> be an offered parameter for multi-tenant RabbitMQ clusters (if not specified 
> it goes to the vhost '/').
> Connection Parameters
> ===
> - Host - Offered on both
> - Port - Source only
> - Virtual Host - Neither
> - User - Source only
> - Password - Source only
> Additionally, it might be worth offer the URI as a valid constructor because 
> that would offer all 5 of the above parameters in a single String.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-04-15 Thread Subhankar Biswas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subhankar Biswas updated FLINK-3769:

Assignee: (was: Subhankar Biswas)

> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-04-15 Thread Subhankar Biswas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subhankar Biswas reassigned FLINK-3769:
---

Assignee: Subhankar Biswas

> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>Assignee: Subhankar Biswas
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2259][ml] Add Train-Testing Splitters

2016-04-15 Thread rawkintrevo
Github user rawkintrevo commented on the pull request:

https://github.com/apache/flink/pull/1898#issuecomment-210728264
  
One build failed on error: scala.reflect.internal.MissingRequirementError: 
object scala.runtime in compiler mirror not found.

Another on some weird YARN stuff, which I didn't touch. Flakey builds?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3742) Add Multi Layer Perceptron Predictor

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243999#comment-15243999
 ] 

ASF GitHub Bot commented on FLINK-3742:
---

Github user rawkintrevo commented on the pull request:

https://github.com/apache/flink/pull/1875#issuecomment-210727904
  
Failed on one build @ error: 
scala.reflect.internal.MissingRequirementError: object scala.runtime in 
compiler mirror not found.



> Add Multi Layer Perceptron Predictor
> 
>
> Key: FLINK-3742
> URL: https://issues.apache.org/jira/browse/FLINK-3742
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Trevor Grant
>Assignee: Trevor Grant
>Priority: Minor
>
> https://en.wikipedia.org/wiki/Multilayer_perceptron
> Multilayer perceptron is a simple sort of artificial neural network.  It 
> creates a directed graph in which the edges are parameter weights and nodes 
> are non-linear activation functions.  It is solved via a method known as back 
> propagation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3742][ml] Add Multilayer Perceptron

2016-04-15 Thread rawkintrevo
Github user rawkintrevo commented on the pull request:

https://github.com/apache/flink/pull/1875#issuecomment-210727904
  
Failed on one build @ error: 
scala.reflect.internal.MissingRequirementError: object scala.runtime in 
compiler mirror not found.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-3211) Add AWS Kinesis streaming connector

2016-04-15 Thread Tzu-Li (Gordon) Tai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-3211:
---
Affects Version/s: (was: 1.0.0)
   1.1.0

> Add AWS Kinesis streaming connector
> ---
>
> Key: FLINK-3211
> URL: https://issues.apache.org/jira/browse/FLINK-3211
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming Connectors
>Affects Versions: 1.1.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> AWS Kinesis is a widely adopted message queue used by AWS users, much like a 
> cloud service version of Apache Kafka. Support for AWS Kinesis will be a 
> great addition to the handful of Flink's streaming connectors to external 
> systems and a great reach out to the AWS community.
> AWS supports two different ways to consume Kinesis data: with the low-level 
> AWS SDK [1], or with the high-level KCL (Kinesis Client Library) [2]. AWS SDK 
> can be used to consume Kinesis data, including stream read beginning from a 
> specific offset (or "record sequence number" in Kinesis terminology). On the 
> other hand, AWS officially recommends using KCL, which offers a higher-level 
> of abstraction that also comes with checkpointing and failure recovery by 
> using a KCL-managed AWS DynamoDB "leash table" as the checkpoint state 
> storage.
> However, KCL is essentially a stream processing library that wraps all the 
> partition-to-task (or "shard" in Kinesis terminology) determination and 
> checkpointing to allow the user to focus only on streaming application logic. 
> This leads to the understanding that we can not use the KCL to implement the 
> Kinesis streaming connector if we are aiming for a deep integration of Flink 
> with Kinesis that provides exactly once guarantees (KCL promises only 
> at-least-once). Therefore, AWS SDK will be the way to go for the 
> implementation of this feature.
> With the ability to read from specific offsets, and also the fact that 
> Kinesis and Kafka share a lot of similarities, the basic principles of the 
> implementation of Flink's Kinesis streaming connector will very much resemble 
> the Kafka connector. We can basically follow the outlines described in 
> [~StephanEwen]'s description [3] and [~rmetzger]'s Kafka connector 
> implementation [4]. A few tweaks due to some of Kinesis v.s. Kafka 
> differences is described as following:
> 1. While the Kafka connector can support reading from multiple topics, I 
> currently don't think this is a good idea for Kinesis streams (a Kinesis 
> Stream is logically equivalent to a Kafka topic). Kinesis streams can exist 
> in different AWS regions, and each Kinesis stream under the same AWS user 
> account may have completely independent access settings with different 
> authorization keys. Overall, a Kinesis stream feels like a much more 
> consolidated resource compared to Kafka topics. It would be great to hear 
> more thoughts on this part.
> 2. While Kafka has brokers that can hold multiple partitions, the only 
> partitioning abstraction for AWS Kinesis is "shards". Therefore, in contrast 
> to the Kafka connector having per broker connections where the connections 
> can handle multiple Kafka partitions, the Kinesis connector will only need to 
> have simple per shard connections.
> 3. Kinesis itself does not support committing offsets back to Kinesis. If we 
> were to implement this feature like the Kafka connector with Kafka / ZK to 
> sync outside view of progress, we probably could use ZK or DynamoDB like the 
> way KCL works. More thoughts on this part will be very helpful too.
> As for the Kinesis Sink, it should be possible to use the AWS KPL (Kinesis 
> Producer Library) [5]. However, for higher code consistency with the proposed 
> Kinesis Consumer, I think it will be better to stick with the AWS SDK for the 
> implementation. The implementation should be straight forward, being almost 
> if not completely the same as the Kafka sink.
> References:
> [1] 
> http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-sdk.html
> [2] 
> http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html
> [3] 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-Connector-td2872.html
> [4] http://data-artisans.com/kafka-flink-a-practical-how-to/
> [5] 
> http://docs.aws.amazon.com//kinesis/latest/dev/developing-producers-with-kpl.html#d0e4998



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-3211) Add AWS Kinesis streaming connector

2016-04-15 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243988#comment-15243988
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-3211 at 4/16/16 3:53 AM:
-

https://github.com/tzulitai/flink/tree/FLINK-3229/flink-streaming-connectors/flink-connector-kinesis

Here is the initial working version of FlinkKinesisConsumer that I have been 
testing in off-production environments, updated corresponding to the recent 
Flink 1.0 changes.

In the description of https://issues.apache.org/jira/browse/FLINK-3229 is 
example code of how to use the consumer.

I'm still refactoring the code just a bit for easier unit tests. The PR will be 
very soon.


was (Author: tzulitai):
https://github.com/tzulitai/flink/tree/FLINK-3229/flink-streaming-connectors/flink-connector-kinesis

Here is the initial working version of FlinkKinesisConsumer that I have been 
testing in off-production environments, updated corresponding to the recent 
Flink 1.0 changes.
I'm still refactoring the code just a bit for easier unit tests. The PR will be 
very soon.

> Add AWS Kinesis streaming connector
> ---
>
> Key: FLINK-3211
> URL: https://issues.apache.org/jira/browse/FLINK-3211
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming Connectors
>Affects Versions: 1.0.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> AWS Kinesis is a widely adopted message queue used by AWS users, much like a 
> cloud service version of Apache Kafka. Support for AWS Kinesis will be a 
> great addition to the handful of Flink's streaming connectors to external 
> systems and a great reach out to the AWS community.
> AWS supports two different ways to consume Kinesis data: with the low-level 
> AWS SDK [1], or with the high-level KCL (Kinesis Client Library) [2]. AWS SDK 
> can be used to consume Kinesis data, including stream read beginning from a 
> specific offset (or "record sequence number" in Kinesis terminology). On the 
> other hand, AWS officially recommends using KCL, which offers a higher-level 
> of abstraction that also comes with checkpointing and failure recovery by 
> using a KCL-managed AWS DynamoDB "leash table" as the checkpoint state 
> storage.
> However, KCL is essentially a stream processing library that wraps all the 
> partition-to-task (or "shard" in Kinesis terminology) determination and 
> checkpointing to allow the user to focus only on streaming application logic. 
> This leads to the understanding that we can not use the KCL to implement the 
> Kinesis streaming connector if we are aiming for a deep integration of Flink 
> with Kinesis that provides exactly once guarantees (KCL promises only 
> at-least-once). Therefore, AWS SDK will be the way to go for the 
> implementation of this feature.
> With the ability to read from specific offsets, and also the fact that 
> Kinesis and Kafka share a lot of similarities, the basic principles of the 
> implementation of Flink's Kinesis streaming connector will very much resemble 
> the Kafka connector. We can basically follow the outlines described in 
> [~StephanEwen]'s description [3] and [~rmetzger]'s Kafka connector 
> implementation [4]. A few tweaks due to some of Kinesis v.s. Kafka 
> differences is described as following:
> 1. While the Kafka connector can support reading from multiple topics, I 
> currently don't think this is a good idea for Kinesis streams (a Kinesis 
> Stream is logically equivalent to a Kafka topic). Kinesis streams can exist 
> in different AWS regions, and each Kinesis stream under the same AWS user 
> account may have completely independent access settings with different 
> authorization keys. Overall, a Kinesis stream feels like a much more 
> consolidated resource compared to Kafka topics. It would be great to hear 
> more thoughts on this part.
> 2. While Kafka has brokers that can hold multiple partitions, the only 
> partitioning abstraction for AWS Kinesis is "shards". Therefore, in contrast 
> to the Kafka connector having per broker connections where the connections 
> can handle multiple Kafka partitions, the Kinesis connector will only need to 
> have simple per shard connections.
> 3. Kinesis itself does not support committing offsets back to Kinesis. If we 
> were to implement this feature like the Kafka connector with Kafka / ZK to 
> sync outside view of progress, we probably could use ZK or DynamoDB like the 
> way KCL works. More thoughts on this part will be very helpful too.
> As for the Kinesis Sink, it should be possible to use the AWS KPL (Kinesis 
> Producer Library) [5]. However, for higher code consistency with the proposed 
> Kinesis Consumer, I think it will be better to stick with the AWS SDK 

[jira] [Updated] (FLINK-3229) Kinesis streaming consumer with integration of Flink's checkpointing mechanics

2016-04-15 Thread Tzu-Li (Gordon) Tai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-3229:
---
Description: 
Opening a sub-task to implement data source consumer for Kinesis streaming 
connector (https://issues.apache.org/jira/browser/FLINK-3211).

An example of the planned user API for Flink Kinesis Consumer:

{code}
Properties kinesisConfig = new Properties();
config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1");
config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, 
"BASIC");
config.put(
KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, 
"aws_access_key_id_here");
config.put(
KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY,
"aws_secret_key_here");
config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, "LATEST"); 
// or TRIM_HORIZON

DataStream kinesisRecords = env.addSource(new FlinkKinesisConsumer<>(
"kinesis_stream_name",
new SimpleStringSchema(),
kinesisConfig));
{code}

  was:
Opening a sub-task to implement data source consumer for Kinesis streaming 
connector (https://issues.apache.org/jira/browser/FLINK-3211).

An example of the planned user API for Flink Kinesis Consumer:

{code}
Properties kinesisConfig = new Properties();
config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1");
config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, 
"BASIC");
config.put(
KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, 
"aws_access_key_id_here");
config.put(
KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY,
"aws_secret_key_here");
config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, "LATEST"); 
// or TRIM_HORIZON

DataStream kinesisRecords = env
.addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new 
SimpleStringSchema(), kinesisConfig);
{code}


> Kinesis streaming consumer with integration of Flink's checkpointing mechanics
> --
>
> Key: FLINK-3229
> URL: https://issues.apache.org/jira/browse/FLINK-3229
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Affects Versions: 1.0.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>
> Opening a sub-task to implement data source consumer for Kinesis streaming 
> connector (https://issues.apache.org/jira/browser/FLINK-3211).
> An example of the planned user API for Flink Kinesis Consumer:
> {code}
> Properties kinesisConfig = new Properties();
> config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1");
> config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, 
> "BASIC");
> config.put(
> KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, 
> "aws_access_key_id_here");
> config.put(
> KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY,
> "aws_secret_key_here");
> config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, 
> "LATEST"); // or TRIM_HORIZON
> DataStream kinesisRecords = env.addSource(new FlinkKinesisConsumer<>(
> "kinesis_stream_name",
> new SimpleStringSchema(),
> kinesisConfig));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3229) Kinesis streaming consumer with integration of Flink's checkpointing mechanics

2016-04-15 Thread Tzu-Li (Gordon) Tai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-3229:
---
Description: 
Opening a sub-task to implement data source consumer for Kinesis streaming 
connector (https://issues.apache.org/jira/browser/FLINK-3211).

An example of the planned user API for Flink Kinesis Consumer:

{code}
Properties kinesisConfig = new Properties();
config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1");
config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, 
"BASIC");
config.put(
KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, 
"aws_access_key_id_here");
config.put(
KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY,
"aws_secret_key_here");
config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, "LATEST"); 
// or TRIM_HORIZON

DataStream kinesisRecords = env
.addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new 
SimpleStringSchema(), kinesisConfig);
{code}

  was:
Opening a sub-task to implement data source consumer for Kinesis streaming 
connector (https://issues.apache.org/jira/browser/FLINK-3211).

An example of the planned user API for Flink Kinesis Consumer:

{code}
Properties kinesisConfig = new Properties();
config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1");
config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, 
"BASIC");
config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID,
 "aws_access_key_id_here");
config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY,
 "aws_secret_key_here");
config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, "LATEST"); 
// or TRIM_HORIZON

DataStream kinesisRecords = env
.addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new 
SimpleStringSchema(), kinesisConfig);
{code}


> Kinesis streaming consumer with integration of Flink's checkpointing mechanics
> --
>
> Key: FLINK-3229
> URL: https://issues.apache.org/jira/browse/FLINK-3229
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Affects Versions: 1.0.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>
> Opening a sub-task to implement data source consumer for Kinesis streaming 
> connector (https://issues.apache.org/jira/browser/FLINK-3211).
> An example of the planned user API for Flink Kinesis Consumer:
> {code}
> Properties kinesisConfig = new Properties();
> config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1");
> config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, 
> "BASIC");
> config.put(
> KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, 
> "aws_access_key_id_here");
> config.put(
> KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY,
> "aws_secret_key_here");
> config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, 
> "LATEST"); // or TRIM_HORIZON
> DataStream kinesisRecords = env
> .addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new 
> SimpleStringSchema(), kinesisConfig);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3229) Kinesis streaming consumer with integration of Flink's checkpointing mechanics

2016-04-15 Thread Tzu-Li (Gordon) Tai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-3229:
---
Description: 
Opening a sub-task to implement data source consumer for Kinesis streaming 
connector (https://issues.apache.org/jira/browser/FLINK-3211).

An example of the planned user API for Flink Kinesis Consumer:

{code}
Properties kinesisConfig = new Properties();
config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1");
config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, 
"BASIC");
config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID,
 "aws_access_key_id_here");
config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY,
 "aws_secret_key_here");
config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, "LATEST"); 
// or TRIM_HORIZON

DataStream kinesisRecords = env
.addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new 
SimpleStringSchema(), kinesisConfig);
{code}

  was:
Opening a sub-task to implement data source consumer for Kinesis streaming 
connector (https://issues.apache.org/jira/browser/FLINK-3211).

An example of the planned user API for Flink Kinesis Consumer:

{code}
Properties config = new Properties();
config.put(FlinkKinesisConsumer.CONFIG_STREAM_DESCRIBE_RETRIES, "3");
config.put(FlinkKinesisConsumer.CONFIG_STREAM_DESCRIBE_BACKOFF_MILLIS, "1000");
config.put(FlinkKinesisConsumer.CONFIG_STREAM_START_POSITION_TYPE, "latest");
config.put(FlinkKinesisConsumer.CONFIG_AWS_REGION, "us-east-1");

AWSCredentialsProvider credentials = // credentials API in AWS SDK

DataStream kinesisRecords = env
.addSource(new FlinkKinesisConsumer<>(
listOfStreams, credentials, new SimpleStringSchema(), config
));
{code}

Currently still considering which read start positions to support 
("TRIM_HORIZON", "LATEST", "AT_SEQUENCE_NUMBER"). The discussions for this can 
be found in https://issues.apache.org/jira/browser/FLINK-3211.


> Kinesis streaming consumer with integration of Flink's checkpointing mechanics
> --
>
> Key: FLINK-3229
> URL: https://issues.apache.org/jira/browse/FLINK-3229
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Affects Versions: 1.0.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>
> Opening a sub-task to implement data source consumer for Kinesis streaming 
> connector (https://issues.apache.org/jira/browser/FLINK-3211).
> An example of the planned user API for Flink Kinesis Consumer:
> {code}
> Properties kinesisConfig = new Properties();
> config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1");
> config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, 
> "BASIC");
> config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID,
>  "aws_access_key_id_here");
> config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY,
>  "aws_secret_key_here");
> config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, 
> "LATEST"); // or TRIM_HORIZON
> DataStream kinesisRecords = env
> .addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new 
> SimpleStringSchema(), kinesisConfig);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3229) Kinesis streaming consumer with integration of Flink's checkpointing mechanics

2016-04-15 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243991#comment-15243991
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-3229:


(Duplicate comment from FLINK-3211. Posting it here also to keep the issue 
updated.)

https://github.com/tzulitai/flink/tree/FLINK-3229/flink-streaming-connectors/flink-connector-kinesis

Here is the initial working version of FlinkKinesisConsumer that I have been 
testing in off-production environments, updated corresponding to the recent 
Flink 1.0 changes.
I'm still refactoring the code just a bit for easier unit tests. The PR will be 
very soon.

> Kinesis streaming consumer with integration of Flink's checkpointing mechanics
> --
>
> Key: FLINK-3229
> URL: https://issues.apache.org/jira/browse/FLINK-3229
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Affects Versions: 1.0.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>
> Opening a sub-task to implement data source consumer for Kinesis streaming 
> connector (https://issues.apache.org/jira/browser/FLINK-3211).
> An example of the planned user API for Flink Kinesis Consumer:
> {code}
> Properties config = new Properties();
> config.put(FlinkKinesisConsumer.CONFIG_STREAM_DESCRIBE_RETRIES, "3");
> config.put(FlinkKinesisConsumer.CONFIG_STREAM_DESCRIBE_BACKOFF_MILLIS, 
> "1000");
> config.put(FlinkKinesisConsumer.CONFIG_STREAM_START_POSITION_TYPE, "latest");
> config.put(FlinkKinesisConsumer.CONFIG_AWS_REGION, "us-east-1");
> AWSCredentialsProvider credentials = // credentials API in AWS SDK
> DataStream kinesisRecords = env
> .addSource(new FlinkKinesisConsumer<>(
> listOfStreams, credentials, new SimpleStringSchema(), config
> ));
> {code}
> Currently still considering which read start positions to support 
> ("TRIM_HORIZON", "LATEST", "AT_SEQUENCE_NUMBER"). The discussions for this 
> can be found in https://issues.apache.org/jira/browser/FLINK-3211.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3211) Add AWS Kinesis streaming connector

2016-04-15 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243988#comment-15243988
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-3211:


https://github.com/tzulitai/flink/tree/FLINK-3229/flink-streaming-connectors/flink-connector-kinesis

Here is the initial working version of FlinkKinesisConsumer that I have been 
testing in off-production environments, updated corresponding to the recent 
Flink 1.0 changes.
I'm still refactoring the code just a bit for easier unit tests. The PR will be 
very soon.

> Add AWS Kinesis streaming connector
> ---
>
> Key: FLINK-3211
> URL: https://issues.apache.org/jira/browse/FLINK-3211
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming Connectors
>Affects Versions: 1.0.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> AWS Kinesis is a widely adopted message queue used by AWS users, much like a 
> cloud service version of Apache Kafka. Support for AWS Kinesis will be a 
> great addition to the handful of Flink's streaming connectors to external 
> systems and a great reach out to the AWS community.
> AWS supports two different ways to consume Kinesis data: with the low-level 
> AWS SDK [1], or with the high-level KCL (Kinesis Client Library) [2]. AWS SDK 
> can be used to consume Kinesis data, including stream read beginning from a 
> specific offset (or "record sequence number" in Kinesis terminology). On the 
> other hand, AWS officially recommends using KCL, which offers a higher-level 
> of abstraction that also comes with checkpointing and failure recovery by 
> using a KCL-managed AWS DynamoDB "leash table" as the checkpoint state 
> storage.
> However, KCL is essentially a stream processing library that wraps all the 
> partition-to-task (or "shard" in Kinesis terminology) determination and 
> checkpointing to allow the user to focus only on streaming application logic. 
> This leads to the understanding that we can not use the KCL to implement the 
> Kinesis streaming connector if we are aiming for a deep integration of Flink 
> with Kinesis that provides exactly once guarantees (KCL promises only 
> at-least-once). Therefore, AWS SDK will be the way to go for the 
> implementation of this feature.
> With the ability to read from specific offsets, and also the fact that 
> Kinesis and Kafka share a lot of similarities, the basic principles of the 
> implementation of Flink's Kinesis streaming connector will very much resemble 
> the Kafka connector. We can basically follow the outlines described in 
> [~StephanEwen]'s description [3] and [~rmetzger]'s Kafka connector 
> implementation [4]. A few tweaks due to some of Kinesis v.s. Kafka 
> differences is described as following:
> 1. While the Kafka connector can support reading from multiple topics, I 
> currently don't think this is a good idea for Kinesis streams (a Kinesis 
> Stream is logically equivalent to a Kafka topic). Kinesis streams can exist 
> in different AWS regions, and each Kinesis stream under the same AWS user 
> account may have completely independent access settings with different 
> authorization keys. Overall, a Kinesis stream feels like a much more 
> consolidated resource compared to Kafka topics. It would be great to hear 
> more thoughts on this part.
> 2. While Kafka has brokers that can hold multiple partitions, the only 
> partitioning abstraction for AWS Kinesis is "shards". Therefore, in contrast 
> to the Kafka connector having per broker connections where the connections 
> can handle multiple Kafka partitions, the Kinesis connector will only need to 
> have simple per shard connections.
> 3. Kinesis itself does not support committing offsets back to Kinesis. If we 
> were to implement this feature like the Kafka connector with Kafka / ZK to 
> sync outside view of progress, we probably could use ZK or DynamoDB like the 
> way KCL works. More thoughts on this part will be very helpful too.
> As for the Kinesis Sink, it should be possible to use the AWS KPL (Kinesis 
> Producer Library) [5]. However, for higher code consistency with the proposed 
> Kinesis Consumer, I think it will be better to stick with the AWS SDK for the 
> implementation. The implementation should be straight forward, being almost 
> if not completely the same as the Kafka sink.
> References:
> [1] 
> http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-sdk.html
> [2] 
> http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html
> [3] 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-Connector-td2872.html
> [4] http://data-artisans.com/kafka-flink-a-practical-how-to/
> [5] 
> 

[jira] [Commented] (FLINK-3768) Clustering Coefficient

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243944#comment-15243944
 ] 

ASF GitHub Bot commented on FLINK-3768:
---

Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/flink/pull/1896#discussion_r59960582
  
--- Diff: 
flink-libraries/flink-gelly-examples/src/main/java/org/apache/flink/graph/examples/TriangleListing.java
 ---
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph.examples;
+
+import org.apache.commons.math3.random.JDKRandomGenerator;
+import org.apache.flink.api.common.JobExecutionResult;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.io.CsvOutputFormat;
+import org.apache.flink.api.java.utils.DataSetUtils;
+import org.apache.flink.api.java.utils.ParameterTool;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.graph.generator.RMatGraph;
+import org.apache.flink.graph.generator.random.JDKRandomGeneratorFactory;
+import org.apache.flink.graph.generator.random.RandomGenerableFactory;
+import org.apache.flink.graph.library.TriangleEnumerator;
+import org.apache.flink.graph.utils.Translate;
+import org.apache.flink.types.IntValue;
+import org.apache.flink.types.LongValue;
+import org.apache.flink.types.NullValue;
+
+import java.text.NumberFormat;
+
+public class TriangleListing {
--- End diff --

Could add JavaDoc to ALL new Java or Scala classes added to source? 
Explanation why and how it is interact with other code is something we are 
looking for. Thanks.


> Clustering Coefficient
> --
>
> Key: FLINK-3768
> URL: https://issues.apache.org/jira/browse/FLINK-3768
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> The local clustering coefficient measures the connectedness of each vertex's 
> neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 
> (neighborhood is a clique).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3768] [gelly] Clustering Coefficient

2016-04-15 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/flink/pull/1896#discussion_r59960582
  
--- Diff: 
flink-libraries/flink-gelly-examples/src/main/java/org/apache/flink/graph/examples/TriangleListing.java
 ---
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph.examples;
+
+import org.apache.commons.math3.random.JDKRandomGenerator;
+import org.apache.flink.api.common.JobExecutionResult;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.io.CsvOutputFormat;
+import org.apache.flink.api.java.utils.DataSetUtils;
+import org.apache.flink.api.java.utils.ParameterTool;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.graph.generator.RMatGraph;
+import org.apache.flink.graph.generator.random.JDKRandomGeneratorFactory;
+import org.apache.flink.graph.generator.random.RandomGenerableFactory;
+import org.apache.flink.graph.library.TriangleEnumerator;
+import org.apache.flink.graph.utils.Translate;
+import org.apache.flink.types.IntValue;
+import org.apache.flink.types.LongValue;
+import org.apache.flink.types.NullValue;
+
+import java.text.NumberFormat;
+
+public class TriangleListing {
--- End diff --

Could add JavaDoc to ALL new Java or Scala classes added to source? 
Explanation why and how it is interact with other code is something we are 
looking for. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3211) Add AWS Kinesis streaming connector

2016-04-15 Thread Marut Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243940#comment-15243940
 ] 

Marut Singh commented on FLINK-3211:


Is there any update on this issue?



> Add AWS Kinesis streaming connector
> ---
>
> Key: FLINK-3211
> URL: https://issues.apache.org/jira/browse/FLINK-3211
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming Connectors
>Affects Versions: 1.0.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> AWS Kinesis is a widely adopted message queue used by AWS users, much like a 
> cloud service version of Apache Kafka. Support for AWS Kinesis will be a 
> great addition to the handful of Flink's streaming connectors to external 
> systems and a great reach out to the AWS community.
> AWS supports two different ways to consume Kinesis data: with the low-level 
> AWS SDK [1], or with the high-level KCL (Kinesis Client Library) [2]. AWS SDK 
> can be used to consume Kinesis data, including stream read beginning from a 
> specific offset (or "record sequence number" in Kinesis terminology). On the 
> other hand, AWS officially recommends using KCL, which offers a higher-level 
> of abstraction that also comes with checkpointing and failure recovery by 
> using a KCL-managed AWS DynamoDB "leash table" as the checkpoint state 
> storage.
> However, KCL is essentially a stream processing library that wraps all the 
> partition-to-task (or "shard" in Kinesis terminology) determination and 
> checkpointing to allow the user to focus only on streaming application logic. 
> This leads to the understanding that we can not use the KCL to implement the 
> Kinesis streaming connector if we are aiming for a deep integration of Flink 
> with Kinesis that provides exactly once guarantees (KCL promises only 
> at-least-once). Therefore, AWS SDK will be the way to go for the 
> implementation of this feature.
> With the ability to read from specific offsets, and also the fact that 
> Kinesis and Kafka share a lot of similarities, the basic principles of the 
> implementation of Flink's Kinesis streaming connector will very much resemble 
> the Kafka connector. We can basically follow the outlines described in 
> [~StephanEwen]'s description [3] and [~rmetzger]'s Kafka connector 
> implementation [4]. A few tweaks due to some of Kinesis v.s. Kafka 
> differences is described as following:
> 1. While the Kafka connector can support reading from multiple topics, I 
> currently don't think this is a good idea for Kinesis streams (a Kinesis 
> Stream is logically equivalent to a Kafka topic). Kinesis streams can exist 
> in different AWS regions, and each Kinesis stream under the same AWS user 
> account may have completely independent access settings with different 
> authorization keys. Overall, a Kinesis stream feels like a much more 
> consolidated resource compared to Kafka topics. It would be great to hear 
> more thoughts on this part.
> 2. While Kafka has brokers that can hold multiple partitions, the only 
> partitioning abstraction for AWS Kinesis is "shards". Therefore, in contrast 
> to the Kafka connector having per broker connections where the connections 
> can handle multiple Kafka partitions, the Kinesis connector will only need to 
> have simple per shard connections.
> 3. Kinesis itself does not support committing offsets back to Kinesis. If we 
> were to implement this feature like the Kafka connector with Kafka / ZK to 
> sync outside view of progress, we probably could use ZK or DynamoDB like the 
> way KCL works. More thoughts on this part will be very helpful too.
> As for the Kinesis Sink, it should be possible to use the AWS KPL (Kinesis 
> Producer Library) [5]. However, for higher code consistency with the proposed 
> Kinesis Consumer, I think it will be better to stick with the AWS SDK for the 
> implementation. The implementation should be straight forward, being almost 
> if not completely the same as the Kafka sink.
> References:
> [1] 
> http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-sdk.html
> [2] 
> http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html
> [3] 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-Connector-td2872.html
> [4] http://data-artisans.com/kafka-flink-a-practical-how-to/
> [5] 
> http://docs.aws.amazon.com//kinesis/latest/dev/developing-producers-with-kpl.html#d0e4998



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3768] [gelly] Clustering Coefficient

2016-04-15 Thread greghogan
Github user greghogan commented on the pull request:

https://github.com/apache/flink/pull/1896#issuecomment-210709650
  
It's certainly nicer to review small PRs, but I also like to present the 
big picture and context in which the features will be used. What if I leave 
this here, break out the dependencies into new PRs, and then rebase this PR 
down if those features are accepted? I'm not quite sure what a design document 
would cover. There's not much sophisticated here as with SG or GSA.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3768) Clustering Coefficient

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243938#comment-15243938
 ] 

ASF GitHub Bot commented on FLINK-3768:
---

Github user greghogan commented on the pull request:

https://github.com/apache/flink/pull/1896#issuecomment-210709650
  
It's certainly nicer to review small PRs, but I also like to present the 
big picture and context in which the features will be used. What if I leave 
this here, break out the dependencies into new PRs, and then rebase this PR 
down if those features are accepted? I'm not quite sure what a design document 
would cover. There's not much sophisticated here as with SG or GSA.


> Clustering Coefficient
> --
>
> Key: FLINK-3768
> URL: https://issues.apache.org/jira/browse/FLINK-3768
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> The local clustering coefficient measures the connectedness of each vertex's 
> neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 
> (neighborhood is a clique).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2259][ml] Add Train-Testing Splitters

2016-04-15 Thread rawkintrevo
GitHub user rawkintrevo opened a pull request:

https://github.com/apache/flink/pull/1898

[FLINK-2259][ml] Add Train-Testing Splitters

This PR adds an object in ml/pipeline called splitter with the following 
methods:

randomSplit: Splits a DataSet into two data sets using DataSet.sample
multiRandomSplit: Splits a DataSet into multiple datasets according to an 
array of proportions
kFoldSplit: Splits DataSet into k TrainTest objects which have a testing 
data set of size 1/k of the original dataset and the remainder of the dataset 
in the training
trainTestSplit: A wrapper for randomSplit that return a TrainTest object
trainTestHoldoutSplit: A wrapper for multiRandomSplit that returns a 
TrainTestHoldout object

the TrainTest and TrainTestHoldout objects are case classes.  randomSplit 
and multiRandomSplit return arrays of DataSets.

- [x] General
  
- [ ] Documentation
  - Documentation is in code, will write markdown after 
review/feedback/finalization

- [x] Tests & Build


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rawkintrevo/flink train-test-split

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1898.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1898


commit ec1e65a31d80b33589b73619f2a5dd0a8e09c568
Author: Trevor Grant 
Date:   2016-04-15T22:37:51Z

Add Splitter Pre-processing

commit 3ecdc3818dd11a847136510dabe96f444924d319
Author: Trevor Grant 
Date:   2016-04-15T22:40:38Z

Add Splitter Pre-processing




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2259) Support training Estimators using a (train, validation, test) split of the available data

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243774#comment-15243774
 ] 

ASF GitHub Bot commented on FLINK-2259:
---

GitHub user rawkintrevo opened a pull request:

https://github.com/apache/flink/pull/1898

[FLINK-2259][ml] Add Train-Testing Splitters

This PR adds an object in ml/pipeline called splitter with the following 
methods:

randomSplit: Splits a DataSet into two data sets using DataSet.sample
multiRandomSplit: Splits a DataSet into multiple datasets according to an 
array of proportions
kFoldSplit: Splits DataSet into k TrainTest objects which have a testing 
data set of size 1/k of the original dataset and the remainder of the dataset 
in the training
trainTestSplit: A wrapper for randomSplit that return a TrainTest object
trainTestHoldoutSplit: A wrapper for multiRandomSplit that returns a 
TrainTestHoldout object

the TrainTest and TrainTestHoldout objects are case classes.  randomSplit 
and multiRandomSplit return arrays of DataSets.

- [x] General
  
- [ ] Documentation
  - Documentation is in code, will write markdown after 
review/feedback/finalization

- [x] Tests & Build


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rawkintrevo/flink train-test-split

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1898.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1898


commit ec1e65a31d80b33589b73619f2a5dd0a8e09c568
Author: Trevor Grant 
Date:   2016-04-15T22:37:51Z

Add Splitter Pre-processing

commit 3ecdc3818dd11a847136510dabe96f444924d319
Author: Trevor Grant 
Date:   2016-04-15T22:40:38Z

Add Splitter Pre-processing




> Support training Estimators using a (train, validation, test) split of the 
> available data
> -
>
> Key: FLINK-2259
> URL: https://issues.apache.org/jira/browse/FLINK-2259
> Project: Flink
>  Issue Type: New Feature
>  Components: Machine Learning Library
>Reporter: Theodore Vasiloudis
>Assignee: Trevor Grant
>Priority: Minor
>  Labels: ML
>
> When there is an abundance of data available, a good way to train models is 
> to split the available data into 3 parts: Train, Validation and Test.
> We use the Train data to train the model, the Validation part is used to 
> estimate the test error and select hyperparameters, and the Test is used to 
> evaluate the performance of the model, and assess its generalization [1]
> This is a common approach when training Artificial Neural Networks, and a 
> good strategy to choose in data-rich environments. Therefore we should have 
> some support of this data-analysis process in our Estimators.
> [1] Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. The elements of 
> statistical learning. Vol. 1. Springer, Berlin: Springer series in 
> statistics, 2001.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3768) Clustering Coefficient

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243730#comment-15243730
 ] 

ASF GitHub Bot commented on FLINK-3768:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1896#issuecomment-210663735
  
Hi @greghogan,
thank you for the PR. This is a big addition! Are all the changes related 
to the clustering coefficient JIRA? Could it maybe be split into smaller PRs in 
order to make reviewing easier?
For big changes like this, it usually a good idea to create a design 
document to show the high-level approach and functionality to be added and 
discuss it with the community. Do you think it would make sense to do this?
Thank you!


> Clustering Coefficient
> --
>
> Key: FLINK-3768
> URL: https://issues.apache.org/jira/browse/FLINK-3768
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> The local clustering coefficient measures the connectedness of each vertex's 
> neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 
> (neighborhood is a clique).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3768] [gelly] Clustering Coefficient

2016-04-15 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1896#issuecomment-210663735
  
Hi @greghogan,
thank you for the PR. This is a big addition! Are all the changes related 
to the clustering coefficient JIRA? Could it maybe be split into smaller PRs in 
order to make reviewing easier?
For big changes like this, it usually a good idea to create a design 
document to show the high-level approach and functionality to be added and 
discuss it with the community. Do you think it would make sense to do this?
Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3657) Change access of DataSetUtils.countElements() to 'public'

2016-04-15 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243680#comment-15243680
 ] 

Fabian Hueske commented on FLINK-3657:
--

Implemented for 1.1.0 with 5f993c65eeb1ae0d805e98b14bcb6603f9bff05f

> Change access of DataSetUtils.countElements() to 'public' 
> --
>
> Key: FLINK-3657
> URL: https://issues.apache.org/jira/browse/FLINK-3657
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0
>Reporter: Suneel Marthi
>Assignee: Suneel Marthi
>Priority: Minor
> Fix For: 1.0.1
>
>
> The access of DatasetUtils.countElements() is presently 'private', change 
> that to be 'public'. We happened to be replicating the functionality in our 
> project and realized the method already existed in Flink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3657: Change access of DataSetUtils.coun...

2016-04-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1829


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3657) Change access of DataSetUtils.countElements() to 'public'

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243678#comment-15243678
 ] 

ASF GitHub Bot commented on FLINK-3657:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1829


> Change access of DataSetUtils.countElements() to 'public' 
> --
>
> Key: FLINK-3657
> URL: https://issues.apache.org/jira/browse/FLINK-3657
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0
>Reporter: Suneel Marthi
>Assignee: Suneel Marthi
>Priority: Minor
> Fix For: 1.0.1
>
>
> The access of DatasetUtils.countElements() is presently 'private', change 
> that to be 'public'. We happened to be replicating the functionality in our 
> project and realized the method already existed in Flink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3657) Change access of DataSetUtils.countElements() to 'public'

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243657#comment-15243657
 ] 

ASF GitHub Bot commented on FLINK-3657:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1829#issuecomment-210647072
  
Thanks for the update @smarthi. Will merge this PR.


> Change access of DataSetUtils.countElements() to 'public' 
> --
>
> Key: FLINK-3657
> URL: https://issues.apache.org/jira/browse/FLINK-3657
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0
>Reporter: Suneel Marthi
>Assignee: Suneel Marthi
>Priority: Minor
> Fix For: 1.0.1
>
>
> The access of DatasetUtils.countElements() is presently 'private', change 
> that to be 'public'. We happened to be replicating the functionality in our 
> project and realized the method already existed in Flink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3657: Change access of DataSetUtils.coun...

2016-04-15 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1829#issuecomment-210647072
  
Thanks for the update @smarthi. Will merge this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3587) Bump Calcite version to 1.7.0

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243652#comment-15243652
 ] 

ASF GitHub Bot commented on FLINK-3587:
---

GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/1897

[FLINK-3587] Bump Calcite version to 1.7.0

- [X] General
  - The pull request references the related JIRA issue
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message

- [X] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed

This pull request updates the Apache Calcite dependency to 1.7. With 
Calcite 1.7 a few method APIs and the some behavior changed.

- Fix changed API for optimizer cost and cardinality estimates.
- Add DataSetValues and DataStreamValues due to changed Calcite RelNode 
generation.
- Pass TableEnvironment instead of TableConfig for DataSet and DataStream 
translation.
- Add methods to create new DataSources to BatchTableEnvironment and 
StreamTableEnvironment.
- Remove copied Calcite rule that got fixed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink calcite1.7

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1897.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1897


commit 92e2539b4178b3049f97bb1a6137ee91c943870a
Author: Fabian Hueske 
Date:   2016-03-22T17:51:46Z

[FLINK-3587] Bump Calcite version to 1.7.0

- Add DataSetValues and DataStreamValues due to changed Calcite RelNode 
generation.
- Pass TableEnvironment instead of TableConfig for DataSet and DataStream 
translation.
- Add methods to create new DataSources to BatchTableEnvironment and 
StreamTableEnvironment.
- Remove copied Calcite rule that got fixed.




> Bump Calcite version to 1.7.0
> -
>
> Key: FLINK-3587
> URL: https://issues.apache.org/jira/browse/FLINK-3587
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Reporter: Vasia Kalavri
>Assignee: Fabian Hueske
>
> We currently depend on Calcite 1.5.0. The latest stable release is 1.6.0, but 
> I propose we bump the version to 1.7.0-SNAPSHOT to benefit from latest 
> features. If we do that, we can also get rid of the custom 
> {{FlinkJoinUnionTransposeRule}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3587] Bump Calcite version to 1.7.0

2016-04-15 Thread fhueske
GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/1897

[FLINK-3587] Bump Calcite version to 1.7.0

- [X] General
  - The pull request references the related JIRA issue
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message

- [X] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed

This pull request updates the Apache Calcite dependency to 1.7. With 
Calcite 1.7 a few method APIs and the some behavior changed.

- Fix changed API for optimizer cost and cardinality estimates.
- Add DataSetValues and DataStreamValues due to changed Calcite RelNode 
generation.
- Pass TableEnvironment instead of TableConfig for DataSet and DataStream 
translation.
- Add methods to create new DataSources to BatchTableEnvironment and 
StreamTableEnvironment.
- Remove copied Calcite rule that got fixed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink calcite1.7

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1897.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1897


commit 92e2539b4178b3049f97bb1a6137ee91c943870a
Author: Fabian Hueske 
Date:   2016-03-22T17:51:46Z

[FLINK-3587] Bump Calcite version to 1.7.0

- Add DataSetValues and DataStreamValues due to changed Calcite RelNode 
generation.
- Pass TableEnvironment instead of TableConfig for DataSet and DataStream 
translation.
- Add methods to create new DataSources to BatchTableEnvironment and 
StreamTableEnvironment.
- Remove copied Calcite rule that got fixed.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3768) Clustering Coefficient

2016-04-15 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243625#comment-15243625
 ] 

Greg Hogan commented on FLINK-3768:
---

The algorithms are very different so I don't see them as exclusive but I see 
that that pull request was not committed.

> Clustering Coefficient
> --
>
> Key: FLINK-3768
> URL: https://issues.apache.org/jira/browse/FLINK-3768
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> The local clustering coefficient measures the connectedness of each vertex's 
> neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 
> (neighborhood is a clique).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-04-15 Thread Robert Batts (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Batts updated FLINK-3769:

Description: 
The RabbitMQ Sink can currently only publish to the "default" exchange. This 
exchange is a direct exchange, so the routing key will route directly to the 
queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
to 1 exchange which routes to 1 queue). Additionally, I believe that if a user 
decides to use a different exchange I think the following can be assumed:

1.) The provided exchange exists
2.) The user has declared the appropriate mapping and the appropriate queues 
exist in RabbitMQ (therefore, nothing needs to be created)

RabbitMQ currently provides four types of exchanges. Three of these will be 
covered by just enabling exchanges (Direct, Fanout, Topic) because they use the 
routingkey (or nothing). 

The fourth exchange type relies on the message headers, which are currently set 
to null by default on the publish. These headers may be on a per message level, 
so the input of this stream will need to take this as input as well. This forth 
exchange could very well be outside of the scope of this Improvement and a 
"RabbitMQ Sink enable headers" Improvement might be the better way to go with 
this.

Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html

  was:
The RabbitMQ Sink can currently only publish to the "default" exchange. This 
exchange is a direct exchange, so the routing key will route directly to the 
queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
to 1 exchange which routes to 1 queue). Additionally, I believe that if a user 
decides to use a different exchange I think the following can be assumed:

1.) The provided exchange exists
2.) The user has declared the appropriate mapping and the appropriate queues 
exist in RabbitMQ (therefore, nothing needs to be created)

RabbitMQ currently provides four types of exchanges. Three of these will be 
covered by just enabling exchanges (Direct, Fanout, Topic) because they use the 
routingkey (or nothing). 

The fourth exchange type relies on the message headers, which are currently set 
to null by default on the publish. These headers may be on a per message level, 
so the input of this stream will need to take this as input as well. This forth 
exchange could very well be outside of the scope of this Improvement and a 
"RabbitMQ Sink enable headers" Improvement might be the better way to go with 
this.


> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-04-15 Thread Robert Batts (JIRA)
Robert Batts created FLINK-3769:
---

 Summary: RabbitMQ Sink ability to publish to a different exchange
 Key: FLINK-3769
 URL: https://issues.apache.org/jira/browse/FLINK-3769
 Project: Flink
  Issue Type: Improvement
  Components: Streaming Connectors
Affects Versions: 1.0.1
Reporter: Robert Batts


The RabbitMQ Sink can currently only publish to the "default" exchange. This 
exchange is a direct exchange, so the routing key will route directly to the 
queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
to 1 exchange which routes to 1 queue). Additionally, I believe that if a user 
decides to use a different exchange I think the following can be assumed:

1.) The provided exchange exists
2.) The user has declared the appropriate mapping and the appropriate queues 
exist in RabbitMQ (therefore, nothing needs to be created)

RabbitMQ currently provides four types of exchanges. Three of these will be 
covered by just enabling exchanges (Direct, Fanout, Topic) because they use the 
routingkey (or nothing). 

The fourth exchange type relies on the message headers, which are currently set 
to null by default on the publish. These headers may be on a per message level, 
so the input of this stream will need to take this as input as well. This forth 
exchange could very well be outside of the scope of this Improvement and a 
"RabbitMQ Sink enable headers" Improvement might be the better way to go with 
this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3768) Clustering Coefficient

2016-04-15 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243542#comment-15243542
 ] 

Vasia Kalavri commented on FLINK-3768:
--

There is also FLINK-1528 for local clustering coefficient. Is this about the 
same algorithm? We should mark it as duplicate if so.

> Clustering Coefficient
> --
>
> Key: FLINK-3768
> URL: https://issues.apache.org/jira/browse/FLINK-3768
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> The local clustering coefficient measures the connectedness of each vertex's 
> neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 
> (neighborhood is a clique).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3768) Clustering Coefficient

2016-04-15 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan updated FLINK-3768:
--
Fix Version/s: 1.1.0

> Clustering Coefficient
> --
>
> Key: FLINK-3768
> URL: https://issues.apache.org/jira/browse/FLINK-3768
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> The local clustering coefficient measures the connectedness of each vertex's 
> neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 
> (neighborhood is a clique).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3768) Clustering Coefficient

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243506#comment-15243506
 ] 

ASF GitHub Bot commented on FLINK-3768:
---

GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/1896

[FLINK-3768] [gelly] Clustering Coefficient

Provides an algorithm for local clustering coefficient and dependent 
functions for degree annotation, algorithm caching, and graph translation.

I worked to improve the performance of `TriangleEnumerator`. Perhaps the 
API has changed since `Edge.reverse()` is not in-place and the edges were not 
being sorted by degree. The `JoinHint` is also important so that the `Triad`s 
are not spilled to disk.

On an AWS ec2.4xlarge (16 vcores, 30 GiB) I am seeing for the following 
timings of 5s, 29s, and 183s for `TriangleListing`. With `TriangleEnumerator` 
the timings are 7s, 45s, and 281s. Without the `JoinHint` the latter 
`TriangleEnumerator` timings are 58s and 347s.

Scale | ChecksumHashCode | Count
--||--
16 | 0xd9086985f4ce | 15616010
18 | 0x0010eeb32a441365 | 82781436
20 | 0x014a9434bb57ddef | 423780284

The command I had used to run the tests:
```
./bin/flink run -class org.apache.flink.graph.examples.TriangleListing 
~/flink-gelly-examples_2.10-1.1-SNAPSHOT.jar --clip_and_flip false --output 
print --output hash --scale 16 --listing
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 3768_clustering_coefficient

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1896.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1896


commit aa1141f4d34f7af9c092ec76bf1a81de310aed16
Author: Greg Hogan 
Date:   2016-04-13T13:28:38Z

[FLINK-3768] [gelly] Clustering Coefficient

Provides an algorithm for local clustering coefficient and dependent
functions for degree annotation, algorithm caching, and graph translation.




> Clustering Coefficient
> --
>
> Key: FLINK-3768
> URL: https://issues.apache.org/jira/browse/FLINK-3768
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>
> The local clustering coefficient measures the connectedness of each vertex's 
> neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 
> (neighborhood is a clique).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3768] [gelly] Clustering Coefficient

2016-04-15 Thread greghogan
GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/1896

[FLINK-3768] [gelly] Clustering Coefficient

Provides an algorithm for local clustering coefficient and dependent 
functions for degree annotation, algorithm caching, and graph translation.

I worked to improve the performance of `TriangleEnumerator`. Perhaps the 
API has changed since `Edge.reverse()` is not in-place and the edges were not 
being sorted by degree. The `JoinHint` is also important so that the `Triad`s 
are not spilled to disk.

On an AWS ec2.4xlarge (16 vcores, 30 GiB) I am seeing for the following 
timings of 5s, 29s, and 183s for `TriangleListing`. With `TriangleEnumerator` 
the timings are 7s, 45s, and 281s. Without the `JoinHint` the latter 
`TriangleEnumerator` timings are 58s and 347s.

Scale | ChecksumHashCode | Count
--||--
16 | 0xd9086985f4ce | 15616010
18 | 0x0010eeb32a441365 | 82781436
20 | 0x014a9434bb57ddef | 423780284

The command I had used to run the tests:
```
./bin/flink run -class org.apache.flink.graph.examples.TriangleListing 
~/flink-gelly-examples_2.10-1.1-SNAPSHOT.jar --clip_and_flip false --output 
print --output hash --scale 16 --listing
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 3768_clustering_coefficient

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1896.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1896


commit aa1141f4d34f7af9c092ec76bf1a81de310aed16
Author: Greg Hogan 
Date:   2016-04-13T13:28:38Z

[FLINK-3768] [gelly] Clustering Coefficient

Provides an algorithm for local clustering coefficient and dependent
functions for degree annotation, algorithm caching, and graph translation.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-3732) Potential null deference in ExecutionConfig#equals()

2016-04-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-3732.
--
   Resolution: Fixed
Fix Version/s: 1.0.2
   1.1.0

Fixed for 1.0.2 with 5b69dd8ce5485d6f8cbd4d94d1ea1870efb53c6a
Fixed for 1.1.0 with f3d3a4493ae786d421176396ee68f01a0e6dbb64

Thanks for the contribution!

> Potential null deference in ExecutionConfig#equals()
> 
>
> Key: FLINK-3732
> URL: https://issues.apache.org/jira/browse/FLINK-3732
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
> Fix For: 1.1.0, 1.0.2
>
>
> {code}
> ((restartStrategyConfiguration == null && 
> other.restartStrategyConfiguration == null) ||
> 
> restartStrategyConfiguration.equals(other.restartStrategyConfiguration)) &&
> {code}
> If restartStrategyConfiguration is null but 
> other.restartStrategyConfiguration is not null, the above would result in NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2544) Some test cases using PowerMock fail with Java 8u20

2016-04-15 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243418#comment-15243418
 ] 

Fabian Hueske commented on FLINK-2544:
--

Hi [~spkavulya], I gave you contributor permissions for the Flink JIRA. 
Now you can assign JIRAs to yourself.

Thanks for your contribution!

> Some test cases using PowerMock fail with Java 8u20
> ---
>
> Key: FLINK-2544
> URL: https://issues.apache.org/jira/browse/FLINK-2544
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Soila Kavulya
>Priority: Minor
> Fix For: 1.1.0, 1.0.2
>
>
> I observed that some of the test cases using {{PowerMockRunner}} fail with 
> Java 8u20 with the following error:
> {code}
> java.lang.VerifyError: Bad  method call from inside of a branch
> Exception Details:
>   Location:
> 
> org/apache/flink/client/program/ClientTest$SuccessReturningActor.()V 
> @32: invokespecial
>   Reason:
> Error exists in the bytecode
>   Bytecode:
> 0x000: 2a4c 1214 b800 1a03 bd00 0d12 1bb8 001f
> 0x010: b800 254e 2db2 0029 a500 0e2a 01c0 002b
> 0x020: b700 2ea7 0009 2bb7 0030 0157 2a01 4c01
> 0x030: 4d01 4e2b 01a5 0008 2b4e a700 0912 32b8
> 0x040: 001a 4e2d 1234 03bd 000d 1236 b800 1f12
> 0x050: 32b8 003a 3a04 1904 b200 29a6 000a b800
> 0x060: 3c4d a700 0919 04c0 0011 4d2c b800 42b5
> 0x070: 0046 b1
>   Stackmap Table:
> full_frame(@38,{UninitializedThis,UninitializedThis,Top,Object[#13]},{})
> full_frame(@44,{Object[#2],Object[#2],Top,Object[#13]},{})
> full_frame(@61,{Object[#2],Null,Null,Null},{Object[#2]})
> full_frame(@67,{Object[#2],Null,Null,Object[#15]},{Object[#2]})
> 
> full_frame(@101,{Object[#2],Null,Null,Object[#15],Object[#13]},{Object[#2]})
> 
> full_frame(@107,{Object[#2],Null,Object[#17],Object[#15],Object[#13]},{Object[#2]})
>   at java.lang.Class.getDeclaredConstructors0(Native Method)
>   at java.lang.Class.privateGetDeclaredConstructors(Class.java:2658)
>   at java.lang.Class.getConstructor0(Class.java:3062)
>   at java.lang.Class.getDeclaredConstructor(Class.java:2165)
>   at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86)
>   at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86)
>   at scala.util.Try$.apply(Try.scala:161)
>   at akka.util.Reflect$.findConstructor(Reflect.scala:86)
>   at akka.actor.NoArgsReflectConstructor.(Props.scala:359)
>   at akka.actor.IndirectActorProducer$.apply(Props.scala:308)
>   at akka.actor.Props.producer(Props.scala:176)
>   at akka.actor.Props.(Props.scala:189)
>   at akka.actor.Props$.create(Props.scala:99)
>   at akka.actor.Props$.create(Props.scala:99)
>   at akka.actor.Props.create(Props.scala)
>   at 
> org.apache.flink.client.program.ClientTest.shouldSubmitToJobClient(ClientTest.java:143)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:310)
>   at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88)
>   at 
> org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:294)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:127)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:282)
>   at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86)
>   at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:207)
>   at 
> 

[jira] [Resolved] (FLINK-2544) Some test cases using PowerMock fail with Java 8u20

2016-04-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-2544.
--
   Resolution: Fixed
Fix Version/s: 1.0.2
   1.1.0

Fixed for 1.0.2 with c1eb247e968a54f3f252ece792847c00b8ef5464
Fixed for 1.1.0 with d938c5f592addcee75227502a930264b36306aed

> Some test cases using PowerMock fail with Java 8u20
> ---
>
> Key: FLINK-2544
> URL: https://issues.apache.org/jira/browse/FLINK-2544
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Soila Kavulya
>Priority: Minor
> Fix For: 1.1.0, 1.0.2
>
>
> I observed that some of the test cases using {{PowerMockRunner}} fail with 
> Java 8u20 with the following error:
> {code}
> java.lang.VerifyError: Bad  method call from inside of a branch
> Exception Details:
>   Location:
> 
> org/apache/flink/client/program/ClientTest$SuccessReturningActor.()V 
> @32: invokespecial
>   Reason:
> Error exists in the bytecode
>   Bytecode:
> 0x000: 2a4c 1214 b800 1a03 bd00 0d12 1bb8 001f
> 0x010: b800 254e 2db2 0029 a500 0e2a 01c0 002b
> 0x020: b700 2ea7 0009 2bb7 0030 0157 2a01 4c01
> 0x030: 4d01 4e2b 01a5 0008 2b4e a700 0912 32b8
> 0x040: 001a 4e2d 1234 03bd 000d 1236 b800 1f12
> 0x050: 32b8 003a 3a04 1904 b200 29a6 000a b800
> 0x060: 3c4d a700 0919 04c0 0011 4d2c b800 42b5
> 0x070: 0046 b1
>   Stackmap Table:
> full_frame(@38,{UninitializedThis,UninitializedThis,Top,Object[#13]},{})
> full_frame(@44,{Object[#2],Object[#2],Top,Object[#13]},{})
> full_frame(@61,{Object[#2],Null,Null,Null},{Object[#2]})
> full_frame(@67,{Object[#2],Null,Null,Object[#15]},{Object[#2]})
> 
> full_frame(@101,{Object[#2],Null,Null,Object[#15],Object[#13]},{Object[#2]})
> 
> full_frame(@107,{Object[#2],Null,Object[#17],Object[#15],Object[#13]},{Object[#2]})
>   at java.lang.Class.getDeclaredConstructors0(Native Method)
>   at java.lang.Class.privateGetDeclaredConstructors(Class.java:2658)
>   at java.lang.Class.getConstructor0(Class.java:3062)
>   at java.lang.Class.getDeclaredConstructor(Class.java:2165)
>   at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86)
>   at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86)
>   at scala.util.Try$.apply(Try.scala:161)
>   at akka.util.Reflect$.findConstructor(Reflect.scala:86)
>   at akka.actor.NoArgsReflectConstructor.(Props.scala:359)
>   at akka.actor.IndirectActorProducer$.apply(Props.scala:308)
>   at akka.actor.Props.producer(Props.scala:176)
>   at akka.actor.Props.(Props.scala:189)
>   at akka.actor.Props$.create(Props.scala:99)
>   at akka.actor.Props$.create(Props.scala:99)
>   at akka.actor.Props.create(Props.scala)
>   at 
> org.apache.flink.client.program.ClientTest.shouldSubmitToJobClient(ClientTest.java:143)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:310)
>   at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88)
>   at 
> org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:294)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:127)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:282)
>   at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86)
>   at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:207)
>   at 
> 

[jira] [Updated] (FLINK-2544) Some test cases using PowerMock fail with Java 8u20

2016-04-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-2544:
-
Assignee: Soila Kavulya

> Some test cases using PowerMock fail with Java 8u20
> ---
>
> Key: FLINK-2544
> URL: https://issues.apache.org/jira/browse/FLINK-2544
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Soila Kavulya
>Priority: Minor
>
> I observed that some of the test cases using {{PowerMockRunner}} fail with 
> Java 8u20 with the following error:
> {code}
> java.lang.VerifyError: Bad  method call from inside of a branch
> Exception Details:
>   Location:
> 
> org/apache/flink/client/program/ClientTest$SuccessReturningActor.()V 
> @32: invokespecial
>   Reason:
> Error exists in the bytecode
>   Bytecode:
> 0x000: 2a4c 1214 b800 1a03 bd00 0d12 1bb8 001f
> 0x010: b800 254e 2db2 0029 a500 0e2a 01c0 002b
> 0x020: b700 2ea7 0009 2bb7 0030 0157 2a01 4c01
> 0x030: 4d01 4e2b 01a5 0008 2b4e a700 0912 32b8
> 0x040: 001a 4e2d 1234 03bd 000d 1236 b800 1f12
> 0x050: 32b8 003a 3a04 1904 b200 29a6 000a b800
> 0x060: 3c4d a700 0919 04c0 0011 4d2c b800 42b5
> 0x070: 0046 b1
>   Stackmap Table:
> full_frame(@38,{UninitializedThis,UninitializedThis,Top,Object[#13]},{})
> full_frame(@44,{Object[#2],Object[#2],Top,Object[#13]},{})
> full_frame(@61,{Object[#2],Null,Null,Null},{Object[#2]})
> full_frame(@67,{Object[#2],Null,Null,Object[#15]},{Object[#2]})
> 
> full_frame(@101,{Object[#2],Null,Null,Object[#15],Object[#13]},{Object[#2]})
> 
> full_frame(@107,{Object[#2],Null,Object[#17],Object[#15],Object[#13]},{Object[#2]})
>   at java.lang.Class.getDeclaredConstructors0(Native Method)
>   at java.lang.Class.privateGetDeclaredConstructors(Class.java:2658)
>   at java.lang.Class.getConstructor0(Class.java:3062)
>   at java.lang.Class.getDeclaredConstructor(Class.java:2165)
>   at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86)
>   at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86)
>   at scala.util.Try$.apply(Try.scala:161)
>   at akka.util.Reflect$.findConstructor(Reflect.scala:86)
>   at akka.actor.NoArgsReflectConstructor.(Props.scala:359)
>   at akka.actor.IndirectActorProducer$.apply(Props.scala:308)
>   at akka.actor.Props.producer(Props.scala:176)
>   at akka.actor.Props.(Props.scala:189)
>   at akka.actor.Props$.create(Props.scala:99)
>   at akka.actor.Props$.create(Props.scala:99)
>   at akka.actor.Props.create(Props.scala)
>   at 
> org.apache.flink.client.program.ClientTest.shouldSubmitToJobClient(ClientTest.java:143)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:310)
>   at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88)
>   at 
> org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:294)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:127)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:282)
>   at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86)
>   at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:207)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.runMethods(PowerMockJUnit44RunnerDelegateImpl.java:146)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$1.run(PowerMockJUnit44RunnerDelegateImpl.java:120)
>   at 
> 

[jira] [Resolved] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking

2016-04-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-3762.
--
   Resolution: Fixed
Fix Version/s: 1.1.0

Fixed for 1.0.2 with b4b08ca24f2dc4897565da59361ad71281910619
Fixed for 1.1.0 with dc78a7470a5da086a08140b200a20d840460ef79

Thanks for the fix [~Andrew_Palumbo]!

>  Kryo StackOverflowError due to disabled Kryo Reference tracking
> 
>
> Key: FLINK-3762
> URL: https://issues.apache.org/jira/browse/FLINK-3762
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Andrew Palumbo
>Assignee: Andrew Palumbo
> Fix For: 1.1.0, 1.0.2
>
>
> As discussed on the dev list,
> In {{KryoSerializer.java}}
> Kryo Reference tracking is disabled by default:
> {code}
> kryo.setReferences(false);
> {code}
> This can causes  {{StackOverflowError}} Exceptions when serializing many 
> objects that may contain recursive objects:
> {code}
> java.lang.StackOverflowError
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
> {code}
> By enabling reference tracking, we can fix this problem.
> [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243362#comment-15243362
 ] 

ASF GitHub Bot commented on FLINK-3762:
---

Github user andrewpalumbo commented on the pull request:

https://github.com/apache/flink/pull/1891#issuecomment-210576438
  
Thank you guys!


>  Kryo StackOverflowError due to disabled Kryo Reference tracking
> 
>
> Key: FLINK-3762
> URL: https://issues.apache.org/jira/browse/FLINK-3762
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Andrew Palumbo
>Assignee: Andrew Palumbo
> Fix For: 1.0.2
>
>
> As discussed on the dev list,
> In {{KryoSerializer.java}}
> Kryo Reference tracking is disabled by default:
> {code}
> kryo.setReferences(false);
> {code}
> This can causes  {{StackOverflowError}} Exceptions when serializing many 
> objects that may contain recursive objects:
> {code}
> java.lang.StackOverflowError
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
> {code}
> By enabling reference tracking, we can fix this problem.
> [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3762: StackOverflowError due to disabled...

2016-04-15 Thread andrewpalumbo
Github user andrewpalumbo commented on the pull request:

https://github.com/apache/flink/pull/1891#issuecomment-210576438
  
Thank you guys!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-3759) Table API should throw exception is null value is encountered in non-null mode.

2016-04-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-3759.

   Resolution: Fixed
Fix Version/s: 1.1.0

Fixed with 722155a1fb95ddb45b6ee4c6cc6d0438cdbafac6

> Table API should throw exception is null value is encountered in non-null 
> mode.
> ---
>
> Key: FLINK-3759
> URL: https://issues.apache.org/jira/browse/FLINK-3759
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Priority: Critical
> Fix For: 1.1.0
>
>
> The Table API can be configured to omit null-checks in generated code to 
> speed up processing. Currently, the generated code replaces a null value with 
> a data type specific default value if it is encountered in non-null-check 
> mode. 
> This can silently cause wrong results and should be changed such that an 
> exception is thrown if a null value is encountered in non-null-check mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3700) Replace Guava Preconditions class with Flink Preconditions

2016-04-15 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243305#comment-15243305
 ] 

Fabian Hueske commented on FLINK-3700:
--

Commit 760a0d9e7fd9fa88e9f7408b137d78df384d764f removed Guava as dependency 
from {{flink-core}}.

> Replace Guava Preconditions class with Flink Preconditions
> --
>
> Key: FLINK-3700
> URL: https://issues.apache.org/jira/browse/FLINK-3700
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.1.0
>
>
> In order to reduce the dependency on Guava (which has cause us quite a bit of 
> pain in the past with its version conflicts), I suggest to add a Flink 
> {{Preconditions}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3738) Refactor TableEnvironment and TranslationContext

2016-04-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-3738.

   Resolution: Done
Fix Version/s: 1.1.0

Done with 1acd844539d3e0ea19d492d2f5f89e32f29827d8

> Refactor TableEnvironment and TranslationContext
> 
>
> Key: FLINK-3738
> URL: https://issues.apache.org/jira/browse/FLINK-3738
> Project: Flink
>  Issue Type: Task
>  Components: Table API
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
> Fix For: 1.1.0
>
>
> Currently the TableAPI uses a static object called {{TranslationContext}} 
> which holds the Calcite table catalog and a Calcite planner instance. 
> Whenever a {{DataSet}} or {{DataStream}} is converted into a {{Table}} or 
> registered as a {{Table}} on the {{TableEnvironment}}, a new entry is added 
> to the catalog. The first time a {{Table}} is added, a planner instance is 
> created. The planner is used to optimize the query (defined by one or more 
> Table API operations and/or one ore more SQL queries) when a {{Table}} is 
> converted into a {{DataSet}} or {{DataStream}}. Since a planner may only be 
> used to optimize a single program, the choice of a single static object is 
> problematic.
> I propose to refactor the {{TableEnvironment}} to take over the 
> responsibility of holding the catalog and the planner instance. 
> - A {{TableEnvironment}} holds a catalog of registered tables and a single 
> planner instance.
> - A {{TableEnvironment}} will only allow to translate a single {{Table}} 
> (possibly composed of several Table API operations and SQL queries) into a 
> {{DataSet}} or {{DataStream}}. 
> - A {{TableEnvironment}} is bound to an {{ExecutionEnvironment}} or a 
> {{StreamExecutionEnvironment}}. This is necessary to create data source or 
> source functions to read external tables or streams.
> - {{DataSet}} and {{DataStream}} need a reference to a {{TableEnvironment}} 
> to be converted into a {{Table}}. This will prohibit implicit casts as 
> currently supported for the DataSet Scala API.
> - A {{Table}} needs a reference to the {{TableEnvironment}} it is bound to. 
> Only tables from the same {{TableEnvironment}} can be processed together.
> - The {{TranslationContext}} will be completely removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3738) Refactor TableEnvironment and TranslationContext

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243286#comment-15243286
 ] 

ASF GitHub Bot commented on FLINK-3738:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1887


> Refactor TableEnvironment and TranslationContext
> 
>
> Key: FLINK-3738
> URL: https://issues.apache.org/jira/browse/FLINK-3738
> Project: Flink
>  Issue Type: Task
>  Components: Table API
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>
> Currently the TableAPI uses a static object called {{TranslationContext}} 
> which holds the Calcite table catalog and a Calcite planner instance. 
> Whenever a {{DataSet}} or {{DataStream}} is converted into a {{Table}} or 
> registered as a {{Table}} on the {{TableEnvironment}}, a new entry is added 
> to the catalog. The first time a {{Table}} is added, a planner instance is 
> created. The planner is used to optimize the query (defined by one or more 
> Table API operations and/or one ore more SQL queries) when a {{Table}} is 
> converted into a {{DataSet}} or {{DataStream}}. Since a planner may only be 
> used to optimize a single program, the choice of a single static object is 
> problematic.
> I propose to refactor the {{TableEnvironment}} to take over the 
> responsibility of holding the catalog and the planner instance. 
> - A {{TableEnvironment}} holds a catalog of registered tables and a single 
> planner instance.
> - A {{TableEnvironment}} will only allow to translate a single {{Table}} 
> (possibly composed of several Table API operations and SQL queries) into a 
> {{DataSet}} or {{DataStream}}. 
> - A {{TableEnvironment}} is bound to an {{ExecutionEnvironment}} or a 
> {{StreamExecutionEnvironment}}. This is necessary to create data source or 
> source functions to read external tables or streams.
> - {{DataSet}} and {{DataStream}} need a reference to a {{TableEnvironment}} 
> to be converted into a {{Table}}. This will prohibit implicit casts as 
> currently supported for the DataSet Scala API.
> - A {{Table}} needs a reference to the {{TableEnvironment}} it is bound to. 
> Only tables from the same {{TableEnvironment}} can be processed together.
> - The {{TranslationContext}} will be completely removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2544) Some test cases using PowerMock fail with Java 8u20

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243285#comment-15243285
 ] 

ASF GitHub Bot commented on FLINK-2544:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1882


> Some test cases using PowerMock fail with Java 8u20
> ---
>
> Key: FLINK-2544
> URL: https://issues.apache.org/jira/browse/FLINK-2544
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Priority: Minor
>
> I observed that some of the test cases using {{PowerMockRunner}} fail with 
> Java 8u20 with the following error:
> {code}
> java.lang.VerifyError: Bad  method call from inside of a branch
> Exception Details:
>   Location:
> 
> org/apache/flink/client/program/ClientTest$SuccessReturningActor.()V 
> @32: invokespecial
>   Reason:
> Error exists in the bytecode
>   Bytecode:
> 0x000: 2a4c 1214 b800 1a03 bd00 0d12 1bb8 001f
> 0x010: b800 254e 2db2 0029 a500 0e2a 01c0 002b
> 0x020: b700 2ea7 0009 2bb7 0030 0157 2a01 4c01
> 0x030: 4d01 4e2b 01a5 0008 2b4e a700 0912 32b8
> 0x040: 001a 4e2d 1234 03bd 000d 1236 b800 1f12
> 0x050: 32b8 003a 3a04 1904 b200 29a6 000a b800
> 0x060: 3c4d a700 0919 04c0 0011 4d2c b800 42b5
> 0x070: 0046 b1
>   Stackmap Table:
> full_frame(@38,{UninitializedThis,UninitializedThis,Top,Object[#13]},{})
> full_frame(@44,{Object[#2],Object[#2],Top,Object[#13]},{})
> full_frame(@61,{Object[#2],Null,Null,Null},{Object[#2]})
> full_frame(@67,{Object[#2],Null,Null,Object[#15]},{Object[#2]})
> 
> full_frame(@101,{Object[#2],Null,Null,Object[#15],Object[#13]},{Object[#2]})
> 
> full_frame(@107,{Object[#2],Null,Object[#17],Object[#15],Object[#13]},{Object[#2]})
>   at java.lang.Class.getDeclaredConstructors0(Native Method)
>   at java.lang.Class.privateGetDeclaredConstructors(Class.java:2658)
>   at java.lang.Class.getConstructor0(Class.java:3062)
>   at java.lang.Class.getDeclaredConstructor(Class.java:2165)
>   at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86)
>   at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86)
>   at scala.util.Try$.apply(Try.scala:161)
>   at akka.util.Reflect$.findConstructor(Reflect.scala:86)
>   at akka.actor.NoArgsReflectConstructor.(Props.scala:359)
>   at akka.actor.IndirectActorProducer$.apply(Props.scala:308)
>   at akka.actor.Props.producer(Props.scala:176)
>   at akka.actor.Props.(Props.scala:189)
>   at akka.actor.Props$.create(Props.scala:99)
>   at akka.actor.Props$.create(Props.scala:99)
>   at akka.actor.Props.create(Props.scala)
>   at 
> org.apache.flink.client.program.ClientTest.shouldSubmitToJobClient(ClientTest.java:143)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:310)
>   at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88)
>   at 
> org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:294)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:127)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:282)
>   at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86)
>   at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:207)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.runMethods(PowerMockJUnit44RunnerDelegateImpl.java:146)
>   at 
> 

[GitHub] flink pull request: [FLINK-3732] [Core] Potential null deference i...

2016-04-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1871


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3738] [tableAPI] Refactor TableEnvironm...

2016-04-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1887


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3700] [core] Remove Guava as a dependen...

2016-04-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1854


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3759] [table] Table API should throw ex...

2016-04-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1892


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3732) Potential null deference in ExecutionConfig#equals()

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243284#comment-15243284
 ] 

ASF GitHub Bot commented on FLINK-3732:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1871


> Potential null deference in ExecutionConfig#equals()
> 
>
> Key: FLINK-3732
> URL: https://issues.apache.org/jira/browse/FLINK-3732
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
> ((restartStrategyConfiguration == null && 
> other.restartStrategyConfiguration == null) ||
> 
> restartStrategyConfiguration.equals(other.restartStrategyConfiguration)) &&
> {code}
> If restartStrategyConfiguration is null but 
> other.restartStrategyConfiguration is not null, the above would result in NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2544] Add Java 8 version for building P...

2016-04-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1882


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243283#comment-15243283
 ] 

ASF GitHub Bot commented on FLINK-3762:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1891


>  Kryo StackOverflowError due to disabled Kryo Reference tracking
> 
>
> Key: FLINK-3762
> URL: https://issues.apache.org/jira/browse/FLINK-3762
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Andrew Palumbo
>Assignee: Andrew Palumbo
> Fix For: 1.0.2
>
>
> As discussed on the dev list,
> In {{KryoSerializer.java}}
> Kryo Reference tracking is disabled by default:
> {code}
> kryo.setReferences(false);
> {code}
> This can causes  {{StackOverflowError}} Exceptions when serializing many 
> objects that may contain recursive objects:
> {code}
> java.lang.StackOverflowError
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
> {code}
> By enabling reference tracking, we can fix this problem.
> [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3759) Table API should throw exception is null value is encountered in non-null mode.

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243281#comment-15243281
 ] 

ASF GitHub Bot commented on FLINK-3759:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1892


> Table API should throw exception is null value is encountered in non-null 
> mode.
> ---
>
> Key: FLINK-3759
> URL: https://issues.apache.org/jira/browse/FLINK-3759
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Priority: Critical
>
> The Table API can be configured to omit null-checks in generated code to 
> speed up processing. Currently, the generated code replaces a null value with 
> a data type specific default value if it is encountered in non-null-check 
> mode. 
> This can silently cause wrong results and should be changed such that an 
> exception is thrown if a null value is encountered in non-null-check mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3700) Replace Guava Preconditions class with Flink Preconditions

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243282#comment-15243282
 ] 

ASF GitHub Bot commented on FLINK-3700:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1854


> Replace Guava Preconditions class with Flink Preconditions
> --
>
> Key: FLINK-3700
> URL: https://issues.apache.org/jira/browse/FLINK-3700
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.1.0
>
>
> In order to reduce the dependency on Guava (which has cause us quite a bit of 
> pain in the past with its version conflicts), I suggest to add a Flink 
> {{Preconditions}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3762: StackOverflowError due to disabled...

2016-04-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1891


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-3768) Clustering Coefficient

2016-04-15 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-3768:
-

 Summary: Clustering Coefficient
 Key: FLINK-3768
 URL: https://issues.apache.org/jira/browse/FLINK-3768
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Affects Versions: 1.1.0
Reporter: Greg Hogan
Assignee: Greg Hogan


The local clustering coefficient measures the connectedness of each vertex's 
neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 
(neighborhood is a clique).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1745) Add exact k-nearest-neighbours algorithm to machine learning library

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243164#comment-15243164
 ] 

ASF GitHub Bot commented on FLINK-1745:
---

Github user danielblazevski commented on the pull request:

https://github.com/apache/flink/pull/1220#issuecomment-210527093
  
@tillrohrmann @chiwanpark does the re-basing look OK now?  Some of the CI 
builds didn't go through, 2 passed, 2 failed and 1 timed out (it seems there is 
a 2hr max limit for the CI process).


> Add exact k-nearest-neighbours algorithm to machine learning library
> 
>
> Key: FLINK-1745
> URL: https://issues.apache.org/jira/browse/FLINK-1745
> Project: Flink
>  Issue Type: New Feature
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Daniel Blazevski
>  Labels: ML, Starter
>
> Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial 
> it is still used as a mean to classify data and to do regression. This issue 
> focuses on the implementation of an exact kNN (H-BNLJ, H-BRJ) algorithm as 
> proposed in [2].
> Could be a starter task.
> Resources:
> [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm]
> [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1745] Add exact k-nearest-neighbours al...

2016-04-15 Thread danielblazevski
Github user danielblazevski commented on the pull request:

https://github.com/apache/flink/pull/1220#issuecomment-210527093
  
@tillrohrmann @chiwanpark does the re-basing look OK now?  Some of the CI 
builds didn't go through, 2 passed, 2 failed and 1 timed out (it seems there is 
a 2hr max limit for the CI process).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3665) Range partitioning lacks support to define sort orders

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243033#comment-15243033
 ] 

ASF GitHub Bot commented on FLINK-3665:
---

Github user dawidwys commented on the pull request:

https://github.com/apache/flink/pull/1848#issuecomment-210485959
  
Hi @fhueske, I hope I applied all your comments, so it would be nice if you 
could have a look at it once again.


> Range partitioning lacks support to define sort orders
> --
>
> Key: FLINK-3665
> URL: https://issues.apache.org/jira/browse/FLINK-3665
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0
>Reporter: Fabian Hueske
> Fix For: 1.1.0
>
>
> {{DataSet.partitionByRange()}} does not allow to specify the sort order of 
> fields. This is fine if range partitioning is used to reduce skewed 
> partitioning. 
> However, it is not sufficient if range partitioning is used to sort a data 
> set in parallel. 
> Since {{DataSet.partitionByRange()}} is {{@Public}} API and cannot be easily 
> changed, I propose to add a method {{withOrders(Order... orders)}} to 
> {{PartitionOperator}}. The method should throw an exception if the 
> partitioning method of {{PartitionOperator}} is not range partitioning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3665] Implemented sort orders support i...

2016-04-15 Thread dawidwys
Github user dawidwys commented on the pull request:

https://github.com/apache/flink/pull/1848#issuecomment-210485959
  
Hi @fhueske, I hope I applied all your comments, so it would be nice if you 
could have a look at it once again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3428) Add fixed time trailing timestamp/watermark extractor

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242947#comment-15242947
 ] 

ASF GitHub Bot commented on FLINK-3428:
---

Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/1764#issuecomment-210462831
  
If everything is ok, could somebody merge this?
It has been some time that there is no activity here.


> Add fixed time trailing timestamp/watermark extractor
> -
>
> Key: FLINK-3428
> URL: https://issues.apache.org/jira/browse/FLINK-3428
> Project: Flink
>  Issue Type: Improvement
>Reporter: Robert Metzger
>Assignee: Kostas Kloudas
>
> Flink currently provides only one build-in timestamp extractor, which assumes 
> strictly ascending timestamps. In real world use cases, timestamps are almost 
> never strictly ascending.
> Therefore, I propose to provide an utility watermark extractor which is 
> generating watermarks with a fixed-time trailing.
> The implementation should keep track of the highest event-time seen so far 
> and subtract a fixed amount of time from that event time.
> This way, users can for example specify that the watermarks should always 
> "lag behind" 10 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3428: Adds a fixed time trailing waterma...

2016-04-15 Thread kl0u
Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/1764#issuecomment-210462831
  
If everything is ok, could somebody merge this?
It has been some time that there is no activity here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242903#comment-15242903
 ] 

ASF GitHub Bot commented on FLINK-3717:
---

GitHub user kl0u opened a pull request:

https://github.com/apache/flink/pull/1895

FLINK-3717 - Be able to re-start reading from a specific point in a file.

This pull-request addresses issue FLINK-3717 and is the first step to make 
File Sources fault-tolerant.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kl0u/flink fif_offset

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1895.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1895


commit a571f8de21d0c48fdb31b4518a4be76c74d32cea
Author: kl0u 
Date:   2016-04-10T14:56:42Z

Added the getOffset() in the FileInputFormat.

commit bc7f97864c56210f79a23f9bdc7c328c19f4c2ee
Author: kl0u 
Date:   2016-04-11T11:49:00Z

Adding tests.

commit 8b0391249fa0902d66857c61c632e3c1a21ee6f3
Author: kl0u 
Date:   2016-04-11T13:01:31Z

Fixed wrong test in FileInputFormatTest.

commit ab6d68e3525e8ed368739b72d1ab90a70207d34d
Author: kl0u 
Date:   2016-04-11T14:04:57Z

Adding tests.

commit 07fbd4848420eeb8f82fc2f51d3dc9879d6034aa
Author: kl0u 
Date:   2016-04-12T09:00:59Z

Fixed the BinaryInputFormat.

commit 8b33caefe3d2eb0735bc38dd26ab9f375c4e211f
Author: kl0u 
Date:   2016-04-12T14:22:10Z

Added tests for binary input format

commit f9502bcf582095afeccfc46e9b6859e0094fde63
Author: kl0u 
Date:   2016-04-13T17:49:17Z

Change in the abstraction for the channel state.

commit e5e95d494978a36006757b78284c52be1d2e9d83
Author: kl0u 
Date:   2016-04-13T22:51:44Z

Adding the avro format

commit ec8e6d49b7d01577c28fd8a0cdc3a352a5de9b77
Author: kl0u 
Date:   2016-04-14T13:35:37Z

Added the Checkpointed Input Interface.

commit 4987b68ef1ae168833edfad9c1b86d94ac33b71b
Author: kl0u 
Date:   2016-04-14T13:52:20Z

Adding test for the recovery.

commit d72117f8355cdf578615ec4f3ae717ffd15f6ed2
Author: kl0u 
Date:   2016-04-15T08:43:18Z

Cleaning up the code.




> Add functionality to be a able to restore from specific point in a 
> FileInputFormat
> --
>
> Key: FLINK-3717
> URL: https://issues.apache.org/jira/browse/FLINK-3717
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>
> This is the first step in order to make the File Sources fault-tolerant. We 
> have to be able to get the start from a specific point in a file despite any 
> caching performed during reading. This will guarantee that the task that will 
> take over the execution of the failed one will be able to start from the 
> correct point in the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242904#comment-15242904
 ] 

ASF GitHub Bot commented on FLINK-3717:
---

Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-210450817
  
This PR addresses FLINK-3717. 
Please review!



> Add functionality to be a able to restore from specific point in a 
> FileInputFormat
> --
>
> Key: FLINK-3717
> URL: https://issues.apache.org/jira/browse/FLINK-3717
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>
> This is the first step in order to make the File Sources fault-tolerant. We 
> have to be able to get the start from a specific point in a file despite any 
> caching performed during reading. This will guarantee that the task that will 
> take over the execution of the failed one will be able to start from the 
> correct point in the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3717 - Be able to re-start reading from ...

2016-04-15 Thread kl0u
GitHub user kl0u opened a pull request:

https://github.com/apache/flink/pull/1895

FLINK-3717 - Be able to re-start reading from a specific point in a file.

This pull-request addresses issue FLINK-3717 and is the first step to make 
File Sources fault-tolerant.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kl0u/flink fif_offset

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1895.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1895


commit a571f8de21d0c48fdb31b4518a4be76c74d32cea
Author: kl0u 
Date:   2016-04-10T14:56:42Z

Added the getOffset() in the FileInputFormat.

commit bc7f97864c56210f79a23f9bdc7c328c19f4c2ee
Author: kl0u 
Date:   2016-04-11T11:49:00Z

Adding tests.

commit 8b0391249fa0902d66857c61c632e3c1a21ee6f3
Author: kl0u 
Date:   2016-04-11T13:01:31Z

Fixed wrong test in FileInputFormatTest.

commit ab6d68e3525e8ed368739b72d1ab90a70207d34d
Author: kl0u 
Date:   2016-04-11T14:04:57Z

Adding tests.

commit 07fbd4848420eeb8f82fc2f51d3dc9879d6034aa
Author: kl0u 
Date:   2016-04-12T09:00:59Z

Fixed the BinaryInputFormat.

commit 8b33caefe3d2eb0735bc38dd26ab9f375c4e211f
Author: kl0u 
Date:   2016-04-12T14:22:10Z

Added tests for binary input format

commit f9502bcf582095afeccfc46e9b6859e0094fde63
Author: kl0u 
Date:   2016-04-13T17:49:17Z

Change in the abstraction for the channel state.

commit e5e95d494978a36006757b78284c52be1d2e9d83
Author: kl0u 
Date:   2016-04-13T22:51:44Z

Adding the avro format

commit ec8e6d49b7d01577c28fd8a0cdc3a352a5de9b77
Author: kl0u 
Date:   2016-04-14T13:35:37Z

Added the Checkpointed Input Interface.

commit 4987b68ef1ae168833edfad9c1b86d94ac33b71b
Author: kl0u 
Date:   2016-04-14T13:52:20Z

Adding test for the recovery.

commit d72117f8355cdf578615ec4f3ae717ffd15f6ed2
Author: kl0u 
Date:   2016-04-15T08:43:18Z

Cleaning up the code.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: FLINK-3717 - Be able to re-start reading from ...

2016-04-15 Thread kl0u
Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-210450817
  
This PR addresses FLINK-3717. 
Please review!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat

2016-04-15 Thread Kostas Kloudas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostas Kloudas updated FLINK-3717:
--
Description: This is the first step in order to make the File Sources 
fault-tolerant. We have to be able to get the start from a specific point in a 
file despite any caching performed during reading. This will guarantee that the 
task that will take over the execution of the failed one will be able to start 
from the correct point in the file.  (was: This is the first step in order to 
make the File Sources fault-tolerant. We have to be able to get the latest read 
offset in the file despite any caching performed during reading. This will 
guarantee that the task that will take over the execution of the failed one 
will be able to start from the correct point in the file.)

> Add functionality to be a able to restore from specific point in a 
> FileInputFormat
> --
>
> Key: FLINK-3717
> URL: https://issues.apache.org/jira/browse/FLINK-3717
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>
> This is the first step in order to make the File Sources fault-tolerant. We 
> have to be able to get the start from a specific point in a file despite any 
> caching performed during reading. This will guarantee that the task that will 
> take over the execution of the failed one will be able to start from the 
> correct point in the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat

2016-04-15 Thread Kostas Kloudas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostas Kloudas updated FLINK-3717:
--
Summary: Add functionality to be a able to restore from specific point in a 
FileInputFormat  (was: Add functionality to get the current offset of a source 
with a FileInputFormat)

> Add functionality to be a able to restore from specific point in a 
> FileInputFormat
> --
>
> Key: FLINK-3717
> URL: https://issues.apache.org/jira/browse/FLINK-3717
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>
> This is the first step in order to make the File Sources fault-tolerant. We 
> have to be able to get the latest read offset in the file despite any caching 
> performed during reading. This will guarantee that the task that will take 
> over the execution of the failed one will be able to start from the correct 
> point in the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3479) Add hash-based strategy for CombineFunction

2016-04-15 Thread Gabor Gevay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gevay updated FLINK-3479:
---
Priority: Minor  (was: Major)

> Add hash-based strategy for CombineFunction
> ---
>
> Key: FLINK-3479
> URL: https://issues.apache.org/jira/browse/FLINK-3479
> Project: Flink
>  Issue Type: Sub-task
>  Components: Local Runtime
>Reporter: Fabian Hueske
>Priority: Minor
>
> This issue is similar to FLINK-3477 but adds a hash-based strategy for 
> {{CombineFunction}} instead of {{ReduceFunction}}.
> The interface of {{CombineFunction}} differs from {{ReduceFunction}} by 
> providing an {{Iterable}} instead of two {{T}} values. Hence, if the 
> {{Iterable}} provides two values, we can do the same as with a 
> {{ReduceFunction}}.
> At the moment, {{CombineFunction}} is wrapped in a {{GroupCombineFunction}} 
> and hence executed using the {{GroupReduceCombineDriver}}. 
> We should add dedicated two dedicated drivers: {{CombineDriver}} and 
> {{ChainedCombineDriver}} and two driver strategies: {{HASH_COMBINE}} and 
> {{SORT_COMBINE}}. 
> If FLINK-3477 is resolved, we can reuse the hash-table.
> We should also add compiler hints to `DataSet.reduceGroup()` and 
> `Grouping.reduceGroup()` to allow users to select between a {{SORT}} and 
> {{HASH}} based combine strategies ({{HASH}} will only be applicable to 
> {{CombineFunction}} and not {{GroupCombineFunction}}).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3767) Show timeline also for running tasks, not only for finished ones

2016-04-15 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-3767:
-

 Summary: Show timeline also for running tasks, not only for 
finished ones
 Key: FLINK-3767
 URL: https://issues.apache.org/jira/browse/FLINK-3767
 Project: Flink
  Issue Type: Sub-task
  Components: Webfrontend
Reporter: Robert Metzger






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2944) Collect, expose and display operator-specific stats

2016-04-15 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-2944:
--
Issue Type: Sub-task  (was: New Feature)
Parent: FLINK-3766

> Collect, expose and display operator-specific stats
> ---
>
> Key: FLINK-2944
> URL: https://issues.apache.org/jira/browse/FLINK-2944
> Project: Flink
>  Issue Type: Sub-task
>  Components: Local Runtime, Webfrontend
>Reporter: Fabian Hueske
>  Labels: requires-design-doc
>
> It would be nice to collect operator-specific stats such as:
> - HashJoin: bytes spilled build side, bytes spilled probe side, num keys 
> (build / probe)
> - GroupBy: bytes spilled, num keys, max group size, avg. group size
> - Combiner: combine rate (in/out ratio)
> - Streaming Ops: avg. throughput
> - etc.
> and display these stats in the webUI. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3766) Add a new tab for monitoring metrics in the web interface

2016-04-15 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-3766:
--
Attachment: metricsMock-unselected.png
metricsMock-selected.png

> Add a new tab for monitoring metrics in the web interface
> -
>
> Key: FLINK-3766
> URL: https://issues.apache.org/jira/browse/FLINK-3766
> Project: Flink
>  Issue Type: New Feature
>  Components: Webfrontend
>Reporter: Robert Metzger
> Attachments: metricsMock-selected.png, metricsMock-unselected.png
>
>
> Add a new tab for showing operator/task specific metrics in the Flink web ui.
> I propose to add a drop down into the tab that allows selecting the metrics.
> For each metric, we show a box in a grid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242881#comment-15242881
 ] 

ASF GitHub Bot commented on FLINK-3764:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1894#discussion_r59865835
  
--- Diff: 
flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingCluster.scala
 ---
@@ -211,11 +211,18 @@ class TestingCluster(
 
 val jmsAliveFutures = jobManagerActors map {
   _ map {
-tm => (tm ? Alive)(timeout)
+jm => (jm ? Alive)(timeout)
+  }
+} getOrElse(Seq())
+
+val resourceManagersAliveFutures = resourceManagerActors map {
+  _ map {
+rm => (rm ? Alive)(timeout)
   }
 } getOrElse(Seq())
 
-val combinedFuture = Future.sequence(tmsAliveFutures ++ 
jmsAliveFutures)
+val combinedFuture = Future.sequence(tmsAliveFutures ++ 
jmsAliveFutures ++
+   resourceManagersAliveFutures)
 
 Await.ready(combinedFuture, timeout)
--- End diff --

No we don't know. Just a `TimeoutException` will be thrown.


> LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails 
> on Travis
> --
>
> Key: FLINK-3764
> URL: https://issues.apache.org/jira/browse/FLINK-3764
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> The 
> {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} 
> fails spuriously on Travis because of a {{NullPointerException}}. The reason 
> is that it's not properly waited until the {{ResourceManager}} has been 
> started. Due to this, it can happen that a leader notification message is 
> tried to be sent to a {{LeaderRetrievalListener}} which has not been set by 
> the {{ResourceManager}}.
> [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3766) Add a new tab for monitoring metrics in the web interface

2016-04-15 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-3766:
-

 Summary: Add a new tab for monitoring metrics in the web interface
 Key: FLINK-3766
 URL: https://issues.apache.org/jira/browse/FLINK-3766
 Project: Flink
  Issue Type: New Feature
  Components: Webfrontend
Reporter: Robert Metzger


Add a new tab for showing operator/task specific metrics in the Flink web ui.
I propose to add a drop down into the tab that allows selecting the metrics.
For each metric, we show a box in a grid.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3764] [tests] Improve LeaderChangeState...

2016-04-15 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1894#discussion_r59865835
  
--- Diff: 
flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingCluster.scala
 ---
@@ -211,11 +211,18 @@ class TestingCluster(
 
 val jmsAliveFutures = jobManagerActors map {
   _ map {
-tm => (tm ? Alive)(timeout)
+jm => (jm ? Alive)(timeout)
+  }
+} getOrElse(Seq())
+
+val resourceManagersAliveFutures = resourceManagerActors map {
+  _ map {
+rm => (rm ? Alive)(timeout)
   }
 } getOrElse(Seq())
 
-val combinedFuture = Future.sequence(tmsAliveFutures ++ 
jmsAliveFutures)
+val combinedFuture = Future.sequence(tmsAliveFutures ++ 
jmsAliveFutures ++
+   resourceManagersAliveFutures)
 
 Await.ready(combinedFuture, timeout)
--- End diff --

No we don't know. Just a `TimeoutException` will be thrown.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242870#comment-15242870
 ] 

ASF GitHub Bot commented on FLINK-3764:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/1894#discussion_r59864663
  
--- Diff: 
flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingCluster.scala
 ---
@@ -211,11 +211,18 @@ class TestingCluster(
 
 val jmsAliveFutures = jobManagerActors map {
   _ map {
-tm => (tm ? Alive)(timeout)
+jm => (jm ? Alive)(timeout)
+  }
+} getOrElse(Seq())
+
+val resourceManagersAliveFutures = resourceManagerActors map {
+  _ map {
+rm => (rm ? Alive)(timeout)
   }
 } getOrElse(Seq())
 
-val combinedFuture = Future.sequence(tmsAliveFutures ++ 
jmsAliveFutures)
+val combinedFuture = Future.sequence(tmsAliveFutures ++ 
jmsAliveFutures ++
+   resourceManagersAliveFutures)
 
 Await.ready(combinedFuture, timeout)
--- End diff --

if this check fails, do we know due to which future?


> LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails 
> on Travis
> --
>
> Key: FLINK-3764
> URL: https://issues.apache.org/jira/browse/FLINK-3764
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> The 
> {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} 
> fails spuriously on Travis because of a {{NullPointerException}}. The reason 
> is that it's not properly waited until the {{ResourceManager}} has been 
> started. Due to this, it can happen that a leader notification message is 
> tried to be sent to a {{LeaderRetrievalListener}} which has not been set by 
> the {{ResourceManager}}.
> [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3764] [tests] Improve LeaderChangeState...

2016-04-15 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/1894#discussion_r59864663
  
--- Diff: 
flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingCluster.scala
 ---
@@ -211,11 +211,18 @@ class TestingCluster(
 
 val jmsAliveFutures = jobManagerActors map {
   _ map {
-tm => (tm ? Alive)(timeout)
+jm => (jm ? Alive)(timeout)
+  }
+} getOrElse(Seq())
+
+val resourceManagersAliveFutures = resourceManagerActors map {
+  _ map {
+rm => (rm ? Alive)(timeout)
   }
 } getOrElse(Seq())
 
-val combinedFuture = Future.sequence(tmsAliveFutures ++ 
jmsAliveFutures)
+val combinedFuture = Future.sequence(tmsAliveFutures ++ 
jmsAliveFutures ++
+   resourceManagersAliveFutures)
 
 Await.ready(combinedFuture, timeout)
--- End diff --

if this check fails, do we know due to which future?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2544) Some test cases using PowerMock fail with Java 8u20

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242868#comment-15242868
 ] 

ASF GitHub Bot commented on FLINK-2544:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1882#issuecomment-210438717
  
As a followup, could we add some "Assume" statements in the tests that 
check whether the Java version is either Java 7 or Java 8u51+ ?


> Some test cases using PowerMock fail with Java 8u20
> ---
>
> Key: FLINK-2544
> URL: https://issues.apache.org/jira/browse/FLINK-2544
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Priority: Minor
>
> I observed that some of the test cases using {{PowerMockRunner}} fail with 
> Java 8u20 with the following error:
> {code}
> java.lang.VerifyError: Bad  method call from inside of a branch
> Exception Details:
>   Location:
> 
> org/apache/flink/client/program/ClientTest$SuccessReturningActor.()V 
> @32: invokespecial
>   Reason:
> Error exists in the bytecode
>   Bytecode:
> 0x000: 2a4c 1214 b800 1a03 bd00 0d12 1bb8 001f
> 0x010: b800 254e 2db2 0029 a500 0e2a 01c0 002b
> 0x020: b700 2ea7 0009 2bb7 0030 0157 2a01 4c01
> 0x030: 4d01 4e2b 01a5 0008 2b4e a700 0912 32b8
> 0x040: 001a 4e2d 1234 03bd 000d 1236 b800 1f12
> 0x050: 32b8 003a 3a04 1904 b200 29a6 000a b800
> 0x060: 3c4d a700 0919 04c0 0011 4d2c b800 42b5
> 0x070: 0046 b1
>   Stackmap Table:
> full_frame(@38,{UninitializedThis,UninitializedThis,Top,Object[#13]},{})
> full_frame(@44,{Object[#2],Object[#2],Top,Object[#13]},{})
> full_frame(@61,{Object[#2],Null,Null,Null},{Object[#2]})
> full_frame(@67,{Object[#2],Null,Null,Object[#15]},{Object[#2]})
> 
> full_frame(@101,{Object[#2],Null,Null,Object[#15],Object[#13]},{Object[#2]})
> 
> full_frame(@107,{Object[#2],Null,Object[#17],Object[#15],Object[#13]},{Object[#2]})
>   at java.lang.Class.getDeclaredConstructors0(Native Method)
>   at java.lang.Class.privateGetDeclaredConstructors(Class.java:2658)
>   at java.lang.Class.getConstructor0(Class.java:3062)
>   at java.lang.Class.getDeclaredConstructor(Class.java:2165)
>   at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86)
>   at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86)
>   at scala.util.Try$.apply(Try.scala:161)
>   at akka.util.Reflect$.findConstructor(Reflect.scala:86)
>   at akka.actor.NoArgsReflectConstructor.(Props.scala:359)
>   at akka.actor.IndirectActorProducer$.apply(Props.scala:308)
>   at akka.actor.Props.producer(Props.scala:176)
>   at akka.actor.Props.(Props.scala:189)
>   at akka.actor.Props$.create(Props.scala:99)
>   at akka.actor.Props$.create(Props.scala:99)
>   at akka.actor.Props.create(Props.scala)
>   at 
> org.apache.flink.client.program.ClientTest.shouldSubmitToJobClient(ClientTest.java:143)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:310)
>   at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88)
>   at 
> org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:294)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:127)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:282)
>   at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86)
>   at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49)
>   at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:207)
>   at 
> 

[GitHub] flink pull request: [FLINK-2544] Add Java 8 version for building P...

2016-04-15 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1882#issuecomment-210438717
  
As a followup, could we add some "Assume" statements in the tests that 
check whether the Java version is either Java 7 or Java 8u51+ ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242864#comment-15242864
 ] 

ASF GitHub Bot commented on FLINK-3764:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1894#issuecomment-210438074
  
Looks good to me!
Please merge once Travis gives green light...


> LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails 
> on Travis
> --
>
> Key: FLINK-3764
> URL: https://issues.apache.org/jira/browse/FLINK-3764
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> The 
> {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} 
> fails spuriously on Travis because of a {{NullPointerException}}. The reason 
> is that it's not properly waited until the {{ResourceManager}} has been 
> started. Due to this, it can happen that a leader notification message is 
> tried to be sent to a {{LeaderRetrievalListener}} which has not been set by 
> the {{ResourceManager}}.
> [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3764] [tests] Improve LeaderChangeState...

2016-04-15 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1894#issuecomment-210438074
  
Looks good to me!
Please merge once Travis gives green light...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242777#comment-15242777
 ] 

ASF GitHub Bot commented on FLINK-3764:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/1894

[FLINK-3764] [tests] Improve LeaderChangeStateCleanupTest stability

The LeaderElectionRetrievalTestingCluster now also waits for the Flink
resource manager to be properly started before trying to send leader
change notification messages to the LeaderRetrievalServices.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink 
fixLeaderChangeStateCleanupTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1894.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1894


commit c9895ba81db966fb3d48eced6529aa33fe1941ce
Author: Till Rohrmann 
Date:   2016-04-15T10:33:58Z

[FLINK-3764] [tests] Improve LeaderChangeStateCleanupTest stability

The LeaderElectionRetrievalTestingCluster now also waits for the Flink
resource manager to be properly started before trying to send leader
change notification messages to the LeaderRetrievalServices.




> LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails 
> on Travis
> --
>
> Key: FLINK-3764
> URL: https://issues.apache.org/jira/browse/FLINK-3764
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> The 
> {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} 
> fails spuriously on Travis because of a {{NullPointerException}}. The reason 
> is that it's not properly waited until the {{ResourceManager}} has been 
> started. Due to this, it can happen that a leader notification message is 
> tried to be sent to a {{LeaderRetrievalListener}} which has not been set by 
> the {{ResourceManager}}.
> [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3764] [tests] Improve LeaderChangeState...

2016-04-15 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/1894

[FLINK-3764] [tests] Improve LeaderChangeStateCleanupTest stability

The LeaderElectionRetrievalTestingCluster now also waits for the Flink
resource manager to be properly started before trying to send leader
change notification messages to the LeaderRetrievalServices.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink 
fixLeaderChangeStateCleanupTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1894.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1894


commit c9895ba81db966fb3d48eced6529aa33fe1941ce
Author: Till Rohrmann 
Date:   2016-04-15T10:33:58Z

[FLINK-3764] [tests] Improve LeaderChangeStateCleanupTest stability

The LeaderElectionRetrievalTestingCluster now also waits for the Flink
resource manager to be properly started before trying to send leader
change notification messages to the LeaderRetrievalServices.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-3765) ScalaShellITCase deadlocks spuriously

2016-04-15 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-3765:


 Summary: ScalaShellITCase deadlocks spuriously
 Key: FLINK-3765
 URL: https://issues.apache.org/jira/browse/FLINK-3765
 Project: Flink
  Issue Type: Bug
  Components: Scala Shell
Affects Versions: 1.1.0
Reporter: Till Rohrmann
Priority: Critical


The {{ScalaShellITCase}} deadlocks spuriously. Stephan already observed it 
locally and now it also happened on Travis [1].

[1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123206554/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis

2016-04-15 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-3764:
-
Description: 
The {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} 
fails spuriously on Travis because of a {{NullPointerException}}. The reason is 
that it's not properly waited until the {{ResourceManager}} has been started. 
Due to this, it can happen that a leader notification message is tried to be 
sent to a {{LeaderRetrievalListener}} which has not been set by the 
{{ResourceManager}}.

[1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt

  was:The 
{{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} 
fails spuriously on Travis because of a {{NullPointerException}}. The reason is 
that it's not properly waited until the {{ResourceManager}} has been started. 
Due to this, it can happen that a leader notification message is tried to be 
sent to a {{LeaderRetrievalListener}} which has not been set by the 
{{ResourceManager}}.


> LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails 
> on Travis
> --
>
> Key: FLINK-3764
> URL: https://issues.apache.org/jira/browse/FLINK-3764
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> The 
> {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} 
> fails spuriously on Travis because of a {{NullPointerException}}. The reason 
> is that it's not properly waited until the {{ResourceManager}} has been 
> started. Due to this, it can happen that a leader notification message is 
> tried to be sent to a {{LeaderRetrievalListener}} which has not been set by 
> the {{ResourceManager}}.
> [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis

2016-04-15 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-3764:


 Summary: 
LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on 
Travis
 Key: FLINK-3764
 URL: https://issues.apache.org/jira/browse/FLINK-3764
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Assignee: Till Rohrmann
Priority: Critical


The {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} 
fails spuriously on Travis because of a {{NullPointerException}}. The reason is 
that it's not properly waited until the {{ResourceManager}} has been started. 
Due to this, it can happen that a leader notification message is tried to be 
sent to a {{LeaderRetrievalListener}} which has not been set by the 
{{ResourceManager}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242751#comment-15242751
 ] 

ASF GitHub Bot commented on FLINK-3762:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1891#issuecomment-210403044
  
Thanks, will merge this PR to master and release-1.0 branches


>  Kryo StackOverflowError due to disabled Kryo Reference tracking
> 
>
> Key: FLINK-3762
> URL: https://issues.apache.org/jira/browse/FLINK-3762
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Andrew Palumbo
>Assignee: Andrew Palumbo
> Fix For: 1.0.2
>
>
> As discussed on the dev list,
> In {{KryoSerializer.java}}
> Kryo Reference tracking is disabled by default:
> {code}
> kryo.setReferences(false);
> {code}
> This can causes  {{StackOverflowError}} Exceptions when serializing many 
> objects that may contain recursive objects:
> {code}
> java.lang.StackOverflowError
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
> {code}
> By enabling reference tracking, we can fix this problem.
> [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3762: StackOverflowError due to disabled...

2016-04-15 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1891#issuecomment-210403044
  
Thanks, will merge this PR to master and release-1.0 branches


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3748) Add CASE function to Table API

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242749#comment-15242749
 ] 

ASF GitHub Bot commented on FLINK-3748:
---

GitHub user twalthr opened a pull request:

https://github.com/apache/flink/pull/1893

[FLINK-3748] [table] Add CASE function to Table API

This PR adds an evaluation function to the Table API that allows for 
if/else branching. It also allows `CASE WHEN` syntax in SQL.

I will update the EBNF grammar in the documentation as part of FLINK-3086 
next.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/twalthr/flink TableApiCaseWhen

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1893.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1893


commit 538b7bfae208ef68131bc76fb05c8835cbd55928
Author: twalthr 
Date:   2016-04-13T12:36:36Z

[FLINK-3748] [table] Add CASE function to Table API




> Add CASE function to Table API
> --
>
> Key: FLINK-3748
> URL: https://issues.apache.org/jira/browse/FLINK-3748
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> Add a CASE/WHEN functionality to Java/Scala Table API and add support for SQL 
> API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3748] [table] Add CASE function to Tabl...

2016-04-15 Thread twalthr
GitHub user twalthr opened a pull request:

https://github.com/apache/flink/pull/1893

[FLINK-3748] [table] Add CASE function to Table API

This PR adds an evaluation function to the Table API that allows for 
if/else branching. It also allows `CASE WHEN` syntax in SQL.

I will update the EBNF grammar in the documentation as part of FLINK-3086 
next.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/twalthr/flink TableApiCaseWhen

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1893.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1893


commit 538b7bfae208ef68131bc76fb05c8835cbd55928
Author: twalthr 
Date:   2016-04-13T12:36:36Z

[FLINK-3748] [table] Add CASE function to Table API




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242748#comment-15242748
 ] 

ASF GitHub Bot commented on FLINK-3762:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1891#issuecomment-210402429
  
Thanks for your contribution @andrewpalumbo :-)


>  Kryo StackOverflowError due to disabled Kryo Reference tracking
> 
>
> Key: FLINK-3762
> URL: https://issues.apache.org/jira/browse/FLINK-3762
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Andrew Palumbo
>Assignee: Andrew Palumbo
> Fix For: 1.0.2
>
>
> As discussed on the dev list,
> In {{KryoSerializer.java}}
> Kryo Reference tracking is disabled by default:
> {code}
> kryo.setReferences(false);
> {code}
> This can causes  {{StackOverflowError}} Exceptions when serializing many 
> objects that may contain recursive objects:
> {code}
> java.lang.StackOverflowError
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
> {code}
> By enabling reference tracking, we can fix this problem.
> [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3762: StackOverflowError due to disabled...

2016-04-15 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1891#issuecomment-210402429
  
Thanks for your contribution @andrewpalumbo :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242747#comment-15242747
 ] 

ASF GitHub Bot commented on FLINK-3762:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1891#issuecomment-210402401
  
Changes look good to me. +1 for merging.


>  Kryo StackOverflowError due to disabled Kryo Reference tracking
> 
>
> Key: FLINK-3762
> URL: https://issues.apache.org/jira/browse/FLINK-3762
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Andrew Palumbo
>Assignee: Andrew Palumbo
> Fix For: 1.0.2
>
>
> As discussed on the dev list,
> In {{KryoSerializer.java}}
> Kryo Reference tracking is disabled by default:
> {code}
> kryo.setReferences(false);
> {code}
> This can causes  {{StackOverflowError}} Exceptions when serializing many 
> objects that may contain recursive objects:
> {code}
> java.lang.StackOverflowError
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
> {code}
> By enabling reference tracking, we can fix this problem.
> [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3762: StackOverflowError due to disabled...

2016-04-15 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1891#issuecomment-210402401
  
Changes look good to me. +1 for merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3759) Table API should throw exception is null value is encountered in non-null mode.

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242744#comment-15242744
 ] 

ASF GitHub Bot commented on FLINK-3759:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1892#issuecomment-210401616
  
Thanks for the fix. 
Will merge this PR.


> Table API should throw exception is null value is encountered in non-null 
> mode.
> ---
>
> Key: FLINK-3759
> URL: https://issues.apache.org/jira/browse/FLINK-3759
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Priority: Critical
>
> The Table API can be configured to omit null-checks in generated code to 
> speed up processing. Currently, the generated code replaces a null value with 
> a data type specific default value if it is encountered in non-null-check 
> mode. 
> This can silently cause wrong results and should be changed such that an 
> exception is thrown if a null value is encountered in non-null-check mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3759] [table] Table API should throw ex...

2016-04-15 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1892#issuecomment-210401616
  
Thanks for the fix. 
Will merge this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   >