[jira] [Assigned] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange
[ https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subhankar Biswas reassigned FLINK-3769: --- Assignee: Subhankar Biswas > RabbitMQ Sink ability to publish to a different exchange > > > Key: FLINK-3769 > URL: https://issues.apache.org/jira/browse/FLINK-3769 > Project: Flink > Issue Type: Improvement > Components: Streaming Connectors >Affects Versions: 1.0.1 >Reporter: Robert Batts >Assignee: Subhankar Biswas > Labels: rabbitmq > > The RabbitMQ Sink can currently only publish to the "default" exchange. This > exchange is a direct exchange, so the routing key will route directly to the > queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job > to 1 exchange which routes to 1 queue). Additionally, I believe that if a > user decides to use a different exchange I think the following can be assumed: > 1.) The provided exchange exists > 2.) The user has declared the appropriate mapping and the appropriate queues > exist in RabbitMQ (therefore, nothing needs to be created) > RabbitMQ currently provides four types of exchanges. Three of these will be > covered by just enabling exchanges (Direct, Fanout, Topic) because they use > the routingkey (or nothing). > The fourth exchange type relies on the message headers, which are currently > set to null by default on the publish. These headers may be on a per message > level, so the input of this stream will need to take this as input as well. > This forth exchange could very well be outside of the scope of this > Improvement and a "RabbitMQ Sink enable headers" Improvement might be the > better way to go with this. > Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-3763) RabbitMQ Source/Sink standardize connection parameters
[ https://issues.apache.org/jira/browse/FLINK-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subhankar Biswas reassigned FLINK-3763: --- Assignee: Subhankar Biswas > RabbitMQ Source/Sink standardize connection parameters > -- > > Key: FLINK-3763 > URL: https://issues.apache.org/jira/browse/FLINK-3763 > Project: Flink > Issue Type: Improvement > Components: Streaming Connectors >Affects Versions: 1.0.1 >Reporter: Robert Batts >Assignee: Subhankar Biswas > > The RabbitMQ source and sink should have the same capabilities in terms of > establishing a connection, currently the sink is lacking connection > parameters that are available on the source. Additionally, VirtualHost should > be an offered parameter for multi-tenant RabbitMQ clusters (if not specified > it goes to the vhost '/'). > Connection Parameters > === > - Host - Offered on both > - Port - Source only > - Virtual Host - Neither > - User - Source only > - Password - Source only > Additionally, it might be worth offer the URI as a valid constructor because > that would offer all 5 of the above parameters in a single String. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange
[ https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subhankar Biswas updated FLINK-3769: Assignee: (was: Subhankar Biswas) > RabbitMQ Sink ability to publish to a different exchange > > > Key: FLINK-3769 > URL: https://issues.apache.org/jira/browse/FLINK-3769 > Project: Flink > Issue Type: Improvement > Components: Streaming Connectors >Affects Versions: 1.0.1 >Reporter: Robert Batts > Labels: rabbitmq > > The RabbitMQ Sink can currently only publish to the "default" exchange. This > exchange is a direct exchange, so the routing key will route directly to the > queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job > to 1 exchange which routes to 1 queue). Additionally, I believe that if a > user decides to use a different exchange I think the following can be assumed: > 1.) The provided exchange exists > 2.) The user has declared the appropriate mapping and the appropriate queues > exist in RabbitMQ (therefore, nothing needs to be created) > RabbitMQ currently provides four types of exchanges. Three of these will be > covered by just enabling exchanges (Direct, Fanout, Topic) because they use > the routingkey (or nothing). > The fourth exchange type relies on the message headers, which are currently > set to null by default on the publish. These headers may be on a per message > level, so the input of this stream will need to take this as input as well. > This forth exchange could very well be outside of the scope of this > Improvement and a "RabbitMQ Sink enable headers" Improvement might be the > better way to go with this. > Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange
[ https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subhankar Biswas reassigned FLINK-3769: --- Assignee: Subhankar Biswas > RabbitMQ Sink ability to publish to a different exchange > > > Key: FLINK-3769 > URL: https://issues.apache.org/jira/browse/FLINK-3769 > Project: Flink > Issue Type: Improvement > Components: Streaming Connectors >Affects Versions: 1.0.1 >Reporter: Robert Batts >Assignee: Subhankar Biswas > Labels: rabbitmq > > The RabbitMQ Sink can currently only publish to the "default" exchange. This > exchange is a direct exchange, so the routing key will route directly to the > queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job > to 1 exchange which routes to 1 queue). Additionally, I believe that if a > user decides to use a different exchange I think the following can be assumed: > 1.) The provided exchange exists > 2.) The user has declared the appropriate mapping and the appropriate queues > exist in RabbitMQ (therefore, nothing needs to be created) > RabbitMQ currently provides four types of exchanges. Three of these will be > covered by just enabling exchanges (Direct, Fanout, Topic) because they use > the routingkey (or nothing). > The fourth exchange type relies on the message headers, which are currently > set to null by default on the publish. These headers may be on a per message > level, so the input of this stream will need to take this as input as well. > This forth exchange could very well be outside of the scope of this > Improvement and a "RabbitMQ Sink enable headers" Improvement might be the > better way to go with this. > Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2259][ml] Add Train-Testing Splitters
Github user rawkintrevo commented on the pull request: https://github.com/apache/flink/pull/1898#issuecomment-210728264 One build failed on error: scala.reflect.internal.MissingRequirementError: object scala.runtime in compiler mirror not found. Another on some weird YARN stuff, which I didn't touch. Flakey builds? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3742) Add Multi Layer Perceptron Predictor
[ https://issues.apache.org/jira/browse/FLINK-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243999#comment-15243999 ] ASF GitHub Bot commented on FLINK-3742: --- Github user rawkintrevo commented on the pull request: https://github.com/apache/flink/pull/1875#issuecomment-210727904 Failed on one build @ error: scala.reflect.internal.MissingRequirementError: object scala.runtime in compiler mirror not found. > Add Multi Layer Perceptron Predictor > > > Key: FLINK-3742 > URL: https://issues.apache.org/jira/browse/FLINK-3742 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Trevor Grant >Assignee: Trevor Grant >Priority: Minor > > https://en.wikipedia.org/wiki/Multilayer_perceptron > Multilayer perceptron is a simple sort of artificial neural network. It > creates a directed graph in which the edges are parameter weights and nodes > are non-linear activation functions. It is solved via a method known as back > propagation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3742][ml] Add Multilayer Perceptron
Github user rawkintrevo commented on the pull request: https://github.com/apache/flink/pull/1875#issuecomment-210727904 Failed on one build @ error: scala.reflect.internal.MissingRequirementError: object scala.runtime in compiler mirror not found. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-3211) Add AWS Kinesis streaming connector
[ https://issues.apache.org/jira/browse/FLINK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai updated FLINK-3211: --- Affects Version/s: (was: 1.0.0) 1.1.0 > Add AWS Kinesis streaming connector > --- > > Key: FLINK-3211 > URL: https://issues.apache.org/jira/browse/FLINK-3211 > Project: Flink > Issue Type: New Feature > Components: Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Original Estimate: 336h > Remaining Estimate: 336h > > AWS Kinesis is a widely adopted message queue used by AWS users, much like a > cloud service version of Apache Kafka. Support for AWS Kinesis will be a > great addition to the handful of Flink's streaming connectors to external > systems and a great reach out to the AWS community. > AWS supports two different ways to consume Kinesis data: with the low-level > AWS SDK [1], or with the high-level KCL (Kinesis Client Library) [2]. AWS SDK > can be used to consume Kinesis data, including stream read beginning from a > specific offset (or "record sequence number" in Kinesis terminology). On the > other hand, AWS officially recommends using KCL, which offers a higher-level > of abstraction that also comes with checkpointing and failure recovery by > using a KCL-managed AWS DynamoDB "leash table" as the checkpoint state > storage. > However, KCL is essentially a stream processing library that wraps all the > partition-to-task (or "shard" in Kinesis terminology) determination and > checkpointing to allow the user to focus only on streaming application logic. > This leads to the understanding that we can not use the KCL to implement the > Kinesis streaming connector if we are aiming for a deep integration of Flink > with Kinesis that provides exactly once guarantees (KCL promises only > at-least-once). Therefore, AWS SDK will be the way to go for the > implementation of this feature. > With the ability to read from specific offsets, and also the fact that > Kinesis and Kafka share a lot of similarities, the basic principles of the > implementation of Flink's Kinesis streaming connector will very much resemble > the Kafka connector. We can basically follow the outlines described in > [~StephanEwen]'s description [3] and [~rmetzger]'s Kafka connector > implementation [4]. A few tweaks due to some of Kinesis v.s. Kafka > differences is described as following: > 1. While the Kafka connector can support reading from multiple topics, I > currently don't think this is a good idea for Kinesis streams (a Kinesis > Stream is logically equivalent to a Kafka topic). Kinesis streams can exist > in different AWS regions, and each Kinesis stream under the same AWS user > account may have completely independent access settings with different > authorization keys. Overall, a Kinesis stream feels like a much more > consolidated resource compared to Kafka topics. It would be great to hear > more thoughts on this part. > 2. While Kafka has brokers that can hold multiple partitions, the only > partitioning abstraction for AWS Kinesis is "shards". Therefore, in contrast > to the Kafka connector having per broker connections where the connections > can handle multiple Kafka partitions, the Kinesis connector will only need to > have simple per shard connections. > 3. Kinesis itself does not support committing offsets back to Kinesis. If we > were to implement this feature like the Kafka connector with Kafka / ZK to > sync outside view of progress, we probably could use ZK or DynamoDB like the > way KCL works. More thoughts on this part will be very helpful too. > As for the Kinesis Sink, it should be possible to use the AWS KPL (Kinesis > Producer Library) [5]. However, for higher code consistency with the proposed > Kinesis Consumer, I think it will be better to stick with the AWS SDK for the > implementation. The implementation should be straight forward, being almost > if not completely the same as the Kafka sink. > References: > [1] > http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-sdk.html > [2] > http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html > [3] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-Connector-td2872.html > [4] http://data-artisans.com/kafka-flink-a-practical-how-to/ > [5] > http://docs.aws.amazon.com//kinesis/latest/dev/developing-producers-with-kpl.html#d0e4998 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLINK-3211) Add AWS Kinesis streaming connector
[ https://issues.apache.org/jira/browse/FLINK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243988#comment-15243988 ] Tzu-Li (Gordon) Tai edited comment on FLINK-3211 at 4/16/16 3:53 AM: - https://github.com/tzulitai/flink/tree/FLINK-3229/flink-streaming-connectors/flink-connector-kinesis Here is the initial working version of FlinkKinesisConsumer that I have been testing in off-production environments, updated corresponding to the recent Flink 1.0 changes. In the description of https://issues.apache.org/jira/browse/FLINK-3229 is example code of how to use the consumer. I'm still refactoring the code just a bit for easier unit tests. The PR will be very soon. was (Author: tzulitai): https://github.com/tzulitai/flink/tree/FLINK-3229/flink-streaming-connectors/flink-connector-kinesis Here is the initial working version of FlinkKinesisConsumer that I have been testing in off-production environments, updated corresponding to the recent Flink 1.0 changes. I'm still refactoring the code just a bit for easier unit tests. The PR will be very soon. > Add AWS Kinesis streaming connector > --- > > Key: FLINK-3211 > URL: https://issues.apache.org/jira/browse/FLINK-3211 > Project: Flink > Issue Type: New Feature > Components: Streaming Connectors >Affects Versions: 1.0.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Original Estimate: 336h > Remaining Estimate: 336h > > AWS Kinesis is a widely adopted message queue used by AWS users, much like a > cloud service version of Apache Kafka. Support for AWS Kinesis will be a > great addition to the handful of Flink's streaming connectors to external > systems and a great reach out to the AWS community. > AWS supports two different ways to consume Kinesis data: with the low-level > AWS SDK [1], or with the high-level KCL (Kinesis Client Library) [2]. AWS SDK > can be used to consume Kinesis data, including stream read beginning from a > specific offset (or "record sequence number" in Kinesis terminology). On the > other hand, AWS officially recommends using KCL, which offers a higher-level > of abstraction that also comes with checkpointing and failure recovery by > using a KCL-managed AWS DynamoDB "leash table" as the checkpoint state > storage. > However, KCL is essentially a stream processing library that wraps all the > partition-to-task (or "shard" in Kinesis terminology) determination and > checkpointing to allow the user to focus only on streaming application logic. > This leads to the understanding that we can not use the KCL to implement the > Kinesis streaming connector if we are aiming for a deep integration of Flink > with Kinesis that provides exactly once guarantees (KCL promises only > at-least-once). Therefore, AWS SDK will be the way to go for the > implementation of this feature. > With the ability to read from specific offsets, and also the fact that > Kinesis and Kafka share a lot of similarities, the basic principles of the > implementation of Flink's Kinesis streaming connector will very much resemble > the Kafka connector. We can basically follow the outlines described in > [~StephanEwen]'s description [3] and [~rmetzger]'s Kafka connector > implementation [4]. A few tweaks due to some of Kinesis v.s. Kafka > differences is described as following: > 1. While the Kafka connector can support reading from multiple topics, I > currently don't think this is a good idea for Kinesis streams (a Kinesis > Stream is logically equivalent to a Kafka topic). Kinesis streams can exist > in different AWS regions, and each Kinesis stream under the same AWS user > account may have completely independent access settings with different > authorization keys. Overall, a Kinesis stream feels like a much more > consolidated resource compared to Kafka topics. It would be great to hear > more thoughts on this part. > 2. While Kafka has brokers that can hold multiple partitions, the only > partitioning abstraction for AWS Kinesis is "shards". Therefore, in contrast > to the Kafka connector having per broker connections where the connections > can handle multiple Kafka partitions, the Kinesis connector will only need to > have simple per shard connections. > 3. Kinesis itself does not support committing offsets back to Kinesis. If we > were to implement this feature like the Kafka connector with Kafka / ZK to > sync outside view of progress, we probably could use ZK or DynamoDB like the > way KCL works. More thoughts on this part will be very helpful too. > As for the Kinesis Sink, it should be possible to use the AWS KPL (Kinesis > Producer Library) [5]. However, for higher code consistency with the proposed > Kinesis Consumer, I think it will be better to stick with the AWS SDK
[jira] [Updated] (FLINK-3229) Kinesis streaming consumer with integration of Flink's checkpointing mechanics
[ https://issues.apache.org/jira/browse/FLINK-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai updated FLINK-3229: --- Description: Opening a sub-task to implement data source consumer for Kinesis streaming connector (https://issues.apache.org/jira/browser/FLINK-3211). An example of the planned user API for Flink Kinesis Consumer: {code} Properties kinesisConfig = new Properties(); config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1"); config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, "BASIC"); config.put( KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, "aws_access_key_id_here"); config.put( KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY, "aws_secret_key_here"); config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, "LATEST"); // or TRIM_HORIZON DataStream kinesisRecords = env.addSource(new FlinkKinesisConsumer<>( "kinesis_stream_name", new SimpleStringSchema(), kinesisConfig)); {code} was: Opening a sub-task to implement data source consumer for Kinesis streaming connector (https://issues.apache.org/jira/browser/FLINK-3211). An example of the planned user API for Flink Kinesis Consumer: {code} Properties kinesisConfig = new Properties(); config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1"); config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, "BASIC"); config.put( KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, "aws_access_key_id_here"); config.put( KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY, "aws_secret_key_here"); config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, "LATEST"); // or TRIM_HORIZON DataStream kinesisRecords = env .addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new SimpleStringSchema(), kinesisConfig); {code} > Kinesis streaming consumer with integration of Flink's checkpointing mechanics > -- > > Key: FLINK-3229 > URL: https://issues.apache.org/jira/browse/FLINK-3229 > Project: Flink > Issue Type: Sub-task > Components: Streaming Connectors >Affects Versions: 1.0.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > > Opening a sub-task to implement data source consumer for Kinesis streaming > connector (https://issues.apache.org/jira/browser/FLINK-3211). > An example of the planned user API for Flink Kinesis Consumer: > {code} > Properties kinesisConfig = new Properties(); > config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1"); > config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, > "BASIC"); > config.put( > KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, > "aws_access_key_id_here"); > config.put( > KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY, > "aws_secret_key_here"); > config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, > "LATEST"); // or TRIM_HORIZON > DataStream kinesisRecords = env.addSource(new FlinkKinesisConsumer<>( > "kinesis_stream_name", > new SimpleStringSchema(), > kinesisConfig)); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-3229) Kinesis streaming consumer with integration of Flink's checkpointing mechanics
[ https://issues.apache.org/jira/browse/FLINK-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai updated FLINK-3229: --- Description: Opening a sub-task to implement data source consumer for Kinesis streaming connector (https://issues.apache.org/jira/browser/FLINK-3211). An example of the planned user API for Flink Kinesis Consumer: {code} Properties kinesisConfig = new Properties(); config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1"); config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, "BASIC"); config.put( KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, "aws_access_key_id_here"); config.put( KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY, "aws_secret_key_here"); config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, "LATEST"); // or TRIM_HORIZON DataStream kinesisRecords = env .addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new SimpleStringSchema(), kinesisConfig); {code} was: Opening a sub-task to implement data source consumer for Kinesis streaming connector (https://issues.apache.org/jira/browser/FLINK-3211). An example of the planned user API for Flink Kinesis Consumer: {code} Properties kinesisConfig = new Properties(); config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1"); config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, "BASIC"); config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, "aws_access_key_id_here"); config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY, "aws_secret_key_here"); config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, "LATEST"); // or TRIM_HORIZON DataStream kinesisRecords = env .addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new SimpleStringSchema(), kinesisConfig); {code} > Kinesis streaming consumer with integration of Flink's checkpointing mechanics > -- > > Key: FLINK-3229 > URL: https://issues.apache.org/jira/browse/FLINK-3229 > Project: Flink > Issue Type: Sub-task > Components: Streaming Connectors >Affects Versions: 1.0.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > > Opening a sub-task to implement data source consumer for Kinesis streaming > connector (https://issues.apache.org/jira/browser/FLINK-3211). > An example of the planned user API for Flink Kinesis Consumer: > {code} > Properties kinesisConfig = new Properties(); > config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1"); > config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, > "BASIC"); > config.put( > KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, > "aws_access_key_id_here"); > config.put( > KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY, > "aws_secret_key_here"); > config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, > "LATEST"); // or TRIM_HORIZON > DataStream kinesisRecords = env > .addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new > SimpleStringSchema(), kinesisConfig); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-3229) Kinesis streaming consumer with integration of Flink's checkpointing mechanics
[ https://issues.apache.org/jira/browse/FLINK-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai updated FLINK-3229: --- Description: Opening a sub-task to implement data source consumer for Kinesis streaming connector (https://issues.apache.org/jira/browser/FLINK-3211). An example of the planned user API for Flink Kinesis Consumer: {code} Properties kinesisConfig = new Properties(); config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1"); config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, "BASIC"); config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, "aws_access_key_id_here"); config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY, "aws_secret_key_here"); config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, "LATEST"); // or TRIM_HORIZON DataStream kinesisRecords = env .addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new SimpleStringSchema(), kinesisConfig); {code} was: Opening a sub-task to implement data source consumer for Kinesis streaming connector (https://issues.apache.org/jira/browser/FLINK-3211). An example of the planned user API for Flink Kinesis Consumer: {code} Properties config = new Properties(); config.put(FlinkKinesisConsumer.CONFIG_STREAM_DESCRIBE_RETRIES, "3"); config.put(FlinkKinesisConsumer.CONFIG_STREAM_DESCRIBE_BACKOFF_MILLIS, "1000"); config.put(FlinkKinesisConsumer.CONFIG_STREAM_START_POSITION_TYPE, "latest"); config.put(FlinkKinesisConsumer.CONFIG_AWS_REGION, "us-east-1"); AWSCredentialsProvider credentials = // credentials API in AWS SDK DataStream kinesisRecords = env .addSource(new FlinkKinesisConsumer<>( listOfStreams, credentials, new SimpleStringSchema(), config )); {code} Currently still considering which read start positions to support ("TRIM_HORIZON", "LATEST", "AT_SEQUENCE_NUMBER"). The discussions for this can be found in https://issues.apache.org/jira/browser/FLINK-3211. > Kinesis streaming consumer with integration of Flink's checkpointing mechanics > -- > > Key: FLINK-3229 > URL: https://issues.apache.org/jira/browse/FLINK-3229 > Project: Flink > Issue Type: Sub-task > Components: Streaming Connectors >Affects Versions: 1.0.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > > Opening a sub-task to implement data source consumer for Kinesis streaming > connector (https://issues.apache.org/jira/browser/FLINK-3211). > An example of the planned user API for Flink Kinesis Consumer: > {code} > Properties kinesisConfig = new Properties(); > config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1"); > config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, > "BASIC"); > config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, > "aws_access_key_id_here"); > config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY, > "aws_secret_key_here"); > config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, > "LATEST"); // or TRIM_HORIZON > DataStream kinesisRecords = env > .addSource(new FlinkKinesisConsumer<>("kinesis_stream_name", new > SimpleStringSchema(), kinesisConfig); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3229) Kinesis streaming consumer with integration of Flink's checkpointing mechanics
[ https://issues.apache.org/jira/browse/FLINK-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243991#comment-15243991 ] Tzu-Li (Gordon) Tai commented on FLINK-3229: (Duplicate comment from FLINK-3211. Posting it here also to keep the issue updated.) https://github.com/tzulitai/flink/tree/FLINK-3229/flink-streaming-connectors/flink-connector-kinesis Here is the initial working version of FlinkKinesisConsumer that I have been testing in off-production environments, updated corresponding to the recent Flink 1.0 changes. I'm still refactoring the code just a bit for easier unit tests. The PR will be very soon. > Kinesis streaming consumer with integration of Flink's checkpointing mechanics > -- > > Key: FLINK-3229 > URL: https://issues.apache.org/jira/browse/FLINK-3229 > Project: Flink > Issue Type: Sub-task > Components: Streaming Connectors >Affects Versions: 1.0.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > > Opening a sub-task to implement data source consumer for Kinesis streaming > connector (https://issues.apache.org/jira/browser/FLINK-3211). > An example of the planned user API for Flink Kinesis Consumer: > {code} > Properties config = new Properties(); > config.put(FlinkKinesisConsumer.CONFIG_STREAM_DESCRIBE_RETRIES, "3"); > config.put(FlinkKinesisConsumer.CONFIG_STREAM_DESCRIBE_BACKOFF_MILLIS, > "1000"); > config.put(FlinkKinesisConsumer.CONFIG_STREAM_START_POSITION_TYPE, "latest"); > config.put(FlinkKinesisConsumer.CONFIG_AWS_REGION, "us-east-1"); > AWSCredentialsProvider credentials = // credentials API in AWS SDK > DataStream kinesisRecords = env > .addSource(new FlinkKinesisConsumer<>( > listOfStreams, credentials, new SimpleStringSchema(), config > )); > {code} > Currently still considering which read start positions to support > ("TRIM_HORIZON", "LATEST", "AT_SEQUENCE_NUMBER"). The discussions for this > can be found in https://issues.apache.org/jira/browser/FLINK-3211. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3211) Add AWS Kinesis streaming connector
[ https://issues.apache.org/jira/browse/FLINK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243988#comment-15243988 ] Tzu-Li (Gordon) Tai commented on FLINK-3211: https://github.com/tzulitai/flink/tree/FLINK-3229/flink-streaming-connectors/flink-connector-kinesis Here is the initial working version of FlinkKinesisConsumer that I have been testing in off-production environments, updated corresponding to the recent Flink 1.0 changes. I'm still refactoring the code just a bit for easier unit tests. The PR will be very soon. > Add AWS Kinesis streaming connector > --- > > Key: FLINK-3211 > URL: https://issues.apache.org/jira/browse/FLINK-3211 > Project: Flink > Issue Type: New Feature > Components: Streaming Connectors >Affects Versions: 1.0.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Original Estimate: 336h > Remaining Estimate: 336h > > AWS Kinesis is a widely adopted message queue used by AWS users, much like a > cloud service version of Apache Kafka. Support for AWS Kinesis will be a > great addition to the handful of Flink's streaming connectors to external > systems and a great reach out to the AWS community. > AWS supports two different ways to consume Kinesis data: with the low-level > AWS SDK [1], or with the high-level KCL (Kinesis Client Library) [2]. AWS SDK > can be used to consume Kinesis data, including stream read beginning from a > specific offset (or "record sequence number" in Kinesis terminology). On the > other hand, AWS officially recommends using KCL, which offers a higher-level > of abstraction that also comes with checkpointing and failure recovery by > using a KCL-managed AWS DynamoDB "leash table" as the checkpoint state > storage. > However, KCL is essentially a stream processing library that wraps all the > partition-to-task (or "shard" in Kinesis terminology) determination and > checkpointing to allow the user to focus only on streaming application logic. > This leads to the understanding that we can not use the KCL to implement the > Kinesis streaming connector if we are aiming for a deep integration of Flink > with Kinesis that provides exactly once guarantees (KCL promises only > at-least-once). Therefore, AWS SDK will be the way to go for the > implementation of this feature. > With the ability to read from specific offsets, and also the fact that > Kinesis and Kafka share a lot of similarities, the basic principles of the > implementation of Flink's Kinesis streaming connector will very much resemble > the Kafka connector. We can basically follow the outlines described in > [~StephanEwen]'s description [3] and [~rmetzger]'s Kafka connector > implementation [4]. A few tweaks due to some of Kinesis v.s. Kafka > differences is described as following: > 1. While the Kafka connector can support reading from multiple topics, I > currently don't think this is a good idea for Kinesis streams (a Kinesis > Stream is logically equivalent to a Kafka topic). Kinesis streams can exist > in different AWS regions, and each Kinesis stream under the same AWS user > account may have completely independent access settings with different > authorization keys. Overall, a Kinesis stream feels like a much more > consolidated resource compared to Kafka topics. It would be great to hear > more thoughts on this part. > 2. While Kafka has brokers that can hold multiple partitions, the only > partitioning abstraction for AWS Kinesis is "shards". Therefore, in contrast > to the Kafka connector having per broker connections where the connections > can handle multiple Kafka partitions, the Kinesis connector will only need to > have simple per shard connections. > 3. Kinesis itself does not support committing offsets back to Kinesis. If we > were to implement this feature like the Kafka connector with Kafka / ZK to > sync outside view of progress, we probably could use ZK or DynamoDB like the > way KCL works. More thoughts on this part will be very helpful too. > As for the Kinesis Sink, it should be possible to use the AWS KPL (Kinesis > Producer Library) [5]. However, for higher code consistency with the proposed > Kinesis Consumer, I think it will be better to stick with the AWS SDK for the > implementation. The implementation should be straight forward, being almost > if not completely the same as the Kafka sink. > References: > [1] > http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-sdk.html > [2] > http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html > [3] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-Connector-td2872.html > [4] http://data-artisans.com/kafka-flink-a-practical-how-to/ > [5] >
[jira] [Commented] (FLINK-3768) Clustering Coefficient
[ https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243944#comment-15243944 ] ASF GitHub Bot commented on FLINK-3768: --- Github user hsaputra commented on a diff in the pull request: https://github.com/apache/flink/pull/1896#discussion_r59960582 --- Diff: flink-libraries/flink-gelly-examples/src/main/java/org/apache/flink/graph/examples/TriangleListing.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.graph.examples; + +import org.apache.commons.math3.random.JDKRandomGenerator; +import org.apache.flink.api.common.JobExecutionResult; +import org.apache.flink.api.java.DataSet; +import org.apache.flink.api.java.ExecutionEnvironment; +import org.apache.flink.api.java.io.CsvOutputFormat; +import org.apache.flink.api.java.utils.DataSetUtils; +import org.apache.flink.api.java.utils.ParameterTool; +import org.apache.flink.graph.Graph; +import org.apache.flink.graph.generator.RMatGraph; +import org.apache.flink.graph.generator.random.JDKRandomGeneratorFactory; +import org.apache.flink.graph.generator.random.RandomGenerableFactory; +import org.apache.flink.graph.library.TriangleEnumerator; +import org.apache.flink.graph.utils.Translate; +import org.apache.flink.types.IntValue; +import org.apache.flink.types.LongValue; +import org.apache.flink.types.NullValue; + +import java.text.NumberFormat; + +public class TriangleListing { --- End diff -- Could add JavaDoc to ALL new Java or Scala classes added to source? Explanation why and how it is interact with other code is something we are looking for. Thanks. > Clustering Coefficient > -- > > Key: FLINK-3768 > URL: https://issues.apache.org/jira/browse/FLINK-3768 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 1.1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > Fix For: 1.1.0 > > > The local clustering coefficient measures the connectedness of each vertex's > neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 > (neighborhood is a clique). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3768] [gelly] Clustering Coefficient
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/flink/pull/1896#discussion_r59960582 --- Diff: flink-libraries/flink-gelly-examples/src/main/java/org/apache/flink/graph/examples/TriangleListing.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.graph.examples; + +import org.apache.commons.math3.random.JDKRandomGenerator; +import org.apache.flink.api.common.JobExecutionResult; +import org.apache.flink.api.java.DataSet; +import org.apache.flink.api.java.ExecutionEnvironment; +import org.apache.flink.api.java.io.CsvOutputFormat; +import org.apache.flink.api.java.utils.DataSetUtils; +import org.apache.flink.api.java.utils.ParameterTool; +import org.apache.flink.graph.Graph; +import org.apache.flink.graph.generator.RMatGraph; +import org.apache.flink.graph.generator.random.JDKRandomGeneratorFactory; +import org.apache.flink.graph.generator.random.RandomGenerableFactory; +import org.apache.flink.graph.library.TriangleEnumerator; +import org.apache.flink.graph.utils.Translate; +import org.apache.flink.types.IntValue; +import org.apache.flink.types.LongValue; +import org.apache.flink.types.NullValue; + +import java.text.NumberFormat; + +public class TriangleListing { --- End diff -- Could add JavaDoc to ALL new Java or Scala classes added to source? Explanation why and how it is interact with other code is something we are looking for. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3211) Add AWS Kinesis streaming connector
[ https://issues.apache.org/jira/browse/FLINK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243940#comment-15243940 ] Marut Singh commented on FLINK-3211: Is there any update on this issue? > Add AWS Kinesis streaming connector > --- > > Key: FLINK-3211 > URL: https://issues.apache.org/jira/browse/FLINK-3211 > Project: Flink > Issue Type: New Feature > Components: Streaming Connectors >Affects Versions: 1.0.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Original Estimate: 336h > Remaining Estimate: 336h > > AWS Kinesis is a widely adopted message queue used by AWS users, much like a > cloud service version of Apache Kafka. Support for AWS Kinesis will be a > great addition to the handful of Flink's streaming connectors to external > systems and a great reach out to the AWS community. > AWS supports two different ways to consume Kinesis data: with the low-level > AWS SDK [1], or with the high-level KCL (Kinesis Client Library) [2]. AWS SDK > can be used to consume Kinesis data, including stream read beginning from a > specific offset (or "record sequence number" in Kinesis terminology). On the > other hand, AWS officially recommends using KCL, which offers a higher-level > of abstraction that also comes with checkpointing and failure recovery by > using a KCL-managed AWS DynamoDB "leash table" as the checkpoint state > storage. > However, KCL is essentially a stream processing library that wraps all the > partition-to-task (or "shard" in Kinesis terminology) determination and > checkpointing to allow the user to focus only on streaming application logic. > This leads to the understanding that we can not use the KCL to implement the > Kinesis streaming connector if we are aiming for a deep integration of Flink > with Kinesis that provides exactly once guarantees (KCL promises only > at-least-once). Therefore, AWS SDK will be the way to go for the > implementation of this feature. > With the ability to read from specific offsets, and also the fact that > Kinesis and Kafka share a lot of similarities, the basic principles of the > implementation of Flink's Kinesis streaming connector will very much resemble > the Kafka connector. We can basically follow the outlines described in > [~StephanEwen]'s description [3] and [~rmetzger]'s Kafka connector > implementation [4]. A few tweaks due to some of Kinesis v.s. Kafka > differences is described as following: > 1. While the Kafka connector can support reading from multiple topics, I > currently don't think this is a good idea for Kinesis streams (a Kinesis > Stream is logically equivalent to a Kafka topic). Kinesis streams can exist > in different AWS regions, and each Kinesis stream under the same AWS user > account may have completely independent access settings with different > authorization keys. Overall, a Kinesis stream feels like a much more > consolidated resource compared to Kafka topics. It would be great to hear > more thoughts on this part. > 2. While Kafka has brokers that can hold multiple partitions, the only > partitioning abstraction for AWS Kinesis is "shards". Therefore, in contrast > to the Kafka connector having per broker connections where the connections > can handle multiple Kafka partitions, the Kinesis connector will only need to > have simple per shard connections. > 3. Kinesis itself does not support committing offsets back to Kinesis. If we > were to implement this feature like the Kafka connector with Kafka / ZK to > sync outside view of progress, we probably could use ZK or DynamoDB like the > way KCL works. More thoughts on this part will be very helpful too. > As for the Kinesis Sink, it should be possible to use the AWS KPL (Kinesis > Producer Library) [5]. However, for higher code consistency with the proposed > Kinesis Consumer, I think it will be better to stick with the AWS SDK for the > implementation. The implementation should be straight forward, being almost > if not completely the same as the Kafka sink. > References: > [1] > http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-sdk.html > [2] > http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html > [3] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-Connector-td2872.html > [4] http://data-artisans.com/kafka-flink-a-practical-how-to/ > [5] > http://docs.aws.amazon.com//kinesis/latest/dev/developing-producers-with-kpl.html#d0e4998 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3768] [gelly] Clustering Coefficient
Github user greghogan commented on the pull request: https://github.com/apache/flink/pull/1896#issuecomment-210709650 It's certainly nicer to review small PRs, but I also like to present the big picture and context in which the features will be used. What if I leave this here, break out the dependencies into new PRs, and then rebase this PR down if those features are accepted? I'm not quite sure what a design document would cover. There's not much sophisticated here as with SG or GSA. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3768) Clustering Coefficient
[ https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243938#comment-15243938 ] ASF GitHub Bot commented on FLINK-3768: --- Github user greghogan commented on the pull request: https://github.com/apache/flink/pull/1896#issuecomment-210709650 It's certainly nicer to review small PRs, but I also like to present the big picture and context in which the features will be used. What if I leave this here, break out the dependencies into new PRs, and then rebase this PR down if those features are accepted? I'm not quite sure what a design document would cover. There's not much sophisticated here as with SG or GSA. > Clustering Coefficient > -- > > Key: FLINK-3768 > URL: https://issues.apache.org/jira/browse/FLINK-3768 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 1.1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > Fix For: 1.1.0 > > > The local clustering coefficient measures the connectedness of each vertex's > neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 > (neighborhood is a clique). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2259][ml] Add Train-Testing Splitters
GitHub user rawkintrevo opened a pull request: https://github.com/apache/flink/pull/1898 [FLINK-2259][ml] Add Train-Testing Splitters This PR adds an object in ml/pipeline called splitter with the following methods: randomSplit: Splits a DataSet into two data sets using DataSet.sample multiRandomSplit: Splits a DataSet into multiple datasets according to an array of proportions kFoldSplit: Splits DataSet into k TrainTest objects which have a testing data set of size 1/k of the original dataset and the remainder of the dataset in the training trainTestSplit: A wrapper for randomSplit that return a TrainTest object trainTestHoldoutSplit: A wrapper for multiRandomSplit that returns a TrainTestHoldout object the TrainTest and TrainTestHoldout objects are case classes. randomSplit and multiRandomSplit return arrays of DataSets. - [x] General - [ ] Documentation - Documentation is in code, will write markdown after review/feedback/finalization - [x] Tests & Build You can merge this pull request into a Git repository by running: $ git pull https://github.com/rawkintrevo/flink train-test-split Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1898.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1898 commit ec1e65a31d80b33589b73619f2a5dd0a8e09c568 Author: Trevor GrantDate: 2016-04-15T22:37:51Z Add Splitter Pre-processing commit 3ecdc3818dd11a847136510dabe96f444924d319 Author: Trevor Grant Date: 2016-04-15T22:40:38Z Add Splitter Pre-processing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2259) Support training Estimators using a (train, validation, test) split of the available data
[ https://issues.apache.org/jira/browse/FLINK-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243774#comment-15243774 ] ASF GitHub Bot commented on FLINK-2259: --- GitHub user rawkintrevo opened a pull request: https://github.com/apache/flink/pull/1898 [FLINK-2259][ml] Add Train-Testing Splitters This PR adds an object in ml/pipeline called splitter with the following methods: randomSplit: Splits a DataSet into two data sets using DataSet.sample multiRandomSplit: Splits a DataSet into multiple datasets according to an array of proportions kFoldSplit: Splits DataSet into k TrainTest objects which have a testing data set of size 1/k of the original dataset and the remainder of the dataset in the training trainTestSplit: A wrapper for randomSplit that return a TrainTest object trainTestHoldoutSplit: A wrapper for multiRandomSplit that returns a TrainTestHoldout object the TrainTest and TrainTestHoldout objects are case classes. randomSplit and multiRandomSplit return arrays of DataSets. - [x] General - [ ] Documentation - Documentation is in code, will write markdown after review/feedback/finalization - [x] Tests & Build You can merge this pull request into a Git repository by running: $ git pull https://github.com/rawkintrevo/flink train-test-split Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1898.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1898 commit ec1e65a31d80b33589b73619f2a5dd0a8e09c568 Author: Trevor GrantDate: 2016-04-15T22:37:51Z Add Splitter Pre-processing commit 3ecdc3818dd11a847136510dabe96f444924d319 Author: Trevor Grant Date: 2016-04-15T22:40:38Z Add Splitter Pre-processing > Support training Estimators using a (train, validation, test) split of the > available data > - > > Key: FLINK-2259 > URL: https://issues.apache.org/jira/browse/FLINK-2259 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library >Reporter: Theodore Vasiloudis >Assignee: Trevor Grant >Priority: Minor > Labels: ML > > When there is an abundance of data available, a good way to train models is > to split the available data into 3 parts: Train, Validation and Test. > We use the Train data to train the model, the Validation part is used to > estimate the test error and select hyperparameters, and the Test is used to > evaluate the performance of the model, and assess its generalization [1] > This is a common approach when training Artificial Neural Networks, and a > good strategy to choose in data-rich environments. Therefore we should have > some support of this data-analysis process in our Estimators. > [1] Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. The elements of > statistical learning. Vol. 1. Springer, Berlin: Springer series in > statistics, 2001. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3768) Clustering Coefficient
[ https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243730#comment-15243730 ] ASF GitHub Bot commented on FLINK-3768: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/1896#issuecomment-210663735 Hi @greghogan, thank you for the PR. This is a big addition! Are all the changes related to the clustering coefficient JIRA? Could it maybe be split into smaller PRs in order to make reviewing easier? For big changes like this, it usually a good idea to create a design document to show the high-level approach and functionality to be added and discuss it with the community. Do you think it would make sense to do this? Thank you! > Clustering Coefficient > -- > > Key: FLINK-3768 > URL: https://issues.apache.org/jira/browse/FLINK-3768 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 1.1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > Fix For: 1.1.0 > > > The local clustering coefficient measures the connectedness of each vertex's > neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 > (neighborhood is a clique). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3768] [gelly] Clustering Coefficient
Github user vasia commented on the pull request: https://github.com/apache/flink/pull/1896#issuecomment-210663735 Hi @greghogan, thank you for the PR. This is a big addition! Are all the changes related to the clustering coefficient JIRA? Could it maybe be split into smaller PRs in order to make reviewing easier? For big changes like this, it usually a good idea to create a design document to show the high-level approach and functionality to be added and discuss it with the community. Do you think it would make sense to do this? Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3657) Change access of DataSetUtils.countElements() to 'public'
[ https://issues.apache.org/jira/browse/FLINK-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243680#comment-15243680 ] Fabian Hueske commented on FLINK-3657: -- Implemented for 1.1.0 with 5f993c65eeb1ae0d805e98b14bcb6603f9bff05f > Change access of DataSetUtils.countElements() to 'public' > -- > > Key: FLINK-3657 > URL: https://issues.apache.org/jira/browse/FLINK-3657 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0 >Reporter: Suneel Marthi >Assignee: Suneel Marthi >Priority: Minor > Fix For: 1.0.1 > > > The access of DatasetUtils.countElements() is presently 'private', change > that to be 'public'. We happened to be replicating the functionality in our > project and realized the method already existed in Flink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3657: Change access of DataSetUtils.coun...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1829 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3657) Change access of DataSetUtils.countElements() to 'public'
[ https://issues.apache.org/jira/browse/FLINK-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243678#comment-15243678 ] ASF GitHub Bot commented on FLINK-3657: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1829 > Change access of DataSetUtils.countElements() to 'public' > -- > > Key: FLINK-3657 > URL: https://issues.apache.org/jira/browse/FLINK-3657 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0 >Reporter: Suneel Marthi >Assignee: Suneel Marthi >Priority: Minor > Fix For: 1.0.1 > > > The access of DatasetUtils.countElements() is presently 'private', change > that to be 'public'. We happened to be replicating the functionality in our > project and realized the method already existed in Flink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3657) Change access of DataSetUtils.countElements() to 'public'
[ https://issues.apache.org/jira/browse/FLINK-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243657#comment-15243657 ] ASF GitHub Bot commented on FLINK-3657: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1829#issuecomment-210647072 Thanks for the update @smarthi. Will merge this PR. > Change access of DataSetUtils.countElements() to 'public' > -- > > Key: FLINK-3657 > URL: https://issues.apache.org/jira/browse/FLINK-3657 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0 >Reporter: Suneel Marthi >Assignee: Suneel Marthi >Priority: Minor > Fix For: 1.0.1 > > > The access of DatasetUtils.countElements() is presently 'private', change > that to be 'public'. We happened to be replicating the functionality in our > project and realized the method already existed in Flink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3657: Change access of DataSetUtils.coun...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1829#issuecomment-210647072 Thanks for the update @smarthi. Will merge this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3587) Bump Calcite version to 1.7.0
[ https://issues.apache.org/jira/browse/FLINK-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243652#comment-15243652 ] ASF GitHub Bot commented on FLINK-3587: --- GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/1897 [FLINK-3587] Bump Calcite version to 1.7.0 - [X] General - The pull request references the related JIRA issue - The pull request addresses only one issue - Each commit in the PR has a meaningful commit message - [X] Documentation - Documentation has been added for new functionality - Old documentation affected by the pull request has been updated - JavaDoc for public methods has been added - [X] Tests & Build - Functionality added by the pull request is covered by tests - `mvn clean verify` has been executed successfully locally or a Travis build has passed This pull request updates the Apache Calcite dependency to 1.7. With Calcite 1.7 a few method APIs and the some behavior changed. - Fix changed API for optimizer cost and cardinality estimates. - Add DataSetValues and DataStreamValues due to changed Calcite RelNode generation. - Pass TableEnvironment instead of TableConfig for DataSet and DataStream translation. - Add methods to create new DataSources to BatchTableEnvironment and StreamTableEnvironment. - Remove copied Calcite rule that got fixed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink calcite1.7 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1897.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1897 commit 92e2539b4178b3049f97bb1a6137ee91c943870a Author: Fabian HueskeDate: 2016-03-22T17:51:46Z [FLINK-3587] Bump Calcite version to 1.7.0 - Add DataSetValues and DataStreamValues due to changed Calcite RelNode generation. - Pass TableEnvironment instead of TableConfig for DataSet and DataStream translation. - Add methods to create new DataSources to BatchTableEnvironment and StreamTableEnvironment. - Remove copied Calcite rule that got fixed. > Bump Calcite version to 1.7.0 > - > > Key: FLINK-3587 > URL: https://issues.apache.org/jira/browse/FLINK-3587 > Project: Flink > Issue Type: Improvement > Components: Table API >Reporter: Vasia Kalavri >Assignee: Fabian Hueske > > We currently depend on Calcite 1.5.0. The latest stable release is 1.6.0, but > I propose we bump the version to 1.7.0-SNAPSHOT to benefit from latest > features. If we do that, we can also get rid of the custom > {{FlinkJoinUnionTransposeRule}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3587] Bump Calcite version to 1.7.0
GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/1897 [FLINK-3587] Bump Calcite version to 1.7.0 - [X] General - The pull request references the related JIRA issue - The pull request addresses only one issue - Each commit in the PR has a meaningful commit message - [X] Documentation - Documentation has been added for new functionality - Old documentation affected by the pull request has been updated - JavaDoc for public methods has been added - [X] Tests & Build - Functionality added by the pull request is covered by tests - `mvn clean verify` has been executed successfully locally or a Travis build has passed This pull request updates the Apache Calcite dependency to 1.7. With Calcite 1.7 a few method APIs and the some behavior changed. - Fix changed API for optimizer cost and cardinality estimates. - Add DataSetValues and DataStreamValues due to changed Calcite RelNode generation. - Pass TableEnvironment instead of TableConfig for DataSet and DataStream translation. - Add methods to create new DataSources to BatchTableEnvironment and StreamTableEnvironment. - Remove copied Calcite rule that got fixed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink calcite1.7 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1897.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1897 commit 92e2539b4178b3049f97bb1a6137ee91c943870a Author: Fabian HueskeDate: 2016-03-22T17:51:46Z [FLINK-3587] Bump Calcite version to 1.7.0 - Add DataSetValues and DataStreamValues due to changed Calcite RelNode generation. - Pass TableEnvironment instead of TableConfig for DataSet and DataStream translation. - Add methods to create new DataSources to BatchTableEnvironment and StreamTableEnvironment. - Remove copied Calcite rule that got fixed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3768) Clustering Coefficient
[ https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243625#comment-15243625 ] Greg Hogan commented on FLINK-3768: --- The algorithms are very different so I don't see them as exclusive but I see that that pull request was not committed. > Clustering Coefficient > -- > > Key: FLINK-3768 > URL: https://issues.apache.org/jira/browse/FLINK-3768 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 1.1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > Fix For: 1.1.0 > > > The local clustering coefficient measures the connectedness of each vertex's > neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 > (neighborhood is a clique). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange
[ https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Batts updated FLINK-3769: Description: The RabbitMQ Sink can currently only publish to the "default" exchange. This exchange is a direct exchange, so the routing key will route directly to the queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job to 1 exchange which routes to 1 queue). Additionally, I believe that if a user decides to use a different exchange I think the following can be assumed: 1.) The provided exchange exists 2.) The user has declared the appropriate mapping and the appropriate queues exist in RabbitMQ (therefore, nothing needs to be created) RabbitMQ currently provides four types of exchanges. Three of these will be covered by just enabling exchanges (Direct, Fanout, Topic) because they use the routingkey (or nothing). The fourth exchange type relies on the message headers, which are currently set to null by default on the publish. These headers may be on a per message level, so the input of this stream will need to take this as input as well. This forth exchange could very well be outside of the scope of this Improvement and a "RabbitMQ Sink enable headers" Improvement might be the better way to go with this. Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html was: The RabbitMQ Sink can currently only publish to the "default" exchange. This exchange is a direct exchange, so the routing key will route directly to the queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job to 1 exchange which routes to 1 queue). Additionally, I believe that if a user decides to use a different exchange I think the following can be assumed: 1.) The provided exchange exists 2.) The user has declared the appropriate mapping and the appropriate queues exist in RabbitMQ (therefore, nothing needs to be created) RabbitMQ currently provides four types of exchanges. Three of these will be covered by just enabling exchanges (Direct, Fanout, Topic) because they use the routingkey (or nothing). The fourth exchange type relies on the message headers, which are currently set to null by default on the publish. These headers may be on a per message level, so the input of this stream will need to take this as input as well. This forth exchange could very well be outside of the scope of this Improvement and a "RabbitMQ Sink enable headers" Improvement might be the better way to go with this. > RabbitMQ Sink ability to publish to a different exchange > > > Key: FLINK-3769 > URL: https://issues.apache.org/jira/browse/FLINK-3769 > Project: Flink > Issue Type: Improvement > Components: Streaming Connectors >Affects Versions: 1.0.1 >Reporter: Robert Batts > Labels: rabbitmq > > The RabbitMQ Sink can currently only publish to the "default" exchange. This > exchange is a direct exchange, so the routing key will route directly to the > queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job > to 1 exchange which routes to 1 queue). Additionally, I believe that if a > user decides to use a different exchange I think the following can be assumed: > 1.) The provided exchange exists > 2.) The user has declared the appropriate mapping and the appropriate queues > exist in RabbitMQ (therefore, nothing needs to be created) > RabbitMQ currently provides four types of exchanges. Three of these will be > covered by just enabling exchanges (Direct, Fanout, Topic) because they use > the routingkey (or nothing). > The fourth exchange type relies on the message headers, which are currently > set to null by default on the publish. These headers may be on a per message > level, so the input of this stream will need to take this as input as well. > This forth exchange could very well be outside of the scope of this > Improvement and a "RabbitMQ Sink enable headers" Improvement might be the > better way to go with this. > Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange
Robert Batts created FLINK-3769: --- Summary: RabbitMQ Sink ability to publish to a different exchange Key: FLINK-3769 URL: https://issues.apache.org/jira/browse/FLINK-3769 Project: Flink Issue Type: Improvement Components: Streaming Connectors Affects Versions: 1.0.1 Reporter: Robert Batts The RabbitMQ Sink can currently only publish to the "default" exchange. This exchange is a direct exchange, so the routing key will route directly to the queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job to 1 exchange which routes to 1 queue). Additionally, I believe that if a user decides to use a different exchange I think the following can be assumed: 1.) The provided exchange exists 2.) The user has declared the appropriate mapping and the appropriate queues exist in RabbitMQ (therefore, nothing needs to be created) RabbitMQ currently provides four types of exchanges. Three of these will be covered by just enabling exchanges (Direct, Fanout, Topic) because they use the routingkey (or nothing). The fourth exchange type relies on the message headers, which are currently set to null by default on the publish. These headers may be on a per message level, so the input of this stream will need to take this as input as well. This forth exchange could very well be outside of the scope of this Improvement and a "RabbitMQ Sink enable headers" Improvement might be the better way to go with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3768) Clustering Coefficient
[ https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243542#comment-15243542 ] Vasia Kalavri commented on FLINK-3768: -- There is also FLINK-1528 for local clustering coefficient. Is this about the same algorithm? We should mark it as duplicate if so. > Clustering Coefficient > -- > > Key: FLINK-3768 > URL: https://issues.apache.org/jira/browse/FLINK-3768 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 1.1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > Fix For: 1.1.0 > > > The local clustering coefficient measures the connectedness of each vertex's > neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 > (neighborhood is a clique). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-3768) Clustering Coefficient
[ https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Hogan updated FLINK-3768: -- Fix Version/s: 1.1.0 > Clustering Coefficient > -- > > Key: FLINK-3768 > URL: https://issues.apache.org/jira/browse/FLINK-3768 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 1.1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > Fix For: 1.1.0 > > > The local clustering coefficient measures the connectedness of each vertex's > neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 > (neighborhood is a clique). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3768) Clustering Coefficient
[ https://issues.apache.org/jira/browse/FLINK-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243506#comment-15243506 ] ASF GitHub Bot commented on FLINK-3768: --- GitHub user greghogan opened a pull request: https://github.com/apache/flink/pull/1896 [FLINK-3768] [gelly] Clustering Coefficient Provides an algorithm for local clustering coefficient and dependent functions for degree annotation, algorithm caching, and graph translation. I worked to improve the performance of `TriangleEnumerator`. Perhaps the API has changed since `Edge.reverse()` is not in-place and the edges were not being sorted by degree. The `JoinHint` is also important so that the `Triad`s are not spilled to disk. On an AWS ec2.4xlarge (16 vcores, 30 GiB) I am seeing for the following timings of 5s, 29s, and 183s for `TriangleListing`. With `TriangleEnumerator` the timings are 7s, 45s, and 281s. Without the `JoinHint` the latter `TriangleEnumerator` timings are 58s and 347s. Scale | ChecksumHashCode | Count --||-- 16 | 0xd9086985f4ce | 15616010 18 | 0x0010eeb32a441365 | 82781436 20 | 0x014a9434bb57ddef | 423780284 The command I had used to run the tests: ``` ./bin/flink run -class org.apache.flink.graph.examples.TriangleListing ~/flink-gelly-examples_2.10-1.1-SNAPSHOT.jar --clip_and_flip false --output print --output hash --scale 16 --listing ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/greghogan/flink 3768_clustering_coefficient Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1896.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1896 commit aa1141f4d34f7af9c092ec76bf1a81de310aed16 Author: Greg HoganDate: 2016-04-13T13:28:38Z [FLINK-3768] [gelly] Clustering Coefficient Provides an algorithm for local clustering coefficient and dependent functions for degree annotation, algorithm caching, and graph translation. > Clustering Coefficient > -- > > Key: FLINK-3768 > URL: https://issues.apache.org/jira/browse/FLINK-3768 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 1.1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > > The local clustering coefficient measures the connectedness of each vertex's > neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 > (neighborhood is a clique). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3768] [gelly] Clustering Coefficient
GitHub user greghogan opened a pull request: https://github.com/apache/flink/pull/1896 [FLINK-3768] [gelly] Clustering Coefficient Provides an algorithm for local clustering coefficient and dependent functions for degree annotation, algorithm caching, and graph translation. I worked to improve the performance of `TriangleEnumerator`. Perhaps the API has changed since `Edge.reverse()` is not in-place and the edges were not being sorted by degree. The `JoinHint` is also important so that the `Triad`s are not spilled to disk. On an AWS ec2.4xlarge (16 vcores, 30 GiB) I am seeing for the following timings of 5s, 29s, and 183s for `TriangleListing`. With `TriangleEnumerator` the timings are 7s, 45s, and 281s. Without the `JoinHint` the latter `TriangleEnumerator` timings are 58s and 347s. Scale | ChecksumHashCode | Count --||-- 16 | 0xd9086985f4ce | 15616010 18 | 0x0010eeb32a441365 | 82781436 20 | 0x014a9434bb57ddef | 423780284 The command I had used to run the tests: ``` ./bin/flink run -class org.apache.flink.graph.examples.TriangleListing ~/flink-gelly-examples_2.10-1.1-SNAPSHOT.jar --clip_and_flip false --output print --output hash --scale 16 --listing ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/greghogan/flink 3768_clustering_coefficient Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1896.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1896 commit aa1141f4d34f7af9c092ec76bf1a81de310aed16 Author: Greg HoganDate: 2016-04-13T13:28:38Z [FLINK-3768] [gelly] Clustering Coefficient Provides an algorithm for local clustering coefficient and dependent functions for degree annotation, algorithm caching, and graph translation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-3732) Potential null deference in ExecutionConfig#equals()
[ https://issues.apache.org/jira/browse/FLINK-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske resolved FLINK-3732. -- Resolution: Fixed Fix Version/s: 1.0.2 1.1.0 Fixed for 1.0.2 with 5b69dd8ce5485d6f8cbd4d94d1ea1870efb53c6a Fixed for 1.1.0 with f3d3a4493ae786d421176396ee68f01a0e6dbb64 Thanks for the contribution! > Potential null deference in ExecutionConfig#equals() > > > Key: FLINK-3732 > URL: https://issues.apache.org/jira/browse/FLINK-3732 > Project: Flink > Issue Type: Bug >Reporter: Ted Yu >Priority: Minor > Fix For: 1.1.0, 1.0.2 > > > {code} > ((restartStrategyConfiguration == null && > other.restartStrategyConfiguration == null) || > > restartStrategyConfiguration.equals(other.restartStrategyConfiguration)) && > {code} > If restartStrategyConfiguration is null but > other.restartStrategyConfiguration is not null, the above would result in NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2544) Some test cases using PowerMock fail with Java 8u20
[ https://issues.apache.org/jira/browse/FLINK-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243418#comment-15243418 ] Fabian Hueske commented on FLINK-2544: -- Hi [~spkavulya], I gave you contributor permissions for the Flink JIRA. Now you can assign JIRAs to yourself. Thanks for your contribution! > Some test cases using PowerMock fail with Java 8u20 > --- > > Key: FLINK-2544 > URL: https://issues.apache.org/jira/browse/FLINK-2544 > Project: Flink > Issue Type: Bug >Reporter: Till Rohrmann >Assignee: Soila Kavulya >Priority: Minor > Fix For: 1.1.0, 1.0.2 > > > I observed that some of the test cases using {{PowerMockRunner}} fail with > Java 8u20 with the following error: > {code} > java.lang.VerifyError: Bad method call from inside of a branch > Exception Details: > Location: > > org/apache/flink/client/program/ClientTest$SuccessReturningActor.()V > @32: invokespecial > Reason: > Error exists in the bytecode > Bytecode: > 0x000: 2a4c 1214 b800 1a03 bd00 0d12 1bb8 001f > 0x010: b800 254e 2db2 0029 a500 0e2a 01c0 002b > 0x020: b700 2ea7 0009 2bb7 0030 0157 2a01 4c01 > 0x030: 4d01 4e2b 01a5 0008 2b4e a700 0912 32b8 > 0x040: 001a 4e2d 1234 03bd 000d 1236 b800 1f12 > 0x050: 32b8 003a 3a04 1904 b200 29a6 000a b800 > 0x060: 3c4d a700 0919 04c0 0011 4d2c b800 42b5 > 0x070: 0046 b1 > Stackmap Table: > full_frame(@38,{UninitializedThis,UninitializedThis,Top,Object[#13]},{}) > full_frame(@44,{Object[#2],Object[#2],Top,Object[#13]},{}) > full_frame(@61,{Object[#2],Null,Null,Null},{Object[#2]}) > full_frame(@67,{Object[#2],Null,Null,Object[#15]},{Object[#2]}) > > full_frame(@101,{Object[#2],Null,Null,Object[#15],Object[#13]},{Object[#2]}) > > full_frame(@107,{Object[#2],Null,Object[#17],Object[#15],Object[#13]},{Object[#2]}) > at java.lang.Class.getDeclaredConstructors0(Native Method) > at java.lang.Class.privateGetDeclaredConstructors(Class.java:2658) > at java.lang.Class.getConstructor0(Class.java:3062) > at java.lang.Class.getDeclaredConstructor(Class.java:2165) > at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86) > at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86) > at scala.util.Try$.apply(Try.scala:161) > at akka.util.Reflect$.findConstructor(Reflect.scala:86) > at akka.actor.NoArgsReflectConstructor.(Props.scala:359) > at akka.actor.IndirectActorProducer$.apply(Props.scala:308) > at akka.actor.Props.producer(Props.scala:176) > at akka.actor.Props.(Props.scala:189) > at akka.actor.Props$.create(Props.scala:99) > at akka.actor.Props$.create(Props.scala:99) > at akka.actor.Props.create(Props.scala) > at > org.apache.flink.client.program.ClientTest.shouldSubmitToJobClient(ClientTest.java:143) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:310) > at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88) > at > org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:294) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:127) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:282) > at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86) > at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:207) > at >
[jira] [Resolved] (FLINK-2544) Some test cases using PowerMock fail with Java 8u20
[ https://issues.apache.org/jira/browse/FLINK-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske resolved FLINK-2544. -- Resolution: Fixed Fix Version/s: 1.0.2 1.1.0 Fixed for 1.0.2 with c1eb247e968a54f3f252ece792847c00b8ef5464 Fixed for 1.1.0 with d938c5f592addcee75227502a930264b36306aed > Some test cases using PowerMock fail with Java 8u20 > --- > > Key: FLINK-2544 > URL: https://issues.apache.org/jira/browse/FLINK-2544 > Project: Flink > Issue Type: Bug >Reporter: Till Rohrmann >Assignee: Soila Kavulya >Priority: Minor > Fix For: 1.1.0, 1.0.2 > > > I observed that some of the test cases using {{PowerMockRunner}} fail with > Java 8u20 with the following error: > {code} > java.lang.VerifyError: Bad method call from inside of a branch > Exception Details: > Location: > > org/apache/flink/client/program/ClientTest$SuccessReturningActor.()V > @32: invokespecial > Reason: > Error exists in the bytecode > Bytecode: > 0x000: 2a4c 1214 b800 1a03 bd00 0d12 1bb8 001f > 0x010: b800 254e 2db2 0029 a500 0e2a 01c0 002b > 0x020: b700 2ea7 0009 2bb7 0030 0157 2a01 4c01 > 0x030: 4d01 4e2b 01a5 0008 2b4e a700 0912 32b8 > 0x040: 001a 4e2d 1234 03bd 000d 1236 b800 1f12 > 0x050: 32b8 003a 3a04 1904 b200 29a6 000a b800 > 0x060: 3c4d a700 0919 04c0 0011 4d2c b800 42b5 > 0x070: 0046 b1 > Stackmap Table: > full_frame(@38,{UninitializedThis,UninitializedThis,Top,Object[#13]},{}) > full_frame(@44,{Object[#2],Object[#2],Top,Object[#13]},{}) > full_frame(@61,{Object[#2],Null,Null,Null},{Object[#2]}) > full_frame(@67,{Object[#2],Null,Null,Object[#15]},{Object[#2]}) > > full_frame(@101,{Object[#2],Null,Null,Object[#15],Object[#13]},{Object[#2]}) > > full_frame(@107,{Object[#2],Null,Object[#17],Object[#15],Object[#13]},{Object[#2]}) > at java.lang.Class.getDeclaredConstructors0(Native Method) > at java.lang.Class.privateGetDeclaredConstructors(Class.java:2658) > at java.lang.Class.getConstructor0(Class.java:3062) > at java.lang.Class.getDeclaredConstructor(Class.java:2165) > at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86) > at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86) > at scala.util.Try$.apply(Try.scala:161) > at akka.util.Reflect$.findConstructor(Reflect.scala:86) > at akka.actor.NoArgsReflectConstructor.(Props.scala:359) > at akka.actor.IndirectActorProducer$.apply(Props.scala:308) > at akka.actor.Props.producer(Props.scala:176) > at akka.actor.Props.(Props.scala:189) > at akka.actor.Props$.create(Props.scala:99) > at akka.actor.Props$.create(Props.scala:99) > at akka.actor.Props.create(Props.scala) > at > org.apache.flink.client.program.ClientTest.shouldSubmitToJobClient(ClientTest.java:143) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:310) > at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88) > at > org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:294) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:127) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:282) > at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86) > at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:207) > at >
[jira] [Updated] (FLINK-2544) Some test cases using PowerMock fail with Java 8u20
[ https://issues.apache.org/jira/browse/FLINK-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske updated FLINK-2544: - Assignee: Soila Kavulya > Some test cases using PowerMock fail with Java 8u20 > --- > > Key: FLINK-2544 > URL: https://issues.apache.org/jira/browse/FLINK-2544 > Project: Flink > Issue Type: Bug >Reporter: Till Rohrmann >Assignee: Soila Kavulya >Priority: Minor > > I observed that some of the test cases using {{PowerMockRunner}} fail with > Java 8u20 with the following error: > {code} > java.lang.VerifyError: Bad method call from inside of a branch > Exception Details: > Location: > > org/apache/flink/client/program/ClientTest$SuccessReturningActor.()V > @32: invokespecial > Reason: > Error exists in the bytecode > Bytecode: > 0x000: 2a4c 1214 b800 1a03 bd00 0d12 1bb8 001f > 0x010: b800 254e 2db2 0029 a500 0e2a 01c0 002b > 0x020: b700 2ea7 0009 2bb7 0030 0157 2a01 4c01 > 0x030: 4d01 4e2b 01a5 0008 2b4e a700 0912 32b8 > 0x040: 001a 4e2d 1234 03bd 000d 1236 b800 1f12 > 0x050: 32b8 003a 3a04 1904 b200 29a6 000a b800 > 0x060: 3c4d a700 0919 04c0 0011 4d2c b800 42b5 > 0x070: 0046 b1 > Stackmap Table: > full_frame(@38,{UninitializedThis,UninitializedThis,Top,Object[#13]},{}) > full_frame(@44,{Object[#2],Object[#2],Top,Object[#13]},{}) > full_frame(@61,{Object[#2],Null,Null,Null},{Object[#2]}) > full_frame(@67,{Object[#2],Null,Null,Object[#15]},{Object[#2]}) > > full_frame(@101,{Object[#2],Null,Null,Object[#15],Object[#13]},{Object[#2]}) > > full_frame(@107,{Object[#2],Null,Object[#17],Object[#15],Object[#13]},{Object[#2]}) > at java.lang.Class.getDeclaredConstructors0(Native Method) > at java.lang.Class.privateGetDeclaredConstructors(Class.java:2658) > at java.lang.Class.getConstructor0(Class.java:3062) > at java.lang.Class.getDeclaredConstructor(Class.java:2165) > at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86) > at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86) > at scala.util.Try$.apply(Try.scala:161) > at akka.util.Reflect$.findConstructor(Reflect.scala:86) > at akka.actor.NoArgsReflectConstructor.(Props.scala:359) > at akka.actor.IndirectActorProducer$.apply(Props.scala:308) > at akka.actor.Props.producer(Props.scala:176) > at akka.actor.Props.(Props.scala:189) > at akka.actor.Props$.create(Props.scala:99) > at akka.actor.Props$.create(Props.scala:99) > at akka.actor.Props.create(Props.scala) > at > org.apache.flink.client.program.ClientTest.shouldSubmitToJobClient(ClientTest.java:143) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:310) > at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88) > at > org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:294) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:127) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:282) > at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86) > at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:207) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.runMethods(PowerMockJUnit44RunnerDelegateImpl.java:146) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$1.run(PowerMockJUnit44RunnerDelegateImpl.java:120) > at >
[jira] [Resolved] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking
[ https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske resolved FLINK-3762. -- Resolution: Fixed Fix Version/s: 1.1.0 Fixed for 1.0.2 with b4b08ca24f2dc4897565da59361ad71281910619 Fixed for 1.1.0 with dc78a7470a5da086a08140b200a20d840460ef79 Thanks for the fix [~Andrew_Palumbo]! > Kryo StackOverflowError due to disabled Kryo Reference tracking > > > Key: FLINK-3762 > URL: https://issues.apache.org/jira/browse/FLINK-3762 > Project: Flink > Issue Type: Bug > Components: Core >Affects Versions: 1.0.1 >Reporter: Andrew Palumbo >Assignee: Andrew Palumbo > Fix For: 1.1.0, 1.0.2 > > > As discussed on the dev list, > In {{KryoSerializer.java}} > Kryo Reference tracking is disabled by default: > {code} > kryo.setReferences(false); > {code} > This can causes {{StackOverflowError}} Exceptions when serializing many > objects that may contain recursive objects: > {code} > java.lang.StackOverflowError > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > {code} > By enabling reference tracking, we can fix this problem. > [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking
[ https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243362#comment-15243362 ] ASF GitHub Bot commented on FLINK-3762: --- Github user andrewpalumbo commented on the pull request: https://github.com/apache/flink/pull/1891#issuecomment-210576438 Thank you guys! > Kryo StackOverflowError due to disabled Kryo Reference tracking > > > Key: FLINK-3762 > URL: https://issues.apache.org/jira/browse/FLINK-3762 > Project: Flink > Issue Type: Bug > Components: Core >Affects Versions: 1.0.1 >Reporter: Andrew Palumbo >Assignee: Andrew Palumbo > Fix For: 1.0.2 > > > As discussed on the dev list, > In {{KryoSerializer.java}} > Kryo Reference tracking is disabled by default: > {code} > kryo.setReferences(false); > {code} > This can causes {{StackOverflowError}} Exceptions when serializing many > objects that may contain recursive objects: > {code} > java.lang.StackOverflowError > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > {code} > By enabling reference tracking, we can fix this problem. > [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3762: StackOverflowError due to disabled...
Github user andrewpalumbo commented on the pull request: https://github.com/apache/flink/pull/1891#issuecomment-210576438 Thank you guys! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Closed] (FLINK-3759) Table API should throw exception is null value is encountered in non-null mode.
[ https://issues.apache.org/jira/browse/FLINK-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske closed FLINK-3759. Resolution: Fixed Fix Version/s: 1.1.0 Fixed with 722155a1fb95ddb45b6ee4c6cc6d0438cdbafac6 > Table API should throw exception is null value is encountered in non-null > mode. > --- > > Key: FLINK-3759 > URL: https://issues.apache.org/jira/browse/FLINK-3759 > Project: Flink > Issue Type: Bug > Components: Table API >Affects Versions: 1.1.0 >Reporter: Fabian Hueske >Priority: Critical > Fix For: 1.1.0 > > > The Table API can be configured to omit null-checks in generated code to > speed up processing. Currently, the generated code replaces a null value with > a data type specific default value if it is encountered in non-null-check > mode. > This can silently cause wrong results and should be changed such that an > exception is thrown if a null value is encountered in non-null-check mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3700) Replace Guava Preconditions class with Flink Preconditions
[ https://issues.apache.org/jira/browse/FLINK-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243305#comment-15243305 ] Fabian Hueske commented on FLINK-3700: -- Commit 760a0d9e7fd9fa88e9f7408b137d78df384d764f removed Guava as dependency from {{flink-core}}. > Replace Guava Preconditions class with Flink Preconditions > -- > > Key: FLINK-3700 > URL: https://issues.apache.org/jira/browse/FLINK-3700 > Project: Flink > Issue Type: Improvement > Components: Core >Affects Versions: 1.0.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.1.0 > > > In order to reduce the dependency on Guava (which has cause us quite a bit of > pain in the past with its version conflicts), I suggest to add a Flink > {{Preconditions}} class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-3738) Refactor TableEnvironment and TranslationContext
[ https://issues.apache.org/jira/browse/FLINK-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske closed FLINK-3738. Resolution: Done Fix Version/s: 1.1.0 Done with 1acd844539d3e0ea19d492d2f5f89e32f29827d8 > Refactor TableEnvironment and TranslationContext > > > Key: FLINK-3738 > URL: https://issues.apache.org/jira/browse/FLINK-3738 > Project: Flink > Issue Type: Task > Components: Table API >Reporter: Fabian Hueske >Assignee: Fabian Hueske > Fix For: 1.1.0 > > > Currently the TableAPI uses a static object called {{TranslationContext}} > which holds the Calcite table catalog and a Calcite planner instance. > Whenever a {{DataSet}} or {{DataStream}} is converted into a {{Table}} or > registered as a {{Table}} on the {{TableEnvironment}}, a new entry is added > to the catalog. The first time a {{Table}} is added, a planner instance is > created. The planner is used to optimize the query (defined by one or more > Table API operations and/or one ore more SQL queries) when a {{Table}} is > converted into a {{DataSet}} or {{DataStream}}. Since a planner may only be > used to optimize a single program, the choice of a single static object is > problematic. > I propose to refactor the {{TableEnvironment}} to take over the > responsibility of holding the catalog and the planner instance. > - A {{TableEnvironment}} holds a catalog of registered tables and a single > planner instance. > - A {{TableEnvironment}} will only allow to translate a single {{Table}} > (possibly composed of several Table API operations and SQL queries) into a > {{DataSet}} or {{DataStream}}. > - A {{TableEnvironment}} is bound to an {{ExecutionEnvironment}} or a > {{StreamExecutionEnvironment}}. This is necessary to create data source or > source functions to read external tables or streams. > - {{DataSet}} and {{DataStream}} need a reference to a {{TableEnvironment}} > to be converted into a {{Table}}. This will prohibit implicit casts as > currently supported for the DataSet Scala API. > - A {{Table}} needs a reference to the {{TableEnvironment}} it is bound to. > Only tables from the same {{TableEnvironment}} can be processed together. > - The {{TranslationContext}} will be completely removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3738) Refactor TableEnvironment and TranslationContext
[ https://issues.apache.org/jira/browse/FLINK-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243286#comment-15243286 ] ASF GitHub Bot commented on FLINK-3738: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1887 > Refactor TableEnvironment and TranslationContext > > > Key: FLINK-3738 > URL: https://issues.apache.org/jira/browse/FLINK-3738 > Project: Flink > Issue Type: Task > Components: Table API >Reporter: Fabian Hueske >Assignee: Fabian Hueske > > Currently the TableAPI uses a static object called {{TranslationContext}} > which holds the Calcite table catalog and a Calcite planner instance. > Whenever a {{DataSet}} or {{DataStream}} is converted into a {{Table}} or > registered as a {{Table}} on the {{TableEnvironment}}, a new entry is added > to the catalog. The first time a {{Table}} is added, a planner instance is > created. The planner is used to optimize the query (defined by one or more > Table API operations and/or one ore more SQL queries) when a {{Table}} is > converted into a {{DataSet}} or {{DataStream}}. Since a planner may only be > used to optimize a single program, the choice of a single static object is > problematic. > I propose to refactor the {{TableEnvironment}} to take over the > responsibility of holding the catalog and the planner instance. > - A {{TableEnvironment}} holds a catalog of registered tables and a single > planner instance. > - A {{TableEnvironment}} will only allow to translate a single {{Table}} > (possibly composed of several Table API operations and SQL queries) into a > {{DataSet}} or {{DataStream}}. > - A {{TableEnvironment}} is bound to an {{ExecutionEnvironment}} or a > {{StreamExecutionEnvironment}}. This is necessary to create data source or > source functions to read external tables or streams. > - {{DataSet}} and {{DataStream}} need a reference to a {{TableEnvironment}} > to be converted into a {{Table}}. This will prohibit implicit casts as > currently supported for the DataSet Scala API. > - A {{Table}} needs a reference to the {{TableEnvironment}} it is bound to. > Only tables from the same {{TableEnvironment}} can be processed together. > - The {{TranslationContext}} will be completely removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2544) Some test cases using PowerMock fail with Java 8u20
[ https://issues.apache.org/jira/browse/FLINK-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243285#comment-15243285 ] ASF GitHub Bot commented on FLINK-2544: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1882 > Some test cases using PowerMock fail with Java 8u20 > --- > > Key: FLINK-2544 > URL: https://issues.apache.org/jira/browse/FLINK-2544 > Project: Flink > Issue Type: Bug >Reporter: Till Rohrmann >Priority: Minor > > I observed that some of the test cases using {{PowerMockRunner}} fail with > Java 8u20 with the following error: > {code} > java.lang.VerifyError: Bad method call from inside of a branch > Exception Details: > Location: > > org/apache/flink/client/program/ClientTest$SuccessReturningActor.()V > @32: invokespecial > Reason: > Error exists in the bytecode > Bytecode: > 0x000: 2a4c 1214 b800 1a03 bd00 0d12 1bb8 001f > 0x010: b800 254e 2db2 0029 a500 0e2a 01c0 002b > 0x020: b700 2ea7 0009 2bb7 0030 0157 2a01 4c01 > 0x030: 4d01 4e2b 01a5 0008 2b4e a700 0912 32b8 > 0x040: 001a 4e2d 1234 03bd 000d 1236 b800 1f12 > 0x050: 32b8 003a 3a04 1904 b200 29a6 000a b800 > 0x060: 3c4d a700 0919 04c0 0011 4d2c b800 42b5 > 0x070: 0046 b1 > Stackmap Table: > full_frame(@38,{UninitializedThis,UninitializedThis,Top,Object[#13]},{}) > full_frame(@44,{Object[#2],Object[#2],Top,Object[#13]},{}) > full_frame(@61,{Object[#2],Null,Null,Null},{Object[#2]}) > full_frame(@67,{Object[#2],Null,Null,Object[#15]},{Object[#2]}) > > full_frame(@101,{Object[#2],Null,Null,Object[#15],Object[#13]},{Object[#2]}) > > full_frame(@107,{Object[#2],Null,Object[#17],Object[#15],Object[#13]},{Object[#2]}) > at java.lang.Class.getDeclaredConstructors0(Native Method) > at java.lang.Class.privateGetDeclaredConstructors(Class.java:2658) > at java.lang.Class.getConstructor0(Class.java:3062) > at java.lang.Class.getDeclaredConstructor(Class.java:2165) > at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86) > at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86) > at scala.util.Try$.apply(Try.scala:161) > at akka.util.Reflect$.findConstructor(Reflect.scala:86) > at akka.actor.NoArgsReflectConstructor.(Props.scala:359) > at akka.actor.IndirectActorProducer$.apply(Props.scala:308) > at akka.actor.Props.producer(Props.scala:176) > at akka.actor.Props.(Props.scala:189) > at akka.actor.Props$.create(Props.scala:99) > at akka.actor.Props$.create(Props.scala:99) > at akka.actor.Props.create(Props.scala) > at > org.apache.flink.client.program.ClientTest.shouldSubmitToJobClient(ClientTest.java:143) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:310) > at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88) > at > org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:294) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:127) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:282) > at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86) > at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:207) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.runMethods(PowerMockJUnit44RunnerDelegateImpl.java:146) > at >
[GitHub] flink pull request: [FLINK-3732] [Core] Potential null deference i...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1871 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3738] [tableAPI] Refactor TableEnvironm...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1887 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3700] [core] Remove Guava as a dependen...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1854 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3759] [table] Table API should throw ex...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1892 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3732) Potential null deference in ExecutionConfig#equals()
[ https://issues.apache.org/jira/browse/FLINK-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243284#comment-15243284 ] ASF GitHub Bot commented on FLINK-3732: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1871 > Potential null deference in ExecutionConfig#equals() > > > Key: FLINK-3732 > URL: https://issues.apache.org/jira/browse/FLINK-3732 > Project: Flink > Issue Type: Bug >Reporter: Ted Yu >Priority: Minor > > {code} > ((restartStrategyConfiguration == null && > other.restartStrategyConfiguration == null) || > > restartStrategyConfiguration.equals(other.restartStrategyConfiguration)) && > {code} > If restartStrategyConfiguration is null but > other.restartStrategyConfiguration is not null, the above would result in NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2544] Add Java 8 version for building P...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1882 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking
[ https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243283#comment-15243283 ] ASF GitHub Bot commented on FLINK-3762: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1891 > Kryo StackOverflowError due to disabled Kryo Reference tracking > > > Key: FLINK-3762 > URL: https://issues.apache.org/jira/browse/FLINK-3762 > Project: Flink > Issue Type: Bug > Components: Core >Affects Versions: 1.0.1 >Reporter: Andrew Palumbo >Assignee: Andrew Palumbo > Fix For: 1.0.2 > > > As discussed on the dev list, > In {{KryoSerializer.java}} > Kryo Reference tracking is disabled by default: > {code} > kryo.setReferences(false); > {code} > This can causes {{StackOverflowError}} Exceptions when serializing many > objects that may contain recursive objects: > {code} > java.lang.StackOverflowError > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > {code} > By enabling reference tracking, we can fix this problem. > [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3759) Table API should throw exception is null value is encountered in non-null mode.
[ https://issues.apache.org/jira/browse/FLINK-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243281#comment-15243281 ] ASF GitHub Bot commented on FLINK-3759: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1892 > Table API should throw exception is null value is encountered in non-null > mode. > --- > > Key: FLINK-3759 > URL: https://issues.apache.org/jira/browse/FLINK-3759 > Project: Flink > Issue Type: Bug > Components: Table API >Affects Versions: 1.1.0 >Reporter: Fabian Hueske >Priority: Critical > > The Table API can be configured to omit null-checks in generated code to > speed up processing. Currently, the generated code replaces a null value with > a data type specific default value if it is encountered in non-null-check > mode. > This can silently cause wrong results and should be changed such that an > exception is thrown if a null value is encountered in non-null-check mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3700) Replace Guava Preconditions class with Flink Preconditions
[ https://issues.apache.org/jira/browse/FLINK-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243282#comment-15243282 ] ASF GitHub Bot commented on FLINK-3700: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1854 > Replace Guava Preconditions class with Flink Preconditions > -- > > Key: FLINK-3700 > URL: https://issues.apache.org/jira/browse/FLINK-3700 > Project: Flink > Issue Type: Improvement > Components: Core >Affects Versions: 1.0.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.1.0 > > > In order to reduce the dependency on Guava (which has cause us quite a bit of > pain in the past with its version conflicts), I suggest to add a Flink > {{Preconditions}} class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3762: StackOverflowError due to disabled...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1891 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-3768) Clustering Coefficient
Greg Hogan created FLINK-3768: - Summary: Clustering Coefficient Key: FLINK-3768 URL: https://issues.apache.org/jira/browse/FLINK-3768 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 1.1.0 Reporter: Greg Hogan Assignee: Greg Hogan The local clustering coefficient measures the connectedness of each vertex's neighborhood. Values range from 0.0 (no edges between neighbors) to 1.0 (neighborhood is a clique). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1745) Add exact k-nearest-neighbours algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243164#comment-15243164 ] ASF GitHub Bot commented on FLINK-1745: --- Github user danielblazevski commented on the pull request: https://github.com/apache/flink/pull/1220#issuecomment-210527093 @tillrohrmann @chiwanpark does the re-basing look OK now? Some of the CI builds didn't go through, 2 passed, 2 failed and 1 timed out (it seems there is a 2hr max limit for the CI process). > Add exact k-nearest-neighbours algorithm to machine learning library > > > Key: FLINK-1745 > URL: https://issues.apache.org/jira/browse/FLINK-1745 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Daniel Blazevski > Labels: ML, Starter > > Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial > it is still used as a mean to classify data and to do regression. This issue > focuses on the implementation of an exact kNN (H-BNLJ, H-BRJ) algorithm as > proposed in [2]. > Could be a starter task. > Resources: > [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] > [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1745] Add exact k-nearest-neighbours al...
Github user danielblazevski commented on the pull request: https://github.com/apache/flink/pull/1220#issuecomment-210527093 @tillrohrmann @chiwanpark does the re-basing look OK now? Some of the CI builds didn't go through, 2 passed, 2 failed and 1 timed out (it seems there is a 2hr max limit for the CI process). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3665) Range partitioning lacks support to define sort orders
[ https://issues.apache.org/jira/browse/FLINK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243033#comment-15243033 ] ASF GitHub Bot commented on FLINK-3665: --- Github user dawidwys commented on the pull request: https://github.com/apache/flink/pull/1848#issuecomment-210485959 Hi @fhueske, I hope I applied all your comments, so it would be nice if you could have a look at it once again. > Range partitioning lacks support to define sort orders > -- > > Key: FLINK-3665 > URL: https://issues.apache.org/jira/browse/FLINK-3665 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0 >Reporter: Fabian Hueske > Fix For: 1.1.0 > > > {{DataSet.partitionByRange()}} does not allow to specify the sort order of > fields. This is fine if range partitioning is used to reduce skewed > partitioning. > However, it is not sufficient if range partitioning is used to sort a data > set in parallel. > Since {{DataSet.partitionByRange()}} is {{@Public}} API and cannot be easily > changed, I propose to add a method {{withOrders(Order... orders)}} to > {{PartitionOperator}}. The method should throw an exception if the > partitioning method of {{PartitionOperator}} is not range partitioning. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3665] Implemented sort orders support i...
Github user dawidwys commented on the pull request: https://github.com/apache/flink/pull/1848#issuecomment-210485959 Hi @fhueske, I hope I applied all your comments, so it would be nice if you could have a look at it once again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3428) Add fixed time trailing timestamp/watermark extractor
[ https://issues.apache.org/jira/browse/FLINK-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242947#comment-15242947 ] ASF GitHub Bot commented on FLINK-3428: --- Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1764#issuecomment-210462831 If everything is ok, could somebody merge this? It has been some time that there is no activity here. > Add fixed time trailing timestamp/watermark extractor > - > > Key: FLINK-3428 > URL: https://issues.apache.org/jira/browse/FLINK-3428 > Project: Flink > Issue Type: Improvement >Reporter: Robert Metzger >Assignee: Kostas Kloudas > > Flink currently provides only one build-in timestamp extractor, which assumes > strictly ascending timestamps. In real world use cases, timestamps are almost > never strictly ascending. > Therefore, I propose to provide an utility watermark extractor which is > generating watermarks with a fixed-time trailing. > The implementation should keep track of the highest event-time seen so far > and subtract a fixed amount of time from that event time. > This way, users can for example specify that the watermarks should always > "lag behind" 10 minutes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3428: Adds a fixed time trailing waterma...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1764#issuecomment-210462831 If everything is ok, could somebody merge this? It has been some time that there is no activity here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat
[ https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242903#comment-15242903 ] ASF GitHub Bot commented on FLINK-3717: --- GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/1895 FLINK-3717 - Be able to re-start reading from a specific point in a file. This pull-request addresses issue FLINK-3717 and is the first step to make File Sources fault-tolerant. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink fif_offset Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1895.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1895 commit a571f8de21d0c48fdb31b4518a4be76c74d32cea Author: kl0uDate: 2016-04-10T14:56:42Z Added the getOffset() in the FileInputFormat. commit bc7f97864c56210f79a23f9bdc7c328c19f4c2ee Author: kl0u Date: 2016-04-11T11:49:00Z Adding tests. commit 8b0391249fa0902d66857c61c632e3c1a21ee6f3 Author: kl0u Date: 2016-04-11T13:01:31Z Fixed wrong test in FileInputFormatTest. commit ab6d68e3525e8ed368739b72d1ab90a70207d34d Author: kl0u Date: 2016-04-11T14:04:57Z Adding tests. commit 07fbd4848420eeb8f82fc2f51d3dc9879d6034aa Author: kl0u Date: 2016-04-12T09:00:59Z Fixed the BinaryInputFormat. commit 8b33caefe3d2eb0735bc38dd26ab9f375c4e211f Author: kl0u Date: 2016-04-12T14:22:10Z Added tests for binary input format commit f9502bcf582095afeccfc46e9b6859e0094fde63 Author: kl0u Date: 2016-04-13T17:49:17Z Change in the abstraction for the channel state. commit e5e95d494978a36006757b78284c52be1d2e9d83 Author: kl0u Date: 2016-04-13T22:51:44Z Adding the avro format commit ec8e6d49b7d01577c28fd8a0cdc3a352a5de9b77 Author: kl0u Date: 2016-04-14T13:35:37Z Added the Checkpointed Input Interface. commit 4987b68ef1ae168833edfad9c1b86d94ac33b71b Author: kl0u Date: 2016-04-14T13:52:20Z Adding test for the recovery. commit d72117f8355cdf578615ec4f3ae717ffd15f6ed2 Author: kl0u Date: 2016-04-15T08:43:18Z Cleaning up the code. > Add functionality to be a able to restore from specific point in a > FileInputFormat > -- > > Key: FLINK-3717 > URL: https://issues.apache.org/jira/browse/FLINK-3717 > Project: Flink > Issue Type: Sub-task > Components: Streaming >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas > > This is the first step in order to make the File Sources fault-tolerant. We > have to be able to get the start from a specific point in a file despite any > caching performed during reading. This will guarantee that the task that will > take over the execution of the failed one will be able to start from the > correct point in the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat
[ https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242904#comment-15242904 ] ASF GitHub Bot commented on FLINK-3717: --- Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1895#issuecomment-210450817 This PR addresses FLINK-3717. Please review! > Add functionality to be a able to restore from specific point in a > FileInputFormat > -- > > Key: FLINK-3717 > URL: https://issues.apache.org/jira/browse/FLINK-3717 > Project: Flink > Issue Type: Sub-task > Components: Streaming >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas > > This is the first step in order to make the File Sources fault-tolerant. We > have to be able to get the start from a specific point in a file despite any > caching performed during reading. This will guarantee that the task that will > take over the execution of the failed one will be able to start from the > correct point in the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3717 - Be able to re-start reading from ...
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/1895 FLINK-3717 - Be able to re-start reading from a specific point in a file. This pull-request addresses issue FLINK-3717 and is the first step to make File Sources fault-tolerant. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink fif_offset Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1895.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1895 commit a571f8de21d0c48fdb31b4518a4be76c74d32cea Author: kl0uDate: 2016-04-10T14:56:42Z Added the getOffset() in the FileInputFormat. commit bc7f97864c56210f79a23f9bdc7c328c19f4c2ee Author: kl0u Date: 2016-04-11T11:49:00Z Adding tests. commit 8b0391249fa0902d66857c61c632e3c1a21ee6f3 Author: kl0u Date: 2016-04-11T13:01:31Z Fixed wrong test in FileInputFormatTest. commit ab6d68e3525e8ed368739b72d1ab90a70207d34d Author: kl0u Date: 2016-04-11T14:04:57Z Adding tests. commit 07fbd4848420eeb8f82fc2f51d3dc9879d6034aa Author: kl0u Date: 2016-04-12T09:00:59Z Fixed the BinaryInputFormat. commit 8b33caefe3d2eb0735bc38dd26ab9f375c4e211f Author: kl0u Date: 2016-04-12T14:22:10Z Added tests for binary input format commit f9502bcf582095afeccfc46e9b6859e0094fde63 Author: kl0u Date: 2016-04-13T17:49:17Z Change in the abstraction for the channel state. commit e5e95d494978a36006757b78284c52be1d2e9d83 Author: kl0u Date: 2016-04-13T22:51:44Z Adding the avro format commit ec8e6d49b7d01577c28fd8a0cdc3a352a5de9b77 Author: kl0u Date: 2016-04-14T13:35:37Z Added the Checkpointed Input Interface. commit 4987b68ef1ae168833edfad9c1b86d94ac33b71b Author: kl0u Date: 2016-04-14T13:52:20Z Adding test for the recovery. commit d72117f8355cdf578615ec4f3ae717ffd15f6ed2 Author: kl0u Date: 2016-04-15T08:43:18Z Cleaning up the code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-3717 - Be able to re-start reading from ...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1895#issuecomment-210450817 This PR addresses FLINK-3717. Please review! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat
[ https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas updated FLINK-3717: -- Description: This is the first step in order to make the File Sources fault-tolerant. We have to be able to get the start from a specific point in a file despite any caching performed during reading. This will guarantee that the task that will take over the execution of the failed one will be able to start from the correct point in the file. (was: This is the first step in order to make the File Sources fault-tolerant. We have to be able to get the latest read offset in the file despite any caching performed during reading. This will guarantee that the task that will take over the execution of the failed one will be able to start from the correct point in the file.) > Add functionality to be a able to restore from specific point in a > FileInputFormat > -- > > Key: FLINK-3717 > URL: https://issues.apache.org/jira/browse/FLINK-3717 > Project: Flink > Issue Type: Sub-task > Components: Streaming >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas > > This is the first step in order to make the File Sources fault-tolerant. We > have to be able to get the start from a specific point in a file despite any > caching performed during reading. This will guarantee that the task that will > take over the execution of the failed one will be able to start from the > correct point in the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat
[ https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas updated FLINK-3717: -- Summary: Add functionality to be a able to restore from specific point in a FileInputFormat (was: Add functionality to get the current offset of a source with a FileInputFormat) > Add functionality to be a able to restore from specific point in a > FileInputFormat > -- > > Key: FLINK-3717 > URL: https://issues.apache.org/jira/browse/FLINK-3717 > Project: Flink > Issue Type: Sub-task > Components: Streaming >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas > > This is the first step in order to make the File Sources fault-tolerant. We > have to be able to get the latest read offset in the file despite any caching > performed during reading. This will guarantee that the task that will take > over the execution of the failed one will be able to start from the correct > point in the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-3479) Add hash-based strategy for CombineFunction
[ https://issues.apache.org/jira/browse/FLINK-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Gevay updated FLINK-3479: --- Priority: Minor (was: Major) > Add hash-based strategy for CombineFunction > --- > > Key: FLINK-3479 > URL: https://issues.apache.org/jira/browse/FLINK-3479 > Project: Flink > Issue Type: Sub-task > Components: Local Runtime >Reporter: Fabian Hueske >Priority: Minor > > This issue is similar to FLINK-3477 but adds a hash-based strategy for > {{CombineFunction}} instead of {{ReduceFunction}}. > The interface of {{CombineFunction}} differs from {{ReduceFunction}} by > providing an {{Iterable}} instead of two {{T}} values. Hence, if the > {{Iterable}} provides two values, we can do the same as with a > {{ReduceFunction}}. > At the moment, {{CombineFunction}} is wrapped in a {{GroupCombineFunction}} > and hence executed using the {{GroupReduceCombineDriver}}. > We should add dedicated two dedicated drivers: {{CombineDriver}} and > {{ChainedCombineDriver}} and two driver strategies: {{HASH_COMBINE}} and > {{SORT_COMBINE}}. > If FLINK-3477 is resolved, we can reuse the hash-table. > We should also add compiler hints to `DataSet.reduceGroup()` and > `Grouping.reduceGroup()` to allow users to select between a {{SORT}} and > {{HASH}} based combine strategies ({{HASH}} will only be applicable to > {{CombineFunction}} and not {{GroupCombineFunction}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3767) Show timeline also for running tasks, not only for finished ones
Robert Metzger created FLINK-3767: - Summary: Show timeline also for running tasks, not only for finished ones Key: FLINK-3767 URL: https://issues.apache.org/jira/browse/FLINK-3767 Project: Flink Issue Type: Sub-task Components: Webfrontend Reporter: Robert Metzger -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-2944) Collect, expose and display operator-specific stats
[ https://issues.apache.org/jira/browse/FLINK-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-2944: -- Issue Type: Sub-task (was: New Feature) Parent: FLINK-3766 > Collect, expose and display operator-specific stats > --- > > Key: FLINK-2944 > URL: https://issues.apache.org/jira/browse/FLINK-2944 > Project: Flink > Issue Type: Sub-task > Components: Local Runtime, Webfrontend >Reporter: Fabian Hueske > Labels: requires-design-doc > > It would be nice to collect operator-specific stats such as: > - HashJoin: bytes spilled build side, bytes spilled probe side, num keys > (build / probe) > - GroupBy: bytes spilled, num keys, max group size, avg. group size > - Combiner: combine rate (in/out ratio) > - Streaming Ops: avg. throughput > - etc. > and display these stats in the webUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-3766) Add a new tab for monitoring metrics in the web interface
[ https://issues.apache.org/jira/browse/FLINK-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-3766: -- Attachment: metricsMock-unselected.png metricsMock-selected.png > Add a new tab for monitoring metrics in the web interface > - > > Key: FLINK-3766 > URL: https://issues.apache.org/jira/browse/FLINK-3766 > Project: Flink > Issue Type: New Feature > Components: Webfrontend >Reporter: Robert Metzger > Attachments: metricsMock-selected.png, metricsMock-unselected.png > > > Add a new tab for showing operator/task specific metrics in the Flink web ui. > I propose to add a drop down into the tab that allows selecting the metrics. > For each metric, we show a box in a grid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242881#comment-15242881 ] ASF GitHub Bot commented on FLINK-3764: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1894#discussion_r59865835 --- Diff: flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingCluster.scala --- @@ -211,11 +211,18 @@ class TestingCluster( val jmsAliveFutures = jobManagerActors map { _ map { -tm => (tm ? Alive)(timeout) +jm => (jm ? Alive)(timeout) + } +} getOrElse(Seq()) + +val resourceManagersAliveFutures = resourceManagerActors map { + _ map { +rm => (rm ? Alive)(timeout) } } getOrElse(Seq()) -val combinedFuture = Future.sequence(tmsAliveFutures ++ jmsAliveFutures) +val combinedFuture = Future.sequence(tmsAliveFutures ++ jmsAliveFutures ++ + resourceManagersAliveFutures) Await.ready(combinedFuture, timeout) --- End diff -- No we don't know. Just a `TimeoutException` will be thrown. > LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails > on Travis > -- > > Key: FLINK-3764 > URL: https://issues.apache.org/jira/browse/FLINK-3764 > Project: Flink > Issue Type: Bug >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Critical > Labels: test-stability > > The > {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} > fails spuriously on Travis because of a {{NullPointerException}}. The reason > is that it's not properly waited until the {{ResourceManager}} has been > started. Due to this, it can happen that a leader notification message is > tried to be sent to a {{LeaderRetrievalListener}} which has not been set by > the {{ResourceManager}}. > [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3766) Add a new tab for monitoring metrics in the web interface
Robert Metzger created FLINK-3766: - Summary: Add a new tab for monitoring metrics in the web interface Key: FLINK-3766 URL: https://issues.apache.org/jira/browse/FLINK-3766 Project: Flink Issue Type: New Feature Components: Webfrontend Reporter: Robert Metzger Add a new tab for showing operator/task specific metrics in the Flink web ui. I propose to add a drop down into the tab that allows selecting the metrics. For each metric, we show a box in a grid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3764] [tests] Improve LeaderChangeState...
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1894#discussion_r59865835 --- Diff: flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingCluster.scala --- @@ -211,11 +211,18 @@ class TestingCluster( val jmsAliveFutures = jobManagerActors map { _ map { -tm => (tm ? Alive)(timeout) +jm => (jm ? Alive)(timeout) + } +} getOrElse(Seq()) + +val resourceManagersAliveFutures = resourceManagerActors map { + _ map { +rm => (rm ? Alive)(timeout) } } getOrElse(Seq()) -val combinedFuture = Future.sequence(tmsAliveFutures ++ jmsAliveFutures) +val combinedFuture = Future.sequence(tmsAliveFutures ++ jmsAliveFutures ++ + resourceManagersAliveFutures) Await.ready(combinedFuture, timeout) --- End diff -- No we don't know. Just a `TimeoutException` will be thrown. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242870#comment-15242870 ] ASF GitHub Bot commented on FLINK-3764: --- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/1894#discussion_r59864663 --- Diff: flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingCluster.scala --- @@ -211,11 +211,18 @@ class TestingCluster( val jmsAliveFutures = jobManagerActors map { _ map { -tm => (tm ? Alive)(timeout) +jm => (jm ? Alive)(timeout) + } +} getOrElse(Seq()) + +val resourceManagersAliveFutures = resourceManagerActors map { + _ map { +rm => (rm ? Alive)(timeout) } } getOrElse(Seq()) -val combinedFuture = Future.sequence(tmsAliveFutures ++ jmsAliveFutures) +val combinedFuture = Future.sequence(tmsAliveFutures ++ jmsAliveFutures ++ + resourceManagersAliveFutures) Await.ready(combinedFuture, timeout) --- End diff -- if this check fails, do we know due to which future? > LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails > on Travis > -- > > Key: FLINK-3764 > URL: https://issues.apache.org/jira/browse/FLINK-3764 > Project: Flink > Issue Type: Bug >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Critical > Labels: test-stability > > The > {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} > fails spuriously on Travis because of a {{NullPointerException}}. The reason > is that it's not properly waited until the {{ResourceManager}} has been > started. Due to this, it can happen that a leader notification message is > tried to be sent to a {{LeaderRetrievalListener}} which has not been set by > the {{ResourceManager}}. > [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3764] [tests] Improve LeaderChangeState...
Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/1894#discussion_r59864663 --- Diff: flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingCluster.scala --- @@ -211,11 +211,18 @@ class TestingCluster( val jmsAliveFutures = jobManagerActors map { _ map { -tm => (tm ? Alive)(timeout) +jm => (jm ? Alive)(timeout) + } +} getOrElse(Seq()) + +val resourceManagersAliveFutures = resourceManagerActors map { + _ map { +rm => (rm ? Alive)(timeout) } } getOrElse(Seq()) -val combinedFuture = Future.sequence(tmsAliveFutures ++ jmsAliveFutures) +val combinedFuture = Future.sequence(tmsAliveFutures ++ jmsAliveFutures ++ + resourceManagersAliveFutures) Await.ready(combinedFuture, timeout) --- End diff -- if this check fails, do we know due to which future? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2544) Some test cases using PowerMock fail with Java 8u20
[ https://issues.apache.org/jira/browse/FLINK-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242868#comment-15242868 ] ASF GitHub Bot commented on FLINK-2544: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1882#issuecomment-210438717 As a followup, could we add some "Assume" statements in the tests that check whether the Java version is either Java 7 or Java 8u51+ ? > Some test cases using PowerMock fail with Java 8u20 > --- > > Key: FLINK-2544 > URL: https://issues.apache.org/jira/browse/FLINK-2544 > Project: Flink > Issue Type: Bug >Reporter: Till Rohrmann >Priority: Minor > > I observed that some of the test cases using {{PowerMockRunner}} fail with > Java 8u20 with the following error: > {code} > java.lang.VerifyError: Bad method call from inside of a branch > Exception Details: > Location: > > org/apache/flink/client/program/ClientTest$SuccessReturningActor.()V > @32: invokespecial > Reason: > Error exists in the bytecode > Bytecode: > 0x000: 2a4c 1214 b800 1a03 bd00 0d12 1bb8 001f > 0x010: b800 254e 2db2 0029 a500 0e2a 01c0 002b > 0x020: b700 2ea7 0009 2bb7 0030 0157 2a01 4c01 > 0x030: 4d01 4e2b 01a5 0008 2b4e a700 0912 32b8 > 0x040: 001a 4e2d 1234 03bd 000d 1236 b800 1f12 > 0x050: 32b8 003a 3a04 1904 b200 29a6 000a b800 > 0x060: 3c4d a700 0919 04c0 0011 4d2c b800 42b5 > 0x070: 0046 b1 > Stackmap Table: > full_frame(@38,{UninitializedThis,UninitializedThis,Top,Object[#13]},{}) > full_frame(@44,{Object[#2],Object[#2],Top,Object[#13]},{}) > full_frame(@61,{Object[#2],Null,Null,Null},{Object[#2]}) > full_frame(@67,{Object[#2],Null,Null,Object[#15]},{Object[#2]}) > > full_frame(@101,{Object[#2],Null,Null,Object[#15],Object[#13]},{Object[#2]}) > > full_frame(@107,{Object[#2],Null,Object[#17],Object[#15],Object[#13]},{Object[#2]}) > at java.lang.Class.getDeclaredConstructors0(Native Method) > at java.lang.Class.privateGetDeclaredConstructors(Class.java:2658) > at java.lang.Class.getConstructor0(Class.java:3062) > at java.lang.Class.getDeclaredConstructor(Class.java:2165) > at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86) > at akka.util.Reflect$$anonfun$4.apply(Reflect.scala:86) > at scala.util.Try$.apply(Try.scala:161) > at akka.util.Reflect$.findConstructor(Reflect.scala:86) > at akka.actor.NoArgsReflectConstructor.(Props.scala:359) > at akka.actor.IndirectActorProducer$.apply(Props.scala:308) > at akka.actor.Props.producer(Props.scala:176) > at akka.actor.Props.(Props.scala:189) > at akka.actor.Props$.create(Props.scala:99) > at akka.actor.Props$.create(Props.scala:99) > at akka.actor.Props.create(Props.scala) > at > org.apache.flink.client.program.ClientTest.shouldSubmitToJobClient(ClientTest.java:143) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:310) > at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88) > at > org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:294) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:127) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:282) > at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86) > at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49) > at > org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:207) > at >
[GitHub] flink pull request: [FLINK-2544] Add Java 8 version for building P...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1882#issuecomment-210438717 As a followup, could we add some "Assume" statements in the tests that check whether the Java version is either Java 7 or Java 8u51+ ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242864#comment-15242864 ] ASF GitHub Bot commented on FLINK-3764: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1894#issuecomment-210438074 Looks good to me! Please merge once Travis gives green light... > LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails > on Travis > -- > > Key: FLINK-3764 > URL: https://issues.apache.org/jira/browse/FLINK-3764 > Project: Flink > Issue Type: Bug >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Critical > Labels: test-stability > > The > {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} > fails spuriously on Travis because of a {{NullPointerException}}. The reason > is that it's not properly waited until the {{ResourceManager}} has been > started. Due to this, it can happen that a leader notification message is > tried to be sent to a {{LeaderRetrievalListener}} which has not been set by > the {{ResourceManager}}. > [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3764] [tests] Improve LeaderChangeState...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1894#issuecomment-210438074 Looks good to me! Please merge once Travis gives green light... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242777#comment-15242777 ] ASF GitHub Bot commented on FLINK-3764: --- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/1894 [FLINK-3764] [tests] Improve LeaderChangeStateCleanupTest stability The LeaderElectionRetrievalTestingCluster now also waits for the Flink resource manager to be properly started before trying to send leader change notification messages to the LeaderRetrievalServices. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink fixLeaderChangeStateCleanupTest Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1894.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1894 commit c9895ba81db966fb3d48eced6529aa33fe1941ce Author: Till RohrmannDate: 2016-04-15T10:33:58Z [FLINK-3764] [tests] Improve LeaderChangeStateCleanupTest stability The LeaderElectionRetrievalTestingCluster now also waits for the Flink resource manager to be properly started before trying to send leader change notification messages to the LeaderRetrievalServices. > LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails > on Travis > -- > > Key: FLINK-3764 > URL: https://issues.apache.org/jira/browse/FLINK-3764 > Project: Flink > Issue Type: Bug >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Critical > Labels: test-stability > > The > {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} > fails spuriously on Travis because of a {{NullPointerException}}. The reason > is that it's not properly waited until the {{ResourceManager}} has been > started. Due to this, it can happen that a leader notification message is > tried to be sent to a {{LeaderRetrievalListener}} which has not been set by > the {{ResourceManager}}. > [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3764] [tests] Improve LeaderChangeState...
GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/1894 [FLINK-3764] [tests] Improve LeaderChangeStateCleanupTest stability The LeaderElectionRetrievalTestingCluster now also waits for the Flink resource manager to be properly started before trying to send leader change notification messages to the LeaderRetrievalServices. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink fixLeaderChangeStateCleanupTest Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1894.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1894 commit c9895ba81db966fb3d48eced6529aa33fe1941ce Author: Till RohrmannDate: 2016-04-15T10:33:58Z [FLINK-3764] [tests] Improve LeaderChangeStateCleanupTest stability The LeaderElectionRetrievalTestingCluster now also waits for the Flink resource manager to be properly started before trying to send leader change notification messages to the LeaderRetrievalServices. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-3765) ScalaShellITCase deadlocks spuriously
Till Rohrmann created FLINK-3765: Summary: ScalaShellITCase deadlocks spuriously Key: FLINK-3765 URL: https://issues.apache.org/jira/browse/FLINK-3765 Project: Flink Issue Type: Bug Components: Scala Shell Affects Versions: 1.1.0 Reporter: Till Rohrmann Priority: Critical The {{ScalaShellITCase}} deadlocks spuriously. Stephan already observed it locally and now it also happened on Travis [1]. [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123206554/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-3764: - Description: The {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} fails spuriously on Travis because of a {{NullPointerException}}. The reason is that it's not properly waited until the {{ResourceManager}} has been started. Due to this, it can happen that a leader notification message is tried to be sent to a {{LeaderRetrievalListener}} which has not been set by the {{ResourceManager}}. [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt was:The {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} fails spuriously on Travis because of a {{NullPointerException}}. The reason is that it's not properly waited until the {{ResourceManager}} has been started. Due to this, it can happen that a leader notification message is tried to be sent to a {{LeaderRetrievalListener}} which has not been set by the {{ResourceManager}}. > LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails > on Travis > -- > > Key: FLINK-3764 > URL: https://issues.apache.org/jira/browse/FLINK-3764 > Project: Flink > Issue Type: Bug >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Critical > Labels: test-stability > > The > {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} > fails spuriously on Travis because of a {{NullPointerException}}. The reason > is that it's not properly waited until the {{ResourceManager}} has been > started. Due to this, it can happen that a leader notification message is > tried to be sent to a {{LeaderRetrievalListener}} which has not been set by > the {{ResourceManager}}. > [1] https://s3.amazonaws.com/archive.travis-ci.org/jobs/123271732/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3764) LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis
Till Rohrmann created FLINK-3764: Summary: LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification fails on Travis Key: FLINK-3764 URL: https://issues.apache.org/jira/browse/FLINK-3764 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Till Rohrmann Priority: Critical The {{LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification}} fails spuriously on Travis because of a {{NullPointerException}}. The reason is that it's not properly waited until the {{ResourceManager}} has been started. Due to this, it can happen that a leader notification message is tried to be sent to a {{LeaderRetrievalListener}} which has not been set by the {{ResourceManager}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking
[ https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242751#comment-15242751 ] ASF GitHub Bot commented on FLINK-3762: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1891#issuecomment-210403044 Thanks, will merge this PR to master and release-1.0 branches > Kryo StackOverflowError due to disabled Kryo Reference tracking > > > Key: FLINK-3762 > URL: https://issues.apache.org/jira/browse/FLINK-3762 > Project: Flink > Issue Type: Bug > Components: Core >Affects Versions: 1.0.1 >Reporter: Andrew Palumbo >Assignee: Andrew Palumbo > Fix For: 1.0.2 > > > As discussed on the dev list, > In {{KryoSerializer.java}} > Kryo Reference tracking is disabled by default: > {code} > kryo.setReferences(false); > {code} > This can causes {{StackOverflowError}} Exceptions when serializing many > objects that may contain recursive objects: > {code} > java.lang.StackOverflowError > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > {code} > By enabling reference tracking, we can fix this problem. > [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3762: StackOverflowError due to disabled...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1891#issuecomment-210403044 Thanks, will merge this PR to master and release-1.0 branches --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3748) Add CASE function to Table API
[ https://issues.apache.org/jira/browse/FLINK-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242749#comment-15242749 ] ASF GitHub Bot commented on FLINK-3748: --- GitHub user twalthr opened a pull request: https://github.com/apache/flink/pull/1893 [FLINK-3748] [table] Add CASE function to Table API This PR adds an evaluation function to the Table API that allows for if/else branching. It also allows `CASE WHEN` syntax in SQL. I will update the EBNF grammar in the documentation as part of FLINK-3086 next. You can merge this pull request into a Git repository by running: $ git pull https://github.com/twalthr/flink TableApiCaseWhen Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1893.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1893 commit 538b7bfae208ef68131bc76fb05c8835cbd55928 Author: twalthrDate: 2016-04-13T12:36:36Z [FLINK-3748] [table] Add CASE function to Table API > Add CASE function to Table API > -- > > Key: FLINK-3748 > URL: https://issues.apache.org/jira/browse/FLINK-3748 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Timo Walther >Assignee: Timo Walther > > Add a CASE/WHEN functionality to Java/Scala Table API and add support for SQL > API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3748] [table] Add CASE function to Tabl...
GitHub user twalthr opened a pull request: https://github.com/apache/flink/pull/1893 [FLINK-3748] [table] Add CASE function to Table API This PR adds an evaluation function to the Table API that allows for if/else branching. It also allows `CASE WHEN` syntax in SQL. I will update the EBNF grammar in the documentation as part of FLINK-3086 next. You can merge this pull request into a Git repository by running: $ git pull https://github.com/twalthr/flink TableApiCaseWhen Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1893.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1893 commit 538b7bfae208ef68131bc76fb05c8835cbd55928 Author: twalthrDate: 2016-04-13T12:36:36Z [FLINK-3748] [table] Add CASE function to Table API --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking
[ https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242748#comment-15242748 ] ASF GitHub Bot commented on FLINK-3762: --- Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/1891#issuecomment-210402429 Thanks for your contribution @andrewpalumbo :-) > Kryo StackOverflowError due to disabled Kryo Reference tracking > > > Key: FLINK-3762 > URL: https://issues.apache.org/jira/browse/FLINK-3762 > Project: Flink > Issue Type: Bug > Components: Core >Affects Versions: 1.0.1 >Reporter: Andrew Palumbo >Assignee: Andrew Palumbo > Fix For: 1.0.2 > > > As discussed on the dev list, > In {{KryoSerializer.java}} > Kryo Reference tracking is disabled by default: > {code} > kryo.setReferences(false); > {code} > This can causes {{StackOverflowError}} Exceptions when serializing many > objects that may contain recursive objects: > {code} > java.lang.StackOverflowError > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > {code} > By enabling reference tracking, we can fix this problem. > [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3762: StackOverflowError due to disabled...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/1891#issuecomment-210402429 Thanks for your contribution @andrewpalumbo :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3762) Kryo StackOverflowError due to disabled Kryo Reference tracking
[ https://issues.apache.org/jira/browse/FLINK-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242747#comment-15242747 ] ASF GitHub Bot commented on FLINK-3762: --- Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/1891#issuecomment-210402401 Changes look good to me. +1 for merging. > Kryo StackOverflowError due to disabled Kryo Reference tracking > > > Key: FLINK-3762 > URL: https://issues.apache.org/jira/browse/FLINK-3762 > Project: Flink > Issue Type: Bug > Components: Core >Affects Versions: 1.0.1 >Reporter: Andrew Palumbo >Assignee: Andrew Palumbo > Fix For: 1.0.2 > > > As discussed on the dev list, > In {{KryoSerializer.java}} > Kryo Reference tracking is disabled by default: > {code} > kryo.setReferences(false); > {code} > This can causes {{StackOverflowError}} Exceptions when serializing many > objects that may contain recursive objects: > {code} > java.lang.StackOverflowError > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) > {code} > By enabling reference tracking, we can fix this problem. > [1]https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3762: StackOverflowError due to disabled...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/1891#issuecomment-210402401 Changes look good to me. +1 for merging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3759) Table API should throw exception is null value is encountered in non-null mode.
[ https://issues.apache.org/jira/browse/FLINK-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242744#comment-15242744 ] ASF GitHub Bot commented on FLINK-3759: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1892#issuecomment-210401616 Thanks for the fix. Will merge this PR. > Table API should throw exception is null value is encountered in non-null > mode. > --- > > Key: FLINK-3759 > URL: https://issues.apache.org/jira/browse/FLINK-3759 > Project: Flink > Issue Type: Bug > Components: Table API >Affects Versions: 1.1.0 >Reporter: Fabian Hueske >Priority: Critical > > The Table API can be configured to omit null-checks in generated code to > speed up processing. Currently, the generated code replaces a null value with > a data type specific default value if it is encountered in non-null-check > mode. > This can silently cause wrong results and should be changed such that an > exception is thrown if a null value is encountered in non-null-check mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3759] [table] Table API should throw ex...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1892#issuecomment-210401616 Thanks for the fix. Will merge this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---