[jira] [Commented] (FLINK-2523) Make task canceling interrupt interval configurable
[ https://issues.apache.org/jira/browse/FLINK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139415#comment-15139415 ] ASF GitHub Bot commented on FLINK-2523: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1612#discussion_r52356401 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobGraph.java --- @@ -102,60 +101,150 @@ // /** -* Constructs a new job graph with no name and a random job ID. +* Constructs a new job graph with no name, a random job ID, and the default +* {@link ExecutionConfig} parameters. */ public JobGraph() { this((String) null); } /** -* Constructs a new job graph with the given name, a random job ID. +* Constructs a new job graph with the given name, a random job ID and the default +* {@link ExecutionConfig} parameters. * * @param jobName The name of the job */ public JobGraph(String jobName) { - this(null, jobName); + this(null, jobName, new ExecutionConfig()); + } + + /** +* Constructs a new job graph with the given job ID, the given name, and the default +* {@link ExecutionConfig} parameters. +* +* @param jobID The id of the job. A random ID is generated, if {@code null} is passed. +* @param jobName The name of the job. +*/ + public JobGraph(JobID jobID, String jobName) { + this(jobID, jobName, new ExecutionConfig()); } /** -* Constructs a new job graph with the given name and a random job ID if null supplied as an id. +* Constructs a new job graph with the given name, the given {@link ExecutionConfig}, +* and a random job ID. +* +* @param jobName The name of the job. +* @param config The execution configuration of the job. +*/ + public JobGraph(String jobName, ExecutionConfig config) { + this(null, jobName, config); + } + + /** +* Constructs a new job graph with the given job ID (or a random ID, if {@code null} is passed), +* the given name and the given execution configuration (see {@link ExecutionConfig}). * * @param jobId The id of the job. A random ID is generated, if {@code null} is passed. * @param jobName The name of the job. +* @param config The execution configuration of the job. */ - public JobGraph(JobID jobId, String jobName) { + public JobGraph(JobID jobId, String jobName, ExecutionConfig config) { this.jobID = jobId == null ? new JobID() : jobId; this.jobName = jobName == null ? "(unnamed job)" : jobName; + this.executionConfig = config; --- End diff -- Can `executionConfig` be `null`? If not, then we should insert a check here. > Make task canceling interrupt interval configurable > --- > > Key: FLINK-2523 > URL: https://issues.apache.org/jira/browse/FLINK-2523 > Project: Flink > Issue Type: Improvement > Components: TaskManager >Affects Versions: 0.10.0 >Reporter: Stephan Ewen >Assignee: Klou > Fix For: 1.0.0 > > > When a task is canceled, the cancellation calls periodically "interrupt()" on > the task thread, if the task thread does not cancel with a certain time. > Currently, this value is hard coded to 10 seconds. We should make that time > configurable. > Until then, I would like to increase the value to 30 seconds, as many tasks > (here I am observing it for Kafka consumers) can take longer then 10 seconds > for proper cleanup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3382] Improve clarity of object reuse i...
Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/1614#issuecomment-182025617 Shouldn't the current record remain valid if `hasNext()` returned true? I mean the user might be holding on to the object returned in `next`, and expect it to not be changed by a `hasNext` call: ``` T cur = it.next(); if(it.hasNext()) { // here, I would expect cur to not have changed since the next() call } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3382) Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper
[ https://issues.apache.org/jira/browse/FLINK-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139530#comment-15139530 ] ASF GitHub Bot commented on FLINK-3382: --- Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/1614#issuecomment-182025617 Shouldn't the current record remain valid if `hasNext()` returned true? I mean the user might be holding on to the object returned in `next`, and expect it to not be changed by a `hasNext` call: ``` T cur = it.next(); if(it.hasNext()) { // here, I would expect cur to not have changed since the next() call } ``` > Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper > - > > Key: FLINK-3382 > URL: https://issues.apache.org/jira/browse/FLINK-3382 > Project: Flink > Issue Type: Improvement > Components: Distributed Runtime >Affects Versions: 1.0.0 >Reporter: Greg Hogan >Assignee: Greg Hogan >Priority: Minor > > Object reuse in {{ReusingMutableToRegularIteratorWrapper.hasNext()}} can be > clarified by creating a single object and storing the iterator's next value > into the second reference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3373) Using a newer library of Apache HttpClient than 4.2.6 will get class loading problems
[ https://issues.apache.org/jira/browse/FLINK-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139694#comment-15139694 ] ASF GitHub Bot commented on FLINK-3373: --- GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/1615 [FLINK-3373] [build] Shade away Hadoop's HTTP Components dependency This makes the HTTP Components dependency disappear from the core classpath, allowing users to use their own version of the dependency. We need shading because we cannot simply bump the HTTP Components version to the newest version. The YARN test for Hadoop version >= 2.6.0 fail in that case. You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink http_shade Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1615.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1615 commit 1be39d12071c7251cd566e692c3a9c7b5440e46d Author: Stephan EwenDate: 2016-02-09T20:18:43Z [FLINK-3373] [build] Shade away Hadoop's HTTP Components dependency > Using a newer library of Apache HttpClient than 4.2.6 will get class loading > problems > - > > Key: FLINK-3373 > URL: https://issues.apache.org/jira/browse/FLINK-3373 > Project: Flink > Issue Type: Bug > Environment: Latest Flink snapshot 1.0 >Reporter: Jakob Sultan Ericsson > > When I trying to use Apache HTTP client 4.5.1 in my flink job it will crash > with NoClassDefFound. > This has to do that it load some classes from provided httpclient 4.2.5/6 in > core flink. > {noformat} > 17:05:56,193 INFO org.apache.flink.runtime.taskmanager.Task >- DuplicateFilter -> InstallKeyLookup (11/16) switched to FAILED with > exception. > java.lang.NoSuchFieldError: INSTANCE > at > org.apache.http.conn.ssl.SSLConnectionSocketFactory.(SSLConnectionSocketFactory.java:144) > at > org.apache.http.impl.conn.PoolingHttpClientConnectionManager.getDefaultRegistry(PoolingHttpClientConnectionManager.java:109) > at > org.apache.http.impl.conn.PoolingHttpClientConnectionManager.(PoolingHttpClientConnectionManager.java:116) > ... > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:305) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:227) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) > at java.lang.Thread.run(Thread.java:745) > {noformat} > SSLConnectionSocketFactory and finds an earlier version of the > AllowAllHostnameVerifier that does have the INSTANCE variable (instance > variable was probably added in 4.3). > {noformat} > jar tvf lib/flink-dist-1.0-SNAPSHOT.jar |grep AllowAllHostnameVerifier >791 Thu Dec 17 09:55:46 CET 2015 > org/apache/http/conn/ssl/AllowAllHostnameVerifier.class > {noformat} > Solutions would be: > - Fix the classloader so that my custom job does not conflict with internal > flink-core classes... pretty hard > - Remove the dependency somehow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3382] Improve clarity of object reuse i...
Github user greghogan commented on the pull request: https://github.com/apache/flink/pull/1614#issuecomment-182035391 Ah, yes, now I see. I'll just burn this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3382) Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper
[ https://issues.apache.org/jira/browse/FLINK-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139652#comment-15139652 ] ASF GitHub Bot commented on FLINK-3382: --- Github user greghogan commented on the pull request: https://github.com/apache/flink/pull/1614#issuecomment-182035391 Ah, yes, now I see. I'll just burn this. > Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper > - > > Key: FLINK-3382 > URL: https://issues.apache.org/jira/browse/FLINK-3382 > Project: Flink > Issue Type: Improvement > Components: Distributed Runtime >Affects Versions: 1.0.0 >Reporter: Greg Hogan >Assignee: Greg Hogan >Priority: Minor > > Object reuse in {{ReusingMutableToRegularIteratorWrapper.hasNext()}} can be > clarified by creating a single object and storing the iterator's next value > into the second reference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3382) Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper
[ https://issues.apache.org/jira/browse/FLINK-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139653#comment-15139653 ] ASF GitHub Bot commented on FLINK-3382: --- Github user greghogan closed the pull request at: https://github.com/apache/flink/pull/1614 > Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper > - > > Key: FLINK-3382 > URL: https://issues.apache.org/jira/browse/FLINK-3382 > Project: Flink > Issue Type: Improvement > Components: Distributed Runtime >Affects Versions: 1.0.0 >Reporter: Greg Hogan >Assignee: Greg Hogan >Priority: Minor > > Object reuse in {{ReusingMutableToRegularIteratorWrapper.hasNext()}} can be > clarified by creating a single object and storing the iterator's next value > into the second reference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLINK-3333) Documentation about object reuse should be improved
[ https://issues.apache.org/jira/browse/FLINK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139151#comment-15139151 ] Greg Hogan edited comment on FLINK- at 2/9/16 8:07 PM: --- Apache Flink programs can be written and configured to reduce the number of object allocations for better performance. User defined functions (like map() or groupReduce()) process many millions or billions of input and output values. Enabling object reuse and processing mutable objects improves performance by lowering demand on the CPU cache and Java garbage collector. Object reuse is disabled by default, with user defined functions generally getting new objects on each call (or through an iterator). In this case it is safe to store references to the objects inside the function (for example, in a List). <'storing values in a list' example> Apache Flink will chain functions to improve performance when sorting is preserved and the parallelism unchanged. The chainable operators are Map, FlatMap, Reduce on full DataSet, GroupCombine on a Grouped DataSet, or a GroupReduce where the user supplied a RichGroupReduceFunction with a combine method. Objects are passed without copying _even when object reuse is disabled_. In the chaining case, the functions in the chain are receiving the same object instances. So the the second map() function is receiving the objects the first map() is returning. This behavior can lead to errors when the first map() function keeps a list of all objects and the second mapper is modifying objects. In that case, the user has to manually create copies of the objects before putting them into the list. There is a switch at the ExecutionConfig which allows users to enable the object reuse mode (enableObjectReuse()). For mutable types, Flink will reuse object instances. In practice that means that a user function will always receive the same object instance (with its fields set to new values). The object reuse mode will lead to better performance because fewer objects are created, but the user has to manually take care of what they are doing with the object references. was (Author: greghogan): Apache Flink programs can be written and configured to reduce the number of object allocations for better performance. User defined functions (like map() or groupReduce()) process many millions or billions of input and output values. Enabling object reuse and processing mutable objects improves performance by lowering demand on the CPU cache and Java garbage collector. Object reuse is disabled by default, with user defined functions generally getting new objects on each call (or through an iterator). In this case it is safe to store references to the objects inside the function (for example, in a List). <'storing values in a list' example> Apache Flink will chain functions to improve performance when sorting is preserved and the parallelism unchanged. The chainable operators are Map, FlatMap, Reduce on full DataSet, GroupCombine on a Grouped DataSet, or a GroupReduce where the user supplied a RichGroupReduceFunction with a combine method). Objects are passed without copying _even when object reuse is disabled_. In the chaining case, the functions in the chain are receiving the same object instances. So the the second map() function is receiving the objects the first map() is returning. This behavior can lead to errors when the first map() function keeps a list of all objects and the second mapper is modifying objects. In that case, the user has to manually create copies of the objects before putting them into the list. There is a switch at the ExecutionConfig which allows users to enable the object reuse mode (enableObjectReuse()). For mutable types, Flink will reuse object instances. In practice that means that a user function will always receive the same object instance (with its fields set to new values). The object reuse mode will lead to better performance because fewer objects are created, but the user has to manually take care of what they are doing with the object references. > Documentation about object reuse should be improved > --- > > Key: FLINK- > URL: https://issues.apache.org/jira/browse/FLINK- > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.0.0 >Reporter: Gabor Gevay >Assignee: Gabor Gevay >Priority: Blocker > Fix For: 1.0.0 > > > The documentation about object reuse \[1\] has several problems, see \[2\]. > \[1\] > https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/index.html#object-reuse-behavior > \[2\] > https://docs.google.com/document/d/1cgkuttvmj4jUonG7E2RdFVjKlfQDm_hE6gvFcgAfzXg/edit -- This message was sent by
[GitHub] flink pull request: [FLINK-3373] [build] Shade away Hadoop's HTTP ...
GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/1615 [FLINK-3373] [build] Shade away Hadoop's HTTP Components dependency This makes the HTTP Components dependency disappear from the core classpath, allowing users to use their own version of the dependency. We need shading because we cannot simply bump the HTTP Components version to the newest version. The YARN test for Hadoop version >= 2.6.0 fail in that case. You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink http_shade Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1615.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1615 commit 1be39d12071c7251cd566e692c3a9c7b5440e46d Author: Stephan EwenDate: 2016-02-09T20:18:43Z [FLINK-3373] [build] Shade away Hadoop's HTTP Components dependency --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3382] Improve clarity of object reuse i...
GitHub user greghogan opened a pull request: https://github.com/apache/flink/pull/1614 [FLINK-3382] Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper You can merge this pull request into a Git repository by running: $ git pull https://github.com/greghogan/flink 3382_improve_clarity_of_object_reuse_in_ReusingMutableToRegularIteratorWrapper Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1614.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1614 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3260] [runtime] Enforce terminal state ...
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/1613#discussion_r52368138 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java --- @@ -107,7 +108,7 @@ private static final AtomicReferenceFieldUpdaterSTATE_UPDATER = AtomicReferenceFieldUpdater.newUpdater(Execution.class, ExecutionState.class, "state"); - private static final Logger LOG = ExecutionGraph.LOG; + private static final Logger LOG = LoggerFactory.getLogger(Execution.class); --- End diff -- Did this cause issues in this case? I originally set the logger to the ExecutionGraph logger to get all messages related to the execution and it changes in one log namespace. I always thought that makes searching the log easier. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2380) Allow to configure default FS for file inputs
[ https://issues.apache.org/jira/browse/FLINK-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139717#comment-15139717 ] ASF GitHub Bot commented on FLINK-2380: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/1524#issuecomment-182054657 I'm testing the change on a cluster (with YARN) to see if everything is working as expected. > Allow to configure default FS for file inputs > - > > Key: FLINK-2380 > URL: https://issues.apache.org/jira/browse/FLINK-2380 > Project: Flink > Issue Type: Improvement > Components: JobManager >Affects Versions: 0.9, 0.10.0 >Reporter: Ufuk Celebi >Assignee: Klou >Priority: Minor > Labels: starter > Fix For: 1.0.0 > > > File inputs use "file://" as default prefix. A user asked to make this > configurable, e.g. "hdfs://" as default. > (I'm not sure whether this is already possible or not. I will check and > either close or implement this for the user.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3382) Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper
[ https://issues.apache.org/jira/browse/FLINK-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139441#comment-15139441 ] ASF GitHub Bot commented on FLINK-3382: --- GitHub user greghogan opened a pull request: https://github.com/apache/flink/pull/1614 [FLINK-3382] Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper You can merge this pull request into a Git repository by running: $ git pull https://github.com/greghogan/flink 3382_improve_clarity_of_object_reuse_in_ReusingMutableToRegularIteratorWrapper Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1614.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1614 > Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper > - > > Key: FLINK-3382 > URL: https://issues.apache.org/jira/browse/FLINK-3382 > Project: Flink > Issue Type: Improvement > Components: Distributed Runtime >Affects Versions: 1.0.0 >Reporter: Greg Hogan >Assignee: Greg Hogan >Priority: Minor > > Object reuse in {{ReusingMutableToRegularIteratorWrapper.hasNext()}} can be > clarified by creating a single object and storing the iterator's next value > into the second reference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3260) ExecutionGraph gets stuck in state FAILING
[ https://issues.apache.org/jira/browse/FLINK-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139699#comment-15139699 ] ASF GitHub Bot commented on FLINK-3260: --- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/1613#discussion_r52368138 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java --- @@ -107,7 +108,7 @@ private static final AtomicReferenceFieldUpdaterSTATE_UPDATER = AtomicReferenceFieldUpdater.newUpdater(Execution.class, ExecutionState.class, "state"); - private static final Logger LOG = ExecutionGraph.LOG; + private static final Logger LOG = LoggerFactory.getLogger(Execution.class); --- End diff -- Did this cause issues in this case? I originally set the logger to the ExecutionGraph logger to get all messages related to the execution and it changes in one log namespace. I always thought that makes searching the log easier. > ExecutionGraph gets stuck in state FAILING > -- > > Key: FLINK-3260 > URL: https://issues.apache.org/jira/browse/FLINK-3260 > Project: Flink > Issue Type: Bug > Components: JobManager >Affects Versions: 0.10.1 >Reporter: Stephan Ewen >Assignee: Till Rohrmann >Priority: Blocker > Fix For: 1.0.0 > > > It is a bit of a rare case, but the following can currently happen: > 1. Jobs runs for a while, some tasks are already finished. > 2. Job fails, goes to state failing and restarting. Non-finished tasks fail > or are canceled. > 3. For the finished tasks, ask-futures from certain messages (for example > for releasing intermediate result partitions) can fail (timeout) and cause > the execution to go from FINISHED to FAILED > 4. This triggers the execution graph to go to FAILING without ever going > further into RESTARTING again > 5. The job is stuck > It initially looks like this is mainly an issue for batch jobs (jobs where > tasks do finish, rather than run infinitely). > The log that shows how this manifests: > {code} > > 17:19:19,782 INFO akka.event.slf4j.Slf4jLogger >- Slf4jLogger started > 17:19:19,844 INFO Remoting >- Starting remoting > 17:19:20,065 INFO Remoting >- Remoting started; listening on addresses > :[akka.tcp://flink@127.0.0.1:56722] > 17:19:20,090 INFO org.apache.flink.runtime.blob.BlobServer >- Created BLOB server storage directory > /tmp/blobStore-6766f51a-1c51-4a03-acfb-08c2c29c11f0 > 17:19:20,096 INFO org.apache.flink.runtime.blob.BlobServer >- Started BLOB server at 0.0.0.0:43327 - max concurrent requests: 50 - max > backlog: 1000 > 17:19:20,113 INFO org.apache.flink.runtime.jobmanager.MemoryArchivist >- Started memory archivist akka://flink/user/archive > 17:19:20,115 INFO org.apache.flink.runtime.checkpoint.SavepointStoreFactory >- No savepoint state backend configured. Using job manager savepoint state > backend. > 17:19:20,118 INFO org.apache.flink.runtime.jobmanager.JobManager >- Starting JobManager at akka.tcp://flink@127.0.0.1:56722/user/jobmanager. > 17:19:20,123 INFO org.apache.flink.runtime.jobmanager.JobManager >- JobManager akka.tcp://flink@127.0.0.1:56722/user/jobmanager was granted > leadership with leader session ID None. > 17:19:25,605 INFO org.apache.flink.runtime.instance.InstanceManager >- Registered TaskManager at > testing-worker-linux-docker-e6d6931f-3200-linux-4 > (akka.tcp://flink@172.17.0.253:43702/user/taskmanager) as > f213232054587f296a12140d56f63ed1. Current number of registered hosts is 1. > Current number of alive task slots is 2. > 17:19:26,758 INFO org.apache.flink.runtime.instance.InstanceManager >- Registered TaskManager at > testing-worker-linux-docker-e6d6931f-3200-linux-4 > (akka.tcp://flink@172.17.0.253:43956/user/taskmanager) as > f9e78baa14fb38c69517fb1bcf4f419c. Current number of registered hosts is 2. > Current number of alive task slots is 4. > 17:19:27,064 INFO org.apache.flink.api.java.ExecutionEnvironment >- The job has 0 registered types and 0 default Kryo serializers > 17:19:27,071 INFO org.apache.flink.client.program.Client >- Starting client actor system > 17:19:27,072 INFO org.apache.flink.runtime.client.JobClient >- Starting JobClient actor system > 17:19:27,110 INFO akka.event.slf4j.Slf4jLogger
[jira] [Commented] (FLINK-3260) ExecutionGraph gets stuck in state FAILING
[ https://issues.apache.org/jira/browse/FLINK-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139701#comment-15139701 ] ASF GitHub Bot commented on FLINK-3260: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1613#issuecomment-182046481 Looks good, with one inline comment. Otherwise, +1 to merge > ExecutionGraph gets stuck in state FAILING > -- > > Key: FLINK-3260 > URL: https://issues.apache.org/jira/browse/FLINK-3260 > Project: Flink > Issue Type: Bug > Components: JobManager >Affects Versions: 0.10.1 >Reporter: Stephan Ewen >Assignee: Till Rohrmann >Priority: Blocker > Fix For: 1.0.0 > > > It is a bit of a rare case, but the following can currently happen: > 1. Jobs runs for a while, some tasks are already finished. > 2. Job fails, goes to state failing and restarting. Non-finished tasks fail > or are canceled. > 3. For the finished tasks, ask-futures from certain messages (for example > for releasing intermediate result partitions) can fail (timeout) and cause > the execution to go from FINISHED to FAILED > 4. This triggers the execution graph to go to FAILING without ever going > further into RESTARTING again > 5. The job is stuck > It initially looks like this is mainly an issue for batch jobs (jobs where > tasks do finish, rather than run infinitely). > The log that shows how this manifests: > {code} > > 17:19:19,782 INFO akka.event.slf4j.Slf4jLogger >- Slf4jLogger started > 17:19:19,844 INFO Remoting >- Starting remoting > 17:19:20,065 INFO Remoting >- Remoting started; listening on addresses > :[akka.tcp://flink@127.0.0.1:56722] > 17:19:20,090 INFO org.apache.flink.runtime.blob.BlobServer >- Created BLOB server storage directory > /tmp/blobStore-6766f51a-1c51-4a03-acfb-08c2c29c11f0 > 17:19:20,096 INFO org.apache.flink.runtime.blob.BlobServer >- Started BLOB server at 0.0.0.0:43327 - max concurrent requests: 50 - max > backlog: 1000 > 17:19:20,113 INFO org.apache.flink.runtime.jobmanager.MemoryArchivist >- Started memory archivist akka://flink/user/archive > 17:19:20,115 INFO org.apache.flink.runtime.checkpoint.SavepointStoreFactory >- No savepoint state backend configured. Using job manager savepoint state > backend. > 17:19:20,118 INFO org.apache.flink.runtime.jobmanager.JobManager >- Starting JobManager at akka.tcp://flink@127.0.0.1:56722/user/jobmanager. > 17:19:20,123 INFO org.apache.flink.runtime.jobmanager.JobManager >- JobManager akka.tcp://flink@127.0.0.1:56722/user/jobmanager was granted > leadership with leader session ID None. > 17:19:25,605 INFO org.apache.flink.runtime.instance.InstanceManager >- Registered TaskManager at > testing-worker-linux-docker-e6d6931f-3200-linux-4 > (akka.tcp://flink@172.17.0.253:43702/user/taskmanager) as > f213232054587f296a12140d56f63ed1. Current number of registered hosts is 1. > Current number of alive task slots is 2. > 17:19:26,758 INFO org.apache.flink.runtime.instance.InstanceManager >- Registered TaskManager at > testing-worker-linux-docker-e6d6931f-3200-linux-4 > (akka.tcp://flink@172.17.0.253:43956/user/taskmanager) as > f9e78baa14fb38c69517fb1bcf4f419c. Current number of registered hosts is 2. > Current number of alive task slots is 4. > 17:19:27,064 INFO org.apache.flink.api.java.ExecutionEnvironment >- The job has 0 registered types and 0 default Kryo serializers > 17:19:27,071 INFO org.apache.flink.client.program.Client >- Starting client actor system > 17:19:27,072 INFO org.apache.flink.runtime.client.JobClient >- Starting JobClient actor system > 17:19:27,110 INFO akka.event.slf4j.Slf4jLogger >- Slf4jLogger started > 17:19:27,121 INFO Remoting >- Starting remoting > 17:19:27,143 INFO org.apache.flink.runtime.client.JobClient >- Started JobClient actor system at 127.0.0.1:51198 > 17:19:27,145 INFO Remoting >- Remoting started; listening on addresses > :[akka.tcp://flink@127.0.0.1:51198] > 17:19:27,325 INFO org.apache.flink.runtime.client.JobClientActor >- Disconnect from JobManager null. > 17:19:27,362 INFO org.apache.flink.runtime.client.JobClientActor >- Received job Flink Java Job at Mon Jan
[GitHub] flink pull request: [FLINK-3260] [runtime] Enforce terminal state ...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1613#issuecomment-182046481 Looks good, with one inline comment. Otherwise, +1 to merge --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52281750 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java --- @@ -1377,6 +1377,24 @@ public long count() throws Exception { return new SortPartitionOperator<>(this, field, order, Utils.getCallLocationName()); } + /** +* Locally sorts the partitions of the DataSet on the an extracted key in the specified order. +* DataSet can be sorted on multiple values by returning a tuple from the KeySelector. --- End diff -- "The DataSet can be ...", add "The" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52281721 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java --- @@ -1377,6 +1377,24 @@ public long count() throws Exception { return new SortPartitionOperator<>(this, field, order, Utils.getCallLocationName()); } + /** +* Locally sorts the partitions of the DataSet on the an extracted key in the specified order. --- End diff -- "...the DataSet on the **an** extracted key...", remove "an" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)
[ https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138686#comment-15138686 ] ASF GitHub Bot commented on FLINK-1966: --- Github user chobeat commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181790876 I agree with @sachingoel0101 on the import complexity but, from our point of view, Flink is the perfect platform to evaluate models in streaming and we are using it that way in our architecture. Why do you think it wouldn't be suitable? > Add support for predictive model markup language (PMML) > --- > > Key: FLINK-1966 > URL: https://issues.apache.org/jira/browse/FLINK-1966 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Sachin Goel >Priority: Minor > Labels: ML > > The predictive model markup language (PMML) [1] is a widely used language to > describe predictive and descriptive models as well as pre- and > post-processing steps. That way it allows and easy way to export for and > import models from other ML tools. > Resources: > [1] > http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)
[ https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138722#comment-15138722 ] ASF GitHub Bot commented on FLINK-1966: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181798643 That is a good point. In streaming setting, it does indeed make sense for the model to be available. However, in my opinion, then it would make sense to actually just use jppml and import the object, followed by extracting the model parameters. Granted, it is an added effort on the user side, but I still think it beats the complexity introduced by supporting imports directly. Furthermore, it would be a bad design to have to reject valid pmml models, just because a minor thing isn't supported in Flink. > Add support for predictive model markup language (PMML) > --- > > Key: FLINK-1966 > URL: https://issues.apache.org/jira/browse/FLINK-1966 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Sachin Goel >Priority: Minor > Labels: ML > > The predictive model markup language (PMML) [1] is a widely used language to > describe predictive and descriptive models as well as pre- and > post-processing steps. That way it allows and easy way to export for and > import models from other ML tools. > Resources: > [1] > http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3373) Using a newer library of Apache HttpClient than 4.2.6 will get class loading problems
[ https://issues.apache.org/jira/browse/FLINK-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138787#comment-15138787 ] Stephan Ewen commented on FLINK-3373: - Would it help if Flink simply updated the HTTP Client to the latest version (is it backwards compatible)? If not, then we need to shade the dependency, but if yes, that would be a very lightweight fix. > Using a newer library of Apache HttpClient than 4.2.6 will get class loading > problems > - > > Key: FLINK-3373 > URL: https://issues.apache.org/jira/browse/FLINK-3373 > Project: Flink > Issue Type: Bug > Environment: Latest Flink snapshot 1.0 >Reporter: Jakob Sultan Ericsson > > When I trying to use Apache HTTP client 4.5.1 in my flink job it will crash > with NoClassDefFound. > This has to do that it load some classes from provided httpclient 4.2.5/6 in > core flink. > {noformat} > 17:05:56,193 INFO org.apache.flink.runtime.taskmanager.Task >- DuplicateFilter -> InstallKeyLookup (11/16) switched to FAILED with > exception. > java.lang.NoSuchFieldError: INSTANCE > at > org.apache.http.conn.ssl.SSLConnectionSocketFactory.(SSLConnectionSocketFactory.java:144) > at > org.apache.http.impl.conn.PoolingHttpClientConnectionManager.getDefaultRegistry(PoolingHttpClientConnectionManager.java:109) > at > org.apache.http.impl.conn.PoolingHttpClientConnectionManager.(PoolingHttpClientConnectionManager.java:116) > ... > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:305) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:227) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) > at java.lang.Thread.run(Thread.java:745) > {noformat} > SSLConnectionSocketFactory and finds an earlier version of the > AllowAllHostnameVerifier that does have the INSTANCE variable (instance > variable was probably added in 4.3). > {noformat} > jar tvf lib/flink-dist-1.0-SNAPSHOT.jar |grep AllowAllHostnameVerifier >791 Thu Dec 17 09:55:46 CET 2015 > org/apache/http/conn/ssl/AllowAllHostnameVerifier.class > {noformat} > Solutions would be: > - Fix the classloader so that my custom job does not conflict with internal > flink-core classes... pretty hard > - Remove the dependency somehow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3226) Translate optimized logical Table API plans into physical plans representing DataSet programs
[ https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138788#comment-15138788 ] ASF GitHub Bot commented on FLINK-3226: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1595#discussion_r52293343 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenUtils.scala --- @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.table.codegen + +import java.util.concurrent.atomic.AtomicInteger + +import org.apache.flink.api.common.typeinfo.BasicTypeInfo._ +import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo._ +import org.apache.flink.api.common.typeinfo.{NumericTypeInfo, TypeInformation} +import org.apache.flink.api.common.typeutils.CompositeType +import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo} +import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo +import org.apache.flink.api.table.typeinfo.RowTypeInfo + +object CodeGenUtils { + + private val nameCounter = new AtomicInteger + + def newName(name: String): String = { +s"$name$$${nameCounter.getAndIncrement}" + } + + // when casting we first need to unbox Primitives, for example, + // float a = 1.0f; + // byte b = (byte) a; + // works, but for boxed types we need this: + // Float a = 1.0f; + // Byte b = (byte)(float) a; + def primitiveTypeTermForTypeInfo(tpe: TypeInformation[_]): String = tpe match { +case INT_TYPE_INFO => "int" +case LONG_TYPE_INFO => "long" +case SHORT_TYPE_INFO => "short" +case BYTE_TYPE_INFO => "byte" +case FLOAT_TYPE_INFO => "float" +case DOUBLE_TYPE_INFO => "double" +case BOOLEAN_TYPE_INFO => "boolean" +case CHAR_TYPE_INFO => "char" + +// From PrimitiveArrayTypeInfo we would get class "int[]", scala reflections +// does not seem to like this, so we manually give the correct type here. +case INT_PRIMITIVE_ARRAY_TYPE_INFO => "int[]" +case LONG_PRIMITIVE_ARRAY_TYPE_INFO => "long[]" +case SHORT_PRIMITIVE_ARRAY_TYPE_INFO => "short[]" +case BYTE_PRIMITIVE_ARRAY_TYPE_INFO => "byte[]" +case FLOAT_PRIMITIVE_ARRAY_TYPE_INFO => "float[]" +case DOUBLE_PRIMITIVE_ARRAY_TYPE_INFO => "double[]" +case BOOLEAN_PRIMITIVE_ARRAY_TYPE_INFO => "boolean[]" +case CHAR_PRIMITIVE_ARRAY_TYPE_INFO => "char[]" + +case _ => + tpe.getTypeClass.getCanonicalName + } + + def boxedTypeTermForTypeInfo(tpe: TypeInformation[_]): String = tpe match { +// From PrimitiveArrayTypeInfo we would get class "int[]", scala reflections +// does not seem to like this, so we manually give the correct type here. +case INT_PRIMITIVE_ARRAY_TYPE_INFO => "int[]" +case LONG_PRIMITIVE_ARRAY_TYPE_INFO => "long[]" +case SHORT_PRIMITIVE_ARRAY_TYPE_INFO => "short[]" +case BYTE_PRIMITIVE_ARRAY_TYPE_INFO => "byte[]" +case FLOAT_PRIMITIVE_ARRAY_TYPE_INFO => "float[]" +case DOUBLE_PRIMITIVE_ARRAY_TYPE_INFO => "double[]" +case BOOLEAN_PRIMITIVE_ARRAY_TYPE_INFO => "boolean[]" +case CHAR_PRIMITIVE_ARRAY_TYPE_INFO => "char[]" + +case _ => + tpe.getTypeClass.getCanonicalName + } + + def primitiveDefaultValue(tpe: TypeInformation[_]): String = tpe match { +case INT_TYPE_INFO => "-1" +case LONG_TYPE_INFO => "-1" +case SHORT_TYPE_INFO => "-1" +case BYTE_TYPE_INFO => "-1" +case FLOAT_TYPE_INFO => "-1.0f" +case DOUBLE_TYPE_INFO => "-1.0d" +case BOOLEAN_TYPE_INFO => "false" +case STRING_TYPE_INFO => "\"\"" +case CHAR_TYPE_INFO => "'\\0'" +case _ => "null" + } + + def requireNumeric(genExpr: GeneratedExpression) = genExpr.resultType match { +case nti:
[GitHub] flink pull request: [FLINK-3226] Implement a CodeGenerator for an ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1595#discussion_r52293343 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenUtils.scala --- @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.table.codegen + +import java.util.concurrent.atomic.AtomicInteger + +import org.apache.flink.api.common.typeinfo.BasicTypeInfo._ +import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo._ +import org.apache.flink.api.common.typeinfo.{NumericTypeInfo, TypeInformation} +import org.apache.flink.api.common.typeutils.CompositeType +import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo} +import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo +import org.apache.flink.api.table.typeinfo.RowTypeInfo + +object CodeGenUtils { + + private val nameCounter = new AtomicInteger + + def newName(name: String): String = { +s"$name$$${nameCounter.getAndIncrement}" + } + + // when casting we first need to unbox Primitives, for example, + // float a = 1.0f; + // byte b = (byte) a; + // works, but for boxed types we need this: + // Float a = 1.0f; + // Byte b = (byte)(float) a; + def primitiveTypeTermForTypeInfo(tpe: TypeInformation[_]): String = tpe match { +case INT_TYPE_INFO => "int" +case LONG_TYPE_INFO => "long" +case SHORT_TYPE_INFO => "short" +case BYTE_TYPE_INFO => "byte" +case FLOAT_TYPE_INFO => "float" +case DOUBLE_TYPE_INFO => "double" +case BOOLEAN_TYPE_INFO => "boolean" +case CHAR_TYPE_INFO => "char" + +// From PrimitiveArrayTypeInfo we would get class "int[]", scala reflections +// does not seem to like this, so we manually give the correct type here. +case INT_PRIMITIVE_ARRAY_TYPE_INFO => "int[]" +case LONG_PRIMITIVE_ARRAY_TYPE_INFO => "long[]" +case SHORT_PRIMITIVE_ARRAY_TYPE_INFO => "short[]" +case BYTE_PRIMITIVE_ARRAY_TYPE_INFO => "byte[]" +case FLOAT_PRIMITIVE_ARRAY_TYPE_INFO => "float[]" +case DOUBLE_PRIMITIVE_ARRAY_TYPE_INFO => "double[]" +case BOOLEAN_PRIMITIVE_ARRAY_TYPE_INFO => "boolean[]" +case CHAR_PRIMITIVE_ARRAY_TYPE_INFO => "char[]" + +case _ => + tpe.getTypeClass.getCanonicalName + } + + def boxedTypeTermForTypeInfo(tpe: TypeInformation[_]): String = tpe match { +// From PrimitiveArrayTypeInfo we would get class "int[]", scala reflections +// does not seem to like this, so we manually give the correct type here. +case INT_PRIMITIVE_ARRAY_TYPE_INFO => "int[]" +case LONG_PRIMITIVE_ARRAY_TYPE_INFO => "long[]" +case SHORT_PRIMITIVE_ARRAY_TYPE_INFO => "short[]" +case BYTE_PRIMITIVE_ARRAY_TYPE_INFO => "byte[]" +case FLOAT_PRIMITIVE_ARRAY_TYPE_INFO => "float[]" +case DOUBLE_PRIMITIVE_ARRAY_TYPE_INFO => "double[]" +case BOOLEAN_PRIMITIVE_ARRAY_TYPE_INFO => "boolean[]" +case CHAR_PRIMITIVE_ARRAY_TYPE_INFO => "char[]" + +case _ => + tpe.getTypeClass.getCanonicalName + } + + def primitiveDefaultValue(tpe: TypeInformation[_]): String = tpe match { +case INT_TYPE_INFO => "-1" +case LONG_TYPE_INFO => "-1" +case SHORT_TYPE_INFO => "-1" +case BYTE_TYPE_INFO => "-1" +case FLOAT_TYPE_INFO => "-1.0f" +case DOUBLE_TYPE_INFO => "-1.0d" +case BOOLEAN_TYPE_INFO => "false" +case STRING_TYPE_INFO => "\"\"" +case CHAR_TYPE_INFO => "'\\0'" +case _ => "null" + } + + def requireNumeric(genExpr: GeneratedExpression) = genExpr.resultType match { +case nti: NumericTypeInfo[_] => // ok +case _ => throw new CodeGenException("Numeric expression type expected.") + } + + def requireString(genExpr: GeneratedExpression) = genExpr.resultType match { +case STRING_TYPE_INFO =>
[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions
[ https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138570#comment-15138570 ] ASF GitHub Bot commented on FLINK-3234: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52281721 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java --- @@ -1377,6 +1377,24 @@ public long count() throws Exception { return new SortPartitionOperator<>(this, field, order, Utils.getCallLocationName()); } + /** +* Locally sorts the partitions of the DataSet on the an extracted key in the specified order. --- End diff -- "...the DataSet on the **an** extracted key...", remove "an" > SortPartition does not support KeySelectorFunctions > --- > > Key: FLINK-3234 > URL: https://issues.apache.org/jira/browse/FLINK-3234 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0, 0.10.1 >Reporter: Fabian Hueske >Assignee: Chiwan Park > Fix For: 1.0.0 > > > The following is not supported by the DataSet API: > {code} > DataSet data = ... > DataSet data.sortPartition( > new KeySelector() { > public Long getKey(MyObject v) { > ... > } > }, > Order.ASCENDING); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52282118 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/SortPartitionOperator.java --- @@ -36,27 +40,58 @@ */ public class SortPartitionOperator extends SingleInputOperator{ - private int[] sortKeyPositions; + private List keys; - private Order[] sortOrders; + private List orders; private final String sortLocationName; + private boolean useKeySelector; - public SortPartitionOperator(DataSet dataSet, int sortField, Order sortOrder, String sortLocationName) { + private SortPartitionOperator(DataSet dataSet, String sortLocationName) { super(dataSet, dataSet.getType()); + + keys = new ArrayList<>(); + orders = new ArrayList<>(); this.sortLocationName = sortLocationName; + } + + + public SortPartitionOperator(DataSet dataSet, int sortField, Order sortOrder, String sortLocationName) { + this(dataSet, sortLocationName); + this.useKeySelector = false; + + ensureSortableKey(sortField); - int[] flatOrderKeys = getFlatFields(sortField); - this.appendSorting(flatOrderKeys, sortOrder); + keys.add(new Keys.ExpressionKeys<>(sortField, getType())); + orders.add(sortOrder); } public SortPartitionOperator(DataSet dataSet, String sortField, Order sortOrder, String sortLocationName) { - super(dataSet, dataSet.getType()); - this.sortLocationName = sortLocationName; + this(dataSet, sortLocationName); + this.useKeySelector = false; + + ensureSortableKey(sortField); + + keys.add(new Keys.ExpressionKeys<>(sortField, getType())); + orders.add(sortOrder); + } + + public SortPartitionOperator(DataSet dataSet, Keys sortKey, Order sortOrder, String sortLocationName) { --- End diff -- Change the `sortKey` parameter type to `SelectorFunctionKeys` (or accept the `KeySelector` and create the `SelectorFunctionKeys` in the constructor. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3355) Allow passing RocksDB Option to RocksDBStateBackend
[ https://issues.apache.org/jira/browse/FLINK-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138775#comment-15138775 ] ASF GitHub Bot commented on FLINK-3355: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1608 > Allow passing RocksDB Option to RocksDBStateBackend > --- > > Key: FLINK-3355 > URL: https://issues.apache.org/jira/browse/FLINK-3355 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Gyula Fora >Assignee: Stephan Ewen >Priority: Critical > > Currently the RocksDB state backend does not allow users to set the > parameters of the created store which might lead to suboptimal performance on > some workloads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3371] [api breaking] Move TriggerResult...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1603#issuecomment-181813971 Exactly, the `AlignedTrigger` will have an `AlignedTriggerContext` without Key/Value state. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3355] [rocksdb backend] Allow passing o...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1608 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3376) Add an illustration of Event Time and Watermarks to the docs
[ https://issues.apache.org/jira/browse/FLINK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138781#comment-15138781 ] Stephan Ewen commented on FLINK-3376: - Some of that information is available, but hidden deep in the Streaming API guide. I vote to link this page on the same level as the Batch and Streaming API guide > Add an illustration of Event Time and Watermarks to the docs > > > Key: FLINK-3376 > URL: https://issues.apache.org/jira/browse/FLINK-3376 > Project: Flink > Issue Type: Improvement > Components: Documentation >Affects Versions: 0.10.1 >Reporter: Stephan Ewen >Priority: Critical > Fix For: 1.0.0 > > > Users seem to get confused about how event time and watermarks work. > We need to add documentation with two sections: > 1. Event time and watermark progress in general > - Watermarks are generated at the sources > - How Watermarks progress through the streaming data flow > 2. Ways that users can generate watermarks > - EventTimeSourceFunctions > - AscendingTimestampExtractor > - TimestampExtractor general case -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3226) Translate optimized logical Table API plans into physical plans representing DataSet programs
[ https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138784#comment-15138784 ] ASF GitHub Bot commented on FLINK-3226: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1595#discussion_r52293004 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala --- @@ -0,0 +1,661 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.table.codegen + +import org.apache.calcite.rex._ +import org.apache.calcite.sql.`type`.SqlTypeName._ +import org.apache.calcite.sql.fun.SqlStdOperatorTable._ +import org.apache.flink.api.common.functions.{FlatMapFunction, Function, MapFunction} +import org.apache.flink.api.common.typeinfo.{AtomicType, TypeInformation} +import org.apache.flink.api.common.typeutils.CompositeType +import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo} +import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo +import org.apache.flink.api.table.TableConfig +import org.apache.flink.api.table.codegen.CodeGenUtils._ +import org.apache.flink.api.table.codegen.Indenter._ --- End diff -- Intellij tells me this import is unused > Translate optimized logical Table API plans into physical plans representing > DataSet programs > - > > Key: FLINK-3226 > URL: https://issues.apache.org/jira/browse/FLINK-3226 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Fabian Hueske >Assignee: Chengxiang Li > > This issue is about translating an (optimized) logical Table API (see > FLINK-3225) query plan into a physical plan. The physical plan is a 1-to-1 > representation of the DataSet program that will be executed. This means: > - Each Flink RelNode refers to exactly one Flink DataSet or DataStream > operator. > - All (join and grouping) keys of Flink operators are correctly specified. > - The expressions which are to be executed in user-code are identified. > - All fields are referenced with their physical execution-time index. > - Flink type information is available. > - Optional: Add physical execution hints for joins > The translation should be the final part of Calcite's optimization process. > For this task we need to: > - implement a set of Flink DataSet RelNodes. Each RelNode corresponds to one > Flink DataSet operator (Map, Reduce, Join, ...). The RelNodes must hold all > relevant operator information (keys, user-code expression, strategy hints, > parallelism). > - implement rules to translate optimized Calcite RelNodes into Flink > RelNodes. We start with a straight-forward mapping and later add rules that > merge several relational operators into a single Flink operator, e.g., merge > a join followed by a filter. Timo implemented some rules for the first SQL > implementation which can be used as a starting point. > - Integrate the translation rules into the Calcite optimization process -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3226] Implement a CodeGenerator for an ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1595#discussion_r52293004 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala --- @@ -0,0 +1,661 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.table.codegen + +import org.apache.calcite.rex._ +import org.apache.calcite.sql.`type`.SqlTypeName._ +import org.apache.calcite.sql.fun.SqlStdOperatorTable._ +import org.apache.flink.api.common.functions.{FlatMapFunction, Function, MapFunction} +import org.apache.flink.api.common.typeinfo.{AtomicType, TypeInformation} +import org.apache.flink.api.common.typeutils.CompositeType +import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo} +import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo +import org.apache.flink.api.table.TableConfig +import org.apache.flink.api.table.codegen.CodeGenUtils._ +import org.apache.flink.api.table.codegen.Indenter._ --- End diff -- Intellij tells me this import is unused --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (FLINK-3359) Make RocksDB file copies asynchronous
[ https://issues.apache.org/jira/browse/FLINK-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek reassigned FLINK-3359: --- Assignee: Aljoscha Krettek > Make RocksDB file copies asynchronous > - > > Key: FLINK-3359 > URL: https://issues.apache.org/jira/browse/FLINK-3359 > Project: Flink > Issue Type: Bug > Components: state backends >Affects Versions: 1.0.0 >Reporter: Stephan Ewen >Assignee: Aljoscha Krettek > Fix For: 1.0.0 > > > While the incremental backup of the RocksDB files needs to be synchronous, > the copying of that file to the backup file system can be fully asynchronous. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52282391 --- Diff: flink-tests/src/test/scala/org/apache/flink/api/scala/operators/SortPartitionITCase.scala --- @@ -166,6 +167,58 @@ class SortPartitionITCase(mode: TestExecutionMode) extends MultipleProgramsTestB TestBaseUtils.compareResultAsText(result.asJava, expected) } + @Test + def testSortPartitionWithKeySelector1(): Unit = { +val env = ExecutionEnvironment.getExecutionEnvironment +env.setParallelism(4) +val ds = CollectionDataSets.get3TupleDataSet(env) + +val result = ds + .map { x => x }.setParallelism(4) + .sortPartition(_._2, Order.DESCENDING) --- End diff -- Change sort order to `ASCENDING` (or in the other test). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions
[ https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138594#comment-15138594 ] ASF GitHub Bot commented on FLINK-3234: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52282336 --- Diff: flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/SortPartitionITCase.java --- @@ -197,6 +198,58 @@ public void testSortPartitionParallelismChange() throws Exception { compareResultAsText(result, expected); } + @Test + public void testSortPartitionWithKeySelector1() throws Exception { + /* +* Test sort partition on an extracted key +*/ + + final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + env.setParallelism(4); + + DataSet> ds = CollectionDataSets.get3TupleDataSet(env); + List result = ds + .map(new IdMapper >()).setParallelism(4) // parallelize input + .sortPartition(new KeySelector , Long>() { + @Override + public Long getKey(Tuple3 value) throws Exception { + return value.f1; + } + }, Order.DESCENDING) --- End diff -- Change sort order to `ASCENDING` (or in the other test). > SortPartition does not support KeySelectorFunctions > --- > > Key: FLINK-3234 > URL: https://issues.apache.org/jira/browse/FLINK-3234 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0, 0.10.1 >Reporter: Fabian Hueske >Assignee: Chiwan Park > Fix For: 1.0.0 > > > The following is not supported by the DataSet API: > {code} > DataSet data = ... > DataSet data.sortPartition( > new KeySelector () { > public Long getKey(MyObject v) { > ... > } > }, > Order.ASCENDING); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)
[ https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138593#comment-15138593 ] ASF GitHub Bot commented on FLINK-1966: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181772422 That said, just for a comparison purpose, spark has its own model export and import feature, along with pmml export. Hoping to fully support pmml import in a framework like flink or spark is a next to impossible thing which requires changes to the entire way our pipelines and datasets and represented. > Add support for predictive model markup language (PMML) > --- > > Key: FLINK-1966 > URL: https://issues.apache.org/jira/browse/FLINK-1966 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Sachin Goel >Priority: Minor > Labels: ML > > The predictive model markup language (PMML) [1] is a widely used language to > describe predictive and descriptive models as well as pre- and > post-processing steps. That way it allows and easy way to export for and > import models from other ML tools. > Resources: > [1] > http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52282336 --- Diff: flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/SortPartitionITCase.java --- @@ -197,6 +198,58 @@ public void testSortPartitionParallelismChange() throws Exception { compareResultAsText(result, expected); } + @Test + public void testSortPartitionWithKeySelector1() throws Exception { + /* +* Test sort partition on an extracted key +*/ + + final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + env.setParallelism(4); + + DataSet> ds = CollectionDataSets.get3TupleDataSet(env); + List result = ds + .map(new IdMapper >()).setParallelism(4) // parallelize input + .sortPartition(new KeySelector , Long>() { + @Override + public Long getKey(Tuple3 value) throws Exception { + return value.f1; + } + }, Order.DESCENDING) --- End diff -- Change sort order to `ASCENDING` (or in the other test). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions
[ https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138591#comment-15138591 ] ASF GitHub Bot commented on FLINK-3234: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52282275 --- Diff: flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala --- @@ -1508,6 +1508,31 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) { new SortPartitionOperator[T](javaSet, field, order, getCallLocationName())) } + /** +* Locally sorts the partitions of the DataSet on the specified field in the specified order. +* The DataSet can be sorted on multiple fields by chaining sortPartition() calls. +* +* Note that any key extraction methods cannot be chained with the KeySelector. To sort the +* partition by multiple values using KeySelector, the KeySelector must return a tuple +* consisting of the values. +*/ + def sortPartition[K: TypeInformation](fun: T => K, order: Order): DataSet[T] ={ --- End diff -- Copy the method docs from the `DataSet.java`. > SortPartition does not support KeySelectorFunctions > --- > > Key: FLINK-3234 > URL: https://issues.apache.org/jira/browse/FLINK-3234 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0, 0.10.1 >Reporter: Fabian Hueske >Assignee: Chiwan Park > Fix For: 1.0.0 > > > The following is not supported by the DataSet API: > {code} > DataSet data = ... > DataSet data.sortPartition( > new KeySelector() { > public Long getKey(MyObject v) { > ... > } > }, > Order.ASCENDING); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181772422 That said, just for a comparison purpose, spark has its own model export and import feature, along with pmml export. Hoping to fully support pmml import in a framework like flink or spark is a next to impossible thing which requires changes to the entire way our pipelines and datasets and represented. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52282275 --- Diff: flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala --- @@ -1508,6 +1508,31 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) { new SortPartitionOperator[T](javaSet, field, order, getCallLocationName())) } + /** +* Locally sorts the partitions of the DataSet on the specified field in the specified order. +* The DataSet can be sorted on multiple fields by chaining sortPartition() calls. +* +* Note that any key extraction methods cannot be chained with the KeySelector. To sort the +* partition by multiple values using KeySelector, the KeySelector must return a tuple +* consisting of the values. +*/ + def sortPartition[K: TypeInformation](fun: T => K, order: Order): DataSet[T] ={ --- End diff -- Copy the method docs from the `DataSet.java`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3243] Fix Interplay of TimeCharacterist...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1513#issuecomment-181777122 You're right, I'm changing it. But it was also me who didn't notice when we put it in initially :sweat_smile: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Github user chobeat commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181790876 I agree with @sachingoel0101 on the import complexity but, from our point of view, Flink is the perfect platform to evaluate models in streaming and we are using it that way in our architecture. Why do you think it wouldn't be suitable? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3366) Rename @Experimental annotation to @PublicEvolving
[ https://issues.apache.org/jira/browse/FLINK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138701#comment-15138701 ] ASF GitHub Bot commented on FLINK-3366: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1599#issuecomment-181794025 Can you update the JavaDoc of the `@PublicEvolving` annotation? Otherwise, good to merge... > Rename @Experimental annotation to @PublicEvolving > -- > > Key: FLINK-3366 > URL: https://issues.apache.org/jira/browse/FLINK-3366 > Project: Flink > Issue Type: Task >Affects Versions: 1.0.0 >Reporter: Fabian Hueske >Assignee: Fabian Hueske >Priority: Blocker > Fix For: 1.0.0 > > > As per discussion on the dev ML, rename the @Experimental annotation to > @PublicEvolving. > Experimental might suggest instable / unreliable functionality which is not > the intended meaning of this annotation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3366] Rename @Experimental annotation t...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1599#issuecomment-181794025 Can you update the JavaDoc of the `@PublicEvolving` annotation? Otherwise, good to merge... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3226] Implement a CodeGenerator for an ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1595#discussion_r52291201 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala --- @@ -0,0 +1,661 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.table.codegen + +import org.apache.calcite.rex._ +import org.apache.calcite.sql.`type`.SqlTypeName._ +import org.apache.calcite.sql.fun.SqlStdOperatorTable._ +import org.apache.flink.api.common.functions.{FlatMapFunction, Function, MapFunction} +import org.apache.flink.api.common.typeinfo.{AtomicType, TypeInformation} +import org.apache.flink.api.common.typeutils.CompositeType +import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo} +import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo +import org.apache.flink.api.table.TableConfig +import org.apache.flink.api.table.codegen.CodeGenUtils._ +import org.apache.flink.api.table.codegen.Indenter._ +import org.apache.flink.api.table.codegen.OperatorCodeGen._ +import org.apache.flink.api.table.plan.TypeConverter.sqlTypeToTypeInfo +import org.apache.flink.api.table.typeinfo.RowTypeInfo + +import scala.collection.JavaConversions._ +import scala.collection.mutable + +class CodeGenerator( +config: TableConfig, +input1: TypeInformation[Any], +input2: Option[TypeInformation[Any]] = None) + extends RexVisitor[GeneratedExpression] { + + // set of member statements that will be added only once + // we use a LinkedHashSet to keep the insertion order + private val reusableMemberStatements = mutable.LinkedHashSet[String]() + + // set of constructor statements that will be added only once + // we use a LinkedHashSet to keep the insertion order + private val reusableInitStatements = mutable.LinkedHashSet[String]() + + // map of initial input unboxing expressions that will be added only once + // (inputTerm, index) -> expr + private val reusableInputUnboxingExprs = mutable.Map[(String, Int), GeneratedExpression]() + + def reuseMemberCode(): String = { +reusableMemberStatements.mkString("", "\n", "\n") + } + + def reuseInitCode(): String = { +reusableInitStatements.mkString("", "\n", "\n") + } + + def reuseInputUnboxingCode(): String = { +reusableInputUnboxingExprs.values.map(_.code).mkString("", "\n", "\n") + } + + def input1Term = "in1" + + def input2Term = "in2" + + def collectorTerm = "c" + + def outRecordTerm = "out" + + def nullCheck: Boolean = config.getNullCheck + + def generateExpression(rex: RexNode): GeneratedExpression = { +rex.accept(this) + } + + def generateFunction[T <: Function]( + name: String, + clazz: Class[T], + bodyCode: String, + returnType: TypeInformation[Any]) +: GeneratedFunction[T] = { +val funcName = newName(name) + +// Janino does not support generics, that's why we need +// manual casting here +val samHeader = + if (clazz == classOf[FlatMapFunction[_,_]]) { +val inputTypeTerm = boxedTypeTermForTypeInfo(input1) +(s"void flatMap(Object _in1, org.apache.flink.util.Collector $collectorTerm)", + s"$inputTypeTerm $input1Term = ($inputTypeTerm) _in1;") + } else if (clazz == classOf[MapFunction[_,_]]) { +val inputTypeTerm = boxedTypeTermForTypeInfo(input1) +("Object map(Object _in1)", + s"$inputTypeTerm $input1Term = ($inputTypeTerm) _in1;") + } else { +// TODO more functions +throw new CodeGenException("Unsupported Function.") + } + +val funcCode = j""" + public class $funcName + implements ${clazz.getCanonicalName} { + +${reuseMemberCode()} + +public $funcName() { +
[jira] [Commented] (FLINK-3243) Fix Interplay of TimeCharacteristic and Time Windows
[ https://issues.apache.org/jira/browse/FLINK-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138768#comment-15138768 ] ASF GitHub Bot commented on FLINK-3243: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1513#issuecomment-181810307 I rebased it to master and updated. > Fix Interplay of TimeCharacteristic and Time Windows > > > Key: FLINK-3243 > URL: https://issues.apache.org/jira/browse/FLINK-3243 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Blocker > > As per the discussion on the Dev ML: > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Time-Behavior-in-Streaming-Jobs-Event-time-processing-time-td9616.html. > The discussion seems to have converged on option 2): > - Add dedicated WindowAssigners for processing time and event time > - {{timeWindow()}} and {{timeWindowAll()}} respect the set > {{TimeCharacteristic}}. > This will make the easy stuff easy, i.e. using time windows and quickly > switching the time characteristic. Users will then have the flexibility to > mix different kinds of window assigners in their job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3243] Fix Interplay of TimeCharacterist...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1513#issuecomment-181810307 I rebased it to master and updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3226] Implement a CodeGenerator for an ...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1595#issuecomment-181824403 Hi Timo, the PR looks really good :-) I found the following issues / questions: - Accessing of POJO fields might not work. - Can you add method comments to the code generation methods in `CodeGenerator` and `CodeGenUtils`? - Would it make sense to separate the function and expression code gen, i.e., split the `CodeGenerator` class? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3226) Translate optimized logical Table API plans into physical plans representing DataSet programs
[ https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138805#comment-15138805 ] ASF GitHub Bot commented on FLINK-3226: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1595#issuecomment-181824403 Hi Timo, the PR looks really good :-) I found the following issues / questions: - Accessing of POJO fields might not work. - Can you add method comments to the code generation methods in `CodeGenerator` and `CodeGenUtils`? - Would it make sense to separate the function and expression code gen, i.e., split the `CodeGenerator` class? > Translate optimized logical Table API plans into physical plans representing > DataSet programs > - > > Key: FLINK-3226 > URL: https://issues.apache.org/jira/browse/FLINK-3226 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Fabian Hueske >Assignee: Chengxiang Li > > This issue is about translating an (optimized) logical Table API (see > FLINK-3225) query plan into a physical plan. The physical plan is a 1-to-1 > representation of the DataSet program that will be executed. This means: > - Each Flink RelNode refers to exactly one Flink DataSet or DataStream > operator. > - All (join and grouping) keys of Flink operators are correctly specified. > - The expressions which are to be executed in user-code are identified. > - All fields are referenced with their physical execution-time index. > - Flink type information is available. > - Optional: Add physical execution hints for joins > The translation should be the final part of Calcite's optimization process. > For this task we need to: > - implement a set of Flink DataSet RelNodes. Each RelNode corresponds to one > Flink DataSet operator (Map, Reduce, Join, ...). The RelNodes must hold all > relevant operator information (keys, user-code expression, strategy hints, > parallelism). > - implement rules to translate optimized Calcite RelNodes into Flink > RelNodes. We start with a straight-forward mapping and later add rules that > merge several relational operators into a single Flink operator, e.g., merge > a join followed by a filter. Timo implemented some rules for the first SQL > implementation which can be used as a starting point. > - Integrate the translation rules into the Calcite optimization process -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52282174 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/SortPartitionOperator.java --- @@ -79,58 +119,41 @@ public SortPartitionOperator(DataSet dataSet, String sortField, Order sortOrd * local partition sorting of the DataSet. * * @param field The field expression referring to the field of the additional sort order of -* the local partition sorting. -* @param order The order of the additional sort order of the local partition sorting. +* the local partition sorting. +* @param order The order of the additional sort order of the local partition sorting. * @return The DataSet with sorted local partitions. */ public SortPartitionOperator sortPartition(String field, Order order) { - int[] flatOrderKeys = getFlatFields(field); - this.appendSorting(flatOrderKeys, order); + if (useKeySelector) { + throw new InvalidProgramException("Expression keys cannot be appended after a KeySelector"); + } + + ensureSortableKey(field); + keys.add(new Keys.ExpressionKeys<>(field, getType())); + orders.add(order); + return this; } - // - // Key Extraction - // - - private int[] getFlatFields(int field) { + public SortPartitionOperator sortPartition(KeySelectorkeyExtractor, Order order) { + throw new InvalidProgramException("KeySelector cannot be chained."); + } - if (!Keys.ExpressionKeys.isSortKey(field, super.getType())) { + private void ensureSortableKey(int field) throws InvalidProgramException { + if (!Keys.ExpressionKeys.isSortKey(field, getType())) { throw new InvalidProgramException("Selected sort key is not a sortable type"); } - - Keys.ExpressionKeys ek = new Keys.ExpressionKeys<>(field, super.getType()); - return ek.computeLogicalKeyPositions(); } - private int[] getFlatFields(String fields) { - - if (!Keys.ExpressionKeys.isSortKey(fields, super.getType())) { + private void ensureSortableKey(String field) throws InvalidProgramException { + if (!Keys.ExpressionKeys.isSortKey(field, getType())) { throw new InvalidProgramException("Selected sort key is not a sortable type"); } - - Keys.ExpressionKeys ek = new Keys.ExpressionKeys<>(fields, super.getType()); - return ek.computeLogicalKeyPositions(); } - private void appendSorting(int[] flatOrderFields, Order order) { - - if(this.sortKeyPositions == null) { - // set sorting info - this.sortKeyPositions = flatOrderFields; - this.sortOrders = new Order[flatOrderFields.length]; - Arrays.fill(this.sortOrders, order); - } else { - // append sorting info to exising info - int oldLength = this.sortKeyPositions.length; - int newLength = oldLength + flatOrderFields.length; - this.sortKeyPositions = Arrays.copyOf(this.sortKeyPositions, newLength); - this.sortOrders = Arrays.copyOf(this.sortOrders, newLength); - - for(int i=0; i
[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions
[ https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138577#comment-15138577 ] ASF GitHub Bot commented on FLINK-3234: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52282118 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/SortPartitionOperator.java --- @@ -36,27 +40,58 @@ */ public class SortPartitionOperator extends SingleInputOperator{ - private int[] sortKeyPositions; + private List keys; - private Order[] sortOrders; + private List orders; private final String sortLocationName; + private boolean useKeySelector; - public SortPartitionOperator(DataSet dataSet, int sortField, Order sortOrder, String sortLocationName) { + private SortPartitionOperator(DataSet dataSet, String sortLocationName) { super(dataSet, dataSet.getType()); + + keys = new ArrayList<>(); + orders = new ArrayList<>(); this.sortLocationName = sortLocationName; + } + + + public SortPartitionOperator(DataSet dataSet, int sortField, Order sortOrder, String sortLocationName) { + this(dataSet, sortLocationName); + this.useKeySelector = false; + + ensureSortableKey(sortField); - int[] flatOrderKeys = getFlatFields(sortField); - this.appendSorting(flatOrderKeys, sortOrder); + keys.add(new Keys.ExpressionKeys<>(sortField, getType())); + orders.add(sortOrder); } public SortPartitionOperator(DataSet dataSet, String sortField, Order sortOrder, String sortLocationName) { - super(dataSet, dataSet.getType()); - this.sortLocationName = sortLocationName; + this(dataSet, sortLocationName); + this.useKeySelector = false; + + ensureSortableKey(sortField); + + keys.add(new Keys.ExpressionKeys<>(sortField, getType())); + orders.add(sortOrder); + } + + public SortPartitionOperator(DataSet dataSet, Keys sortKey, Order sortOrder, String sortLocationName) { --- End diff -- Change the `sortKey` parameter type to `SelectorFunctionKeys` (or accept the `KeySelector` and create the `SelectorFunctionKeys` in the constructor. > SortPartition does not support KeySelectorFunctions > --- > > Key: FLINK-3234 > URL: https://issues.apache.org/jira/browse/FLINK-3234 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0, 0.10.1 >Reporter: Fabian Hueske >Assignee: Chiwan Park > Fix For: 1.0.0 > > > The following is not supported by the DataSet API: > {code} > DataSet data = ... > DataSet data.sortPartition( > new KeySelector () { > public Long getKey(MyObject v) { > ... > } > }, > Order.ASCENDING); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions
[ https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138589#comment-15138589 ] ASF GitHub Bot commented on FLINK-3234: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52282174 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/SortPartitionOperator.java --- @@ -79,58 +119,41 @@ public SortPartitionOperator(DataSet dataSet, String sortField, Order sortOrd * local partition sorting of the DataSet. * * @param field The field expression referring to the field of the additional sort order of -* the local partition sorting. -* @param order The order of the additional sort order of the local partition sorting. +* the local partition sorting. +* @param order The order of the additional sort order of the local partition sorting. * @return The DataSet with sorted local partitions. */ public SortPartitionOperator sortPartition(String field, Order order) { - int[] flatOrderKeys = getFlatFields(field); - this.appendSorting(flatOrderKeys, order); + if (useKeySelector) { + throw new InvalidProgramException("Expression keys cannot be appended after a KeySelector"); + } + + ensureSortableKey(field); + keys.add(new Keys.ExpressionKeys<>(field, getType())); + orders.add(order); + return this; } - // - // Key Extraction - // - - private int[] getFlatFields(int field) { + public SortPartitionOperator sortPartition(KeySelectorkeyExtractor, Order order) { + throw new InvalidProgramException("KeySelector cannot be chained."); + } - if (!Keys.ExpressionKeys.isSortKey(field, super.getType())) { + private void ensureSortableKey(int field) throws InvalidProgramException { + if (!Keys.ExpressionKeys.isSortKey(field, getType())) { throw new InvalidProgramException("Selected sort key is not a sortable type"); } - - Keys.ExpressionKeys ek = new Keys.ExpressionKeys<>(field, super.getType()); - return ek.computeLogicalKeyPositions(); } - private int[] getFlatFields(String fields) { - - if (!Keys.ExpressionKeys.isSortKey(fields, super.getType())) { + private void ensureSortableKey(String field) throws InvalidProgramException { + if (!Keys.ExpressionKeys.isSortKey(field, getType())) { throw new InvalidProgramException("Selected sort key is not a sortable type"); } - - Keys.ExpressionKeys ek = new Keys.ExpressionKeys<>(fields, super.getType()); - return ek.computeLogicalKeyPositions(); } - private void appendSorting(int[] flatOrderFields, Order order) { - - if(this.sortKeyPositions == null) { - // set sorting info - this.sortKeyPositions = flatOrderFields; - this.sortOrders = new Order[flatOrderFields.length]; - Arrays.fill(this.sortOrders, order); - } else { - // append sorting info to exising info - int oldLength = this.sortKeyPositions.length; - int newLength = oldLength + flatOrderFields.length; - this.sortKeyPositions = Arrays.copyOf(this.sortKeyPositions, newLength); - this.sortOrders = Arrays.copyOf(this.sortOrders, newLength); - - for(int i=0; i SortPartition does not support KeySelectorFunctions > --- > > Key: FLINK-3234 > URL: https://issues.apache.org/jira/browse/FLINK-3234 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0, 0.10.1 >Reporter: Fabian Hueske >Assignee: Chiwan Park > Fix For:
[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)
[ https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138576#comment-15138576 ] ASF GitHub Bot commented on FLINK-1966: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181771679 As the original author of this PR, I'd say this: I tried implementing the import features but they aren't worth it. You have to discard most of the valid pmml models because they don't fit in with the flink framework. Further, in my opinion, the use of flink is to train the model. Once we export that model in pmml, you can use it pretty much anywhere, say R or matlab, which support a complete pmml import and export functionality. The exported model is in most cases going to be used for testing, evaluating and predictions purposes, for which flink isn't a good platform to use anyway. This can be accomplished anywhere. > Add support for predictive model markup language (PMML) > --- > > Key: FLINK-1966 > URL: https://issues.apache.org/jira/browse/FLINK-1966 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Sachin Goel >Priority: Minor > Labels: ML > > The predictive model markup language (PMML) [1] is a widely used language to > describe predictive and descriptive models as well as pre- and > post-processing steps. That way it allows and easy way to export for and > import models from other ML tools. > Resources: > [1] > http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3374) CEPITCase testSimplePatternEventTime fails
[ https://issues.apache.org/jira/browse/FLINK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138575#comment-15138575 ] Till Rohrmann commented on FLINK-3374: -- Is this reproducible? Looks to me as if the testing file could not be created by the test. This might be simply a problem of the Travis machine. > CEPITCase testSimplePatternEventTime fails > -- > > Key: FLINK-3374 > URL: https://issues.apache.org/jira/browse/FLINK-3374 > Project: Flink > Issue Type: Test > Components: Tests >Reporter: Ufuk Celebi >Priority: Minor > > {code} > testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: > 1.68 sec <<< ERROR! > org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: java.io.FileNotFoundException: > /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:162) > at > org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56) > at > org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256) > at > org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263) > at > org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248) > at > org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76) > at > org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > at java.lang.Thread.run(Thread.java:745) > testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: > 1.68 sec <<< FAILURE! > java.lang.AssertionError: Different number of lines in expected and obtained > result. expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306) > at > org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292) > at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56) > {code} > https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz > {code} > 04:53:46,840 INFO org.apache.flink.runtime.client.JobClientActor >- 02/09/2016 04:53:46Map -> Sink: Unnamed(2/4) switched to FAILED > java.io.FileNotFoundException: > /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:162) > at >
[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181771679 As the original author of this PR, I'd say this: I tried implementing the import features but they aren't worth it. You have to discard most of the valid pmml models because they don't fit in with the flink framework. Further, in my opinion, the use of flink is to train the model. Once we export that model in pmml, you can use it pretty much anywhere, say R or matlab, which support a complete pmml import and export functionality. The exported model is in most cases going to be used for testing, evaluating and predictions purposes, for which flink isn't a good platform to use anyway. This can be accomplished anywhere. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2832) Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement
[ https://issues.apache.org/jira/browse/FLINK-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138691#comment-15138691 ] Robert Metzger commented on FLINK-2832: --- Another instance: https://s3.amazonaws.com/archive.travis-ci.org/jobs/107973266/log.txt {code} Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.914 sec <<< FAILURE! - in org.apache.flink.api.java.sampling.RandomSamplerTest testReservoirSamplerWithReplacement(org.apache.flink.api.java.sampling.RandomSamplerTest) Time elapsed: 1.282 sec <<< FAILURE! java.lang.AssertionError: KS test result with p value(0.034000), d value(0.032600) at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.flink.api.java.sampling.RandomSamplerTest.verifyKSTest(RandomSamplerTest.java:342) at org.apache.flink.api.java.sampling.RandomSamplerTest.verifyRandomSamplerWithSampleSize(RandomSamplerTest.java:330) at org.apache.flink.api.java.sampling.RandomSamplerTest.verifyReservoirSamplerWithReplacement(RandomSamplerTest.java:289) at org.apache.flink.api.java.sampling.RandomSamplerTest.testReservoirSamplerWithReplacement(RandomSamplerTest.java:194) {code} > Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement > --- > > Key: FLINK-2832 > URL: https://issues.apache.org/jira/browse/FLINK-2832 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 0.10.0 >Reporter: Vasia Kalavri >Priority: Critical > Labels: test-stability > Fix For: 1.0.0 > > > Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 19.133 sec > <<< FAILURE! - in org.apache.flink.api.java.sampling.RandomSamplerTest > testReservoirSamplerWithReplacement(org.apache.flink.api.java.sampling.RandomSamplerTest) > Time elapsed: 2.534 sec <<< FAILURE! > java.lang.AssertionError: KS test result with p value(0.11), d > value(0.103090) > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.flink.api.java.sampling.RandomSamplerTest.verifyKSTest(RandomSamplerTest.java:342) > at > org.apache.flink.api.java.sampling.RandomSamplerTest.verifyRandomSamplerWithSampleSize(RandomSamplerTest.java:330) > at > org.apache.flink.api.java.sampling.RandomSamplerTest.verifyReservoirSamplerWithReplacement(RandomSamplerTest.java:289) > at > org.apache.flink.api.java.sampling.RandomSamplerTest.testReservoirSamplerWithReplacement(RandomSamplerTest.java:192) > Results : > Failed tests: > > RandomSamplerTest.testReservoirSamplerWithReplacement:192->verifyReservoirSamplerWithReplacement:289->verifyRandomSamplerWithSampleSize:330->verifyKSTest:342 > KS test result with p value(0.11), d value(0.103090) > Full log [here|https://travis-ci.org/apache/flink/jobs/84120131]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)
[ https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138728#comment-15138728 ] ASF GitHub Bot commented on FLINK-1966: --- Github user chobeat commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181799783 @sachingoel0101 I agree. Nonetheless, an easy way to store and move a model generated in batch to a streaming enviroment would be a really useful feature and we go back to what @chiwanpark was saying about a custom format internal to Flink. > Add support for predictive model markup language (PMML) > --- > > Key: FLINK-1966 > URL: https://issues.apache.org/jira/browse/FLINK-1966 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Sachin Goel >Priority: Minor > Labels: ML > > The predictive model markup language (PMML) [1] is a widely used language to > describe predictive and descriptive models as well as pre- and > post-processing steps. That way it allows and easy way to export for and > import models from other ML tools. > Resources: > [1] > http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Github user chobeat commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181799783 @sachingoel0101 I agree. Nonetheless, an easy way to store and move a model generated in batch to a streaming enviroment would be a really useful feature and we go back to what @chiwanpark was saying about a custom format internal to Flink. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-3376) Add an illustration of Event Time and Watermarks to the docsq
Stephan Ewen created FLINK-3376: --- Summary: Add an illustration of Event Time and Watermarks to the docsq Key: FLINK-3376 URL: https://issues.apache.org/jira/browse/FLINK-3376 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 0.10.1 Reporter: Stephan Ewen Priority: Critical Fix For: 1.0.0 Users seem to get confused about how event time and watermarks work. We need to add documentation with two sections: 1. Event time and watermark progress in general - Watermarks are generated at the sources - How Watermarks progress through the streaming data flow 2. Ways that users can generate watermarks - EventTimeSourceFunctions - AscendingTimestampExtractor - TimestampExtractor general case -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-3376) Add an illustration of Event Time and Watermarks to the docs
[ https://issues.apache.org/jira/browse/FLINK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen updated FLINK-3376: Summary: Add an illustration of Event Time and Watermarks to the docs (was: Add an illustration of Event Time and Watermarks to the docsq) > Add an illustration of Event Time and Watermarks to the docs > > > Key: FLINK-3376 > URL: https://issues.apache.org/jira/browse/FLINK-3376 > Project: Flink > Issue Type: Improvement > Components: Documentation >Affects Versions: 0.10.1 >Reporter: Stephan Ewen >Priority: Critical > Fix For: 1.0.0 > > > Users seem to get confused about how event time and watermarks work. > We need to add documentation with two sections: > 1. Event time and watermark progress in general > - Watermarks are generated at the sources > - How Watermarks progress through the streaming data flow > 2. Ways that users can generate watermarks > - EventTimeSourceFunctions > - AscendingTimestampExtractor > - TimestampExtractor general case -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3286) Remove JDEB Debian Package code from flink-dist
[ https://issues.apache.org/jira/browse/FLINK-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138794#comment-15138794 ] Robert Metzger commented on FLINK-3286: --- Removed the files as well: http://git-wip-us.apache.org/repos/asf/flink/commit/a4f0692e > Remove JDEB Debian Package code from flink-dist > --- > > Key: FLINK-3286 > URL: https://issues.apache.org/jira/browse/FLINK-3286 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 0.10.1 >Reporter: Stephan Ewen >Assignee: Stephan Ewen >Priority: Blocker > Fix For: 1.0.0 > > > There is currently code in the {{flink-dist}} project to create a debian > package for Flink. This has been added by a contributor quite a while back, > and never been maintained (probably also never used). > I vote to remove that. It is out of date with paths and filenames and there > seems no interest in maintaining it so far. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52281965 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java --- @@ -1377,6 +1377,24 @@ public long count() throws Exception { return new SortPartitionOperator<>(this, field, order, Utils.getCallLocationName()); } + /** +* Locally sorts the partitions of the DataSet on the an extracted key in the specified order. +* DataSet can be sorted on multiple values by returning a tuple from the KeySelector. +* +* Note that any key extraction methods cannot be chained with the KeySelector. To sort the --- End diff -- "Note that any key extraction methods cannot be ..." -> "Note that no additional sort keys can be appended to a KeySelector." --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions
[ https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138573#comment-15138573 ] ASF GitHub Bot commented on FLINK-3234: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52281965 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java --- @@ -1377,6 +1377,24 @@ public long count() throws Exception { return new SortPartitionOperator<>(this, field, order, Utils.getCallLocationName()); } + /** +* Locally sorts the partitions of the DataSet on the an extracted key in the specified order. +* DataSet can be sorted on multiple values by returning a tuple from the KeySelector. +* +* Note that any key extraction methods cannot be chained with the KeySelector. To sort the --- End diff -- "Note that any key extraction methods cannot be ..." -> "Note that no additional sort keys can be appended to a KeySelector." > SortPartition does not support KeySelectorFunctions > --- > > Key: FLINK-3234 > URL: https://issues.apache.org/jira/browse/FLINK-3234 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0, 0.10.1 >Reporter: Fabian Hueske >Assignee: Chiwan Park > Fix For: 1.0.0 > > > The following is not supported by the DataSet API: > {code} > DataSet data = ... > DataSet data.sortPartition( > new KeySelector() { > public Long getKey(MyObject v) { > ... > } > }, > Order.ASCENDING); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions
[ https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138571#comment-15138571 ] ASF GitHub Bot commented on FLINK-3234: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52281750 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java --- @@ -1377,6 +1377,24 @@ public long count() throws Exception { return new SortPartitionOperator<>(this, field, order, Utils.getCallLocationName()); } + /** +* Locally sorts the partitions of the DataSet on the an extracted key in the specified order. +* DataSet can be sorted on multiple values by returning a tuple from the KeySelector. --- End diff -- "The DataSet can be ...", add "The" > SortPartition does not support KeySelectorFunctions > --- > > Key: FLINK-3234 > URL: https://issues.apache.org/jira/browse/FLINK-3234 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0, 0.10.1 >Reporter: Fabian Hueske >Assignee: Chiwan Park > Fix For: 1.0.0 > > > The following is not supported by the DataSet API: > {code} > DataSet data = ... > DataSet data.sortPartition( > new KeySelector() { > public Long getKey(MyObject v) { > ... > } > }, > Order.ASCENDING); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3243] Fix Interplay of TimeCharacterist...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1513#issuecomment-181776065 For new classes, it makes sense. Was a mistake on my end to name them like this in the first place. But users that adopted this draw in my experience more satisfaction from stable code than from a style nuance. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3374) CEPITCase testSimplePatternEventTime fails
[ https://issues.apache.org/jira/browse/FLINK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138626#comment-15138626 ] Till Rohrmann commented on FLINK-3374: -- Probably because it uses {{WriteMode.OVERWRITE}}. > CEPITCase testSimplePatternEventTime fails > -- > > Key: FLINK-3374 > URL: https://issues.apache.org/jira/browse/FLINK-3374 > Project: Flink > Issue Type: Test > Components: Tests >Reporter: Ufuk Celebi >Assignee: Till Rohrmann >Priority: Minor > > {code} > testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: > 1.68 sec <<< ERROR! > org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: java.io.FileNotFoundException: > /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:162) > at > org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56) > at > org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256) > at > org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263) > at > org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248) > at > org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76) > at > org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > at java.lang.Thread.run(Thread.java:745) > testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: > 1.68 sec <<< FAILURE! > java.lang.AssertionError: Different number of lines in expected and obtained > result. expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306) > at > org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292) > at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56) > {code} > https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz > {code} > 04:53:46,840 INFO org.apache.flink.runtime.client.JobClientActor >- 02/09/2016 04:53:46Map -> Sink: Unnamed(2/4) switched to FAILED > java.io.FileNotFoundException: > /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:162) > at > org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56) > at >
[jira] [Assigned] (FLINK-3374) CEPITCase testSimplePatternEventTime fails
[ https://issues.apache.org/jira/browse/FLINK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann reassigned FLINK-3374: Assignee: Till Rohrmann > CEPITCase testSimplePatternEventTime fails > -- > > Key: FLINK-3374 > URL: https://issues.apache.org/jira/browse/FLINK-3374 > Project: Flink > Issue Type: Test > Components: Tests >Reporter: Ufuk Celebi >Assignee: Till Rohrmann >Priority: Minor > > {code} > testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: > 1.68 sec <<< ERROR! > org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: java.io.FileNotFoundException: > /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:162) > at > org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56) > at > org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256) > at > org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263) > at > org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248) > at > org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76) > at > org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > at java.lang.Thread.run(Thread.java:745) > testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: > 1.68 sec <<< FAILURE! > java.lang.AssertionError: Different number of lines in expected and obtained > result. expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306) > at > org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292) > at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56) > {code} > https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz > {code} > 04:53:46,840 INFO org.apache.flink.runtime.client.JobClientActor >- 02/09/2016 04:53:46Map -> Sink: Unnamed(2/4) switched to FAILED > java.io.FileNotFoundException: > /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:162) > at > org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56) > at >
[jira] [Commented] (FLINK-3066) Kafka producer fails on leader change
[ https://issues.apache.org/jira/browse/FLINK-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138677#comment-15138677 ] Gyula Fora commented on FLINK-3066: --- Thank you Robert for the help, it is a good catch :) > Kafka producer fails on leader change > - > > Key: FLINK-3066 > URL: https://issues.apache.org/jira/browse/FLINK-3066 > Project: Flink > Issue Type: Bug > Components: Kafka Connector, Streaming Connectors >Affects Versions: 0.10.0, 1.0.0 >Reporter: Gyula Fora > > I got the following exception during my streaming job: > {code} > 16:44:50,637 INFO org.apache.flink.runtime.jobmanager.JobManager >- Status of job 4d3f9443df4822e875f1400244a6e8dd (deduplo!) changed to > FAILING. > java.lang.Exception: Failed to send data to Kafka: This server is not the > leader for that topic-partition. > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.checkErroneous(FlinkKafkaProducer.java:275) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.invoke(FlinkKafkaProducer.java:246) > at > org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:37) > at > org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:166) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:63) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:221) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:588) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.kafka.common.errors.NotLeaderForPartitionException: > This server is not the leader for that topic-partition. > {code} > And then the job crashed and recovered. This should probably be something > that we handle without crashing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181798643 That is a good point. In streaming setting, it does indeed make sense for the model to be available. However, in my opinion, then it would make sense to actually just use jppml and import the object, followed by extracting the model parameters. Granted, it is an added effort on the user side, but I still think it beats the complexity introduced by supporting imports directly. Furthermore, it would be a bad design to have to reject valid pmml models, just because a minor thing isn't supported in Flink. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3371) Move TriggerCotext and TriggerResult to their own classes
[ https://issues.apache.org/jira/browse/FLINK-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138754#comment-15138754 ] ASF GitHub Bot commented on FLINK-3371: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1603#issuecomment-181807335 You are not moving `TriggerContext` because it is specific to Trigger, correct? Otherwise it looks good to merge. :+1: > Move TriggerCotext and TriggerResult to their own classes > - > > Key: FLINK-3371 > URL: https://issues.apache.org/jira/browse/FLINK-3371 > Project: Flink > Issue Type: Sub-task > Components: Windowing Operators >Affects Versions: 1.0.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.0.0 > > > As part of adding aligned window operators, we need aligned trigger classes. > To make the {{TriggerResult}} and {{TriggerContext}} accessible to them, they > should move to their own classes, from currently being internal classes of > {{Trigger}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3355] [rocksdb backend] Allow passing o...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1608#issuecomment-181807481 :+1: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3371] [api breaking] Move TriggerResult...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1603#issuecomment-181807335 You are not moving `TriggerContext` because it is specific to Trigger, correct? Otherwise it looks good to merge. :+1: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3355) Allow passing RocksDB Option to RocksDBStateBackend
[ https://issues.apache.org/jira/browse/FLINK-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138755#comment-15138755 ] ASF GitHub Bot commented on FLINK-3355: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1608#issuecomment-181807481 :+1: > Allow passing RocksDB Option to RocksDBStateBackend > --- > > Key: FLINK-3355 > URL: https://issues.apache.org/jira/browse/FLINK-3355 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Gyula Fora >Assignee: Stephan Ewen >Priority: Critical > > Currently the RocksDB state backend does not allow users to set the > parameters of the created store which might lead to suboptimal performance on > some workloads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3375) Allow Watermark Generation in the Kafka Source
Stephan Ewen created FLINK-3375: --- Summary: Allow Watermark Generation in the Kafka Source Key: FLINK-3375 URL: https://issues.apache.org/jira/browse/FLINK-3375 Project: Flink Issue Type: Improvement Components: Kafka Connector Affects Versions: 1.0.0 Reporter: Stephan Ewen Fix For: 1.0.0 It is a common case that event timestamps are ascending inside one Kafka Partition. Ascending timestamps are easy for users, because they are handles by ascending timestamp extraction. If the Kafka source has multiple partitions per source task, then the records become out of order before timestamps can be extracted and watermarks can be generated. If we make the FlinkKafkaConsumer an event time source function, it can generate watermarks itself. It would internally implement the same logic as the regular operators that merge streams, keeping track of event time progress per partition and generating watermarks based on the current guaranteed event time progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3374) CEPITCase testSimplePatternEventTime fails
Ufuk Celebi created FLINK-3374: -- Summary: CEPITCase testSimplePatternEventTime fails Key: FLINK-3374 URL: https://issues.apache.org/jira/browse/FLINK-3374 Project: Flink Issue Type: Test Components: Tests Reporter: Ufuk Celebi Priority: Minor {code} testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: 1.68 sec <<< ERROR! org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.io.FileNotFoundException: /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or directory) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:162) at org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56) at org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256) at org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263) at org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248) at org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76) at org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66) at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89) at org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) at java.lang.Thread.run(Thread.java:745) testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: 1.68 sec <<< FAILURE! java.lang.AssertionError: Different number of lines in expected and obtained result. expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306) at org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292) at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56) {code} https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz {code} 04:53:46,840 INFO org.apache.flink.runtime.client.JobClientActor - 02/09/2016 04:53:46 Map -> Sink: Unnamed(2/4) switched to FAILED java.io.FileNotFoundException: /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or directory) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:162) at org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56) at org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256) at org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263) at org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248) at org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76) at
[jira] [Commented] (FLINK-3243) Fix Interplay of TimeCharacteristic and Time Windows
[ https://issues.apache.org/jira/browse/FLINK-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138616#comment-15138616 ] ASF GitHub Bot commented on FLINK-3243: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1513#issuecomment-181777122 You're right, I'm changing it. But it was also me who didn't notice when we put it in initially :sweat_smile: > Fix Interplay of TimeCharacteristic and Time Windows > > > Key: FLINK-3243 > URL: https://issues.apache.org/jira/browse/FLINK-3243 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Blocker > > As per the discussion on the Dev ML: > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Time-Behavior-in-Streaming-Jobs-Event-time-processing-time-td9616.html. > The discussion seems to have converged on option 2): > - Add dedicated WindowAssigners for processing time and event time > - {{timeWindow()}} and {{timeWindowAll()}} respect the set > {{TimeCharacteristic}}. > This will make the easy stuff easy, i.e. using time windows and quickly > switching the time characteristic. Users will then have the flexibility to > mix different kinds of window assigners in their job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3374) CEPITCase testSimplePatternEventTime fails
[ https://issues.apache.org/jira/browse/FLINK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138617#comment-15138617 ] Till Rohrmann commented on FLINK-3374: -- Hmm I just saw that the CEPITCase creates a file under the parent path instead of a folder. I'm wondering how this could pass before. On Tue, Feb 9, 2016 at 10:17 AM, Stephan Ewen (JIRA)> CEPITCase testSimplePatternEventTime fails > -- > > Key: FLINK-3374 > URL: https://issues.apache.org/jira/browse/FLINK-3374 > Project: Flink > Issue Type: Test > Components: Tests >Reporter: Ufuk Celebi >Priority: Minor > > {code} > testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: > 1.68 sec <<< ERROR! > org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: java.io.FileNotFoundException: > /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:162) > at > org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56) > at > org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256) > at > org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263) > at > org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248) > at > org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76) > at > org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > at java.lang.Thread.run(Thread.java:745) > testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: > 1.68 sec <<< FAILURE! > java.lang.AssertionError: Different number of lines in expected and obtained > result. expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306) > at > org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292) > at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56) > {code} > https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz > {code} > 04:53:46,840 INFO org.apache.flink.runtime.client.JobClientActor >- 02/09/2016 04:53:46Map -> Sink: Unnamed(2/4) switched to FAILED > java.io.FileNotFoundException: > /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:162) > at
[GitHub] flink pull request: [FLINK-3226] Translate logical aggregations to...
Github user vasia commented on the pull request: https://github.com/apache/flink/pull/1600#issuecomment-181794348 Thanks for the feedback @tillrohrmann, @twalthr! I've moved the classes to `org.apache.flink.api.table.runtime` and tried to shorten the aggregates code using Numerics. I only left `AvgAggregate` as is, because integer average and float/double average are computed differently. We can always replace it with code generation later as @twalthr suggested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3226) Translate optimized logical Table API plans into physical plans representing DataSet programs
[ https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138705#comment-15138705 ] ASF GitHub Bot commented on FLINK-3226: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/1600#issuecomment-181794348 Thanks for the feedback @tillrohrmann, @twalthr! I've moved the classes to `org.apache.flink.api.table.runtime` and tried to shorten the aggregates code using Numerics. I only left `AvgAggregate` as is, because integer average and float/double average are computed differently. We can always replace it with code generation later as @twalthr suggested. > Translate optimized logical Table API plans into physical plans representing > DataSet programs > - > > Key: FLINK-3226 > URL: https://issues.apache.org/jira/browse/FLINK-3226 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Fabian Hueske >Assignee: Chengxiang Li > > This issue is about translating an (optimized) logical Table API (see > FLINK-3225) query plan into a physical plan. The physical plan is a 1-to-1 > representation of the DataSet program that will be executed. This means: > - Each Flink RelNode refers to exactly one Flink DataSet or DataStream > operator. > - All (join and grouping) keys of Flink operators are correctly specified. > - The expressions which are to be executed in user-code are identified. > - All fields are referenced with their physical execution-time index. > - Flink type information is available. > - Optional: Add physical execution hints for joins > The translation should be the final part of Calcite's optimization process. > For this task we need to: > - implement a set of Flink DataSet RelNodes. Each RelNode corresponds to one > Flink DataSet operator (Map, Reduce, Join, ...). The RelNodes must hold all > relevant operator information (keys, user-code expression, strategy hints, > parallelism). > - implement rules to translate optimized Calcite RelNodes into Flink > RelNodes. We start with a straight-forward mapping and later add rules that > merge several relational operators into a single Flink operator, e.g., merge > a join followed by a filter. Timo implemented some rules for the first SQL > implementation which can be used as a starting point. > - Integrate the translation rules into the Calcite optimization process -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181803637 I'm all for that. Flink's models should be transferable at least across flink. But that should be part of a separate PR, and not block this one as it has been for far too long. It should be pretty easy to accomplish --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3371) Move TriggerCotext and TriggerResult to their own classes
[ https://issues.apache.org/jira/browse/FLINK-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138777#comment-15138777 ] ASF GitHub Bot commented on FLINK-3371: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1603#issuecomment-181813971 Exactly, the `AlignedTrigger` will have an `AlignedTriggerContext` without Key/Value state. > Move TriggerCotext and TriggerResult to their own classes > - > > Key: FLINK-3371 > URL: https://issues.apache.org/jira/browse/FLINK-3371 > Project: Flink > Issue Type: Sub-task > Components: Windowing Operators >Affects Versions: 1.0.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.0.0 > > > As part of adding aligned window operators, we need aligned trigger classes. > To make the {{TriggerResult}} and {{TriggerContext}} accessible to them, they > should move to their own classes, from currently being internal classes of > {{Trigger}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)
[ https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138528#comment-15138528 ] ASF GitHub Bot commented on FLINK-1966: --- Github user chobeat commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181757426 Well that wouldn't be a problem for the export: you will create and therefore export only models that have `double` as datatype for parameters but that's not an issue. This would be a problem for import though because PMML does support a wider set of data types and model types but you can't really achieve any satisfying degree of support for PMML in a platform like Flink and that's why everyone use JPMML for evaluation. You will be able to only import compatible models with compatible data fields. This would require a simple validation at runtime on the model type and on fields' data types. > Add support for predictive model markup language (PMML) > --- > > Key: FLINK-1966 > URL: https://issues.apache.org/jira/browse/FLINK-1966 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Sachin Goel >Priority: Minor > Labels: ML > > The predictive model markup language (PMML) [1] is a widely used language to > describe predictive and descriptive models as well as pre- and > post-processing steps. That way it allows and easy way to export for and > import models from other ML tools. > Resources: > [1] > http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Github user chobeat commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181757426 Well that wouldn't be a problem for the export: you will create and therefore export only models that have `double` as datatype for parameters but that's not an issue. This would be a problem for import though because PMML does support a wider set of data types and model types but you can't really achieve any satisfying degree of support for PMML in a platform like Flink and that's why everyone use JPMML for evaluation. You will be able to only import compatible models with compatible data fields. This would require a simple validation at runtime on the model type and on fields' data types. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-3373) Using a newer library of Apache HttpClient than 4.2.6 will get class loading problems
Jakob Sultan Ericsson created FLINK-3373: Summary: Using a newer library of Apache HttpClient than 4.2.6 will get class loading problems Key: FLINK-3373 URL: https://issues.apache.org/jira/browse/FLINK-3373 Project: Flink Issue Type: Bug Environment: Latest Flink snapshot 1.0 Reporter: Jakob Sultan Ericsson When I trying to use Apache HTTP client 4.5.1 in my flink job it will crash with NoClassDefFound. This has to do that it load some classes from provided httpclient 4.2.5/6 in core flink. {noformat} 17:05:56,193 INFO org.apache.flink.runtime.taskmanager.Task - DuplicateFilter -> InstallKeyLookup (11/16) switched to FAILED with exception. java.lang.NoSuchFieldError: INSTANCE at org.apache.http.conn.ssl.SSLConnectionSocketFactory.(SSLConnectionSocketFactory.java:144) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.getDefaultRegistry(PoolingHttpClientConnectionManager.java:109) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.(PoolingHttpClientConnectionManager.java:116) ... at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89) at org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:305) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:227) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) at java.lang.Thread.run(Thread.java:745) {noformat} SSLConnectionSocketFactory and finds an earlier version of the AllowAllHostnameVerifier that does have the INSTANCE variable (instance variable was probably added in 4.3). {noformat} jar tvf lib/flink-dist-1.0-SNAPSHOT.jar |grep AllowAllHostnameVerifier 791 Thu Dec 17 09:55:46 CET 2015 org/apache/http/conn/ssl/AllowAllHostnameVerifier.class {noformat} Solutions would be: - Fix the classloader so that my custom job does not conflict with internal flink-core classes... pretty hard - Remove the dependency somehow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions
[ https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138598#comment-15138598 ] ASF GitHub Bot commented on FLINK-3234: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1585#issuecomment-181773503 The refactoring looks good, @chiwanpark. I have just a few minor remarks. The PR can be resolved after these have been addressed. > SortPartition does not support KeySelectorFunctions > --- > > Key: FLINK-3234 > URL: https://issues.apache.org/jira/browse/FLINK-3234 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0, 0.10.1 >Reporter: Fabian Hueske >Assignee: Chiwan Park > Fix For: 1.0.0 > > > The following is not supported by the DataSet API: > {code} > DataSet data = ... > DataSet data.sortPartition( > new KeySelector() { > public Long getKey(MyObject v) { > ... > } > }, > Order.ASCENDING); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3374) CEPITCase testSimplePatternEventTime fails
[ https://issues.apache.org/jira/browse/FLINK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138600#comment-15138600 ] Stephan Ewen commented on FLINK-3374: - My guess is that the parent path does not exist. Maybe an issue in the FileOutputFormat > CEPITCase testSimplePatternEventTime fails > -- > > Key: FLINK-3374 > URL: https://issues.apache.org/jira/browse/FLINK-3374 > Project: Flink > Issue Type: Test > Components: Tests >Reporter: Ufuk Celebi >Priority: Minor > > {code} > testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: > 1.68 sec <<< ERROR! > org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: java.io.FileNotFoundException: > /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:162) > at > org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56) > at > org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256) > at > org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263) > at > org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248) > at > org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76) > at > org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > at java.lang.Thread.run(Thread.java:745) > testSimplePatternEventTime(org.apache.flink.cep.CEPITCase) Time elapsed: > 1.68 sec <<< FAILURE! > java.lang.AssertionError: Different number of lines in expected and obtained > result. expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306) > at > org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292) > at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56) > {code} > https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz > {code} > 04:53:46,840 INFO org.apache.flink.runtime.client.JobClientActor >- 02/09/2016 04:53:46Map -> Sink: Unnamed(2/4) switched to FAILED > java.io.FileNotFoundException: > /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:162) > at > org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56) > at >
[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions
[ https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138595#comment-15138595 ] ASF GitHub Bot commented on FLINK-3234: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1585#discussion_r52282391 --- Diff: flink-tests/src/test/scala/org/apache/flink/api/scala/operators/SortPartitionITCase.scala --- @@ -166,6 +167,58 @@ class SortPartitionITCase(mode: TestExecutionMode) extends MultipleProgramsTestB TestBaseUtils.compareResultAsText(result.asJava, expected) } + @Test + def testSortPartitionWithKeySelector1(): Unit = { +val env = ExecutionEnvironment.getExecutionEnvironment +env.setParallelism(4) +val ds = CollectionDataSets.get3TupleDataSet(env) + +val result = ds + .map { x => x }.setParallelism(4) + .sortPartition(_._2, Order.DESCENDING) --- End diff -- Change sort order to `ASCENDING` (or in the other test). > SortPartition does not support KeySelectorFunctions > --- > > Key: FLINK-3234 > URL: https://issues.apache.org/jira/browse/FLINK-3234 > Project: Flink > Issue Type: Improvement > Components: DataSet API >Affects Versions: 1.0.0, 0.10.1 >Reporter: Fabian Hueske >Assignee: Chiwan Park > Fix For: 1.0.0 > > > The following is not supported by the DataSet API: > {code} > DataSet data = ... > DataSet data.sortPartition( > new KeySelector() { > public Long getKey(MyObject v) { > ... > } > }, > Order.ASCENDING); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1585#issuecomment-181773503 The refactoring looks good, @chiwanpark. I have just a few minor remarks. The PR can be resolved after these have been addressed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3243) Fix Interplay of TimeCharacteristic and Time Windows
[ https://issues.apache.org/jira/browse/FLINK-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138610#comment-15138610 ] ASF GitHub Bot commented on FLINK-3243: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1513#issuecomment-181776065 For new classes, it makes sense. Was a mistake on my end to name them like this in the first place. But users that adopted this draw in my experience more satisfaction from stable code than from a style nuance. > Fix Interplay of TimeCharacteristic and Time Windows > > > Key: FLINK-3243 > URL: https://issues.apache.org/jira/browse/FLINK-3243 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Blocker > > As per the discussion on the Dev ML: > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Time-Behavior-in-Streaming-Jobs-Event-time-processing-time-td9616.html. > The discussion seems to have converged on option 2): > - Add dedicated WindowAssigners for processing time and event time > - {{timeWindow()}} and {{timeWindowAll()}} respect the set > {{TimeCharacteristic}}. > This will make the easy stuff easy, i.e. using time windows and quickly > switching the time characteristic. Users will then have the flexibility to > mix different kinds of window assigners in their job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3355] [rocksdb backend] Allow passing o...
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1608#issuecomment-181800676 Looks good :+1: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3355) Allow passing RocksDB Option to RocksDBStateBackend
[ https://issues.apache.org/jira/browse/FLINK-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138734#comment-15138734 ] ASF GitHub Bot commented on FLINK-3355: --- Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1608#issuecomment-181800676 Looks good :+1: > Allow passing RocksDB Option to RocksDBStateBackend > --- > > Key: FLINK-3355 > URL: https://issues.apache.org/jira/browse/FLINK-3355 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Gyula Fora >Assignee: Stephan Ewen >Priority: Critical > > Currently the RocksDB state backend does not allow users to set the > parameters of the created store which might lead to suboptimal performance on > some workloads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)
[ https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138751#comment-15138751 ] ASF GitHub Bot commented on FLINK-1966: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181803637 I'm all for that. Flink's models should be transferable at least across flink. But that should be part of a separate PR, and not block this one as it has been for far too long. It should be pretty easy to accomplish > Add support for predictive model markup language (PMML) > --- > > Key: FLINK-1966 > URL: https://issues.apache.org/jira/browse/FLINK-1966 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Sachin Goel >Priority: Minor > Labels: ML > > The predictive model markup language (PMML) [1] is a widely used language to > describe predictive and descriptive models as well as pre- and > post-processing steps. That way it allows and easy way to export for and > import models from other ML tools. > Resources: > [1] > http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3226) Translate optimized logical Table API plans into physical plans representing DataSet programs
[ https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138764#comment-15138764 ] ASF GitHub Bot commented on FLINK-3226: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1595#discussion_r52291201 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala --- @@ -0,0 +1,661 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.table.codegen + +import org.apache.calcite.rex._ +import org.apache.calcite.sql.`type`.SqlTypeName._ +import org.apache.calcite.sql.fun.SqlStdOperatorTable._ +import org.apache.flink.api.common.functions.{FlatMapFunction, Function, MapFunction} +import org.apache.flink.api.common.typeinfo.{AtomicType, TypeInformation} +import org.apache.flink.api.common.typeutils.CompositeType +import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo} +import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo +import org.apache.flink.api.table.TableConfig +import org.apache.flink.api.table.codegen.CodeGenUtils._ +import org.apache.flink.api.table.codegen.Indenter._ +import org.apache.flink.api.table.codegen.OperatorCodeGen._ +import org.apache.flink.api.table.plan.TypeConverter.sqlTypeToTypeInfo +import org.apache.flink.api.table.typeinfo.RowTypeInfo + +import scala.collection.JavaConversions._ +import scala.collection.mutable + +class CodeGenerator( +config: TableConfig, +input1: TypeInformation[Any], +input2: Option[TypeInformation[Any]] = None) + extends RexVisitor[GeneratedExpression] { + + // set of member statements that will be added only once + // we use a LinkedHashSet to keep the insertion order + private val reusableMemberStatements = mutable.LinkedHashSet[String]() + + // set of constructor statements that will be added only once + // we use a LinkedHashSet to keep the insertion order + private val reusableInitStatements = mutable.LinkedHashSet[String]() + + // map of initial input unboxing expressions that will be added only once + // (inputTerm, index) -> expr + private val reusableInputUnboxingExprs = mutable.Map[(String, Int), GeneratedExpression]() + + def reuseMemberCode(): String = { +reusableMemberStatements.mkString("", "\n", "\n") + } + + def reuseInitCode(): String = { +reusableInitStatements.mkString("", "\n", "\n") + } + + def reuseInputUnboxingCode(): String = { +reusableInputUnboxingExprs.values.map(_.code).mkString("", "\n", "\n") + } + + def input1Term = "in1" + + def input2Term = "in2" + + def collectorTerm = "c" + + def outRecordTerm = "out" + + def nullCheck: Boolean = config.getNullCheck + + def generateExpression(rex: RexNode): GeneratedExpression = { +rex.accept(this) + } + + def generateFunction[T <: Function]( + name: String, + clazz: Class[T], + bodyCode: String, + returnType: TypeInformation[Any]) +: GeneratedFunction[T] = { +val funcName = newName(name) + +// Janino does not support generics, that's why we need +// manual casting here +val samHeader = + if (clazz == classOf[FlatMapFunction[_,_]]) { +val inputTypeTerm = boxedTypeTermForTypeInfo(input1) +(s"void flatMap(Object _in1, org.apache.flink.util.Collector $collectorTerm)", + s"$inputTypeTerm $input1Term = ($inputTypeTerm) _in1;") + } else if (clazz == classOf[MapFunction[_,_]]) { +val inputTypeTerm = boxedTypeTermForTypeInfo(input1) +("Object map(Object _in1)", + s"$inputTypeTerm $input1Term = ($inputTypeTerm) _in1;") + } else { +// TODO more functions +throw new
[jira] [Commented] (FLINK-3377) Remove final flag from ResultPartitionWriter class
[ https://issues.apache.org/jira/browse/FLINK-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139016#comment-15139016 ] ASF GitHub Bot commented on FLINK-3377: --- GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/1609 [FLINK-3377] Remove final flag from ResultPartitionWriter class You can merge this pull request into a Git repository by running: $ git pull https://github.com/zentol/flink 3377_partitionwriter_final Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1609.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1609 commit 47670f5812c255a3cb992a0a2f396330ed5c519d Author: zentolDate: 2016-02-09T14:51:11Z [FLINK-3377] Remove final flag from ResultPartitionWriter class > Remove final flag from ResultPartitionWriter class > -- > > Key: FLINK-3377 > URL: https://issues.apache.org/jira/browse/FLINK-3377 > Project: Flink > Issue Type: Wish > Components: Distributed Runtime >Affects Versions: 0.10.1 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Trivial > Fix For: 1.00 > > > The final flag on the > org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter class is > causing issues for me. > The flag requires me to run a test I'm working on with a > @RunWith(PowerMockRunner.class) annotation so that i can use > @PrepareForTest({ResultPartitionWriter.class}). > But it breaks my TemporaryFolder annotated with @ClassRule. (apart from that > there also was a classloader issue, but i could resolve that) > To me these seem like unnecessary problems, as such i propose removing the > final flag. > The -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3378) Consolidate TestingCluster and FokableFlinkMiniCluster
Maximilian Michels created FLINK-3378: - Summary: Consolidate TestingCluster and FokableFlinkMiniCluster Key: FLINK-3378 URL: https://issues.apache.org/jira/browse/FLINK-3378 Project: Flink Issue Type: Improvement Components: Tests Affects Versions: 1.0.0 Reporter: Maximilian Michels Fix For: 1.0.0 {{TestingCluster}} appears to be outdated and should be replaced by or consolidated with the {{ForkableMiniCluster}}. Both clusters start the testing actors. Additionally, ForkableMiniCluster cluster has support for forking, HA, and restarting actors. As of now it looks like the use of both is arbitrary. The TestingCluster may produce test failures because multiple forked test instances could be trying to bind to the same free port. It looks like the ForkableMiniCluster should also inherit from FlinkMiniCluster instead of LocalFlinkMiniCluster because it overwrites all inherited implementations of LocalFlinkMiniCluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-3377) Remove final flag from ResultPartitionWriter class
[ https://issues.apache.org/jira/browse/FLINK-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-3377: Summary: Remove final flag from ResultPartitionWriter class (was: Remove final flag from ResultPatitionWriter class) > Remove final flag from ResultPartitionWriter class > -- > > Key: FLINK-3377 > URL: https://issues.apache.org/jira/browse/FLINK-3377 > Project: Flink > Issue Type: Wish > Components: Distributed Runtime >Affects Versions: 0.10.1 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Trivial > Fix For: 1.00 > > > The final flag on the > org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter class is > causing issues for me. > The flag requires me to run a test I'm working on with a > @RunWith(PowerMockRunner.class) annotation so that i can use > @PrepareForTest({ResultPartitionWriter.class}). > But it breaks my TemporaryFolder annotated with @ClassRule. (apart from that > there also was a classloader issue, but i could resolve that) > To me these seem like unnecessary problems, as such i propose removing the > final flag. > The -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3377) Remove final flag from ResultPatitionWriter class
Chesnay Schepler created FLINK-3377: --- Summary: Remove final flag from ResultPatitionWriter class Key: FLINK-3377 URL: https://issues.apache.org/jira/browse/FLINK-3377 Project: Flink Issue Type: Wish Components: Distributed Runtime Affects Versions: 0.10.1 Reporter: Chesnay Schepler Assignee: Chesnay Schepler Priority: Trivial Fix For: 1.00 The final flag on the org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter class is causing issues for me. The flag requires me to run a test I'm working on with a @RunWith(PowerMockRunner.class) annotation so that i can use @PrepareForTest({ResultPartitionWriter.class}). But it breaks my TemporaryFolder annotated with @ClassRule. (apart from that there also was a classloader issue, but i could resolve that) To me these seem like unnecessary problems, as such i propose removing the final flag. The -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2237) Add hash-based Aggregation
[ https://issues.apache.org/jira/browse/FLINK-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139041#comment-15139041 ] ASF GitHub Bot commented on FLINK-2237: --- Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/1517#issuecomment-181907722 What I'm not sure about is the `closeUserCode` call in `ChainedReduceCombineDriver.closeTask`. Those other chained drivers that have a `running` flag for indicating canceling, make this call only when the driver was not canceled. But those other chained drivers where there is no `running` flag seem to make this call also when they were canceled. What is the reasoning behind this situation? > Add hash-based Aggregation > -- > > Key: FLINK-2237 > URL: https://issues.apache.org/jira/browse/FLINK-2237 > Project: Flink > Issue Type: New Feature >Reporter: Rafiullah Momand >Assignee: Gabor Gevay >Priority: Minor > > Aggregation functions at the moment are implemented in a sort-based way. > How can we implement hash based Aggregation for Flink? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2237] [runtime] Add hash-based combiner...
Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/1517#issuecomment-181907722 What I'm not sure about is the `closeUserCode` call in `ChainedReduceCombineDriver.closeTask`. Those other chained drivers that have a `running` flag for indicating canceling, make this call only when the driver was not canceled. But those other chained drivers where there is no `running` flag seem to make this call also when they were canceled. What is the reasoning behind this situation? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2055) Implement Streaming HBaseSink
[ https://issues.apache.org/jira/browse/FLINK-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139040#comment-15139040 ] Chesnay Schepler commented on FLINK-2055: - What's the status on this one? > Implement Streaming HBaseSink > - > > Key: FLINK-2055 > URL: https://issues.apache.org/jira/browse/FLINK-2055 > Project: Flink > Issue Type: New Feature > Components: Streaming, Streaming Connectors >Affects Versions: 0.9 >Reporter: Robert Metzger >Assignee: Hilmi Yildirim > > As per : > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Write-Stream-to-HBase-td1300.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3107] [runtime] Start checkpoint ID cou...
GitHub user uce opened a pull request: https://github.com/apache/flink/pull/1610 [FLINK-3107] [runtime] Start checkpoint ID counter with periodic scheduler Problem: The job manager enables checkpoints during submission of streaming programs. This can lead to call to a call to `ZooKeeperCheckpointIDCounter.start()`, which communicates with ZooKeeper. This can block the job manager actor. Solution: Start the counter in the `CheckpointCoordinatorDeActivator`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/flink 3107-counter_start Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1610.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1610 commit d70bc79e48dddb658c2240350837000ce9f1f0fe Author: Ufuk CelebiDate: 2016-02-09T15:06:46Z [FLINK-3107] [runtime] Start checkpoint ID counter with periodic scheduler Problem: The job manager enables checkpoints during submission of streaming programs. This can lead to call to a call to `ZooKeeperCheckpointIDCounter.start()`, which communicates with ZooKeeper. This can block the job manager actor. Solution: Start the counter in the `CheckpointCoordinatorDeActivator`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---