date:20160209

[jira] [Commented] (FLINK-2523) Make task canceling interrupt interval configurable

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139415#comment-15139415
 ] 

ASF GitHub Bot commented on FLINK-2523:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1612#discussion_r52356401
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobGraph.java ---
@@ -102,60 +101,150 @@
// 

 
/**
-* Constructs a new job graph with no name and a random job ID.
+* Constructs a new job graph with no name, a random job ID, and the 
default
+* {@link ExecutionConfig} parameters.
 */
public JobGraph() {
this((String) null);
}
 
/**
-* Constructs a new job graph with the given name, a random job ID.
+* Constructs a new job graph with the given name, a random job ID and 
the default
+* {@link ExecutionConfig} parameters.
 *
 * @param jobName The name of the job
 */
public JobGraph(String jobName) {
-   this(null, jobName);
+   this(null, jobName, new ExecutionConfig());
+   }
+
+   /**
+* Constructs a new job graph with the given job ID, the given name, 
and the default
+* {@link ExecutionConfig} parameters.
+*
+* @param jobID The id of the job. A random ID is generated, if {@code 
null} is passed.
+* @param jobName The name of the job.
+*/
+   public JobGraph(JobID jobID, String jobName) {
+   this(jobID, jobName, new ExecutionConfig());
}
 
/**
-* Constructs a new job graph with the given name and a random job ID 
if null supplied as an id.
+* Constructs a new job graph with the given name, the given {@link 
ExecutionConfig},
+* and a random job ID.
+*
+* @param jobName The name of the job.
+* @param config The execution configuration of the job.
+*/
+   public JobGraph(String jobName, ExecutionConfig config) {
+   this(null, jobName, config);
+   }
+
+   /**
+* Constructs a new job graph with the given job ID (or a random ID, if 
{@code null} is passed),
+* the given name and the given execution configuration (see {@link 
ExecutionConfig}).
 *
 * @param jobId The id of the job. A random ID is generated, if {@code 
null} is passed.
 * @param jobName The name of the job.
+* @param config The execution configuration of the job.
 */
-   public JobGraph(JobID jobId, String jobName) {
+   public JobGraph(JobID jobId, String jobName, ExecutionConfig config) {
this.jobID = jobId == null ? new JobID() : jobId;
this.jobName = jobName == null ? "(unnamed job)" : jobName;
+   this.executionConfig = config;
--- End diff --

Can `executionConfig` be `null`? If not, then we should insert a check here.


> Make task canceling interrupt interval configurable
> ---
>
> Key: FLINK-2523
> URL: https://issues.apache.org/jira/browse/FLINK-2523
> Project: Flink
>  Issue Type: Improvement
>  Components: TaskManager
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Klou
> Fix For: 1.0.0
>
>
> When a task is canceled, the cancellation calls periodically "interrupt()" on 
> the task thread, if the task thread does not cancel with a certain time.
> Currently, this value is hard coded to 10 seconds. We should make that time 
> configurable.
> Until then, I would like to increase the value to 30 seconds, as many tasks 
> (here I am observing it for Kafka consumers) can take longer then 10 seconds 
> for proper cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3382] Improve clarity of object reuse i...

2016-02-09 Thread ggevay

Github user ggevay commented on the pull request:

https://github.com/apache/flink/pull/1614#issuecomment-182025617
  
Shouldn't the current record remain valid if `hasNext()` returned true? I 
mean the user might be holding on to the object returned in `next`, and expect 
it to not be changed by a `hasNext` call:
```
T cur = it.next();
if(it.hasNext()) {
  // here, I would expect cur to not have changed since the next() call
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3382) Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139530#comment-15139530
 ] 

ASF GitHub Bot commented on FLINK-3382:
---

Github user ggevay commented on the pull request:

https://github.com/apache/flink/pull/1614#issuecomment-182025617
  
Shouldn't the current record remain valid if `hasNext()` returned true? I 
mean the user might be holding on to the object returned in `next`, and expect 
it to not be changed by a `hasNext` call:
```
T cur = it.next();
if(it.hasNext()) {
  // here, I would expect cur to not have changed since the next() call
}
```


> Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper
> -
>
> Key: FLINK-3382
> URL: https://issues.apache.org/jira/browse/FLINK-3382
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Runtime
>Affects Versions: 1.0.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
>
> Object reuse in {{ReusingMutableToRegularIteratorWrapper.hasNext()}} can be 
> clarified by creating a single object and storing the iterator's next value 
> into the second reference.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3373) Using a newer library of Apache HttpClient than 4.2.6 will get class loading problems

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139694#comment-15139694
 ] 

ASF GitHub Bot commented on FLINK-3373:
---

GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/1615

[FLINK-3373] [build] Shade away Hadoop's HTTP Components dependency

This makes the HTTP Components dependency disappear from the core 
classpath, allowing users to use their own version of the dependency.

We need shading because we cannot simply bump the HTTP Components version 
to the newest version. The YARN test for Hadoop version >= 2.6.0 fail in that 
case.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink http_shade

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1615.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1615


commit 1be39d12071c7251cd566e692c3a9c7b5440e46d
Author: Stephan Ewen 
Date:   2016-02-09T20:18:43Z

[FLINK-3373] [build] Shade away Hadoop's HTTP Components dependency




> Using a newer library of Apache HttpClient than 4.2.6 will get class loading 
> problems
> -
>
> Key: FLINK-3373
> URL: https://issues.apache.org/jira/browse/FLINK-3373
> Project: Flink
>  Issue Type: Bug
> Environment: Latest Flink snapshot 1.0
>Reporter: Jakob Sultan Ericsson
>
> When I trying to use Apache HTTP client 4.5.1 in my flink job it will crash 
> with NoClassDefFound.
> This has to do that it load some classes from provided httpclient 4.2.5/6 in 
> core flink.
> {noformat}
> 17:05:56,193 INFO  org.apache.flink.runtime.taskmanager.Task  
>- DuplicateFilter -> InstallKeyLookup (11/16) switched to FAILED with 
> exception.
> java.lang.NoSuchFieldError: INSTANCE
> at 
> org.apache.http.conn.ssl.SSLConnectionSocketFactory.(SSLConnectionSocketFactory.java:144)
> at 
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.getDefaultRegistry(PoolingHttpClientConnectionManager.java:109)
> at 
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.(PoolingHttpClientConnectionManager.java:116)
> ...
> at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:305)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:227)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> SSLConnectionSocketFactory and finds an earlier version of the 
> AllowAllHostnameVerifier that does have the INSTANCE variable (instance 
> variable was probably added in 4.3).
> {noformat}
> jar tvf lib/flink-dist-1.0-SNAPSHOT.jar |grep AllowAllHostnameVerifier  
>791 Thu Dec 17 09:55:46 CET 2015 
> org/apache/http/conn/ssl/AllowAllHostnameVerifier.class
> {noformat}
> Solutions would be:
> - Fix the classloader so that my custom job does not conflict with internal 
> flink-core classes... pretty hard
> - Remove the dependency somehow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3382] Improve clarity of object reuse i...

2016-02-09 Thread greghogan

Github user greghogan commented on the pull request:

https://github.com/apache/flink/pull/1614#issuecomment-182035391
  
Ah, yes, now I see. I'll just burn this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3382) Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139652#comment-15139652
 ] 

ASF GitHub Bot commented on FLINK-3382:
---

Github user greghogan commented on the pull request:

https://github.com/apache/flink/pull/1614#issuecomment-182035391
  
Ah, yes, now I see. I'll just burn this.


> Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper
> -
>
> Key: FLINK-3382
> URL: https://issues.apache.org/jira/browse/FLINK-3382
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Runtime
>Affects Versions: 1.0.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
>
> Object reuse in {{ReusingMutableToRegularIteratorWrapper.hasNext()}} can be 
> clarified by creating a single object and storing the iterator's next value 
> into the second reference.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3382) Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139653#comment-15139653
 ] 

ASF GitHub Bot commented on FLINK-3382:
---

Github user greghogan closed the pull request at:

https://github.com/apache/flink/pull/1614


> Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper
> -
>
> Key: FLINK-3382
> URL: https://issues.apache.org/jira/browse/FLINK-3382
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Runtime
>Affects Versions: 1.0.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
>
> Object reuse in {{ReusingMutableToRegularIteratorWrapper.hasNext()}} can be 
> clarified by creating a single object and storing the iterator's next value 
> into the second reference.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (FLINK-3333) Documentation about object reuse should be improved

2016-02-09 Thread Greg Hogan (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139151#comment-15139151
 ] 

Greg Hogan edited comment on FLINK- at 2/9/16 8:07 PM:
---

Apache Flink programs can be written and configured to reduce the number of 
object allocations for better performance. User defined functions (like map() 
or groupReduce()) process many millions or billions of input and output values. 
Enabling object reuse and processing mutable objects improves performance by 
lowering demand on the CPU cache and Java garbage collector.

Object reuse is disabled by default, with user defined functions generally 
getting new objects on each call (or through an iterator). In this case it is 
safe to store references to the objects inside the function (for example, in a 
List).

<'storing values in a list' example>

Apache Flink will chain functions to improve performance when sorting is 
preserved and the parallelism unchanged. The chainable operators are Map, 
FlatMap, Reduce on full DataSet, GroupCombine on a Grouped DataSet, or a 
GroupReduce where the user supplied a RichGroupReduceFunction with a combine 
method. Objects are passed without copying _even when object reuse is disabled_.

In the chaining case, the functions in the chain are receiving the same object 
instances. So the the second map() function is receiving the objects the first 
map() is returning. This behavior can lead to errors when the first map() 
function keeps a list of all objects and the second mapper is modifying 
objects. In that case, the user has to manually create copies of the objects 
before putting them into the list.







There is a switch at the ExecutionConfig which allows users to enable the 
object reuse mode (enableObjectReuse()). For mutable types, Flink will reuse 
object instances. In practice that means that a user function will always 
receive the same object instance (with its fields set to new values). The 
object reuse mode will lead to better performance because fewer objects are 
created, but the user has to manually take care of what they are doing with the 
object references.




was (Author: greghogan):
Apache Flink programs can be written and configured to reduce the number of 
object allocations for better performance. User defined functions (like map() 
or groupReduce()) process many millions or billions of input and output values. 
Enabling object reuse and processing mutable objects improves performance by 
lowering demand on the CPU cache and Java garbage collector.

Object reuse is disabled by default, with user defined functions generally 
getting new objects on each call (or through an iterator). In this case it is 
safe to store references to the objects inside the function (for example, in a 
List).

<'storing values in a list' example>

Apache Flink will chain functions to improve performance when sorting is 
preserved and the parallelism unchanged. The chainable operators are Map, 
FlatMap, Reduce on full DataSet, GroupCombine on a Grouped DataSet, or a 
GroupReduce where the user supplied a RichGroupReduceFunction with a combine 
method). Objects are passed without copying _even when object reuse is 
disabled_.

In the chaining case, the functions in the chain are receiving the same object 
instances. So the the second map() function is receiving the objects the first 
map() is returning. This behavior can lead to errors when the first map() 
function keeps a list of all objects and the second mapper is modifying 
objects. In that case, the user has to manually create copies of the objects 
before putting them into the list.







There is a switch at the ExecutionConfig which allows users to enable the 
object reuse mode (enableObjectReuse()). For mutable types, Flink will reuse 
object instances. In practice that means that a user function will always 
receive the same object instance (with its fields set to new values). The 
object reuse mode will lead to better performance because fewer objects are 
created, but the user has to manually take care of what they are doing with the 
object references.



> Documentation about object reuse should be improved
> ---
>
> Key: FLINK-
> URL: https://issues.apache.org/jira/browse/FLINK-
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.0
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Blocker
> Fix For: 1.0.0
>
>
> The documentation about object reuse \[1\] has several problems, see \[2\].
> \[1\] 
> https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/index.html#object-reuse-behavior
> \[2\] 
> https://docs.google.com/document/d/1cgkuttvmj4jUonG7E2RdFVjKlfQDm_hE6gvFcgAfzXg/edit



--
This message was sent by

[GitHub] flink pull request: [FLINK-3373] [build] Shade away Hadoop's HTTP ...

2016-02-09 Thread StephanEwen

GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/1615

[FLINK-3373] [build] Shade away Hadoop's HTTP Components dependency

This makes the HTTP Components dependency disappear from the core 
classpath, allowing users to use their own version of the dependency.

We need shading because we cannot simply bump the HTTP Components version 
to the newest version. The YARN test for Hadoop version >= 2.6.0 fail in that 
case.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink http_shade

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1615.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1615


commit 1be39d12071c7251cd566e692c3a9c7b5440e46d
Author: Stephan Ewen 
Date:   2016-02-09T20:18:43Z

[FLINK-3373] [build] Shade away Hadoop's HTTP Components dependency




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3382] Improve clarity of object reuse i...

2016-02-09 Thread greghogan

GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/1614

[FLINK-3382] Improve clarity of object reuse in 
ReusingMutableToRegularIteratorWrapper



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 
3382_improve_clarity_of_object_reuse_in_ReusingMutableToRegularIteratorWrapper

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1614.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1614






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3260] [runtime] Enforce terminal state ...

2016-02-09 Thread StephanEwen

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1613#discussion_r52368138
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ---
@@ -107,7 +108,7 @@
private static final AtomicReferenceFieldUpdater STATE_UPDATER =
AtomicReferenceFieldUpdater.newUpdater(Execution.class, 
ExecutionState.class, "state");

-   private static final Logger LOG = ExecutionGraph.LOG;
+   private static final Logger LOG = 
LoggerFactory.getLogger(Execution.class);
--- End diff --

Did this cause issues in this case? I originally set the logger to the 
ExecutionGraph logger to get all messages related to the execution and it 
changes in one log namespace. I always thought that makes searching the log 
easier.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2380) Allow to configure default FS for file inputs

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139717#comment-15139717
 ] 

ASF GitHub Bot commented on FLINK-2380:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1524#issuecomment-182054657
  
I'm testing the change on a cluster (with YARN) to see if everything is 
working as expected.


> Allow to configure default FS for file inputs
> -
>
> Key: FLINK-2380
> URL: https://issues.apache.org/jira/browse/FLINK-2380
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager
>Affects Versions: 0.9, 0.10.0
>Reporter: Ufuk Celebi
>Assignee: Klou
>Priority: Minor
>  Labels: starter
> Fix For: 1.0.0
>
>
> File inputs use "file://" as default prefix. A user asked to make this 
> configurable, e.g. "hdfs://" as default.
> (I'm not sure whether this is already possible or not. I will check and 
> either close or implement this for the user.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3382) Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139441#comment-15139441
 ] 

ASF GitHub Bot commented on FLINK-3382:
---

GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/1614

[FLINK-3382] Improve clarity of object reuse in 
ReusingMutableToRegularIteratorWrapper



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 
3382_improve_clarity_of_object_reuse_in_ReusingMutableToRegularIteratorWrapper

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1614.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1614






> Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper
> -
>
> Key: FLINK-3382
> URL: https://issues.apache.org/jira/browse/FLINK-3382
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Runtime
>Affects Versions: 1.0.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
>
> Object reuse in {{ReusingMutableToRegularIteratorWrapper.hasNext()}} can be 
> clarified by creating a single object and storing the iterator's next value 
> into the second reference.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3260) ExecutionGraph gets stuck in state FAILING

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139699#comment-15139699
 ] 

ASF GitHub Bot commented on FLINK-3260:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1613#discussion_r52368138
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ---
@@ -107,7 +108,7 @@
private static final AtomicReferenceFieldUpdater STATE_UPDATER =
AtomicReferenceFieldUpdater.newUpdater(Execution.class, 
ExecutionState.class, "state");

-   private static final Logger LOG = ExecutionGraph.LOG;
+   private static final Logger LOG = 
LoggerFactory.getLogger(Execution.class);
--- End diff --

Did this cause issues in this case? I originally set the logger to the 
ExecutionGraph logger to get all messages related to the execution and it 
changes in one log namespace. I always thought that makes searching the log 
easier.


> ExecutionGraph gets stuck in state FAILING
> --
>
> Key: FLINK-3260
> URL: https://issues.apache.org/jira/browse/FLINK-3260
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Affects Versions: 0.10.1
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
>Priority: Blocker
> Fix For: 1.0.0
>
>
> It is a bit of a rare case, but the following can currently happen:
>   1. Jobs runs for a while, some tasks are already finished.
>   2. Job fails, goes to state failing and restarting. Non-finished tasks fail 
> or are canceled.
>   3. For the finished tasks, ask-futures from certain messages (for example 
> for releasing intermediate result partitions) can fail (timeout) and cause 
> the execution to go from FINISHED to FAILED
>   4. This triggers the execution graph to go to FAILING without ever going 
> further into RESTARTING again
>   5. The job is stuck
> It initially looks like this is mainly an issue for batch jobs (jobs where 
> tasks do finish, rather than run infinitely).
> The log that shows how this manifests:
> {code}
> 
> 17:19:19,782 INFO  akka.event.slf4j.Slf4jLogger   
>- Slf4jLogger started
> 17:19:19,844 INFO  Remoting   
>- Starting remoting
> 17:19:20,065 INFO  Remoting   
>- Remoting started; listening on addresses 
> :[akka.tcp://flink@127.0.0.1:56722]
> 17:19:20,090 INFO  org.apache.flink.runtime.blob.BlobServer   
>- Created BLOB server storage directory 
> /tmp/blobStore-6766f51a-1c51-4a03-acfb-08c2c29c11f0
> 17:19:20,096 INFO  org.apache.flink.runtime.blob.BlobServer   
>- Started BLOB server at 0.0.0.0:43327 - max concurrent requests: 50 - max 
> backlog: 1000
> 17:19:20,113 INFO  org.apache.flink.runtime.jobmanager.MemoryArchivist
>- Started memory archivist akka://flink/user/archive
> 17:19:20,115 INFO  org.apache.flink.runtime.checkpoint.SavepointStoreFactory  
>- No savepoint state backend configured. Using job manager savepoint state 
> backend.
> 17:19:20,118 INFO  org.apache.flink.runtime.jobmanager.JobManager 
>- Starting JobManager at akka.tcp://flink@127.0.0.1:56722/user/jobmanager.
> 17:19:20,123 INFO  org.apache.flink.runtime.jobmanager.JobManager 
>- JobManager akka.tcp://flink@127.0.0.1:56722/user/jobmanager was granted 
> leadership with leader session ID None.
> 17:19:25,605 INFO  org.apache.flink.runtime.instance.InstanceManager  
>- Registered TaskManager at 
> testing-worker-linux-docker-e6d6931f-3200-linux-4 
> (akka.tcp://flink@172.17.0.253:43702/user/taskmanager) as 
> f213232054587f296a12140d56f63ed1. Current number of registered hosts is 1. 
> Current number of alive task slots is 2.
> 17:19:26,758 INFO  org.apache.flink.runtime.instance.InstanceManager  
>- Registered TaskManager at 
> testing-worker-linux-docker-e6d6931f-3200-linux-4 
> (akka.tcp://flink@172.17.0.253:43956/user/taskmanager) as 
> f9e78baa14fb38c69517fb1bcf4f419c. Current number of registered hosts is 2. 
> Current number of alive task slots is 4.
> 17:19:27,064 INFO  org.apache.flink.api.java.ExecutionEnvironment 
>- The job has 0 registered types and 0 default Kryo serializers
> 17:19:27,071 INFO  org.apache.flink.client.program.Client 
>- Starting client actor system
> 17:19:27,072 INFO  org.apache.flink.runtime.client.JobClient  
>- Starting JobClient actor system
> 17:19:27,110 INFO  akka.event.slf4j.Slf4jLogger

[jira] [Commented] (FLINK-3260) ExecutionGraph gets stuck in state FAILING

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139701#comment-15139701
 ] 

ASF GitHub Bot commented on FLINK-3260:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1613#issuecomment-182046481
  
Looks good, with one inline comment.

Otherwise, +1 to merge


> ExecutionGraph gets stuck in state FAILING
> --
>
> Key: FLINK-3260
> URL: https://issues.apache.org/jira/browse/FLINK-3260
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Affects Versions: 0.10.1
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
>Priority: Blocker
> Fix For: 1.0.0
>
>
> It is a bit of a rare case, but the following can currently happen:
>   1. Jobs runs for a while, some tasks are already finished.
>   2. Job fails, goes to state failing and restarting. Non-finished tasks fail 
> or are canceled.
>   3. For the finished tasks, ask-futures from certain messages (for example 
> for releasing intermediate result partitions) can fail (timeout) and cause 
> the execution to go from FINISHED to FAILED
>   4. This triggers the execution graph to go to FAILING without ever going 
> further into RESTARTING again
>   5. The job is stuck
> It initially looks like this is mainly an issue for batch jobs (jobs where 
> tasks do finish, rather than run infinitely).
> The log that shows how this manifests:
> {code}
> 
> 17:19:19,782 INFO  akka.event.slf4j.Slf4jLogger   
>- Slf4jLogger started
> 17:19:19,844 INFO  Remoting   
>- Starting remoting
> 17:19:20,065 INFO  Remoting   
>- Remoting started; listening on addresses 
> :[akka.tcp://flink@127.0.0.1:56722]
> 17:19:20,090 INFO  org.apache.flink.runtime.blob.BlobServer   
>- Created BLOB server storage directory 
> /tmp/blobStore-6766f51a-1c51-4a03-acfb-08c2c29c11f0
> 17:19:20,096 INFO  org.apache.flink.runtime.blob.BlobServer   
>- Started BLOB server at 0.0.0.0:43327 - max concurrent requests: 50 - max 
> backlog: 1000
> 17:19:20,113 INFO  org.apache.flink.runtime.jobmanager.MemoryArchivist
>- Started memory archivist akka://flink/user/archive
> 17:19:20,115 INFO  org.apache.flink.runtime.checkpoint.SavepointStoreFactory  
>- No savepoint state backend configured. Using job manager savepoint state 
> backend.
> 17:19:20,118 INFO  org.apache.flink.runtime.jobmanager.JobManager 
>- Starting JobManager at akka.tcp://flink@127.0.0.1:56722/user/jobmanager.
> 17:19:20,123 INFO  org.apache.flink.runtime.jobmanager.JobManager 
>- JobManager akka.tcp://flink@127.0.0.1:56722/user/jobmanager was granted 
> leadership with leader session ID None.
> 17:19:25,605 INFO  org.apache.flink.runtime.instance.InstanceManager  
>- Registered TaskManager at 
> testing-worker-linux-docker-e6d6931f-3200-linux-4 
> (akka.tcp://flink@172.17.0.253:43702/user/taskmanager) as 
> f213232054587f296a12140d56f63ed1. Current number of registered hosts is 1. 
> Current number of alive task slots is 2.
> 17:19:26,758 INFO  org.apache.flink.runtime.instance.InstanceManager  
>- Registered TaskManager at 
> testing-worker-linux-docker-e6d6931f-3200-linux-4 
> (akka.tcp://flink@172.17.0.253:43956/user/taskmanager) as 
> f9e78baa14fb38c69517fb1bcf4f419c. Current number of registered hosts is 2. 
> Current number of alive task slots is 4.
> 17:19:27,064 INFO  org.apache.flink.api.java.ExecutionEnvironment 
>- The job has 0 registered types and 0 default Kryo serializers
> 17:19:27,071 INFO  org.apache.flink.client.program.Client 
>- Starting client actor system
> 17:19:27,072 INFO  org.apache.flink.runtime.client.JobClient  
>- Starting JobClient actor system
> 17:19:27,110 INFO  akka.event.slf4j.Slf4jLogger   
>- Slf4jLogger started
> 17:19:27,121 INFO  Remoting   
>- Starting remoting
> 17:19:27,143 INFO  org.apache.flink.runtime.client.JobClient  
>- Started JobClient actor system at 127.0.0.1:51198
> 17:19:27,145 INFO  Remoting   
>- Remoting started; listening on addresses 
> :[akka.tcp://flink@127.0.0.1:51198]
> 17:19:27,325 INFO  org.apache.flink.runtime.client.JobClientActor 
>- Disconnect from JobManager null.
> 17:19:27,362 INFO  org.apache.flink.runtime.client.JobClientActor 
>- Received job Flink Java Job at Mon Jan

[GitHub] flink pull request: [FLINK-3260] [runtime] Enforce terminal state ...

2016-02-09 Thread StephanEwen

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1613#issuecomment-182046481
  
Looks good, with one inline comment.

Otherwise, +1 to merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...

2016-02-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52281750
  
--- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java 
---
@@ -1377,6 +1377,24 @@ public long count() throws Exception {
return new SortPartitionOperator<>(this, field, order, 
Utils.getCallLocationName());
}
 
+   /**
+* Locally sorts the partitions of the DataSet on the an extracted key 
in the specified order.
+* DataSet can be sorted on multiple values by returning a tuple from 
the KeySelector.
--- End diff --

"The DataSet can be ...", add "The"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...

2016-02-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52281721
  
--- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java 
---
@@ -1377,6 +1377,24 @@ public long count() throws Exception {
return new SortPartitionOperator<>(this, field, order, 
Utils.getCallLocationName());
}
 
+   /**
+* Locally sorts the partitions of the DataSet on the an extracted key 
in the specified order.
--- End diff --

"...the DataSet on the **an** extracted key...", remove "an"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138686#comment-15138686
 ] 

ASF GitHub Bot commented on FLINK-1966:
---

Github user chobeat commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181790876
  
I agree with @sachingoel0101 on the import complexity but, from our point 
of view, Flink is the perfect platform to evaluate models in streaming and we 
are using it that way in our architecture. Why do you think it wouldn't be 
suitable? 


> Add support for predictive model markup language (PMML)
> ---
>
> Key: FLINK-1966
> URL: https://issues.apache.org/jira/browse/FLINK-1966
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Sachin Goel
>Priority: Minor
>  Labels: ML
>
> The predictive model markup language (PMML) [1] is a widely used language to 
> describe predictive and descriptive models as well as pre- and 
> post-processing steps. That way it allows and easy way to export for and 
> import models from other ML tools.
> Resources:
> [1] 
> http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138722#comment-15138722
 ] 

ASF GitHub Bot commented on FLINK-1966:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181798643
  
That is a good point. In streaming setting, it does indeed make sense for 
the model to be available. However, in my opinion, then it would make sense to 
actually just use jppml and import the object, followed by extracting the model 
parameters. Granted, it is an added effort on the user side, but I still think 
it beats the complexity introduced by supporting imports directly. Furthermore, 
it would be a bad design to have to reject valid pmml models, just because a 
minor thing isn't supported in Flink. 


> Add support for predictive model markup language (PMML)
> ---
>
> Key: FLINK-1966
> URL: https://issues.apache.org/jira/browse/FLINK-1966
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Sachin Goel
>Priority: Minor
>  Labels: ML
>
> The predictive model markup language (PMML) [1] is a widely used language to 
> describe predictive and descriptive models as well as pre- and 
> post-processing steps. That way it allows and easy way to export for and 
> import models from other ML tools.
> Resources:
> [1] 
> http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3373) Using a newer library of Apache HttpClient than 4.2.6 will get class loading problems

2016-02-09 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138787#comment-15138787
 ] 

Stephan Ewen commented on FLINK-3373:
-

Would it help if Flink simply updated the HTTP Client to the latest version (is 
it backwards compatible)?

If not, then we need to shade the dependency, but if yes, that would be a very 
lightweight fix.

> Using a newer library of Apache HttpClient than 4.2.6 will get class loading 
> problems
> -
>
> Key: FLINK-3373
> URL: https://issues.apache.org/jira/browse/FLINK-3373
> Project: Flink
>  Issue Type: Bug
> Environment: Latest Flink snapshot 1.0
>Reporter: Jakob Sultan Ericsson
>
> When I trying to use Apache HTTP client 4.5.1 in my flink job it will crash 
> with NoClassDefFound.
> This has to do that it load some classes from provided httpclient 4.2.5/6 in 
> core flink.
> {noformat}
> 17:05:56,193 INFO  org.apache.flink.runtime.taskmanager.Task  
>- DuplicateFilter -> InstallKeyLookup (11/16) switched to FAILED with 
> exception.
> java.lang.NoSuchFieldError: INSTANCE
> at 
> org.apache.http.conn.ssl.SSLConnectionSocketFactory.(SSLConnectionSocketFactory.java:144)
> at 
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.getDefaultRegistry(PoolingHttpClientConnectionManager.java:109)
> at 
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.(PoolingHttpClientConnectionManager.java:116)
> ...
> at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:305)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:227)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> SSLConnectionSocketFactory and finds an earlier version of the 
> AllowAllHostnameVerifier that does have the INSTANCE variable (instance 
> variable was probably added in 4.3).
> {noformat}
> jar tvf lib/flink-dist-1.0-SNAPSHOT.jar |grep AllowAllHostnameVerifier  
>791 Thu Dec 17 09:55:46 CET 2015 
> org/apache/http/conn/ssl/AllowAllHostnameVerifier.class
> {noformat}
> Solutions would be:
> - Fix the classloader so that my custom job does not conflict with internal 
> flink-core classes... pretty hard
> - Remove the dependency somehow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3226) Translate optimized logical Table API plans into physical plans representing DataSet programs

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138788#comment-15138788
 ] 

ASF GitHub Bot commented on FLINK-3226:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1595#discussion_r52293343
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenUtils.scala
 ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.codegen
+
+import java.util.concurrent.atomic.AtomicInteger
+
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo._
+import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo._
+import org.apache.flink.api.common.typeinfo.{NumericTypeInfo, 
TypeInformation}
+import org.apache.flink.api.common.typeutils.CompositeType
+import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo}
+import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo
+import org.apache.flink.api.table.typeinfo.RowTypeInfo
+
+object CodeGenUtils {
+
+  private val nameCounter = new AtomicInteger
+
+  def newName(name: String): String = {
+s"$name$$${nameCounter.getAndIncrement}"
+  }
+
+  // when casting we first need to unbox Primitives, for example,
+  // float a = 1.0f;
+  // byte b = (byte) a;
+  // works, but for boxed types we need this:
+  // Float a = 1.0f;
+  // Byte b = (byte)(float) a;
+  def primitiveTypeTermForTypeInfo(tpe: TypeInformation[_]): String = tpe 
match {
+case INT_TYPE_INFO => "int"
+case LONG_TYPE_INFO => "long"
+case SHORT_TYPE_INFO => "short"
+case BYTE_TYPE_INFO => "byte"
+case FLOAT_TYPE_INFO => "float"
+case DOUBLE_TYPE_INFO => "double"
+case BOOLEAN_TYPE_INFO => "boolean"
+case CHAR_TYPE_INFO => "char"
+
+// From PrimitiveArrayTypeInfo we would get class "int[]", scala 
reflections
+// does not seem to like this, so we manually give the correct type 
here.
+case INT_PRIMITIVE_ARRAY_TYPE_INFO => "int[]"
+case LONG_PRIMITIVE_ARRAY_TYPE_INFO => "long[]"
+case SHORT_PRIMITIVE_ARRAY_TYPE_INFO => "short[]"
+case BYTE_PRIMITIVE_ARRAY_TYPE_INFO => "byte[]"
+case FLOAT_PRIMITIVE_ARRAY_TYPE_INFO => "float[]"
+case DOUBLE_PRIMITIVE_ARRAY_TYPE_INFO => "double[]"
+case BOOLEAN_PRIMITIVE_ARRAY_TYPE_INFO => "boolean[]"
+case CHAR_PRIMITIVE_ARRAY_TYPE_INFO => "char[]"
+
+case _ =>
+  tpe.getTypeClass.getCanonicalName
+  }
+
+  def boxedTypeTermForTypeInfo(tpe: TypeInformation[_]): String = tpe 
match {
+// From PrimitiveArrayTypeInfo we would get class "int[]", scala 
reflections
+// does not seem to like this, so we manually give the correct type 
here.
+case INT_PRIMITIVE_ARRAY_TYPE_INFO => "int[]"
+case LONG_PRIMITIVE_ARRAY_TYPE_INFO => "long[]"
+case SHORT_PRIMITIVE_ARRAY_TYPE_INFO => "short[]"
+case BYTE_PRIMITIVE_ARRAY_TYPE_INFO => "byte[]"
+case FLOAT_PRIMITIVE_ARRAY_TYPE_INFO => "float[]"
+case DOUBLE_PRIMITIVE_ARRAY_TYPE_INFO => "double[]"
+case BOOLEAN_PRIMITIVE_ARRAY_TYPE_INFO => "boolean[]"
+case CHAR_PRIMITIVE_ARRAY_TYPE_INFO => "char[]"
+
+case _ =>
+  tpe.getTypeClass.getCanonicalName
+  }
+
+  def primitiveDefaultValue(tpe: TypeInformation[_]): String = tpe match {
+case INT_TYPE_INFO => "-1"
+case LONG_TYPE_INFO => "-1"
+case SHORT_TYPE_INFO => "-1"
+case BYTE_TYPE_INFO => "-1"
+case FLOAT_TYPE_INFO => "-1.0f"
+case DOUBLE_TYPE_INFO => "-1.0d"
+case BOOLEAN_TYPE_INFO => "false"
+case STRING_TYPE_INFO => "\"\""
+case CHAR_TYPE_INFO => "'\\0'"
+case _ => "null"
+  }
+
+  def requireNumeric(genExpr: GeneratedExpression) = genExpr.resultType 
match {
+case nti:

[GitHub] flink pull request: [FLINK-3226] Implement a CodeGenerator for an ...

2016-02-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1595#discussion_r52293343
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenUtils.scala
 ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.codegen
+
+import java.util.concurrent.atomic.AtomicInteger
+
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo._
+import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo._
+import org.apache.flink.api.common.typeinfo.{NumericTypeInfo, 
TypeInformation}
+import org.apache.flink.api.common.typeutils.CompositeType
+import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo}
+import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo
+import org.apache.flink.api.table.typeinfo.RowTypeInfo
+
+object CodeGenUtils {
+
+  private val nameCounter = new AtomicInteger
+
+  def newName(name: String): String = {
+s"$name$$${nameCounter.getAndIncrement}"
+  }
+
+  // when casting we first need to unbox Primitives, for example,
+  // float a = 1.0f;
+  // byte b = (byte) a;
+  // works, but for boxed types we need this:
+  // Float a = 1.0f;
+  // Byte b = (byte)(float) a;
+  def primitiveTypeTermForTypeInfo(tpe: TypeInformation[_]): String = tpe 
match {
+case INT_TYPE_INFO => "int"
+case LONG_TYPE_INFO => "long"
+case SHORT_TYPE_INFO => "short"
+case BYTE_TYPE_INFO => "byte"
+case FLOAT_TYPE_INFO => "float"
+case DOUBLE_TYPE_INFO => "double"
+case BOOLEAN_TYPE_INFO => "boolean"
+case CHAR_TYPE_INFO => "char"
+
+// From PrimitiveArrayTypeInfo we would get class "int[]", scala 
reflections
+// does not seem to like this, so we manually give the correct type 
here.
+case INT_PRIMITIVE_ARRAY_TYPE_INFO => "int[]"
+case LONG_PRIMITIVE_ARRAY_TYPE_INFO => "long[]"
+case SHORT_PRIMITIVE_ARRAY_TYPE_INFO => "short[]"
+case BYTE_PRIMITIVE_ARRAY_TYPE_INFO => "byte[]"
+case FLOAT_PRIMITIVE_ARRAY_TYPE_INFO => "float[]"
+case DOUBLE_PRIMITIVE_ARRAY_TYPE_INFO => "double[]"
+case BOOLEAN_PRIMITIVE_ARRAY_TYPE_INFO => "boolean[]"
+case CHAR_PRIMITIVE_ARRAY_TYPE_INFO => "char[]"
+
+case _ =>
+  tpe.getTypeClass.getCanonicalName
+  }
+
+  def boxedTypeTermForTypeInfo(tpe: TypeInformation[_]): String = tpe 
match {
+// From PrimitiveArrayTypeInfo we would get class "int[]", scala 
reflections
+// does not seem to like this, so we manually give the correct type 
here.
+case INT_PRIMITIVE_ARRAY_TYPE_INFO => "int[]"
+case LONG_PRIMITIVE_ARRAY_TYPE_INFO => "long[]"
+case SHORT_PRIMITIVE_ARRAY_TYPE_INFO => "short[]"
+case BYTE_PRIMITIVE_ARRAY_TYPE_INFO => "byte[]"
+case FLOAT_PRIMITIVE_ARRAY_TYPE_INFO => "float[]"
+case DOUBLE_PRIMITIVE_ARRAY_TYPE_INFO => "double[]"
+case BOOLEAN_PRIMITIVE_ARRAY_TYPE_INFO => "boolean[]"
+case CHAR_PRIMITIVE_ARRAY_TYPE_INFO => "char[]"
+
+case _ =>
+  tpe.getTypeClass.getCanonicalName
+  }
+
+  def primitiveDefaultValue(tpe: TypeInformation[_]): String = tpe match {
+case INT_TYPE_INFO => "-1"
+case LONG_TYPE_INFO => "-1"
+case SHORT_TYPE_INFO => "-1"
+case BYTE_TYPE_INFO => "-1"
+case FLOAT_TYPE_INFO => "-1.0f"
+case DOUBLE_TYPE_INFO => "-1.0d"
+case BOOLEAN_TYPE_INFO => "false"
+case STRING_TYPE_INFO => "\"\""
+case CHAR_TYPE_INFO => "'\\0'"
+case _ => "null"
+  }
+
+  def requireNumeric(genExpr: GeneratedExpression) = genExpr.resultType 
match {
+case nti: NumericTypeInfo[_] => // ok
+case _ => throw new CodeGenException("Numeric expression type 
expected.")
+  }
+
+  def requireString(genExpr: GeneratedExpression) = genExpr.resultType 
match {
+case STRING_TYPE_INFO =>

[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138570#comment-15138570
 ] 

ASF GitHub Bot commented on FLINK-3234:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52281721
  
--- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java 
---
@@ -1377,6 +1377,24 @@ public long count() throws Exception {
return new SortPartitionOperator<>(this, field, order, 
Utils.getCallLocationName());
}
 
+   /**
+* Locally sorts the partitions of the DataSet on the an extracted key 
in the specified order.
--- End diff --

"...the DataSet on the **an** extracted key...", remove "an"


> SortPartition does not support KeySelectorFunctions
> ---
>
> Key: FLINK-3234
> URL: https://issues.apache.org/jira/browse/FLINK-3234
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0, 0.10.1
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
> Fix For: 1.0.0
>
>
> The following is not supported by the DataSet API:
> {code}
> DataSet data = ...
> DataSet data.sortPartition(
>   new KeySelector() {
> public Long getKey(MyObject v) {
>   ...
> }
>   }, 
>   Order.ASCENDING);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...

2016-02-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52282118
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/operators/SortPartitionOperator.java
 ---
@@ -36,27 +40,58 @@
  */
 public class SortPartitionOperator extends SingleInputOperator {
 
-   private int[] sortKeyPositions;
+   private List keys;
 
-   private Order[] sortOrders;
+   private List orders;
 
private final String sortLocationName;
 
+   private boolean useKeySelector;
 
-   public SortPartitionOperator(DataSet dataSet, int sortField, Order 
sortOrder, String sortLocationName) {
+   private SortPartitionOperator(DataSet dataSet, String 
sortLocationName) {
super(dataSet, dataSet.getType());
+
+   keys = new ArrayList<>();
+   orders = new ArrayList<>();
this.sortLocationName = sortLocationName;
+   }
+
+
+   public SortPartitionOperator(DataSet dataSet, int sortField, Order 
sortOrder, String sortLocationName) {
+   this(dataSet, sortLocationName);
+   this.useKeySelector = false;
+
+   ensureSortableKey(sortField);
 
-   int[] flatOrderKeys = getFlatFields(sortField);
-   this.appendSorting(flatOrderKeys, sortOrder);
+   keys.add(new Keys.ExpressionKeys<>(sortField, getType()));
+   orders.add(sortOrder);
}
 
public SortPartitionOperator(DataSet dataSet, String sortField, 
Order sortOrder, String sortLocationName) {
-   super(dataSet, dataSet.getType());
-   this.sortLocationName = sortLocationName;
+   this(dataSet, sortLocationName);
+   this.useKeySelector = false;
+
+   ensureSortableKey(sortField);
+
+   keys.add(new Keys.ExpressionKeys<>(sortField, getType()));
+   orders.add(sortOrder);
+   }
+
+   public SortPartitionOperator(DataSet dataSet, Keys sortKey, Order 
sortOrder, String sortLocationName) {
--- End diff --

Change the `sortKey` parameter type to `SelectorFunctionKeys` (or accept 
the `KeySelector` and create the `SelectorFunctionKeys` in the constructor.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3355) Allow passing RocksDB Option to RocksDBStateBackend

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138775#comment-15138775
 ] 

ASF GitHub Bot commented on FLINK-3355:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1608


> Allow passing RocksDB Option to RocksDBStateBackend
> ---
>
> Key: FLINK-3355
> URL: https://issues.apache.org/jira/browse/FLINK-3355
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Gyula Fora
>Assignee: Stephan Ewen
>Priority: Critical
>
> Currently the RocksDB state backend does not allow users to set the 
> parameters of the created store which might lead to suboptimal performance on 
> some workloads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3371] [api breaking] Move TriggerResult...

2016-02-09 Thread StephanEwen

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1603#issuecomment-181813971
  
Exactly, the `AlignedTrigger` will have an `AlignedTriggerContext` without 
Key/Value state.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3355] [rocksdb backend] Allow passing o...

2016-02-09 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1608


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3376) Add an illustration of Event Time and Watermarks to the docs

2016-02-09 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138781#comment-15138781
 ] 

Stephan Ewen commented on FLINK-3376:
-

Some of that information is available, but hidden deep in the Streaming API 
guide.

I vote to link this page on the same level as the Batch and Streaming API guide

> Add an illustration of Event Time and Watermarks to the docs
> 
>
> Key: FLINK-3376
> URL: https://issues.apache.org/jira/browse/FLINK-3376
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 0.10.1
>Reporter: Stephan Ewen
>Priority: Critical
> Fix For: 1.0.0
>
>
> Users seem to get confused about how event time and watermarks work.
> We need to add documentation with two sections:
> 1. Event time and watermark progress in general
>   - Watermarks are generated at the sources
>   - How Watermarks progress through the streaming data flow
> 2. Ways that users can generate watermarks
>   - EventTimeSourceFunctions
>   - AscendingTimestampExtractor
>   - TimestampExtractor general case



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3226) Translate optimized logical Table API plans into physical plans representing DataSet programs

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138784#comment-15138784
 ] 

ASF GitHub Bot commented on FLINK-3226:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1595#discussion_r52293004
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala
 ---
@@ -0,0 +1,661 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.codegen
+
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.`type`.SqlTypeName._
+import org.apache.calcite.sql.fun.SqlStdOperatorTable._
+import org.apache.flink.api.common.functions.{FlatMapFunction, Function, 
MapFunction}
+import org.apache.flink.api.common.typeinfo.{AtomicType, TypeInformation}
+import org.apache.flink.api.common.typeutils.CompositeType
+import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo}
+import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo
+import org.apache.flink.api.table.TableConfig
+import org.apache.flink.api.table.codegen.CodeGenUtils._
+import org.apache.flink.api.table.codegen.Indenter._
--- End diff --

Intellij tells me this import is unused


> Translate optimized logical Table API plans into physical plans representing 
> DataSet programs
> -
>
> Key: FLINK-3226
> URL: https://issues.apache.org/jira/browse/FLINK-3226
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Reporter: Fabian Hueske
>Assignee: Chengxiang Li
>
> This issue is about translating an (optimized) logical Table API (see 
> FLINK-3225) query plan into a physical plan. The physical plan is a 1-to-1 
> representation of the DataSet program that will be executed. This means:
> - Each Flink RelNode refers to exactly one Flink DataSet or DataStream 
> operator.
> - All (join and grouping) keys of Flink operators are correctly specified.
> - The expressions which are to be executed in user-code are identified.
> - All fields are referenced with their physical execution-time index.
> - Flink type information is available.
> - Optional: Add physical execution hints for joins
> The translation should be the final part of Calcite's optimization process.
> For this task we need to:
> - implement a set of Flink DataSet RelNodes. Each RelNode corresponds to one 
> Flink DataSet operator (Map, Reduce, Join, ...). The RelNodes must hold all 
> relevant operator information (keys, user-code expression, strategy hints, 
> parallelism).
> - implement rules to translate optimized Calcite RelNodes into Flink 
> RelNodes. We start with a straight-forward mapping and later add rules that 
> merge several relational operators into a single Flink operator, e.g., merge 
> a join followed by a filter. Timo implemented some rules for the first SQL 
> implementation which can be used as a starting point.
> - Integrate the translation rules into the Calcite optimization process



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3226] Implement a CodeGenerator for an ...

2016-02-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1595#discussion_r52293004
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala
 ---
@@ -0,0 +1,661 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.codegen
+
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.`type`.SqlTypeName._
+import org.apache.calcite.sql.fun.SqlStdOperatorTable._
+import org.apache.flink.api.common.functions.{FlatMapFunction, Function, 
MapFunction}
+import org.apache.flink.api.common.typeinfo.{AtomicType, TypeInformation}
+import org.apache.flink.api.common.typeutils.CompositeType
+import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo}
+import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo
+import org.apache.flink.api.table.TableConfig
+import org.apache.flink.api.table.codegen.CodeGenUtils._
+import org.apache.flink.api.table.codegen.Indenter._
--- End diff --

Intellij tells me this import is unused


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (FLINK-3359) Make RocksDB file copies asynchronous

2016-02-09 Thread Aljoscha Krettek (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek reassigned FLINK-3359:
---

Assignee: Aljoscha Krettek

> Make RocksDB file copies asynchronous
> -
>
> Key: FLINK-3359
> URL: https://issues.apache.org/jira/browse/FLINK-3359
> Project: Flink
>  Issue Type: Bug
>  Components: state backends
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: Aljoscha Krettek
> Fix For: 1.0.0
>
>
> While the incremental backup of the RocksDB files needs to be synchronous, 
> the copying of that file to the backup file system can be fully asynchronous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...

2016-02-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52282391
  
--- Diff: 
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/SortPartitionITCase.scala
 ---
@@ -166,6 +167,58 @@ class SortPartitionITCase(mode: TestExecutionMode) 
extends MultipleProgramsTestB
 TestBaseUtils.compareResultAsText(result.asJava, expected)
   }
 
+  @Test
+  def testSortPartitionWithKeySelector1(): Unit = {
+val env = ExecutionEnvironment.getExecutionEnvironment
+env.setParallelism(4)
+val ds = CollectionDataSets.get3TupleDataSet(env)
+
+val result = ds
+  .map { x => x }.setParallelism(4)
+  .sortPartition(_._2, Order.DESCENDING)
--- End diff --

Change sort order to `ASCENDING` (or in the other test).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138594#comment-15138594
 ] 

ASF GitHub Bot commented on FLINK-3234:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52282336
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/SortPartitionITCase.java
 ---
@@ -197,6 +198,58 @@ public void testSortPartitionParallelismChange() 
throws Exception {
compareResultAsText(result, expected);
}
 
+   @Test
+   public void testSortPartitionWithKeySelector1() throws Exception {
+   /*
+* Test sort partition on an extracted key
+*/
+
+   final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   env.setParallelism(4);
+
+   DataSet> ds = 
CollectionDataSets.get3TupleDataSet(env);
+   List result = ds
+   .map(new IdMapper>()).setParallelism(4) // parallelize input
+   .sortPartition(new KeySelector, Long>() {
+   @Override
+   public Long getKey(Tuple3 value) throws Exception {
+   return value.f1;
+   }
+   }, Order.DESCENDING)
--- End diff --

Change sort order to `ASCENDING` (or in the other test).


> SortPartition does not support KeySelectorFunctions
> ---
>
> Key: FLINK-3234
> URL: https://issues.apache.org/jira/browse/FLINK-3234
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0, 0.10.1
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
> Fix For: 1.0.0
>
>
> The following is not supported by the DataSet API:
> {code}
> DataSet data = ...
> DataSet data.sortPartition(
>   new KeySelector() {
> public Long getKey(MyObject v) {
>   ...
> }
>   }, 
>   Order.ASCENDING);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138593#comment-15138593
 ] 

ASF GitHub Bot commented on FLINK-1966:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181772422
  
That said, just for a comparison purpose, spark has its own model export 
and import feature, along with pmml export. Hoping to fully support pmml import 
in a framework like flink or spark is a next to impossible thing which requires 
changes to the entire way our pipelines and datasets and represented. 


> Add support for predictive model markup language (PMML)
> ---
>
> Key: FLINK-1966
> URL: https://issues.apache.org/jira/browse/FLINK-1966
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Sachin Goel
>Priority: Minor
>  Labels: ML
>
> The predictive model markup language (PMML) [1] is a widely used language to 
> describe predictive and descriptive models as well as pre- and 
> post-processing steps. That way it allows and easy way to export for and 
> import models from other ML tools.
> Resources:
> [1] 
> http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...

2016-02-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52282336
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/SortPartitionITCase.java
 ---
@@ -197,6 +198,58 @@ public void testSortPartitionParallelismChange() 
throws Exception {
compareResultAsText(result, expected);
}
 
+   @Test
+   public void testSortPartitionWithKeySelector1() throws Exception {
+   /*
+* Test sort partition on an extracted key
+*/
+
+   final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   env.setParallelism(4);
+
+   DataSet> ds = 
CollectionDataSets.get3TupleDataSet(env);
+   List result = ds
+   .map(new IdMapper>()).setParallelism(4) // parallelize input
+   .sortPartition(new KeySelector, Long>() {
+   @Override
+   public Long getKey(Tuple3 value) throws Exception {
+   return value.f1;
+   }
+   }, Order.DESCENDING)
--- End diff --

Change sort order to `ASCENDING` (or in the other test).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138591#comment-15138591
 ] 

ASF GitHub Bot commented on FLINK-3234:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52282275
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala ---
@@ -1508,6 +1508,31 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) {
   new SortPartitionOperator[T](javaSet, field, order, 
getCallLocationName()))
   }
 
+  /**
+* Locally sorts the partitions of the DataSet on the specified field 
in the specified order.
+* The DataSet can be sorted on multiple fields by chaining 
sortPartition() calls.
+*
+* Note that any key extraction methods cannot be chained with the 
KeySelector. To sort the
+* partition by multiple values using KeySelector, the KeySelector must 
return a tuple
+* consisting of the values.
+*/
+  def sortPartition[K: TypeInformation](fun: T => K, order: Order): 
DataSet[T] ={
--- End diff --

Copy the method docs from the `DataSet.java`. 


> SortPartition does not support KeySelectorFunctions
> ---
>
> Key: FLINK-3234
> URL: https://issues.apache.org/jira/browse/FLINK-3234
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0, 0.10.1
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
> Fix For: 1.0.0
>
>
> The following is not supported by the DataSet API:
> {code}
> DataSet data = ...
> DataSet data.sortPartition(
>   new KeySelector() {
> public Long getKey(MyObject v) {
>   ...
> }
>   }, 
>   Order.ASCENDING);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...

2016-02-09 Thread sachingoel0101

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181772422
  
That said, just for a comparison purpose, spark has its own model export 
and import feature, along with pmml export. Hoping to fully support pmml import 
in a framework like flink or spark is a next to impossible thing which requires 
changes to the entire way our pipelines and datasets and represented. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...

2016-02-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52282275
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala ---
@@ -1508,6 +1508,31 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) {
   new SortPartitionOperator[T](javaSet, field, order, 
getCallLocationName()))
   }
 
+  /**
+* Locally sorts the partitions of the DataSet on the specified field 
in the specified order.
+* The DataSet can be sorted on multiple fields by chaining 
sortPartition() calls.
+*
+* Note that any key extraction methods cannot be chained with the 
KeySelector. To sort the
+* partition by multiple values using KeySelector, the KeySelector must 
return a tuple
+* consisting of the values.
+*/
+  def sortPartition[K: TypeInformation](fun: T => K, order: Order): 
DataSet[T] ={
--- End diff --

Copy the method docs from the `DataSet.java`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3243] Fix Interplay of TimeCharacterist...

2016-02-09 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1513#issuecomment-181777122
  
You're right, I'm changing it. But it was also me who didn't notice when we 
put it in initially :sweat_smile: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...

2016-02-09 Thread chobeat

Github user chobeat commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181790876
  
I agree with @sachingoel0101 on the import complexity but, from our point 
of view, Flink is the perfect platform to evaluate models in streaming and we 
are using it that way in our architecture. Why do you think it wouldn't be 
suitable? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3366) Rename @Experimental annotation to @PublicEvolving

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138701#comment-15138701
 ] 

ASF GitHub Bot commented on FLINK-3366:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1599#issuecomment-181794025
  
Can you update the JavaDoc of the `@PublicEvolving` annotation?

Otherwise, good to merge...


> Rename @Experimental annotation to @PublicEvolving
> --
>
> Key: FLINK-3366
> URL: https://issues.apache.org/jira/browse/FLINK-3366
> Project: Flink
>  Issue Type: Task
>Affects Versions: 1.0.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Blocker
> Fix For: 1.0.0
>
>
> As per discussion on the dev ML, rename the @Experimental annotation to 
> @PublicEvolving.
> Experimental might suggest instable / unreliable functionality which is not 
> the intended meaning of this annotation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3366] Rename @Experimental annotation t...

2016-02-09 Thread StephanEwen

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1599#issuecomment-181794025
  
Can you update the JavaDoc of the `@PublicEvolving` annotation?

Otherwise, good to merge...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3226] Implement a CodeGenerator for an ...

2016-02-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1595#discussion_r52291201
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala
 ---
@@ -0,0 +1,661 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.codegen
+
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.`type`.SqlTypeName._
+import org.apache.calcite.sql.fun.SqlStdOperatorTable._
+import org.apache.flink.api.common.functions.{FlatMapFunction, Function, 
MapFunction}
+import org.apache.flink.api.common.typeinfo.{AtomicType, TypeInformation}
+import org.apache.flink.api.common.typeutils.CompositeType
+import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo}
+import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo
+import org.apache.flink.api.table.TableConfig
+import org.apache.flink.api.table.codegen.CodeGenUtils._
+import org.apache.flink.api.table.codegen.Indenter._
+import org.apache.flink.api.table.codegen.OperatorCodeGen._
+import org.apache.flink.api.table.plan.TypeConverter.sqlTypeToTypeInfo
+import org.apache.flink.api.table.typeinfo.RowTypeInfo
+
+import scala.collection.JavaConversions._
+import scala.collection.mutable
+
+class CodeGenerator(
+config: TableConfig,
+input1: TypeInformation[Any],
+input2: Option[TypeInformation[Any]] = None)
+  extends RexVisitor[GeneratedExpression] {
+
+  // set of member statements that will be added only once
+  // we use a LinkedHashSet to keep the insertion order
+  private val reusableMemberStatements = mutable.LinkedHashSet[String]()
+
+  // set of constructor statements that will be added only once
+  // we use a LinkedHashSet to keep the insertion order
+  private val reusableInitStatements = mutable.LinkedHashSet[String]()
+
+  // map of initial input unboxing expressions that will be added only once
+  // (inputTerm, index) -> expr
+  private val reusableInputUnboxingExprs = mutable.Map[(String, Int), 
GeneratedExpression]()
+
+  def reuseMemberCode(): String = {
+reusableMemberStatements.mkString("", "\n", "\n")
+  }
+
+  def reuseInitCode(): String = {
+reusableInitStatements.mkString("", "\n", "\n")
+  }
+
+  def reuseInputUnboxingCode(): String = {
+reusableInputUnboxingExprs.values.map(_.code).mkString("", "\n", "\n")
+  }
+
+  def input1Term = "in1"
+
+  def input2Term = "in2"
+
+  def collectorTerm = "c"
+
+  def outRecordTerm = "out"
+
+  def nullCheck: Boolean = config.getNullCheck
+
+  def generateExpression(rex: RexNode): GeneratedExpression = {
+rex.accept(this)
+  }
+
+  def generateFunction[T <: Function](
+  name: String,
+  clazz: Class[T],
+  bodyCode: String,
+  returnType: TypeInformation[Any])
+: GeneratedFunction[T] = {
+val funcName = newName(name)
+
+// Janino does not support generics, that's why we need
+// manual casting here
+val samHeader =
+  if (clazz == classOf[FlatMapFunction[_,_]]) {
+val inputTypeTerm = boxedTypeTermForTypeInfo(input1)
+(s"void flatMap(Object _in1, org.apache.flink.util.Collector 
$collectorTerm)",
+  s"$inputTypeTerm $input1Term = ($inputTypeTerm) _in1;")
+  } else if (clazz == classOf[MapFunction[_,_]]) {
+val inputTypeTerm = boxedTypeTermForTypeInfo(input1)
+("Object map(Object _in1)",
+  s"$inputTypeTerm $input1Term = ($inputTypeTerm) _in1;")
+  } else {
+// TODO more functions
+throw new CodeGenException("Unsupported Function.")
+  }
+
+val funcCode = j"""
+  public class $funcName
+  implements ${clazz.getCanonicalName} {
+
+${reuseMemberCode()}
+
+public $funcName() {
+

[jira] [Commented] (FLINK-3243) Fix Interplay of TimeCharacteristic and Time Windows

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138768#comment-15138768
 ] 

ASF GitHub Bot commented on FLINK-3243:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1513#issuecomment-181810307
  
I rebased it to master and updated.


> Fix Interplay of TimeCharacteristic and Time Windows
> 
>
> Key: FLINK-3243
> URL: https://issues.apache.org/jira/browse/FLINK-3243
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.0.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Blocker
>
> As per the discussion on the Dev ML: 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Time-Behavior-in-Streaming-Jobs-Event-time-processing-time-td9616.html.
> The discussion seems to have converged on option 2):
> - Add dedicated WindowAssigners for processing time and event time
> - {{timeWindow()}} and {{timeWindowAll()}} respect the set 
> {{TimeCharacteristic}}. 
> This will make the easy stuff easy, i.e. using time windows and quickly 
> switching the time characteristic. Users will then have the flexibility to 
> mix different kinds of window assigners in their job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3243] Fix Interplay of TimeCharacterist...

2016-02-09 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1513#issuecomment-181810307
  
I rebased it to master and updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3226] Implement a CodeGenerator for an ...

2016-02-09 Thread fhueske

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1595#issuecomment-181824403
  
Hi Timo, the PR looks really good :-)

I found the following issues / questions:
- Accessing of POJO fields might not work.
- Can you add method comments to the code generation methods in 
`CodeGenerator` and `CodeGenUtils`?
- Would it make sense to separate the function and expression code gen, 
i.e., split the `CodeGenerator` class? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3226) Translate optimized logical Table API plans into physical plans representing DataSet programs

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138805#comment-15138805
 ] 

ASF GitHub Bot commented on FLINK-3226:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1595#issuecomment-181824403
  
Hi Timo, the PR looks really good :-)

I found the following issues / questions:
- Accessing of POJO fields might not work.
- Can you add method comments to the code generation methods in 
`CodeGenerator` and `CodeGenUtils`?
- Would it make sense to separate the function and expression code gen, 
i.e., split the `CodeGenerator` class? 


> Translate optimized logical Table API plans into physical plans representing 
> DataSet programs
> -
>
> Key: FLINK-3226
> URL: https://issues.apache.org/jira/browse/FLINK-3226
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Reporter: Fabian Hueske
>Assignee: Chengxiang Li
>
> This issue is about translating an (optimized) logical Table API (see 
> FLINK-3225) query plan into a physical plan. The physical plan is a 1-to-1 
> representation of the DataSet program that will be executed. This means:
> - Each Flink RelNode refers to exactly one Flink DataSet or DataStream 
> operator.
> - All (join and grouping) keys of Flink operators are correctly specified.
> - The expressions which are to be executed in user-code are identified.
> - All fields are referenced with their physical execution-time index.
> - Flink type information is available.
> - Optional: Add physical execution hints for joins
> The translation should be the final part of Calcite's optimization process.
> For this task we need to:
> - implement a set of Flink DataSet RelNodes. Each RelNode corresponds to one 
> Flink DataSet operator (Map, Reduce, Join, ...). The RelNodes must hold all 
> relevant operator information (keys, user-code expression, strategy hints, 
> parallelism).
> - implement rules to translate optimized Calcite RelNodes into Flink 
> RelNodes. We start with a straight-forward mapping and later add rules that 
> merge several relational operators into a single Flink operator, e.g., merge 
> a join followed by a filter. Timo implemented some rules for the first SQL 
> implementation which can be used as a starting point.
> - Integrate the translation rules into the Calcite optimization process



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...

2016-02-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52282174
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/operators/SortPartitionOperator.java
 ---
@@ -79,58 +119,41 @@ public SortPartitionOperator(DataSet dataSet, 
String sortField, Order sortOrd
 * local partition sorting of the DataSet.
 *
 * @param field The field expression referring to the field of the 
additional sort order of
-* the local partition sorting.
-* @param order The order  of the additional sort order of the local 
partition sorting.
+*  the local partition sorting.
+* @param order The order of the additional sort order of the local 
partition sorting.
 * @return The DataSet with sorted local partitions.
 */
public SortPartitionOperator sortPartition(String field, Order 
order) {
-   int[] flatOrderKeys = getFlatFields(field);
-   this.appendSorting(flatOrderKeys, order);
+   if (useKeySelector) {
+   throw new InvalidProgramException("Expression keys 
cannot be appended after a KeySelector");
+   }
+
+   ensureSortableKey(field);
+   keys.add(new Keys.ExpressionKeys<>(field, getType()));
+   orders.add(order);
+
return this;
}
 
-   // 

-   //  Key Extraction
-   // 

-
-   private int[] getFlatFields(int field) {
+   public  SortPartitionOperator sortPartition(KeySelector 
keyExtractor, Order order) {
+   throw new InvalidProgramException("KeySelector cannot be 
chained.");
+   }
 
-   if (!Keys.ExpressionKeys.isSortKey(field, super.getType())) {
+   private void ensureSortableKey(int field) throws 
InvalidProgramException {
+   if (!Keys.ExpressionKeys.isSortKey(field, getType())) {
throw new InvalidProgramException("Selected sort key is 
not a sortable type");
}
-
-   Keys.ExpressionKeys ek = new Keys.ExpressionKeys<>(field, 
super.getType());
-   return ek.computeLogicalKeyPositions();
}
 
-   private int[] getFlatFields(String fields) {
-
-   if (!Keys.ExpressionKeys.isSortKey(fields, super.getType())) {
+   private void ensureSortableKey(String field) throws 
InvalidProgramException {
+   if (!Keys.ExpressionKeys.isSortKey(field, getType())) {
throw new InvalidProgramException("Selected sort key is 
not a sortable type");
}
-
-   Keys.ExpressionKeys ek = new Keys.ExpressionKeys<>(fields, 
super.getType());
-   return ek.computeLogicalKeyPositions();
}
 
-   private void appendSorting(int[] flatOrderFields, Order order) {
-
-   if(this.sortKeyPositions == null) {
-   // set sorting info
-   this.sortKeyPositions = flatOrderFields;
-   this.sortOrders = new Order[flatOrderFields.length];
-   Arrays.fill(this.sortOrders, order);
-   } else {
-   // append sorting info to exising info
-   int oldLength = this.sortKeyPositions.length;
-   int newLength = oldLength + flatOrderFields.length;
-   this.sortKeyPositions = 
Arrays.copyOf(this.sortKeyPositions, newLength);
-   this.sortOrders = Arrays.copyOf(this.sortOrders, 
newLength);
-
-   for(int i=0; i

[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138577#comment-15138577
 ] 

ASF GitHub Bot commented on FLINK-3234:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52282118
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/operators/SortPartitionOperator.java
 ---
@@ -36,27 +40,58 @@
  */
 public class SortPartitionOperator extends SingleInputOperator {
 
-   private int[] sortKeyPositions;
+   private List keys;
 
-   private Order[] sortOrders;
+   private List orders;
 
private final String sortLocationName;
 
+   private boolean useKeySelector;
 
-   public SortPartitionOperator(DataSet dataSet, int sortField, Order 
sortOrder, String sortLocationName) {
+   private SortPartitionOperator(DataSet dataSet, String 
sortLocationName) {
super(dataSet, dataSet.getType());
+
+   keys = new ArrayList<>();
+   orders = new ArrayList<>();
this.sortLocationName = sortLocationName;
+   }
+
+
+   public SortPartitionOperator(DataSet dataSet, int sortField, Order 
sortOrder, String sortLocationName) {
+   this(dataSet, sortLocationName);
+   this.useKeySelector = false;
+
+   ensureSortableKey(sortField);
 
-   int[] flatOrderKeys = getFlatFields(sortField);
-   this.appendSorting(flatOrderKeys, sortOrder);
+   keys.add(new Keys.ExpressionKeys<>(sortField, getType()));
+   orders.add(sortOrder);
}
 
public SortPartitionOperator(DataSet dataSet, String sortField, 
Order sortOrder, String sortLocationName) {
-   super(dataSet, dataSet.getType());
-   this.sortLocationName = sortLocationName;
+   this(dataSet, sortLocationName);
+   this.useKeySelector = false;
+
+   ensureSortableKey(sortField);
+
+   keys.add(new Keys.ExpressionKeys<>(sortField, getType()));
+   orders.add(sortOrder);
+   }
+
+   public SortPartitionOperator(DataSet dataSet, Keys sortKey, Order 
sortOrder, String sortLocationName) {
--- End diff --

Change the `sortKey` parameter type to `SelectorFunctionKeys` (or accept 
the `KeySelector` and create the `SelectorFunctionKeys` in the constructor.


> SortPartition does not support KeySelectorFunctions
> ---
>
> Key: FLINK-3234
> URL: https://issues.apache.org/jira/browse/FLINK-3234
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0, 0.10.1
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
> Fix For: 1.0.0
>
>
> The following is not supported by the DataSet API:
> {code}
> DataSet data = ...
> DataSet data.sortPartition(
>   new KeySelector() {
> public Long getKey(MyObject v) {
>   ...
> }
>   }, 
>   Order.ASCENDING);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138589#comment-15138589
 ] 

ASF GitHub Bot commented on FLINK-3234:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52282174
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/operators/SortPartitionOperator.java
 ---
@@ -79,58 +119,41 @@ public SortPartitionOperator(DataSet dataSet, 
String sortField, Order sortOrd
 * local partition sorting of the DataSet.
 *
 * @param field The field expression referring to the field of the 
additional sort order of
-* the local partition sorting.
-* @param order The order  of the additional sort order of the local 
partition sorting.
+*  the local partition sorting.
+* @param order The order of the additional sort order of the local 
partition sorting.
 * @return The DataSet with sorted local partitions.
 */
public SortPartitionOperator sortPartition(String field, Order 
order) {
-   int[] flatOrderKeys = getFlatFields(field);
-   this.appendSorting(flatOrderKeys, order);
+   if (useKeySelector) {
+   throw new InvalidProgramException("Expression keys 
cannot be appended after a KeySelector");
+   }
+
+   ensureSortableKey(field);
+   keys.add(new Keys.ExpressionKeys<>(field, getType()));
+   orders.add(order);
+
return this;
}
 
-   // 

-   //  Key Extraction
-   // 

-
-   private int[] getFlatFields(int field) {
+   public  SortPartitionOperator sortPartition(KeySelector 
keyExtractor, Order order) {
+   throw new InvalidProgramException("KeySelector cannot be 
chained.");
+   }
 
-   if (!Keys.ExpressionKeys.isSortKey(field, super.getType())) {
+   private void ensureSortableKey(int field) throws 
InvalidProgramException {
+   if (!Keys.ExpressionKeys.isSortKey(field, getType())) {
throw new InvalidProgramException("Selected sort key is 
not a sortable type");
}
-
-   Keys.ExpressionKeys ek = new Keys.ExpressionKeys<>(field, 
super.getType());
-   return ek.computeLogicalKeyPositions();
}
 
-   private int[] getFlatFields(String fields) {
-
-   if (!Keys.ExpressionKeys.isSortKey(fields, super.getType())) {
+   private void ensureSortableKey(String field) throws 
InvalidProgramException {
+   if (!Keys.ExpressionKeys.isSortKey(field, getType())) {
throw new InvalidProgramException("Selected sort key is 
not a sortable type");
}
-
-   Keys.ExpressionKeys ek = new Keys.ExpressionKeys<>(fields, 
super.getType());
-   return ek.computeLogicalKeyPositions();
}
 
-   private void appendSorting(int[] flatOrderFields, Order order) {
-
-   if(this.sortKeyPositions == null) {
-   // set sorting info
-   this.sortKeyPositions = flatOrderFields;
-   this.sortOrders = new Order[flatOrderFields.length];
-   Arrays.fill(this.sortOrders, order);
-   } else {
-   // append sorting info to exising info
-   int oldLength = this.sortKeyPositions.length;
-   int newLength = oldLength + flatOrderFields.length;
-   this.sortKeyPositions = 
Arrays.copyOf(this.sortKeyPositions, newLength);
-   this.sortOrders = Arrays.copyOf(this.sortOrders, 
newLength);
-
-   for(int i=0; i SortPartition does not support KeySelectorFunctions
> ---
>
> Key: FLINK-3234
> URL: https://issues.apache.org/jira/browse/FLINK-3234
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0, 0.10.1
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
> Fix For:

[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138576#comment-15138576
 ] 

ASF GitHub Bot commented on FLINK-1966:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181771679
  
As the original  author of this PR, I'd say this:
I tried implementing the import features but they aren't worth it. You have 
to discard most of the valid pmml models because they don't fit in with the 
flink framework. 
Further, in my opinion, the use of flink is to train the model. Once we 
export that model in pmml, you can use it pretty much anywhere, say R or 
matlab, which support a complete pmml import and export functionality. The 
exported model is in most cases going to be used for testing, evaluating and 
predictions purposes, for which flink isn't a good platform to use anyway. This 
can be accomplished anywhere. 


> Add support for predictive model markup language (PMML)
> ---
>
> Key: FLINK-1966
> URL: https://issues.apache.org/jira/browse/FLINK-1966
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Sachin Goel
>Priority: Minor
>  Labels: ML
>
> The predictive model markup language (PMML) [1] is a widely used language to 
> describe predictive and descriptive models as well as pre- and 
> post-processing steps. That way it allows and easy way to export for and 
> import models from other ML tools.
> Resources:
> [1] 
> http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3374) CEPITCase testSimplePatternEventTime fails

2016-02-09 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138575#comment-15138575
 ] 

Till Rohrmann commented on FLINK-3374:
--

Is this reproducible? Looks to me as if the testing file could not be created 
by the test. This might be simply a problem of the Travis machine.

> CEPITCase testSimplePatternEventTime fails
> --
>
> Key: FLINK-3374
> URL: https://issues.apache.org/jira/browse/FLINK-3374
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Reporter: Ufuk Celebi
>Priority: Minor
>
> {code}
> testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 
> 1.68 sec  <<< ERROR!
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.io.FileNotFoundException: 
> /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
> directory)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:213)
>   at java.io.FileOutputStream.(FileOutputStream.java:162)
>   at 
> org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
>   at 
> org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256)
>   at 
> org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263)
>   at 
> org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248)
>   at 
> org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76)
>   at 
> org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
>   at java.lang.Thread.run(Thread.java:745)
> testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 
> 1.68 sec  <<< FAILURE!
> java.lang.AssertionError: Different number of lines in expected and obtained 
> result. expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292)
>   at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56)
> {code}
> https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz
> {code}
> 04:53:46,840 INFO  org.apache.flink.runtime.client.JobClientActor 
>- 02/09/2016 04:53:46Map -> Sink: Unnamed(2/4) switched to FAILED 
> java.io.FileNotFoundException: 
> /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
> directory)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:213)
>   at java.io.FileOutputStream.(FileOutputStream.java:162)
>   at 
>

[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...

2016-02-09 Thread sachingoel0101

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181771679
  
As the original  author of this PR, I'd say this:
I tried implementing the import features but they aren't worth it. You have 
to discard most of the valid pmml models because they don't fit in with the 
flink framework. 
Further, in my opinion, the use of flink is to train the model. Once we 
export that model in pmml, you can use it pretty much anywhere, say R or 
matlab, which support a complete pmml import and export functionality. The 
exported model is in most cases going to be used for testing, evaluating and 
predictions purposes, for which flink isn't a good platform to use anyway. This 
can be accomplished anywhere. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2832) Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement

2016-02-09 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138691#comment-15138691
 ] 

Robert Metzger commented on FLINK-2832:
---

Another instance: 
https://s3.amazonaws.com/archive.travis-ci.org/jobs/107973266/log.txt


{code}
Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.914 sec <<< 
FAILURE! - in org.apache.flink.api.java.sampling.RandomSamplerTest
testReservoirSamplerWithReplacement(org.apache.flink.api.java.sampling.RandomSamplerTest)
  Time elapsed: 1.282 sec  <<< FAILURE!
java.lang.AssertionError: KS test result with p value(0.034000), d 
value(0.032600)
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.flink.api.java.sampling.RandomSamplerTest.verifyKSTest(RandomSamplerTest.java:342)
at 
org.apache.flink.api.java.sampling.RandomSamplerTest.verifyRandomSamplerWithSampleSize(RandomSamplerTest.java:330)
at 
org.apache.flink.api.java.sampling.RandomSamplerTest.verifyReservoirSamplerWithReplacement(RandomSamplerTest.java:289)
at 
org.apache.flink.api.java.sampling.RandomSamplerTest.testReservoirSamplerWithReplacement(RandomSamplerTest.java:194)

{code}

> Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement
> ---
>
> Key: FLINK-2832
> URL: https://issues.apache.org/jira/browse/FLINK-2832
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.10.0
>Reporter: Vasia Kalavri
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.0.0
>
>
> Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 19.133 sec 
> <<< FAILURE! - in org.apache.flink.api.java.sampling.RandomSamplerTest
> testReservoirSamplerWithReplacement(org.apache.flink.api.java.sampling.RandomSamplerTest)
>   Time elapsed: 2.534 sec  <<< FAILURE!
> java.lang.AssertionError: KS test result with p value(0.11), d 
> value(0.103090)
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyKSTest(RandomSamplerTest.java:342)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyRandomSamplerWithSampleSize(RandomSamplerTest.java:330)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyReservoirSamplerWithReplacement(RandomSamplerTest.java:289)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.testReservoirSamplerWithReplacement(RandomSamplerTest.java:192)
> Results :
> Failed tests: 
>   
> RandomSamplerTest.testReservoirSamplerWithReplacement:192->verifyReservoirSamplerWithReplacement:289->verifyRandomSamplerWithSampleSize:330->verifyKSTest:342
>  KS test result with p value(0.11), d value(0.103090)
> Full log [here|https://travis-ci.org/apache/flink/jobs/84120131].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138728#comment-15138728
 ] 

ASF GitHub Bot commented on FLINK-1966:
---

Github user chobeat commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181799783
  
@sachingoel0101 I agree. Nonetheless, an easy way to store and move a model 
generated in batch to a streaming enviroment would be a really useful feature 
and we go back to what @chiwanpark was saying about a custom format internal to 
Flink. 


> Add support for predictive model markup language (PMML)
> ---
>
> Key: FLINK-1966
> URL: https://issues.apache.org/jira/browse/FLINK-1966
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Sachin Goel
>Priority: Minor
>  Labels: ML
>
> The predictive model markup language (PMML) [1] is a widely used language to 
> describe predictive and descriptive models as well as pre- and 
> post-processing steps. That way it allows and easy way to export for and 
> import models from other ML tools.
> Resources:
> [1] 
> http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...

2016-02-09 Thread chobeat

Github user chobeat commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181799783
  
@sachingoel0101 I agree. Nonetheless, an easy way to store and move a model 
generated in batch to a streaming enviroment would be a really useful feature 
and we go back to what @chiwanpark was saying about a custom format internal to 
Flink. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-3376) Add an illustration of Event Time and Watermarks to the docsq

2016-02-09 Thread Stephan Ewen (JIRA)

Stephan Ewen created FLINK-3376:
---

 Summary: Add an illustration of Event Time and Watermarks to the 
docsq
 Key: FLINK-3376
 URL: https://issues.apache.org/jira/browse/FLINK-3376
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 0.10.1
Reporter: Stephan Ewen
Priority: Critical
 Fix For: 1.0.0


Users seem to get confused about how event time and watermarks work.
We need to add documentation with two sections:


1. Event time and watermark progress in general
  - Watermarks are generated at the sources
  - How Watermarks progress through the streaming data flow

2. Ways that users can generate watermarks
  - EventTimeSourceFunctions
  - AscendingTimestampExtractor
  - TimestampExtractor general case




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (FLINK-3376) Add an illustration of Event Time and Watermarks to the docs

2016-02-09 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-3376:

Summary: Add an illustration of Event Time and Watermarks to the docs  
(was: Add an illustration of Event Time and Watermarks to the docsq)

> Add an illustration of Event Time and Watermarks to the docs
> 
>
> Key: FLINK-3376
> URL: https://issues.apache.org/jira/browse/FLINK-3376
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 0.10.1
>Reporter: Stephan Ewen
>Priority: Critical
> Fix For: 1.0.0
>
>
> Users seem to get confused about how event time and watermarks work.
> We need to add documentation with two sections:
> 1. Event time and watermark progress in general
>   - Watermarks are generated at the sources
>   - How Watermarks progress through the streaming data flow
> 2. Ways that users can generate watermarks
>   - EventTimeSourceFunctions
>   - AscendingTimestampExtractor
>   - TimestampExtractor general case



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3286) Remove JDEB Debian Package code from flink-dist

2016-02-09 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138794#comment-15138794
 ] 

Robert Metzger commented on FLINK-3286:
---

Removed the files as well: 
http://git-wip-us.apache.org/repos/asf/flink/commit/a4f0692e

> Remove JDEB Debian Package code from flink-dist
> ---
>
> Key: FLINK-3286
> URL: https://issues.apache.org/jira/browse/FLINK-3286
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 0.10.1
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Blocker
> Fix For: 1.0.0
>
>
> There is currently code in the {{flink-dist}} project to create a debian 
> package for Flink. This has been added by a contributor quite a while back, 
> and never been maintained (probably also never used).
> I vote to remove that. It is out of date with paths and filenames and there 
> seems no interest in maintaining it so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...

2016-02-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52281965
  
--- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java 
---
@@ -1377,6 +1377,24 @@ public long count() throws Exception {
return new SortPartitionOperator<>(this, field, order, 
Utils.getCallLocationName());
}
 
+   /**
+* Locally sorts the partitions of the DataSet on the an extracted key 
in the specified order.
+* DataSet can be sorted on multiple values by returning a tuple from 
the KeySelector.
+*
+* Note that any key extraction methods cannot be chained with the 
KeySelector. To sort the
--- End diff --

"Note that any key extraction methods cannot be ..." -> "Note that no 
additional sort keys can be appended to a KeySelector."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138573#comment-15138573
 ] 

ASF GitHub Bot commented on FLINK-3234:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52281965
  
--- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java 
---
@@ -1377,6 +1377,24 @@ public long count() throws Exception {
return new SortPartitionOperator<>(this, field, order, 
Utils.getCallLocationName());
}
 
+   /**
+* Locally sorts the partitions of the DataSet on the an extracted key 
in the specified order.
+* DataSet can be sorted on multiple values by returning a tuple from 
the KeySelector.
+*
+* Note that any key extraction methods cannot be chained with the 
KeySelector. To sort the
--- End diff --

"Note that any key extraction methods cannot be ..." -> "Note that no 
additional sort keys can be appended to a KeySelector."


> SortPartition does not support KeySelectorFunctions
> ---
>
> Key: FLINK-3234
> URL: https://issues.apache.org/jira/browse/FLINK-3234
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0, 0.10.1
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
> Fix For: 1.0.0
>
>
> The following is not supported by the DataSet API:
> {code}
> DataSet data = ...
> DataSet data.sortPartition(
>   new KeySelector() {
> public Long getKey(MyObject v) {
>   ...
> }
>   }, 
>   Order.ASCENDING);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138571#comment-15138571
 ] 

ASF GitHub Bot commented on FLINK-3234:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52281750
  
--- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java 
---
@@ -1377,6 +1377,24 @@ public long count() throws Exception {
return new SortPartitionOperator<>(this, field, order, 
Utils.getCallLocationName());
}
 
+   /**
+* Locally sorts the partitions of the DataSet on the an extracted key 
in the specified order.
+* DataSet can be sorted on multiple values by returning a tuple from 
the KeySelector.
--- End diff --

"The DataSet can be ...", add "The"


> SortPartition does not support KeySelectorFunctions
> ---
>
> Key: FLINK-3234
> URL: https://issues.apache.org/jira/browse/FLINK-3234
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0, 0.10.1
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
> Fix For: 1.0.0
>
>
> The following is not supported by the DataSet API:
> {code}
> DataSet data = ...
> DataSet data.sortPartition(
>   new KeySelector() {
> public Long getKey(MyObject v) {
>   ...
> }
>   }, 
>   Order.ASCENDING);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3243] Fix Interplay of TimeCharacterist...

2016-02-09 Thread StephanEwen

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1513#issuecomment-181776065
  
For new classes, it makes sense. Was a mistake on my end to name them like 
this in the first place.

But users that adopted this draw in my experience more satisfaction from 
stable code than from a style nuance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3374) CEPITCase testSimplePatternEventTime fails

2016-02-09 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138626#comment-15138626
 ] 

Till Rohrmann commented on FLINK-3374:
--

Probably because it uses {{WriteMode.OVERWRITE}}.

> CEPITCase testSimplePatternEventTime fails
> --
>
> Key: FLINK-3374
> URL: https://issues.apache.org/jira/browse/FLINK-3374
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Reporter: Ufuk Celebi
>Assignee: Till Rohrmann
>Priority: Minor
>
> {code}
> testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 
> 1.68 sec  <<< ERROR!
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.io.FileNotFoundException: 
> /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
> directory)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:213)
>   at java.io.FileOutputStream.(FileOutputStream.java:162)
>   at 
> org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
>   at 
> org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256)
>   at 
> org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263)
>   at 
> org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248)
>   at 
> org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76)
>   at 
> org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
>   at java.lang.Thread.run(Thread.java:745)
> testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 
> 1.68 sec  <<< FAILURE!
> java.lang.AssertionError: Different number of lines in expected and obtained 
> result. expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292)
>   at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56)
> {code}
> https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz
> {code}
> 04:53:46,840 INFO  org.apache.flink.runtime.client.JobClientActor 
>- 02/09/2016 04:53:46Map -> Sink: Unnamed(2/4) switched to FAILED 
> java.io.FileNotFoundException: 
> /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
> directory)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:213)
>   at java.io.FileOutputStream.(FileOutputStream.java:162)
>   at 
> org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
>   at 
>

[jira] [Assigned] (FLINK-3374) CEPITCase testSimplePatternEventTime fails

2016-02-09 Thread Till Rohrmann (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-3374:


Assignee: Till Rohrmann

> CEPITCase testSimplePatternEventTime fails
> --
>
> Key: FLINK-3374
> URL: https://issues.apache.org/jira/browse/FLINK-3374
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Reporter: Ufuk Celebi
>Assignee: Till Rohrmann
>Priority: Minor
>
> {code}
> testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 
> 1.68 sec  <<< ERROR!
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.io.FileNotFoundException: 
> /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
> directory)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:213)
>   at java.io.FileOutputStream.(FileOutputStream.java:162)
>   at 
> org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
>   at 
> org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256)
>   at 
> org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263)
>   at 
> org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248)
>   at 
> org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76)
>   at 
> org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
>   at java.lang.Thread.run(Thread.java:745)
> testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 
> 1.68 sec  <<< FAILURE!
> java.lang.AssertionError: Different number of lines in expected and obtained 
> result. expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292)
>   at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56)
> {code}
> https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz
> {code}
> 04:53:46,840 INFO  org.apache.flink.runtime.client.JobClientActor 
>- 02/09/2016 04:53:46Map -> Sink: Unnamed(2/4) switched to FAILED 
> java.io.FileNotFoundException: 
> /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
> directory)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:213)
>   at java.io.FileOutputStream.(FileOutputStream.java:162)
>   at 
> org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
>   at 
>

[jira] [Commented] (FLINK-3066) Kafka producer fails on leader change

2016-02-09 Thread Gyula Fora (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138677#comment-15138677
 ] 

Gyula Fora commented on FLINK-3066:
---

Thank you Robert for the help, it is a good catch :)

> Kafka producer fails on leader change
> -
>
> Key: FLINK-3066
> URL: https://issues.apache.org/jira/browse/FLINK-3066
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector, Streaming Connectors
>Affects Versions: 0.10.0, 1.0.0
>Reporter: Gyula Fora
>
> I got the following exception during my streaming job:
> {code}
> 16:44:50,637 INFO  org.apache.flink.runtime.jobmanager.JobManager 
>- Status of job 4d3f9443df4822e875f1400244a6e8dd (deduplo!) changed to 
> FAILING.
> java.lang.Exception: Failed to send data to Kafka: This server is not the 
> leader for that topic-partition.
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.checkErroneous(FlinkKafkaProducer.java:275)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.invoke(FlinkKafkaProducer.java:246)
>   at 
> org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:37)
>   at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:166)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:63)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:221)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:588)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kafka.common.errors.NotLeaderForPartitionException: 
> This server is not the leader for that topic-partition.
> {code}
> And then the job crashed and recovered. This should probably be something 
> that we handle without crashing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...

2016-02-09 Thread sachingoel0101

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181798643
  
That is a good point. In streaming setting, it does indeed make sense for 
the model to be available. However, in my opinion, then it would make sense to 
actually just use jppml and import the object, followed by extracting the model 
parameters. Granted, it is an added effort on the user side, but I still think 
it beats the complexity introduced by supporting imports directly. Furthermore, 
it would be a bad design to have to reject valid pmml models, just because a 
minor thing isn't supported in Flink. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3371) Move TriggerCotext and TriggerResult to their own classes

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138754#comment-15138754
 ] 

ASF GitHub Bot commented on FLINK-3371:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1603#issuecomment-181807335
  
You are not moving `TriggerContext` because it is specific to Trigger, 
correct?

Otherwise it looks good to merge. :+1: 


> Move TriggerCotext and TriggerResult to their own classes
> -
>
> Key: FLINK-3371
> URL: https://issues.apache.org/jira/browse/FLINK-3371
> Project: Flink
>  Issue Type: Sub-task
>  Components: Windowing Operators
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.0.0
>
>
> As part of adding aligned window operators, we need aligned trigger classes.
> To make the {{TriggerResult}} and {{TriggerContext}} accessible to them, they 
> should move to their own classes, from currently being internal classes of 
> {{Trigger}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3355] [rocksdb backend] Allow passing o...

2016-02-09 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1608#issuecomment-181807481
  
:+1: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3371] [api breaking] Move TriggerResult...

2016-02-09 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1603#issuecomment-181807335
  
You are not moving `TriggerContext` because it is specific to Trigger, 
correct?

Otherwise it looks good to merge. :+1: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3355) Allow passing RocksDB Option to RocksDBStateBackend

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138755#comment-15138755
 ] 

ASF GitHub Bot commented on FLINK-3355:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1608#issuecomment-181807481
  
:+1: 


> Allow passing RocksDB Option to RocksDBStateBackend
> ---
>
> Key: FLINK-3355
> URL: https://issues.apache.org/jira/browse/FLINK-3355
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Gyula Fora
>Assignee: Stephan Ewen
>Priority: Critical
>
> Currently the RocksDB state backend does not allow users to set the 
> parameters of the created store which might lead to suboptimal performance on 
> some workloads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3375) Allow Watermark Generation in the Kafka Source

2016-02-09 Thread Stephan Ewen (JIRA)

Stephan Ewen created FLINK-3375:
---

 Summary: Allow Watermark Generation in the Kafka Source
 Key: FLINK-3375
 URL: https://issues.apache.org/jira/browse/FLINK-3375
 Project: Flink
  Issue Type: Improvement
  Components: Kafka Connector
Affects Versions: 1.0.0
Reporter: Stephan Ewen
 Fix For: 1.0.0


It is a common case that event timestamps are ascending inside one Kafka 
Partition. Ascending timestamps are easy for users, because they are handles by 
ascending timestamp extraction.

If the Kafka source has multiple partitions per source task, then the records 
become out of order before timestamps can be extracted and watermarks can be 
generated.

If we make the FlinkKafkaConsumer an event time source function, it can 
generate watermarks itself. It would internally implement the same logic as the 
regular operators that merge streams, keeping track of event time progress per 
partition and generating watermarks based on the current guaranteed event time 
progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3374) CEPITCase testSimplePatternEventTime fails

2016-02-09 Thread Ufuk Celebi (JIRA)

Ufuk Celebi created FLINK-3374:
--

 Summary: CEPITCase testSimplePatternEventTime fails
 Key: FLINK-3374
 URL: https://issues.apache.org/jira/browse/FLINK-3374
 Project: Flink
  Issue Type: Test
  Components: Tests
Reporter: Ufuk Celebi
Priority: Minor


{code}
testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 1.68 
sec  <<< ERROR!
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.io.FileNotFoundException: 
/tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.(FileOutputStream.java:213)
at java.io.FileOutputStream.(FileOutputStream.java:162)
at 
org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
at 
org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256)
at 
org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263)
at 
org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248)
at 
org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76)
at 
org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66)
at 
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
at java.lang.Thread.run(Thread.java:745)

testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 1.68 
sec  <<< FAILURE!
java.lang.AssertionError: Different number of lines in expected and obtained 
result. expected:<1> but was:<0>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at 
org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306)
at 
org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292)
at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56)
{code}

https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz

{code}
04:53:46,840 INFO  org.apache.flink.runtime.client.JobClientActor   
 - 02/09/2016 04:53:46  Map -> Sink: Unnamed(2/4) switched to FAILED 
java.io.FileNotFoundException: 
/tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.(FileOutputStream.java:213)
at java.io.FileOutputStream.(FileOutputStream.java:162)
at 
org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
at 
org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256)
at 
org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263)
at 
org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248)
at 
org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76)
at

[jira] [Commented] (FLINK-3243) Fix Interplay of TimeCharacteristic and Time Windows

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138616#comment-15138616
 ] 

ASF GitHub Bot commented on FLINK-3243:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1513#issuecomment-181777122
  
You're right, I'm changing it. But it was also me who didn't notice when we 
put it in initially :sweat_smile: 


> Fix Interplay of TimeCharacteristic and Time Windows
> 
>
> Key: FLINK-3243
> URL: https://issues.apache.org/jira/browse/FLINK-3243
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.0.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Blocker
>
> As per the discussion on the Dev ML: 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Time-Behavior-in-Streaming-Jobs-Event-time-processing-time-td9616.html.
> The discussion seems to have converged on option 2):
> - Add dedicated WindowAssigners for processing time and event time
> - {{timeWindow()}} and {{timeWindowAll()}} respect the set 
> {{TimeCharacteristic}}. 
> This will make the easy stuff easy, i.e. using time windows and quickly 
> switching the time characteristic. Users will then have the flexibility to 
> mix different kinds of window assigners in their job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3374) CEPITCase testSimplePatternEventTime fails

2016-02-09 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138617#comment-15138617
 ] 

Till Rohrmann commented on FLINK-3374:
--

Hmm I just saw that the CEPITCase creates a file under the parent path
instead of a folder. I'm wondering how this could pass before.

On Tue, Feb 9, 2016 at 10:17 AM, Stephan Ewen (JIRA) 



> CEPITCase testSimplePatternEventTime fails
> --
>
> Key: FLINK-3374
> URL: https://issues.apache.org/jira/browse/FLINK-3374
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Reporter: Ufuk Celebi
>Priority: Minor
>
> {code}
> testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 
> 1.68 sec  <<< ERROR!
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.io.FileNotFoundException: 
> /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
> directory)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:213)
>   at java.io.FileOutputStream.(FileOutputStream.java:162)
>   at 
> org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
>   at 
> org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256)
>   at 
> org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263)
>   at 
> org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248)
>   at 
> org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76)
>   at 
> org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
>   at java.lang.Thread.run(Thread.java:745)
> testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 
> 1.68 sec  <<< FAILURE!
> java.lang.AssertionError: Different number of lines in expected and obtained 
> result. expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292)
>   at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56)
> {code}
> https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz
> {code}
> 04:53:46,840 INFO  org.apache.flink.runtime.client.JobClientActor 
>- 02/09/2016 04:53:46Map -> Sink: Unnamed(2/4) switched to FAILED 
> java.io.FileNotFoundException: 
> /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
> directory)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:213)
>   at java.io.FileOutputStream.(FileOutputStream.java:162)
>   at

[GitHub] flink pull request: [FLINK-3226] Translate logical aggregations to...

2016-02-09 Thread vasia

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1600#issuecomment-181794348
  
Thanks for the feedback @tillrohrmann, @twalthr!
I've moved the classes to `org.apache.flink.api.table.runtime` and tried to 
shorten the aggregates code using Numerics. I only left `AvgAggregate` as is, 
because integer average and float/double average are computed differently. We 
can always replace it with code generation later as @twalthr suggested.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3226) Translate optimized logical Table API plans into physical plans representing DataSet programs

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138705#comment-15138705
 ] 

ASF GitHub Bot commented on FLINK-3226:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1600#issuecomment-181794348
  
Thanks for the feedback @tillrohrmann, @twalthr!
I've moved the classes to `org.apache.flink.api.table.runtime` and tried to 
shorten the aggregates code using Numerics. I only left `AvgAggregate` as is, 
because integer average and float/double average are computed differently. We 
can always replace it with code generation later as @twalthr suggested.


> Translate optimized logical Table API plans into physical plans representing 
> DataSet programs
> -
>
> Key: FLINK-3226
> URL: https://issues.apache.org/jira/browse/FLINK-3226
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Reporter: Fabian Hueske
>Assignee: Chengxiang Li
>
> This issue is about translating an (optimized) logical Table API (see 
> FLINK-3225) query plan into a physical plan. The physical plan is a 1-to-1 
> representation of the DataSet program that will be executed. This means:
> - Each Flink RelNode refers to exactly one Flink DataSet or DataStream 
> operator.
> - All (join and grouping) keys of Flink operators are correctly specified.
> - The expressions which are to be executed in user-code are identified.
> - All fields are referenced with their physical execution-time index.
> - Flink type information is available.
> - Optional: Add physical execution hints for joins
> The translation should be the final part of Calcite's optimization process.
> For this task we need to:
> - implement a set of Flink DataSet RelNodes. Each RelNode corresponds to one 
> Flink DataSet operator (Map, Reduce, Join, ...). The RelNodes must hold all 
> relevant operator information (keys, user-code expression, strategy hints, 
> parallelism).
> - implement rules to translate optimized Calcite RelNodes into Flink 
> RelNodes. We start with a straight-forward mapping and later add rules that 
> merge several relational operators into a single Flink operator, e.g., merge 
> a join followed by a filter. Timo implemented some rules for the first SQL 
> implementation which can be used as a starting point.
> - Integrate the translation rules into the Calcite optimization process



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...

2016-02-09 Thread sachingoel0101

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181803637
  
I'm all for that. Flink's models should be transferable at least across 
flink. But that should be part of a separate PR, and not block this one as it 
has been for far too long. 
It should be pretty easy to accomplish 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3371) Move TriggerCotext and TriggerResult to their own classes

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138777#comment-15138777
 ] 

ASF GitHub Bot commented on FLINK-3371:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1603#issuecomment-181813971
  
Exactly, the `AlignedTrigger` will have an `AlignedTriggerContext` without 
Key/Value state.


> Move TriggerCotext and TriggerResult to their own classes
> -
>
> Key: FLINK-3371
> URL: https://issues.apache.org/jira/browse/FLINK-3371
> Project: Flink
>  Issue Type: Sub-task
>  Components: Windowing Operators
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.0.0
>
>
> As part of adding aligned window operators, we need aligned trigger classes.
> To make the {{TriggerResult}} and {{TriggerContext}} accessible to them, they 
> should move to their own classes, from currently being internal classes of 
> {{Trigger}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138528#comment-15138528
 ] 

ASF GitHub Bot commented on FLINK-1966:
---

Github user chobeat commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181757426
  
Well that wouldn't be a problem for the export: you will create and 
therefore export only models that have `double` as datatype for parameters but 
that's not an issue. 

This would be a problem for import though because PMML does support a wider 
set of data types and model types but you can't really achieve any satisfying 
degree of support for PMML in a platform like Flink and that's why everyone use 
JPMML for evaluation. You will be able to only import compatible models with 
compatible data fields. This would require a simple validation at runtime on 
the model type and on fields' data types.


> Add support for predictive model markup language (PMML)
> ---
>
> Key: FLINK-1966
> URL: https://issues.apache.org/jira/browse/FLINK-1966
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Sachin Goel
>Priority: Minor
>  Labels: ML
>
> The predictive model markup language (PMML) [1] is a widely used language to 
> describe predictive and descriptive models as well as pre- and 
> post-processing steps. That way it allows and easy way to export for and 
> import models from other ML tools.
> Resources:
> [1] 
> http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...

2016-02-09 Thread chobeat

Github user chobeat commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181757426
  
Well that wouldn't be a problem for the export: you will create and 
therefore export only models that have `double` as datatype for parameters but 
that's not an issue. 

This would be a problem for import though because PMML does support a wider 
set of data types and model types but you can't really achieve any satisfying 
degree of support for PMML in a platform like Flink and that's why everyone use 
JPMML for evaluation. You will be able to only import compatible models with 
compatible data fields. This would require a simple validation at runtime on 
the model type and on fields' data types.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-3373) Using a newer library of Apache HttpClient than 4.2.6 will get class loading problems

2016-02-09 Thread Jakob Sultan Ericsson (JIRA)

Jakob Sultan Ericsson created FLINK-3373:


 Summary: Using a newer library of Apache HttpClient than 4.2.6 
will get class loading problems
 Key: FLINK-3373
 URL: https://issues.apache.org/jira/browse/FLINK-3373
 Project: Flink
  Issue Type: Bug
 Environment: Latest Flink snapshot 1.0
Reporter: Jakob Sultan Ericsson


When I trying to use Apache HTTP client 4.5.1 in my flink job it will crash 
with NoClassDefFound.
This has to do that it load some classes from provided httpclient 4.2.5/6 in 
core flink.

{noformat}
17:05:56,193 INFO  org.apache.flink.runtime.taskmanager.Task
 - DuplicateFilter -> InstallKeyLookup (11/16) switched to FAILED with 
exception.
java.lang.NoSuchFieldError: INSTANCE
at 
org.apache.http.conn.ssl.SSLConnectionSocketFactory.(SSLConnectionSocketFactory.java:144)
at 
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.getDefaultRegistry(PoolingHttpClientConnectionManager.java:109)
at 
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.(PoolingHttpClientConnectionManager.java:116)
...
at 
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:305)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:227)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
at java.lang.Thread.run(Thread.java:745)
{noformat}

SSLConnectionSocketFactory and finds an earlier version of the 
AllowAllHostnameVerifier that does have the INSTANCE variable (instance 
variable was probably added in 4.3).

{noformat}
jar tvf lib/flink-dist-1.0-SNAPSHOT.jar |grep AllowAllHostnameVerifier  
   791 Thu Dec 17 09:55:46 CET 2015 
org/apache/http/conn/ssl/AllowAllHostnameVerifier.class
{noformat}

Solutions would be:
- Fix the classloader so that my custom job does not conflict with internal 
flink-core classes... pretty hard
- Remove the dependency somehow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138598#comment-15138598
 ] 

ASF GitHub Bot commented on FLINK-3234:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1585#issuecomment-181773503
  
The refactoring looks good, @chiwanpark.
I have just a few minor remarks. The PR can be resolved after these have 
been addressed.


> SortPartition does not support KeySelectorFunctions
> ---
>
> Key: FLINK-3234
> URL: https://issues.apache.org/jira/browse/FLINK-3234
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0, 0.10.1
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
> Fix For: 1.0.0
>
>
> The following is not supported by the DataSet API:
> {code}
> DataSet data = ...
> DataSet data.sortPartition(
>   new KeySelector() {
> public Long getKey(MyObject v) {
>   ...
> }
>   }, 
>   Order.ASCENDING);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3374) CEPITCase testSimplePatternEventTime fails

2016-02-09 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138600#comment-15138600
 ] 

Stephan Ewen commented on FLINK-3374:
-

My guess is that the parent path does not exist. Maybe an issue in the 
FileOutputFormat

> CEPITCase testSimplePatternEventTime fails
> --
>
> Key: FLINK-3374
> URL: https://issues.apache.org/jira/browse/FLINK-3374
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Reporter: Ufuk Celebi
>Priority: Minor
>
> {code}
> testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 
> 1.68 sec  <<< ERROR!
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.io.FileNotFoundException: 
> /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
> directory)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:213)
>   at java.io.FileOutputStream.(FileOutputStream.java:162)
>   at 
> org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
>   at 
> org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256)
>   at 
> org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263)
>   at 
> org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248)
>   at 
> org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76)
>   at 
> org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
>   at java.lang.Thread.run(Thread.java:745)
> testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 
> 1.68 sec  <<< FAILURE!
> java.lang.AssertionError: Different number of lines in expected and obtained 
> result. expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292)
>   at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56)
> {code}
> https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz
> {code}
> 04:53:46,840 INFO  org.apache.flink.runtime.client.JobClientActor 
>- 02/09/2016 04:53:46Map -> Sink: Unnamed(2/4) switched to FAILED 
> java.io.FileNotFoundException: 
> /tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
> directory)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:213)
>   at java.io.FileOutputStream.(FileOutputStream.java:162)
>   at 
> org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
>   at 
>

[jira] [Commented] (FLINK-3234) SortPartition does not support KeySelectorFunctions

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138595#comment-15138595
 ] 

ASF GitHub Bot commented on FLINK-3234:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1585#discussion_r52282391
  
--- Diff: 
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/SortPartitionITCase.scala
 ---
@@ -166,6 +167,58 @@ class SortPartitionITCase(mode: TestExecutionMode) 
extends MultipleProgramsTestB
 TestBaseUtils.compareResultAsText(result.asJava, expected)
   }
 
+  @Test
+  def testSortPartitionWithKeySelector1(): Unit = {
+val env = ExecutionEnvironment.getExecutionEnvironment
+env.setParallelism(4)
+val ds = CollectionDataSets.get3TupleDataSet(env)
+
+val result = ds
+  .map { x => x }.setParallelism(4)
+  .sortPartition(_._2, Order.DESCENDING)
--- End diff --

Change sort order to `ASCENDING` (or in the other test).


> SortPartition does not support KeySelectorFunctions
> ---
>
> Key: FLINK-3234
> URL: https://issues.apache.org/jira/browse/FLINK-3234
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0, 0.10.1
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
> Fix For: 1.0.0
>
>
> The following is not supported by the DataSet API:
> {code}
> DataSet data = ...
> DataSet data.sortPartition(
>   new KeySelector() {
> public Long getKey(MyObject v) {
>   ...
> }
>   }, 
>   Order.ASCENDING);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3234] [dataSet] Add KeySelector support...

2016-02-09 Thread fhueske

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1585#issuecomment-181773503
  
The refactoring looks good, @chiwanpark.
I have just a few minor remarks. The PR can be resolved after these have 
been addressed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3243) Fix Interplay of TimeCharacteristic and Time Windows

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138610#comment-15138610
 ] 

ASF GitHub Bot commented on FLINK-3243:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1513#issuecomment-181776065
  
For new classes, it makes sense. Was a mistake on my end to name them like 
this in the first place.

But users that adopted this draw in my experience more satisfaction from 
stable code than from a style nuance.


> Fix Interplay of TimeCharacteristic and Time Windows
> 
>
> Key: FLINK-3243
> URL: https://issues.apache.org/jira/browse/FLINK-3243
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.0.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Blocker
>
> As per the discussion on the Dev ML: 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Time-Behavior-in-Streaming-Jobs-Event-time-processing-time-td9616.html.
> The discussion seems to have converged on option 2):
> - Add dedicated WindowAssigners for processing time and event time
> - {{timeWindow()}} and {{timeWindowAll()}} respect the set 
> {{TimeCharacteristic}}. 
> This will make the easy stuff easy, i.e. using time windows and quickly 
> switching the time characteristic. Users will then have the flexibility to 
> mix different kinds of window assigners in their job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3355] [rocksdb backend] Allow passing o...

2016-02-09 Thread gyfora

Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/1608#issuecomment-181800676
  
Looks good :+1: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3355) Allow passing RocksDB Option to RocksDBStateBackend

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138734#comment-15138734
 ] 

ASF GitHub Bot commented on FLINK-3355:
---

Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/1608#issuecomment-181800676
  
Looks good :+1: 


> Allow passing RocksDB Option to RocksDBStateBackend
> ---
>
> Key: FLINK-3355
> URL: https://issues.apache.org/jira/browse/FLINK-3355
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Gyula Fora
>Assignee: Stephan Ewen
>Priority: Critical
>
> Currently the RocksDB state backend does not allow users to set the 
> parameters of the created store which might lead to suboptimal performance on 
> some workloads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138751#comment-15138751
 ] 

ASF GitHub Bot commented on FLINK-1966:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1186#issuecomment-181803637
  
I'm all for that. Flink's models should be transferable at least across 
flink. But that should be part of a separate PR, and not block this one as it 
has been for far too long. 
It should be pretty easy to accomplish 


> Add support for predictive model markup language (PMML)
> ---
>
> Key: FLINK-1966
> URL: https://issues.apache.org/jira/browse/FLINK-1966
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Sachin Goel
>Priority: Minor
>  Labels: ML
>
> The predictive model markup language (PMML) [1] is a widely used language to 
> describe predictive and descriptive models as well as pre- and 
> post-processing steps. That way it allows and easy way to export for and 
> import models from other ML tools.
> Resources:
> [1] 
> http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3226) Translate optimized logical Table API plans into physical plans representing DataSet programs

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138764#comment-15138764
 ] 

ASF GitHub Bot commented on FLINK-3226:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1595#discussion_r52291201
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala
 ---
@@ -0,0 +1,661 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.codegen
+
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.`type`.SqlTypeName._
+import org.apache.calcite.sql.fun.SqlStdOperatorTable._
+import org.apache.flink.api.common.functions.{FlatMapFunction, Function, 
MapFunction}
+import org.apache.flink.api.common.typeinfo.{AtomicType, TypeInformation}
+import org.apache.flink.api.common.typeutils.CompositeType
+import org.apache.flink.api.java.typeutils.{PojoTypeInfo, TupleTypeInfo}
+import org.apache.flink.api.scala.typeutils.CaseClassTypeInfo
+import org.apache.flink.api.table.TableConfig
+import org.apache.flink.api.table.codegen.CodeGenUtils._
+import org.apache.flink.api.table.codegen.Indenter._
+import org.apache.flink.api.table.codegen.OperatorCodeGen._
+import org.apache.flink.api.table.plan.TypeConverter.sqlTypeToTypeInfo
+import org.apache.flink.api.table.typeinfo.RowTypeInfo
+
+import scala.collection.JavaConversions._
+import scala.collection.mutable
+
+class CodeGenerator(
+config: TableConfig,
+input1: TypeInformation[Any],
+input2: Option[TypeInformation[Any]] = None)
+  extends RexVisitor[GeneratedExpression] {
+
+  // set of member statements that will be added only once
+  // we use a LinkedHashSet to keep the insertion order
+  private val reusableMemberStatements = mutable.LinkedHashSet[String]()
+
+  // set of constructor statements that will be added only once
+  // we use a LinkedHashSet to keep the insertion order
+  private val reusableInitStatements = mutable.LinkedHashSet[String]()
+
+  // map of initial input unboxing expressions that will be added only once
+  // (inputTerm, index) -> expr
+  private val reusableInputUnboxingExprs = mutable.Map[(String, Int), 
GeneratedExpression]()
+
+  def reuseMemberCode(): String = {
+reusableMemberStatements.mkString("", "\n", "\n")
+  }
+
+  def reuseInitCode(): String = {
+reusableInitStatements.mkString("", "\n", "\n")
+  }
+
+  def reuseInputUnboxingCode(): String = {
+reusableInputUnboxingExprs.values.map(_.code).mkString("", "\n", "\n")
+  }
+
+  def input1Term = "in1"
+
+  def input2Term = "in2"
+
+  def collectorTerm = "c"
+
+  def outRecordTerm = "out"
+
+  def nullCheck: Boolean = config.getNullCheck
+
+  def generateExpression(rex: RexNode): GeneratedExpression = {
+rex.accept(this)
+  }
+
+  def generateFunction[T <: Function](
+  name: String,
+  clazz: Class[T],
+  bodyCode: String,
+  returnType: TypeInformation[Any])
+: GeneratedFunction[T] = {
+val funcName = newName(name)
+
+// Janino does not support generics, that's why we need
+// manual casting here
+val samHeader =
+  if (clazz == classOf[FlatMapFunction[_,_]]) {
+val inputTypeTerm = boxedTypeTermForTypeInfo(input1)
+(s"void flatMap(Object _in1, org.apache.flink.util.Collector 
$collectorTerm)",
+  s"$inputTypeTerm $input1Term = ($inputTypeTerm) _in1;")
+  } else if (clazz == classOf[MapFunction[_,_]]) {
+val inputTypeTerm = boxedTypeTermForTypeInfo(input1)
+("Object map(Object _in1)",
+  s"$inputTypeTerm $input1Term = ($inputTypeTerm) _in1;")
+  } else {
+// TODO more functions
+throw new

[jira] [Commented] (FLINK-3377) Remove final flag from ResultPartitionWriter class

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139016#comment-15139016
 ] 

ASF GitHub Bot commented on FLINK-3377:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/1609

[FLINK-3377] Remove final flag from ResultPartitionWriter class



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 3377_partitionwriter_final

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1609.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1609


commit 47670f5812c255a3cb992a0a2f396330ed5c519d
Author: zentol 
Date:   2016-02-09T14:51:11Z

[FLINK-3377] Remove final flag from ResultPartitionWriter class




> Remove final flag from ResultPartitionWriter class
> --
>
> Key: FLINK-3377
> URL: https://issues.apache.org/jira/browse/FLINK-3377
> Project: Flink
>  Issue Type: Wish
>  Components: Distributed Runtime
>Affects Versions: 0.10.1
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.00
>
>
> The final flag on the 
> org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter class is 
> causing issues for me.
> The flag requires me to run a test I'm working on with a 
> @RunWith(PowerMockRunner.class) annotation so that i can use 
> @PrepareForTest({ResultPartitionWriter.class}).
> But it breaks my TemporaryFolder annotated with @ClassRule. (apart from that 
> there also was a classloader issue, but i could resolve that)
> To me these seem like unnecessary problems, as such i propose removing the 
> final flag.
> The 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3378) Consolidate TestingCluster and FokableFlinkMiniCluster

2016-02-09 Thread Maximilian Michels (JIRA)

Maximilian Michels created FLINK-3378:
-

 Summary: Consolidate TestingCluster and FokableFlinkMiniCluster
 Key: FLINK-3378
 URL: https://issues.apache.org/jira/browse/FLINK-3378
 Project: Flink
  Issue Type: Improvement
  Components: Tests
Affects Versions: 1.0.0
Reporter: Maximilian Michels
 Fix For: 1.0.0


{{TestingCluster}} appears to be outdated and should be replaced by or 
consolidated with the {{ForkableMiniCluster}}. Both clusters start the testing 
actors. Additionally, ForkableMiniCluster cluster has support for forking, HA, 
and restarting actors.

As of now it looks like the use of both is arbitrary. The TestingCluster may 
produce test failures because multiple forked test instances could be trying to 
bind to the same free port.

It looks like the ForkableMiniCluster should also inherit from FlinkMiniCluster 
instead of LocalFlinkMiniCluster because it overwrites all inherited 
implementations of LocalFlinkMiniCluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (FLINK-3377) Remove final flag from ResultPartitionWriter class

2016-02-09 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-3377:

Summary: Remove final flag from ResultPartitionWriter class  (was: Remove 
final flag from ResultPatitionWriter class)

> Remove final flag from ResultPartitionWriter class
> --
>
> Key: FLINK-3377
> URL: https://issues.apache.org/jira/browse/FLINK-3377
> Project: Flink
>  Issue Type: Wish
>  Components: Distributed Runtime
>Affects Versions: 0.10.1
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.00
>
>
> The final flag on the 
> org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter class is 
> causing issues for me.
> The flag requires me to run a test I'm working on with a 
> @RunWith(PowerMockRunner.class) annotation so that i can use 
> @PrepareForTest({ResultPartitionWriter.class}).
> But it breaks my TemporaryFolder annotated with @ClassRule. (apart from that 
> there also was a classloader issue, but i could resolve that)
> To me these seem like unnecessary problems, as such i propose removing the 
> final flag.
> The 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3377) Remove final flag from ResultPatitionWriter class

2016-02-09 Thread Chesnay Schepler (JIRA)

Chesnay Schepler created FLINK-3377:
---

 Summary: Remove final flag from ResultPatitionWriter class
 Key: FLINK-3377
 URL: https://issues.apache.org/jira/browse/FLINK-3377
 Project: Flink
  Issue Type: Wish
  Components: Distributed Runtime
Affects Versions: 0.10.1
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Trivial
 Fix For: 1.00


The final flag on the 
org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter class is 
causing issues for me.

The flag requires me to run a test I'm working on with a 
@RunWith(PowerMockRunner.class) annotation so that i can use 
@PrepareForTest({ResultPartitionWriter.class}).
But it breaks my TemporaryFolder annotated with @ClassRule. (apart from that 
there also was a classloader issue, but i could resolve that)

To me these seem like unnecessary problems, as such i propose removing the 
final flag.

The 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2237) Add hash-based Aggregation

2016-02-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139041#comment-15139041
 ] 

ASF GitHub Bot commented on FLINK-2237:
---

Github user ggevay commented on the pull request:

https://github.com/apache/flink/pull/1517#issuecomment-181907722
  
What I'm not sure about is the `closeUserCode` call in 
`ChainedReduceCombineDriver.closeTask`. Those other chained drivers that have a 
`running` flag for indicating canceling, make this call only when the driver 
was not canceled. But those other chained drivers where there is no `running` 
flag seem to make this call also when they were canceled. What is the reasoning 
behind this situation?


> Add hash-based Aggregation
> --
>
> Key: FLINK-2237
> URL: https://issues.apache.org/jira/browse/FLINK-2237
> Project: Flink
>  Issue Type: New Feature
>Reporter: Rafiullah Momand
>Assignee: Gabor Gevay
>Priority: Minor
>
> Aggregation functions at the moment are implemented in a sort-based way.
> How can we implement hash based Aggregation for Flink?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2237] [runtime] Add hash-based combiner...

2016-02-09 Thread ggevay

Github user ggevay commented on the pull request:

https://github.com/apache/flink/pull/1517#issuecomment-181907722
  
What I'm not sure about is the `closeUserCode` call in 
`ChainedReduceCombineDriver.closeTask`. Those other chained drivers that have a 
`running` flag for indicating canceling, make this call only when the driver 
was not canceled. But those other chained drivers where there is no `running` 
flag seem to make this call also when they were canceled. What is the reasoning 
behind this situation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2055) Implement Streaming HBaseSink

2016-02-09 Thread Chesnay Schepler (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139040#comment-15139040
 ] 

Chesnay Schepler commented on FLINK-2055:
-

What's the status on this one?

> Implement Streaming HBaseSink
> -
>
> Key: FLINK-2055
> URL: https://issues.apache.org/jira/browse/FLINK-2055
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming, Streaming Connectors
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Hilmi Yildirim
>
> As per : 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Write-Stream-to-HBase-td1300.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3107] [runtime] Start checkpoint ID cou...

2016-02-09 Thread uce

GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/1610

[FLINK-3107] [runtime] Start checkpoint ID counter with periodic scheduler

Problem: The job manager enables checkpoints during submission of streaming 
programs. This can lead to call to a call to 
`ZooKeeperCheckpointIDCounter.start()`, which communicates with ZooKeeper. This 
can block the job manager actor.

Solution: Start the counter in the `CheckpointCoordinatorDeActivator`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/flink 3107-counter_start

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1610.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1610


commit d70bc79e48dddb658c2240350837000ce9f1f0fe
Author: Ufuk Celebi 
Date:   2016-02-09T15:06:46Z

[FLINK-3107] [runtime] Start checkpoint ID counter with periodic scheduler

Problem: The job manager enables checkpoints during submission of streaming
programs. This can lead to call to a call to 
`ZooKeeperCheckpointIDCounter.start()`,
which communicates with ZooKeeper. This can block the job manager actor.

Solution: Start the counter in the `CheckpointCoordinatorDeActivator`.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

1 2 3 >

1 - 100 of 239 matches

Mail list logo