[GitHub] flink pull request: [FLINK-1565][FLINK-2078] Document ExecutionCon...

2015-06-05 Thread flinkqa
Github user flinkqa commented on the pull request:

https://github.com/apache/flink/pull/781#issuecomment-109172997
  
Tested pull request.Result: 
fatal: No such remote 'totest'
fatal: 'totest' does not appear to be a git repository
fatal: The remote end hung up unexpectedly
error: pathspec 'totest/flink1565' did not match any file(s) known to git.
Running ./tools/qa-check.sh
Computing Flink QA-Check results (please be patient).
:+1: The number of javadoc errors was 174 and is now 174
:-1: The change increases the number of compiler warnings from 634 to 647
```diff
First 100 warnings:
diff: standard output: Broken pipe
1,139c1,149
 [WARNING] bootstrap class path not set in conjunction with -source 1.6
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/operators/base/CollectorMapOperatorBase.java:[24,45]
 org.apache.flink.api.common.functions.GenericCollectorMap in 
org.apache.flink.api.common.functions has been deprecated
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/core/memory/MemorySegment.java:[968,38]
 sun.misc.Unsafe is internal proprietary API and may be removed in a future 
release
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/typeinfo/BasicTypeInfo.java:[56,8]
 serializable class org.apache.flink.api.common.typeinfo.BasicTypeInfo has no 
definition of serialVersionUID
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/typeinfo/NumericTypeInfo.java:[27,8]
 serializable class org.apache.flink.api.common.typeinfo.NumericTypeInfo has no 
definition of serialVersionUID
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/typeinfo/IntegerTypeInfo.java:[27,8]
 serializable class org.apache.flink.api.common.typeinfo.IntegerTypeInfo has no 
definition of serialVersionUID
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java:[579,25]
 found raw type: org.apache.flink.api.common.ExecutionConfig.Entry
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java:[613,23]
 serializable class 
org.apache.flink.api.common.ExecutionConfig.GlobalJobParameters has no 
definition of serialVersionUID
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/core/memory/MemoryUtils.java:[33,37]
 sun.misc.Unsafe is internal proprietary API and may be removed in a future 
release
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/core/memory/MemoryUtils.java:[42,32]
 sun.misc.Unsafe is internal proprietary API and may be removed in a future 
release
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/core/memory/MemoryUtils.java:[44,53]
 sun.misc.Unsafe is internal proprietary API and may be removed in a future 
release
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/core/memory/MemoryUtils.java:[46,41]
 sun.misc.Unsafe is internal proprietary API and may be removed in a future 
release
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/operators/Operator.java:[231,81]
 found raw type: org.apache.flink.api.common.operators.Operator
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/operators/GenericDataSourceBase.java:[49,17]
 found raw type: 
org.apache.flink.api.common.operators.GenericDataSourceBase.SplitDataProperties
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/operators/GenericDataSourceBase.java:[184,28]
 unchecked conversion
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/operators/Ordering.java:[116,47]
 found raw type: java.lang.Class
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/operators/GenericDataSinkBase.java:[154,97]
 found raw type: org.apache.flink.api.common.operators.Operator
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/operators/AbstractUdfOperator.java:[142,40]
 found raw type: java.lang.Class
 [WARNING] 
/home/robert/qa-bot/flink/tools/_qa_workdir/flink/flink-core/src/main/java/org/apache/flink/api/common/operators/AbstractUdfOperator.java:[154,40]
 found raw type: java.lang.Class
 [WARNING] 

[GitHub] flink pull request: [FLINK-2155] Enforce import restriction on usa...

2015-06-05 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/790#issuecomment-109192811
  
Thank you for the contribution.

+1 to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2155) Add an additional checkstyle validation for illegal imports

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574082#comment-14574082
 ] 

ASF GitHub Bot commented on FLINK-2155:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/790#issuecomment-109192811
  
Thank you for the contribution.

+1 to merge.


 Add an additional checkstyle validation for illegal imports
 ---

 Key: FLINK-2155
 URL: https://issues.apache.org/jira/browse/FLINK-2155
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Lokesh Rajaram
Assignee: Lokesh Rajaram

 Add an additional check-style validation for illegal imports.
 To begin with the following two package import are marked as illegal:
  1. org.apache.commons.lang3.Validate
  2. org.apache.flink.shaded.*
 Implementation based on: 
 http://checkstyle.sourceforge.net/config_imports.html#IllegalImport



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request:

2015-06-05 Thread thvasilo
Github user thvasilo commented on the pull request:


https://github.com/apache/flink/commit/27487ec6089adbea77266f194582ae476e50e928#commitcomment-11537565
  
In docs/libs/ml/quickstart.md:
In docs/libs/ml/quickstart.md on line 82:
TODO: Need to transform the tuples to LabeledVector


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2139] [streaming] Streaming outputforma...

2015-06-05 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/789#issuecomment-109218281
  
Both are good points. 

As for the socket test I can do one of the following:
   * Do not test it
   * Test it with smaller data
   * Mock the socket with some collection

I am not not really happy with any of these options to be honest, but maybe 
I am just missing something.

As for the ITCases, you are right, I have to admit that I was looking at 
the 
[AvroOutputFormatTest](https://github.com/apache/flink/blob/master/flink-staging/flink-avro/src/test/java/org/apache/flink/api/avro/AvroOutputFormatTest.java)
 and named my tests accordingly. So you would suggest renaming that too then?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2000) Add SQL-style aggregations for Table API

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574161#comment-14574161
 ] 

ASF GitHub Bot commented on FLINK-2000:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/782#issuecomment-109211373
  
This looks good. :+1: Any objections to me merging this later?


 Add SQL-style aggregations for Table API
 

 Key: FLINK-2000
 URL: https://issues.apache.org/jira/browse/FLINK-2000
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Cheng Hao
Priority: Minor

 Right now, the syntax for aggregations is a.count, a.min and so on. We 
 could in addition offer COUNT(a), MIN(a) and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2153) Exclude dependency on hbase annotations module

2015-06-05 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574189#comment-14574189
 ] 

Stephan Ewen commented on FLINK-2153:
-

Any updates on that. Would like to have this in the 0.9 release, where we like 
the first release candidate out any time now.

 Exclude dependency on hbase annotations module
 --

 Key: FLINK-2153
 URL: https://issues.apache.org/jira/browse/FLINK-2153
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Lokesh Rajaram

 [ERROR] Failed to execute goal on project flink-hbase: Could not resolve
 dependencies for project org.apache.flink:flink-hbase:jar:0.9-SNAPSHOT:
 Could not find artifact jdk.tools:jdk.tools:jar:1.7 at specified path
 /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/../lib/tools.jar
 There is a Spark issue for this [1] with a solution [2].
 [1] https://issues.apache.org/jira/browse/SPARK-4455
 [2] https://github.com/apache/spark/pull/3286/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2139) Test Streaming Outputformats

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574196#comment-14574196
 ] 

ASF GitHub Bot commented on FLINK-2139:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/789#issuecomment-109216797
  
I think we need to name the tests ITCase. In my understanding the tests 
that bring up a test cluster and execute a program are ITCases since they take 
longer to run than simple tests. I might be wrong, though. Anyone else have an 
opinion?


 Test Streaming Outputformats
 

 Key: FLINK-2139
 URL: https://issues.apache.org/jira/browse/FLINK-2139
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Márton Balassi
 Fix For: 0.9


 Currently the only tested streaming core output is the writeAsTest and that 
 is only tested indirectly in integration tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2002) Iterative test fails when ran with other tests in the same environment

2015-06-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574199#comment-14574199
 ] 

Márton Balassi commented on FLINK-2002:
---

[~gyfora] has already fixed this, I am adding a commit re-enabling the test 
that revealed the issue.

 Iterative test fails when ran with other tests in the same environment
 --

 Key: FLINK-2002
 URL: https://issues.apache.org/jira/browse/FLINK-2002
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Péter Szabó
Assignee: Márton Balassi

 I run tests in the same StreamExecutionEnvironment with 
 MultipleProgramsTestBase. One of the tests uses an iterative data stream. It 
 fails as well as all tests after that. (When I put the iterative test in a 
 separate environment, all tests passes.) For me it seems that it is a 
 state-related issue but there is also some problem with the broker slots.
 The iterative test throws:
 java.lang.Exception: TaskManager sent illegal state update: CANCELING
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:618)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply$mcV$sp(JobManager.scala:222)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:221)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:221)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
   at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:314)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29)
   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29)
   at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
   at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:95)
   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
   at akka.dispatch.Mailbox.run(Mailbox.scala:221)
   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: 
 org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
 Not enough free slots available to run the job. You can decrease the operator 
 parallelism or increase the number of slots per TaskManager in the 
 configuration. Task to schedule:  Attempt #0 (GroupedActiveDiscretizer 
 (2/4)) @ (unassigned) - [SCHEDULED]  with groupID  
 e8f7c9c85e64403962648bc7e2aead8b  in sharing group  SlotSharingGroup 
 [5e62f1cc5cae2c088430ef935470a8d5, 5bc227941969d1daa1ebb1ba070b55ce, 
 d999ee6c10730775a8fef1c6f1af1dbd, 45b73caa75424d84adbb7bb92671591d, 
 5c94c54d9316b827c6eba6c721329549, 794d6c56bee347dcdd62ffdf189de267, 
 4c3b72e17a4acecde4241fe6e63355b8, f6a6028c224a7b81e4802eeaf9c8487e, 
 

[jira] [Commented] (FLINK-2139) Test Streaming Outputformats

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574207#comment-14574207
 ] 

ASF GitHub Bot commented on FLINK-2139:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/789#issuecomment-109218281
  
Both are good points. 

As for the socket test I can do one of the following:
   * Do not test it
   * Test it with smaller data
   * Mock the socket with some collection

I am not not really happy with any of these options to be honest, but maybe 
I am just missing something.

As for the ITCases, you are right, I have to admit that I was looking at 
the 
[AvroOutputFormatTest](https://github.com/apache/flink/blob/master/flink-staging/flink-avro/src/test/java/org/apache/flink/api/avro/AvroOutputFormatTest.java)
 and named my tests accordingly. So you would suggest renaming that too then?


 Test Streaming Outputformats
 

 Key: FLINK-2139
 URL: https://issues.apache.org/jira/browse/FLINK-2139
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Márton Balassi
 Fix For: 0.9


 Currently the only tested streaming core output is the writeAsTest and that 
 is only tested indirectly in integration tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2123] Fix log4j warnings on CliFrontend...

2015-06-05 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/783#issuecomment-109199667
  
Merging ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2000] [table] Add sql style aggregation...

2015-06-05 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/782#issuecomment-109214447
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2000) Add SQL-style aggregations for Table API

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574178#comment-14574178
 ] 

ASF GitHub Bot commented on FLINK-2000:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/782#issuecomment-109214447
  
LGTM


 Add SQL-style aggregations for Table API
 

 Key: FLINK-2000
 URL: https://issues.apache.org/jira/browse/FLINK-2000
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Cheng Hao
Priority: Minor

 Right now, the syntax for aggregations is a.count, a.min and so on. We 
 could in addition offer COUNT(a), MIN(a) and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Added member currentSplit for FileInputFormat....

2015-06-05 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/791#issuecomment-109205245
  
Thank you for the contribution.

+1 to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1319) Add static code analysis for UDFs

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574191#comment-14574191
 ] 

ASF GitHub Bot commented on FLINK-1319:
---

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/729#discussion_r31798470
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/functions/SemanticPropUtil.java
 ---
@@ -309,15 +308,20 @@ public static DualInputSemanticProperties 
getSemanticPropsDual(
getSemanticPropsDualFromString(result, forwardedFirst, 
forwardedSecond,
nonForwardedFirst, nonForwardedSecond, 
readFirst, readSecond, inType1, inType2, outType);
return result;
-   } else {
-   return new DualInputSemanticProperties();
}
+   return null;
+   }
+
+   public static void 
getSemanticPropsSingleFromString(SingleInputSemanticProperties result,
+   
String[] forwarded, String[] nonForwarded, 
String[] readSet,
+   
TypeInformation? inType, TypeInformation? 
outType) {
+   getSemanticPropsSingleFromString(result, forwarded, 
nonForwarded, readSet, inType, outType, false);
}
 
public static void 
getSemanticPropsSingleFromString(SingleInputSemanticProperties result,
String[] forwarded, String[] nonForwarded, String[] 
readSet,
-   TypeInformation? inType, TypeInformation? outType)
-   {
+   TypeInformation? inType, TypeInformation? outType,
+   boolean skipIncompatibleTypes) {
--- End diff --

Sometimes the analyzer works better than required. E.g. the analyzer 
outputs @ForwardedFields(*-record.customer.name) but if customer is a 
GenericType output type, the types are incompatible. I thought it is better to 
reuse the type compatibility checking of the PropUtil than reimplement 
everything, but skip types that are incompatible without throwing Exceptions.


 Add static code analysis for UDFs
 -

 Key: FLINK-1319
 URL: https://issues.apache.org/jira/browse/FLINK-1319
 Project: Flink
  Issue Type: New Feature
  Components: Java API, Scala API
Reporter: Stephan Ewen
Assignee: Timo Walther
Priority: Minor

 Flink's Optimizer takes information that tells it for UDFs which fields of 
 the input elements are accessed, modified, or frwarded/copied. This 
 information frequently helps to reuse partitionings, sorts, etc. It may speed 
 up programs significantly, as it can frequently eliminate sorts and shuffles, 
 which are costly.
 Right now, users can add lightweight annotations to UDFs to provide this 
 information (such as adding {{@ConstandFields(0-3, 1, 2-1)}}.
 We worked with static code analysis of UDFs before, to determine this 
 information automatically. This is an incredible feature, as it magically 
 makes programs faster.
 For record-at-a-time operations (Map, Reduce, FlatMap, Join, Cross), this 
 works surprisingly well in many cases. We used the Soot toolkit for the 
 static code analysis. Unfortunately, Soot is LGPL licensed and thus we did 
 not include any of the code so far.
 I propose to add this functionality to Flink, in the form of a drop-in 
 addition, to work around the LGPL incompatibility with ALS 2.0. Users could 
 simply download a special flink-code-analysis.jar and drop it into the 
 lib folder to enable this functionality. We may even add a script to 
 tools that downloads that library automatically into the lib folder. This 
 should be legally fine, since we do not redistribute LGPL code and only 
 dynamically link it (the incompatibility with ASL 2.0 is mainly in the 
 patentability, if I remember correctly).
 Prior work on this has been done by [~aljoscha] and [~skunert], which could 
 provide a code base to start with.
 *Appendix*
 Hompage to Soot static analysis toolkit: http://www.sable.mcgill.ca/soot/
 Papers on static analysis and for optimization: 
 http://stratosphere.eu/assets/papers/EnablingOperatorReorderingSCA_12.pdf and 
 http://stratosphere.eu/assets/papers/openingTheBlackBoxes_12.pdf
 Quick introduction to the Optimizer: 
 http://stratosphere.eu/assets/papers/2014-VLDBJ_Stratosphere_Overview.pdf 
 (Section 6)
 Optimizer for Iterations: 
 http://stratosphere.eu/assets/papers/spinningFastIterativeDataFlows_12.pdf 
 (Sections 4.3 and 5.3)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2139) Test Streaming Outputformats

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574097#comment-14574097
 ] 

ASF GitHub Bot commented on FLINK-2139:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/789#issuecomment-109196655
  
The socket test failed once in ten runs on travis, some messages were lost 
while writing and reading to the socket.  do not think that we can avoid that 
completely. I can feed in less input data or remove the test.


 Test Streaming Outputformats
 

 Key: FLINK-2139
 URL: https://issues.apache.org/jira/browse/FLINK-2139
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Márton Balassi
 Fix For: 0.9


 Currently the only tested streaming core output is the writeAsTest and that 
 is only tested indirectly in integration tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2139] [streaming] Streaming outputforma...

2015-06-05 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/789#issuecomment-109196655
  
The socket test failed once in ten runs on travis, some messages were lost 
while writing and reading to the socket.  do not think that we can avoid that 
completely. I can feed in less input data or remove the test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2002) Iterative test fails when ran with other tests in the same environment

2015-06-05 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574153#comment-14574153
 ] 

Stephan Ewen commented on FLINK-2002:
-

This seems to be related to improper cleanup after a job.

 Iterative test fails when ran with other tests in the same environment
 --

 Key: FLINK-2002
 URL: https://issues.apache.org/jira/browse/FLINK-2002
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Péter Szabó

 I run tests in the same StreamExecutionEnvironment with 
 MultipleProgramsTestBase. One of the tests uses an iterative data stream. It 
 fails as well as all tests after that. (When I put the iterative test in a 
 separate environment, all tests passes.) For me it seems that it is a 
 state-related issue but there is also some problem with the broker slots.
 The iterative test throws:
 java.lang.Exception: TaskManager sent illegal state update: CANCELING
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:618)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply$mcV$sp(JobManager.scala:222)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:221)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:221)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
   at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:314)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29)
   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29)
   at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
   at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:95)
   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
   at akka.dispatch.Mailbox.run(Mailbox.scala:221)
   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: 
 org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
 Not enough free slots available to run the job. You can decrease the operator 
 parallelism or increase the number of slots per TaskManager in the 
 configuration. Task to schedule:  Attempt #0 (GroupedActiveDiscretizer 
 (2/4)) @ (unassigned) - [SCHEDULED]  with groupID  
 e8f7c9c85e64403962648bc7e2aead8b  in sharing group  SlotSharingGroup 
 [5e62f1cc5cae2c088430ef935470a8d5, 5bc227941969d1daa1ebb1ba070b55ce, 
 d999ee6c10730775a8fef1c6f1af1dbd, 45b73caa75424d84adbb7bb92671591d, 
 5c94c54d9316b827c6eba6c721329549, 794d6c56bee347dcdd62ffdf189de267, 
 4c3b72e17a4acecde4241fe6e63355b8, f6a6028c224a7b81e4802eeaf9c8487e, 
 989c68790fc7c5e2f8b8c150a33fef89, db93daa1f9e5194f0079df2629b08efb, 
 

[GitHub] flink pull request: [FLINK-2072] [ml] [docs] Add a quickstart guid...

2015-06-05 Thread thvasilo
GitHub user thvasilo opened a pull request:

https://github.com/apache/flink/pull/792

[FLINK-2072] [ml]  [docs] Add a quickstart guide for FlinkML

This is an initial version of the quickstart guide. There are some issues 
that still need to be addressed such as the validity of standardizing the data, 
and whether the complete code example should be included in an examples package 
for FlinkML.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thvasilo/flink quickstart-ml

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/792.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #792


commit 27487ec6089adbea77266f194582ae476e50e928
Author: Theodore Vasiloudis t...@sics.se
Date:   2015-06-05T09:09:11Z

Initial version of quickstart guide




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2163) VertexCentricConfigurationITCase sometimes fails on Travis

2015-06-05 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574184#comment-14574184
 ] 

Stephan Ewen commented on FLINK-2163:
-

It is a strong suspicion. Other file based tests have failed with a similar 
error before. Collect() based tests seem to not exhibit that issue.

 VertexCentricConfigurationITCase sometimes fails on Travis
 --

 Key: FLINK-2163
 URL: https://issues.apache.org/jira/browse/FLINK-2163
 Project: Flink
  Issue Type: Bug
  Components: Gelly
Reporter: Aljoscha Krettek

 This is the relevant output from the log:
 {code}
 testIterationINDirection[Execution mode = 
 CLUSTER](org.apache.flink.graph.test.VertexCentricConfigurationITCase)  Time 
 elapsed: 0.587 sec   FAILURE!
 java.lang.AssertionError: Different number of lines in expected and obtained 
 result. expected:5 but was:2
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at 
 org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:270)
   at 
 org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:256)
   at 
 org.apache.flink.graph.test.VertexCentricConfigurationITCase.after(VertexCentricConfigurationITCase.java:70)
 Results :
 Failed tests: 
   
 VertexCentricConfigurationITCase.after:70-TestBaseUtils.compareResultsByLinesInMemory:256-TestBaseUtils.compareResultsByLinesInMemory:270
  Different number of lines in expected and obtained result. expected:5 but 
 was:2
 {code}
 https://travis-ci.org/aljoscha/flink/jobs/65403502



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request:

2015-06-05 Thread thvasilo
Github user thvasilo commented on the pull request:


https://github.com/apache/flink/commit/27487ec6089adbea77266f194582ae476e50e928#commitcomment-11537784
  
In docs/libs/ml/quickstart.md:
In docs/libs/ml/quickstart.md on line 69:
This way of reading in CSVs can get unwieldy fast. We need a more concise 
way to do this,


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2160] Change Streaming Source Interface...

2015-06-05 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/785#issuecomment-109173098
  
@mbalassi What's the reason for not wanting emit and keeping collect? I am 
aware that other parts of the system have a Collector with a collect method but 
the sources are somewhat special, I think. Also, collect() is a legacy from 
Hadoop MapReduce and I'm not sure it's a good name in the first place. :smile: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2000] [table] Add sql style aggregation...

2015-06-05 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/782#issuecomment-109211373
  
This looks good. :+1: Any objections to me merging this later?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1319][core] Add static code analysis fo...

2015-06-05 Thread twalthr
Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/729#discussion_r31798470
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/functions/SemanticPropUtil.java
 ---
@@ -309,15 +308,20 @@ public static DualInputSemanticProperties 
getSemanticPropsDual(
getSemanticPropsDualFromString(result, forwardedFirst, 
forwardedSecond,
nonForwardedFirst, nonForwardedSecond, 
readFirst, readSecond, inType1, inType2, outType);
return result;
-   } else {
-   return new DualInputSemanticProperties();
}
+   return null;
+   }
+
+   public static void 
getSemanticPropsSingleFromString(SingleInputSemanticProperties result,
+   
String[] forwarded, String[] nonForwarded, 
String[] readSet,
+   
TypeInformation? inType, TypeInformation? 
outType) {
+   getSemanticPropsSingleFromString(result, forwarded, 
nonForwarded, readSet, inType, outType, false);
}
 
public static void 
getSemanticPropsSingleFromString(SingleInputSemanticProperties result,
String[] forwarded, String[] nonForwarded, String[] 
readSet,
-   TypeInformation? inType, TypeInformation? outType)
-   {
+   TypeInformation? inType, TypeInformation? outType,
+   boolean skipIncompatibleTypes) {
--- End diff --

Sometimes the analyzer works better than required. E.g. the analyzer 
outputs @ForwardedFields(*-record.customer.name) but if customer is a 
GenericType output type, the types are incompatible. I thought it is better to 
reuse the type compatibility checking of the PropUtil than reimplement 
everything, but skip types that are incompatible without throwing Exceptions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2164) Document batch and streaming startup modes

2015-06-05 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-2164:
-

 Summary: Document batch and streaming startup modes
 Key: FLINK-2164
 URL: https://issues.apache.org/jira/browse/FLINK-2164
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1892) Local job execution does not exit.

2015-06-05 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574222#comment-14574222
 ] 

Robert Metzger commented on FLINK-1892:
---

[~ktzoumas], I'll take over the issue, ok?

 Local job execution does not exit.
 --

 Key: FLINK-1892
 URL: https://issues.apache.org/jira/browse/FLINK-1892
 Project: Flink
  Issue Type: Bug
  Components: Flink on Tez
Reporter: Kostas Tzoumas
Assignee: Kostas Tzoumas

 When using the LocalTezEnvironment to run a job from the IDE the job fails to 
 exit after producing data. The following thread seems to run and not allow 
 the job to exit:
 Thread-31 #46 prio=5 os_prio=31 tid=0x7fb5d2c43000 nid=0x5507 runnable 
 [0x000127319000]
java.lang.Thread.State: RUNNABLE
   at java.lang.Throwable.fillInStackTrace(Native Method)
   at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
   - locked 0x00076dfda130 (a java.lang.InterruptedException)
   at java.lang.Throwable.init(Throwable.java:250)
   at java.lang.Exception.init(Exception.java:54)
   at java.lang.InterruptedException.init(InterruptedException.java:57)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
   at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:545)
   at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:322)
   at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:316)
   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1892) Local job execution does not exit.

2015-06-05 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1892:
-

Assignee: Robert Metzger  (was: Kostas Tzoumas)

 Local job execution does not exit.
 --

 Key: FLINK-1892
 URL: https://issues.apache.org/jira/browse/FLINK-1892
 Project: Flink
  Issue Type: Bug
  Components: Flink on Tez
Reporter: Kostas Tzoumas
Assignee: Robert Metzger

 When using the LocalTezEnvironment to run a job from the IDE the job fails to 
 exit after producing data. The following thread seems to run and not allow 
 the job to exit:
 Thread-31 #46 prio=5 os_prio=31 tid=0x7fb5d2c43000 nid=0x5507 runnable 
 [0x000127319000]
java.lang.Thread.State: RUNNABLE
   at java.lang.Throwable.fillInStackTrace(Native Method)
   at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
   - locked 0x00076dfda130 (a java.lang.InterruptedException)
   at java.lang.Throwable.init(Throwable.java:250)
   at java.lang.Exception.init(Exception.java:54)
   at java.lang.InterruptedException.init(InterruptedException.java:57)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
   at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:545)
   at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:322)
   at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:316)
   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2165) Rename Table conversion methods in TableEnvironment

2015-06-05 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2165:


 Summary: Rename Table conversion methods in TableEnvironment
 Key: FLINK-2165
 URL: https://issues.apache.org/jira/browse/FLINK-2165
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor
 Fix For: 0.9


The {{TableEnvironment}} provides methods to convert DataSets and DataStreams 
into Tables and back. These methods are called {{toTable()}}, {{toSet()}}, and 
{{toStream()}}.

I propose to rename the methods into {{fromDataSet()}}, {{fromDataStream()}}, 
{{toDataSet()}}, and {{toDataStream()}} for the following reasons:

- {{fromDataSet()}}, {{fromDataStream()}} is closer to the SQL FROM expression
- It allows to add methods such as {{fromCSV()}}, {{fromHCat()}}, 
{{fromParquet()}}, and so on to the {{TableEnvironment}}
- {{toSet()}} and {{toStream()}} should be renamed for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1940) StockPrice example cannot be visualized

2015-06-05 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574157#comment-14574157
 ] 

Stephan Ewen commented on FLINK-1940:
-

I think Marton removed that example, so it is not relevant to any packaged 
example any more.

 StockPrice example cannot be visualized
 ---

 Key: FLINK-1940
 URL: https://issues.apache.org/jira/browse/FLINK-1940
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Gyula Fora

 The planvisualizer fails on the JSON generated by the StockPrice example



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1892) Local job execution does not exit.

2015-06-05 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574167#comment-14574167
 ] 

Stephan Ewen commented on FLINK-1892:
-

Let's bump the Tez version to 0.6.1 to get the fix in.

 Local job execution does not exit.
 --

 Key: FLINK-1892
 URL: https://issues.apache.org/jira/browse/FLINK-1892
 Project: Flink
  Issue Type: Bug
  Components: Flink on Tez
Reporter: Kostas Tzoumas
Assignee: Kostas Tzoumas

 When using the LocalTezEnvironment to run a job from the IDE the job fails to 
 exit after producing data. The following thread seems to run and not allow 
 the job to exit:
 Thread-31 #46 prio=5 os_prio=31 tid=0x7fb5d2c43000 nid=0x5507 runnable 
 [0x000127319000]
java.lang.Thread.State: RUNNABLE
   at java.lang.Throwable.fillInStackTrace(Native Method)
   at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
   - locked 0x00076dfda130 (a java.lang.InterruptedException)
   at java.lang.Throwable.init(Throwable.java:250)
   at java.lang.Exception.init(Exception.java:54)
   at java.lang.InterruptedException.init(InterruptedException.java:57)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
   at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:545)
   at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:322)
   at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:316)
   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1844) Add Normaliser to ML library

2015-06-05 Thread Theodore Vasiloudis (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574166#comment-14574166
 ] 

Theodore Vasiloudis commented on FLINK-1844:


No worries [~fobeligi], thank you for your contribution. Keep us updated.

 Add Normaliser to ML library
 

 Key: FLINK-1844
 URL: https://issues.apache.org/jira/browse/FLINK-1844
 Project: Flink
  Issue Type: Improvement
  Components: Machine Learning Library
Reporter: Faye Beligianni
Assignee: Faye Beligianni
Priority: Minor
  Labels: ML, Starter

 In many algorithms in ML, the features' values would be better to lie between 
 a given range of values, usually in the range (0,1) [1]. Therefore, a 
 {{Transformer}} could be implemented to achieve that normalisation.
 Resources: 
 [1][http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2002) Iterative test fails when ran with other tests in the same environment

2015-06-05 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi reassigned FLINK-2002:
-

Assignee: Márton Balassi

 Iterative test fails when ran with other tests in the same environment
 --

 Key: FLINK-2002
 URL: https://issues.apache.org/jira/browse/FLINK-2002
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Péter Szabó
Assignee: Márton Balassi

 I run tests in the same StreamExecutionEnvironment with 
 MultipleProgramsTestBase. One of the tests uses an iterative data stream. It 
 fails as well as all tests after that. (When I put the iterative test in a 
 separate environment, all tests passes.) For me it seems that it is a 
 state-related issue but there is also some problem with the broker slots.
 The iterative test throws:
 java.lang.Exception: TaskManager sent illegal state update: CANCELING
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:618)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply$mcV$sp(JobManager.scala:222)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:221)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:221)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
   at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:314)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29)
   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29)
   at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
   at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:95)
   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
   at akka.dispatch.Mailbox.run(Mailbox.scala:221)
   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: 
 org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
 Not enough free slots available to run the job. You can decrease the operator 
 parallelism or increase the number of slots per TaskManager in the 
 configuration. Task to schedule:  Attempt #0 (GroupedActiveDiscretizer 
 (2/4)) @ (unassigned) - [SCHEDULED]  with groupID  
 e8f7c9c85e64403962648bc7e2aead8b  in sharing group  SlotSharingGroup 
 [5e62f1cc5cae2c088430ef935470a8d5, 5bc227941969d1daa1ebb1ba070b55ce, 
 d999ee6c10730775a8fef1c6f1af1dbd, 45b73caa75424d84adbb7bb92671591d, 
 5c94c54d9316b827c6eba6c721329549, 794d6c56bee347dcdd62ffdf189de267, 
 4c3b72e17a4acecde4241fe6e63355b8, f6a6028c224a7b81e4802eeaf9c8487e, 
 989c68790fc7c5e2f8b8c150a33fef89, db93daa1f9e5194f0079df2629b08efb, 
 bf7dbb1fd756ce322249eb973844b375, 

[GitHub] flink pull request: [FLINK-2139] [streaming] Streaming outputforma...

2015-06-05 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/789#issuecomment-109216797
  
I think we need to name the tests ITCase. In my understanding the tests 
that bring up a test cluster and execute a program are ITCases since they take 
longer to run than simple tests. I might be wrong, though. Anyone else have an 
opinion?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1319][core] Add static code analysis fo...

2015-06-05 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/729#discussion_r31799272
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/functions/SemanticPropUtil.java
 ---
@@ -309,15 +308,20 @@ public static DualInputSemanticProperties 
getSemanticPropsDual(
getSemanticPropsDualFromString(result, forwardedFirst, 
forwardedSecond,
nonForwardedFirst, nonForwardedSecond, 
readFirst, readSecond, inType1, inType2, outType);
return result;
-   } else {
-   return new DualInputSemanticProperties();
}
+   return null;
+   }
+
+   public static void 
getSemanticPropsSingleFromString(SingleInputSemanticProperties result,
+   
String[] forwarded, String[] nonForwarded, 
String[] readSet,
+   
TypeInformation? inType, TypeInformation? 
outType) {
+   getSemanticPropsSingleFromString(result, forwarded, 
nonForwarded, readSet, inType, outType, false);
}
 
public static void 
getSemanticPropsSingleFromString(SingleInputSemanticProperties result,
String[] forwarded, String[] nonForwarded, String[] 
readSet,
-   TypeInformation? inType, TypeInformation? outType)
-   {
+   TypeInformation? inType, TypeInformation? outType,
+   boolean skipIncompatibleTypes) {
--- End diff --

OK, got it. Thanks for explaining. :-) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1319) Add static code analysis for UDFs

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574204#comment-14574204
 ] 

ASF GitHub Bot commented on FLINK-1319:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/729#discussion_r31799272
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/functions/SemanticPropUtil.java
 ---
@@ -309,15 +308,20 @@ public static DualInputSemanticProperties 
getSemanticPropsDual(
getSemanticPropsDualFromString(result, forwardedFirst, 
forwardedSecond,
nonForwardedFirst, nonForwardedSecond, 
readFirst, readSecond, inType1, inType2, outType);
return result;
-   } else {
-   return new DualInputSemanticProperties();
}
+   return null;
+   }
+
+   public static void 
getSemanticPropsSingleFromString(SingleInputSemanticProperties result,
+   
String[] forwarded, String[] nonForwarded, 
String[] readSet,
+   
TypeInformation? inType, TypeInformation? 
outType) {
+   getSemanticPropsSingleFromString(result, forwarded, 
nonForwarded, readSet, inType, outType, false);
}
 
public static void 
getSemanticPropsSingleFromString(SingleInputSemanticProperties result,
String[] forwarded, String[] nonForwarded, String[] 
readSet,
-   TypeInformation? inType, TypeInformation? outType)
-   {
+   TypeInformation? inType, TypeInformation? outType,
+   boolean skipIncompatibleTypes) {
--- End diff --

OK, got it. Thanks for explaining. :-) 


 Add static code analysis for UDFs
 -

 Key: FLINK-1319
 URL: https://issues.apache.org/jira/browse/FLINK-1319
 Project: Flink
  Issue Type: New Feature
  Components: Java API, Scala API
Reporter: Stephan Ewen
Assignee: Timo Walther
Priority: Minor

 Flink's Optimizer takes information that tells it for UDFs which fields of 
 the input elements are accessed, modified, or frwarded/copied. This 
 information frequently helps to reuse partitionings, sorts, etc. It may speed 
 up programs significantly, as it can frequently eliminate sorts and shuffles, 
 which are costly.
 Right now, users can add lightweight annotations to UDFs to provide this 
 information (such as adding {{@ConstandFields(0-3, 1, 2-1)}}.
 We worked with static code analysis of UDFs before, to determine this 
 information automatically. This is an incredible feature, as it magically 
 makes programs faster.
 For record-at-a-time operations (Map, Reduce, FlatMap, Join, Cross), this 
 works surprisingly well in many cases. We used the Soot toolkit for the 
 static code analysis. Unfortunately, Soot is LGPL licensed and thus we did 
 not include any of the code so far.
 I propose to add this functionality to Flink, in the form of a drop-in 
 addition, to work around the LGPL incompatibility with ALS 2.0. Users could 
 simply download a special flink-code-analysis.jar and drop it into the 
 lib folder to enable this functionality. We may even add a script to 
 tools that downloads that library automatically into the lib folder. This 
 should be legally fine, since we do not redistribute LGPL code and only 
 dynamically link it (the incompatibility with ASL 2.0 is mainly in the 
 patentability, if I remember correctly).
 Prior work on this has been done by [~aljoscha] and [~skunert], which could 
 provide a code base to start with.
 *Appendix*
 Hompage to Soot static analysis toolkit: http://www.sable.mcgill.ca/soot/
 Papers on static analysis and for optimization: 
 http://stratosphere.eu/assets/papers/EnablingOperatorReorderingSCA_12.pdf and 
 http://stratosphere.eu/assets/papers/openingTheBlackBoxes_12.pdf
 Quick introduction to the Optimizer: 
 http://stratosphere.eu/assets/papers/2014-VLDBJ_Stratosphere_Overview.pdf 
 (Section 6)
 Optimizer for Iterations: 
 http://stratosphere.eu/assets/papers/spinningFastIterativeDataFlows_12.pdf 
 (Sections 4.3 and 5.3)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2165] [TableAPI] Renamed table conversi...

2015-06-05 Thread fhueske
GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/793

[FLINK-2165] [TableAPI] Renamed table conversion functions in 
TableEnvironment



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink fromDataSet

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/793.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #793


commit 9dd5ca596282786c0ae4e2b104e4f36de48554f6
Author: Fabian Hueske fhue...@apache.org
Date:   2015-06-05T09:37:39Z

[FLINK-2165] [TableAPI] Renamed table conversion functions in 
TableEnvironment




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-2065) Cancelled jobs finish with final state FAILED

2015-06-05 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi resolved FLINK-2065.

   Resolution: Fixed
Fix Version/s: 0.9

Fixed in 5a7ceda61227336115723da969ee649202a8dbb6.

 Cancelled jobs finish with final state FAILED
 -

 Key: FLINK-2065
 URL: https://issues.apache.org/jira/browse/FLINK-2065
 Project: Flink
  Issue Type: Bug
  Components: Distributed Runtime
Affects Versions: 0.9
Reporter: Robert Metzger
 Fix For: 0.9

 Attachments: failed.tgz


 While running some experiments, I've noticed that jobs sometimes finish in 
 FAILED, even though I've cancelled them.
 The reported error is
 {code}
 hdp22-kafka-w-0.c.astral-sorter-757.internal
 Error: java.lang.IllegalStateException: Buffer has already been recycled.
 at 
 org.apache.flink.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:173)
 at 
 org.apache.flink.runtime.io.network.buffer.Buffer.ensureNotRecycled(Buffer.java:142)
 at 
 org.apache.flink.runtime.io.network.buffer.Buffer.getMemorySegment(Buffer.java:78)
 at 
 org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.setNextBuffer(SpillingAdaptiveSpanningRecordDeserializer.java:72)
 at 
 org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:80)
 at 
 org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:34)
 at 
 org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73)
 at org.apache.flink.runtime.operators.MapDriver.run(MapDriver.java:96)
 at 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
 at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 The logs:
 {code}
 16:29:37,212 INFO  org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 
- Trying to cancel job with ID ecccf02327c70c9e35770c6da37638e1.
 16:29:37,214 INFO  org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 
- Status of job ecccf02327c70c9e35770c6da37638e1 (Simple big union) 
 changed to CANCELLING .
 16:31:15,581 INFO  org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 
- Status of job ecccf02327c70c9e35770c6da37638e1 (Simple big union) 
 changed to FAILING Buffer has already been recycled..
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2166) Add fromCsvFile() to TableEnvironment

2015-06-05 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2166:


 Summary: Add fromCsvFile() to TableEnvironment
 Key: FLINK-2166
 URL: https://issues.apache.org/jira/browse/FLINK-2166
 Project: Flink
  Issue Type: New Feature
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor


Add a {{fromCsvFile()}} method to the {{TableEnvironment}} to read a {{Table}} 
from a CSV file.

The implementation should reuse Flink's CsvInputFormat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2167) Add fromHCat() to TableEnvironment

2015-06-05 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2167:


 Summary: Add fromHCat() to TableEnvironment
 Key: FLINK-2167
 URL: https://issues.apache.org/jira/browse/FLINK-2167
 Project: Flink
  Issue Type: New Feature
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor


Add a {{fromHCat()}} method to the {{TableEnvironment}} to read a {{Table}} 
from an HCatalog table.

The implementation could reuse Flink's HCatInputFormat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Added member currentSplit for FileInputFormat....

2015-06-05 Thread peedeeX21
Github user peedeeX21 commented on the pull request:

https://github.com/apache/flink/pull/791#issuecomment-109249506
  
Travis fails executing this test: 
`KafkaITCase.testPersistentSourceWithOffsetUpdates`. Unfortunately I have no 
idea what the problem is. I am not even sure if this is related to my commit.
Can someone have a look at it? Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1863) RemoteInputChannelTest.testConcurrentOnBufferAndRelease fails on travis

2015-06-05 Thread Ufuk Celebi (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574312#comment-14574312
 ] 

Ufuk Celebi commented on FLINK-1863:


Not occurring any more. I tried to debug this , but could not reproduce it with 
many (20+) Travis runs and didn't notice anything in the code.

 RemoteInputChannelTest.testConcurrentOnBufferAndRelease fails on travis
 ---

 Key: FLINK-1863
 URL: https://issues.apache.org/jira/browse/FLINK-1863
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Ufuk Celebi

 {code}
 testConcurrentOnBufferAndRelease(org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannelTest)
   Time elapsed: 120.022 sec   ERROR!
 java.lang.Exception: test timed out after 12 milliseconds
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248)
   at java.util.concurrent.FutureTask.get(FutureTask.java:111)
   at 
 org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannelTest.testConcurrentOnBufferAndRelease(RemoteInputChannelTest.java:124)
 {code}
 This is the build: 
 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57943450/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2165) Rename Table conversion methods in TableEnvironment

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574267#comment-14574267
 ] 

ASF GitHub Bot commented on FLINK-2165:
---

GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/793

[FLINK-2165] [TableAPI] Renamed table conversion functions in 
TableEnvironment



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink fromDataSet

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/793.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #793


commit 9dd5ca596282786c0ae4e2b104e4f36de48554f6
Author: Fabian Hueske fhue...@apache.org
Date:   2015-06-05T09:37:39Z

[FLINK-2165] [TableAPI] Renamed table conversion functions in 
TableEnvironment




 Rename Table conversion methods in TableEnvironment
 ---

 Key: FLINK-2165
 URL: https://issues.apache.org/jira/browse/FLINK-2165
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Minor
 Fix For: 0.9


 The {{TableEnvironment}} provides methods to convert DataSets and DataStreams 
 into Tables and back. These methods are called {{toTable()}}, {{toSet()}}, 
 and {{toStream()}}.
 I propose to rename the methods into {{fromDataSet()}}, {{fromDataStream()}}, 
 {{toDataSet()}}, and {{toDataStream()}} for the following reasons:
 - {{fromDataSet()}}, {{fromDataStream()}} is closer to the SQL FROM expression
 - It allows to add methods such as {{fromCSV()}}, {{fromHCat()}}, 
 {{fromParquet()}}, and so on to the {{TableEnvironment}}
 - {{toSet()}} and {{toStream()}} should be renamed for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-06-05 Thread Peter Schrott (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574273#comment-14574273
 ] 

Peter Schrott commented on FLINK-1731:
--

[~till.rohrmann] I am not entirely sure if we speak about the same thing. In 
our opinion the failure of Travis is not related to our changes. 
Or do you mean, that I should force Travis to run over my repository to see the 
problem still exists?
If so, I just need to push something to my repository, right? But I don't have 
any changes to make.
- Thanks, Peter

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Peter Schrott
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1319][core] Add static code analysis fo...

2015-06-05 Thread twalthr
Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/729#issuecomment-109264887
  
Good points. Perfectly fine for me ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2133) Possible deadlock in ExecutionGraph

2015-06-05 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575036#comment-14575036
 ] 

Stephan Ewen commented on FLINK-2133:
-

Okay, I found a plausible scenario how this can happen: (It is a super hard 
race)

  - During canceling, the {{ExecutionJobVertices}} cancel simultaneously 
(vertex1 and vertex2)
  - {{Vertex 1}} transitions into its final state
  - In the executiongraph, it transitions the counter to the next vertex to 
check/wait for to {{vertex 2}} and checks if that one is finished already
  - {{Vertex 2}} is just done with its final subtask canceling has reached the 
state where it increments the number of terminal subtasks (ExecutionJobvertex, 
after 448, but before 454) 
  - The thread that finished {{vertex 1}} recognizes that this considers 
{{vertex 2}} terminal and marks the job entirely as complete. It triggers 
restart.
  - {{Vertex 2}} tries to tell the ExecutionGraph that it reached a terminal 
state and cannot acquire the lock any more that it needs to learn that its 
transition to terminal has already been registered.

== Deadlock

There is a simple way to fix this, but I am not sure if there is any reasonable 
way to test this. Seems that one needs to provoke this insanely exact timed 
race between the threads to provoke that situation.

 Possible deadlock in ExecutionGraph
 ---

 Key: FLINK-2133
 URL: https://issues.apache.org/jira/browse/FLINK-2133
 Project: Flink
  Issue Type: Bug
Reporter: Aljoscha Krettek

 I had the following output on Travis:
 {code}
 Found one Java-level deadlock:
 =
 ForkJoinPool-1-worker-3:
   waiting to lock monitor 0x7f1c54af7eb8 (object 0xd77fa8c0, a 
 org.apache.flink.runtime.util.SerializableObject),
   which is held by flink-akka.actor.default-dispatcher-4
 flink-akka.actor.default-dispatcher-4:
   waiting to lock monitor 0x7f1c5486aca0 (object 0xd77fa218, a 
 org.apache.flink.runtime.util.SerializableObject),
   which is held by ForkJoinPool-1-worker-3
 Java stack information for the threads listed above:
 ===
 ForkJoinPool-1-worker-3:
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.resetForNewExecution(ExecutionJobVertex.java:338)
   - waiting to lock 0xd77fa8c0 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.restart(ExecutionGraph.java:595)
   - locked 0xd77fa218 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph$3.call(ExecutionGraph.java:733)
   at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at 
 scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 flink-akka.actor.default-dispatcher-4:
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.jobVertexInFinalState(ExecutionGraph.java:683)
   - waiting to lock 0xd77fa218 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.subtaskInFinalState(ExecutionJobVertex.java:454)
   - locked 0xd77fa8c0 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.vertexCancelled(ExecutionJobVertex.java:426)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionVertex.executionCanceled(ExecutionVertex.java:565)
   at 
 org.apache.flink.runtime.executiongraph.Execution.cancelingComplete(Execution.java:653)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:784)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply$mcV$sp(JobManager.scala:220)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:219)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:219)
   at 
 

[jira] [Commented] (FLINK-2174) Allow comments in 'slaves' file

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574995#comment-14574995
 ] 

ASF GitHub Bot commented on FLINK-2174:
---

GitHub user mjsax opened a pull request:

https://github.com/apache/flink/pull/796

[FLINK-2174] Allow comments in 'slaves' file

added skipping of #-comments in slaves file in start/stop scripts

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/flink slavesFile

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/796.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #796


commit f3d3bd7b5ab4241371642f5d0da3f06f7f56d92c
Author: mjsax mj...@informatik.hu-berlin.de
Date:   2015-06-05T18:35:45Z

added skipping of #-comments in slaves file




 Allow comments in 'slaves' file
 ---

 Key: FLINK-2174
 URL: https://issues.apache.org/jira/browse/FLINK-2174
 Project: Flink
  Issue Type: Improvement
  Components: Start-Stop Scripts
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Currently, each line in slaves in interpreded as a host name. Scripts should 
 skip lines starting with '#'. Also allow for comments at the end of a line 
 and skip empty lines.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2174] Allow comments in 'slaves' file

2015-06-05 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/flink/pull/796

[FLINK-2174] Allow comments in 'slaves' file

added skipping of #-comments in slaves file in start/stop scripts

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/flink slavesFile

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/796.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #796


commit f3d3bd7b5ab4241371642f5d0da3f06f7f56d92c
Author: mjsax mj...@informatik.hu-berlin.de
Date:   2015-06-05T18:35:45Z

added skipping of #-comments in slaves file




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-2176) Add support for ProgramDesctiption interface in clients

2015-06-05 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated FLINK-2176:
---
Component/s: Examples

 Add support for ProgramDesctiption interface in clients
 ---

 Key: FLINK-2176
 URL: https://issues.apache.org/jira/browse/FLINK-2176
 Project: Flink
  Issue Type: Improvement
  Components: Examples, other, Webfrontend
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Extend WebClient and bin/flink client to show information providid via 
 ProgramDesctiption interface.
   - show as tooltip in WebClient
   - show on command line in info command



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2153) Exclude dependency on hbase annotations module

2015-06-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574736#comment-14574736
 ] 

Márton Balassi commented on FLINK-2153:
---

Thanks for the quick feedback.

 Exclude dependency on hbase annotations module
 --

 Key: FLINK-2153
 URL: https://issues.apache.org/jira/browse/FLINK-2153
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Lokesh Rajaram

 [ERROR] Failed to execute goal on project flink-hbase: Could not resolve
 dependencies for project org.apache.flink:flink-hbase:jar:0.9-SNAPSHOT:
 Could not find artifact jdk.tools:jdk.tools:jar:1.7 at specified path
 /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/../lib/tools.jar
 There is a Spark issue for this [1] with a solution [2].
 [1] https://issues.apache.org/jira/browse/SPARK-4455
 [2] https://github.com/apache/spark/pull/3286/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2173) Python uses different tmp file than Flink

2015-06-05 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574818#comment-14574818
 ] 

Chesnay Schepler commented on FLINK-2173:
-

also, in case anyone is interested in tackling this issue: it should only be 
relevant during plan construction; at runtime all paths are supplied by the 
java side in a safe manner.

one potential way to fix this earlier: instead of using tempfile.gettempdir(), 
infer the proper tmp directory by checking the location of the plan file, since 
it should reside in the java tmp folder.

 Python uses different tmp file than Flink
 -

 Key: FLINK-2173
 URL: https://issues.apache.org/jira/browse/FLINK-2173
 Project: Flink
  Issue Type: Bug
  Components: Python API
 Environment: Debian Linux
Reporter: Matthias J. Sax
Priority: Critical

 Flink gets the temp file path using System.getProperty(java.io.tmpdir) 
 while Python uses the tempfile.gettempdir() method. However, both can be 
 semantically different.
 On my system Flink uses /tmp while Pyhton used /tmp/users/1000 (1000 is 
 my Linux user-id)
 This issues leads (at least) to failing tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2164] Document streaming and batch mode

2015-06-05 Thread rmetzger
GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/795

[FLINK-2164] Document streaming and batch mode



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink2164

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/795.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #795


commit 3996871b40d87408fcb024482430f1348c06bc13
Author: Robert Metzger rmetz...@apache.org
Date:   2015-06-05T16:59:34Z

[FLINK-2164] Document streaming and batch mode




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1635) Remove Apache Thrift dependency from Flink

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574793#comment-14574793
 ] 

ASF GitHub Bot commented on FLINK-1635:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/794#issuecomment-109355770
  
+1 to merge like this.


 Remove Apache Thrift dependency from Flink
 --

 Key: FLINK-1635
 URL: https://issues.apache.org/jira/browse/FLINK-1635
 Project: Flink
  Issue Type: Bug
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Maximilian Michels

 I've added Thrift and Protobuf to Flink to support it out of the box with 
 Kryo.
 However, after trying to access a HCatalog/Hive table yesterday using Flink I 
 found that there is a dependency conflict between Flink and Hive (on thrift).
 Maybe it makes more sense to properly document our serialization framework 
 and provide a copypaste solution on how to get thrift/protobuf et al to 
 work with Flink.
 Please chime in if you are against removing the out of the box support for 
 protobuf and kryo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1635] remove Apache Thrift and Google P...

2015-06-05 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/794#issuecomment-109355770
  
+1 to merge like this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1635] remove Apache Thrift and Google P...

2015-06-05 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/794#issuecomment-109355690
  
I also added the necessary Maven configuration for the examples.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1635) Remove Apache Thrift dependency from Flink

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574792#comment-14574792
 ] 

ASF GitHub Bot commented on FLINK-1635:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/794#issuecomment-109355690
  
I also added the necessary Maven configuration for the examples.


 Remove Apache Thrift dependency from Flink
 --

 Key: FLINK-1635
 URL: https://issues.apache.org/jira/browse/FLINK-1635
 Project: Flink
  Issue Type: Bug
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Maximilian Michels

 I've added Thrift and Protobuf to Flink to support it out of the box with 
 Kryo.
 However, after trying to access a HCatalog/Hive table yesterday using Flink I 
 found that there is a dependency conflict between Flink and Hive (on thrift).
 Maybe it makes more sense to properly document our serialization framework 
 and provide a copypaste solution on how to get thrift/protobuf et al to 
 work with Flink.
 Please chime in if you are against removing the out of the box support for 
 protobuf and kryo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2136) Test the streaming scala API

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574718#comment-14574718
 ] 

ASF GitHub Bot commented on FLINK-2136:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/771#issuecomment-109338790
  
Let me merge this for the release now.


 Test the streaming scala API
 

 Key: FLINK-2136
 URL: https://issues.apache.org/jira/browse/FLINK-2136
 Project: Flink
  Issue Type: Test
  Components: Scala API, Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Gábor Hermann

 There are no test covering the streaming scala API. I would suggest to test 
 whether the StreamGraph created by a certain operation looks as expected. 
 Deeper layers and runtime should not be tested here, that is done in 
 streaming-core.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2173) Python used different tmp file than Flink

2015-06-05 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated FLINK-2173:
---
Summary: Python used different tmp file than Flink  (was: Python used 
diffente tmp file than Flink)

 Python used different tmp file than Flink
 -

 Key: FLINK-2173
 URL: https://issues.apache.org/jira/browse/FLINK-2173
 Project: Flink
  Issue Type: Bug
  Components: Python API
 Environment: Debian Linux
Reporter: Matthias J. Sax
Priority: Critical

 Flink gets the temp file path using System.getProperty(java.io.tmpdir) 
 while Python uses the tempfile.gettempdir() method. However, both can be 
 semantically different.
 On my system Flink uses /tmp while Pyhton used /tmp/users/1000 (1000 is 
 my Linux user-id)
 This issues leads (at least) to failing tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2173) Python uses different tmp file than Flink

2015-06-05 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated FLINK-2173:
---
Summary: Python uses different tmp file than Flink  (was: Python used 
different tmp file than Flink)

 Python uses different tmp file than Flink
 -

 Key: FLINK-2173
 URL: https://issues.apache.org/jira/browse/FLINK-2173
 Project: Flink
  Issue Type: Bug
  Components: Python API
 Environment: Debian Linux
Reporter: Matthias J. Sax
Priority: Critical

 Flink gets the temp file path using System.getProperty(java.io.tmpdir) 
 while Python uses the tempfile.gettempdir() method. However, both can be 
 semantically different.
 On my system Flink uses /tmp while Pyhton used /tmp/users/1000 (1000 is 
 my Linux user-id)
 This issues leads (at least) to failing tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2136] Adding DataStream tests for Scala...

2015-06-05 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/771#issuecomment-109338790
  
Let me merge this for the release now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1635] remove Apache Thrift and Google P...

2015-06-05 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/794#discussion_r31827450
  
--- Diff: docs/apis/best_practices.md ---
@@ -155,3 +155,41 @@ public static final class Tokenizer extends 
RichFlatMapFunctionString, Tuple2S
// .. do more ..
 {% endhighlight %}
 
+
+## Register a custom serializer for your Flink program
+
+If you use a custom type in your Flink program which cannot be serialized 
by the
+Flink type serializer, Flink falls back to using the generic Kryo
+serializer. You may register your own serializer or a serialization system 
like
+Google Protobuf or Apache Thrift with Kryo. To do that, simply register 
the type
+class and the serializer in the `ExecutionConfig` of your Flink program.
+
+
+{% highlight java %}
+final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+
+// register the class of the serializer as serializer for a type
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
MyCustomSerializer.class);
+
+// register an instance as serializer for a type
+MySerializer mySerializer = new MySerializer();
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
mySerializer);
+{% endhighlight %}
+
+Note that your custom serializer has to extend Kryo's Serializer class. In 
the
+case of Google Protobuf or Apache Thrift, this has already been done for
+you:
+
+{% highlight java %}
+
+final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+
+// register the Google Protobuf serializer with Kryo
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
ProtobufSerializer.class);
+
+// register the serializer included with Apache Thrift as the standard 
serializer
+// TBaseSerializer states it should be initalized as a default Kryo 
serializer
+env.getConfig().addDefaultKryoSerializer(MyCustomType.class, 
TBaseSerializer.class);
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
TMessage.class);
--- End diff --

this seems incorrect


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1635) Remove Apache Thrift dependency from Flink

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574774#comment-14574774
 ] 

ASF GitHub Bot commented on FLINK-1635:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/794#discussion_r31827450
  
--- Diff: docs/apis/best_practices.md ---
@@ -155,3 +155,41 @@ public static final class Tokenizer extends 
RichFlatMapFunctionString, Tuple2S
// .. do more ..
 {% endhighlight %}
 
+
+## Register a custom serializer for your Flink program
+
+If you use a custom type in your Flink program which cannot be serialized 
by the
+Flink type serializer, Flink falls back to using the generic Kryo
+serializer. You may register your own serializer or a serialization system 
like
+Google Protobuf or Apache Thrift with Kryo. To do that, simply register 
the type
+class and the serializer in the `ExecutionConfig` of your Flink program.
+
+
+{% highlight java %}
+final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+
+// register the class of the serializer as serializer for a type
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
MyCustomSerializer.class);
+
+// register an instance as serializer for a type
+MySerializer mySerializer = new MySerializer();
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
mySerializer);
+{% endhighlight %}
+
+Note that your custom serializer has to extend Kryo's Serializer class. In 
the
+case of Google Protobuf or Apache Thrift, this has already been done for
+you:
+
+{% highlight java %}
+
+final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+
+// register the Google Protobuf serializer with Kryo
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
ProtobufSerializer.class);
+
+// register the serializer included with Apache Thrift as the standard 
serializer
+// TBaseSerializer states it should be initalized as a default Kryo 
serializer
+env.getConfig().addDefaultKryoSerializer(MyCustomType.class, 
TBaseSerializer.class);
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
TMessage.class);
--- End diff --

this seems incorrect


 Remove Apache Thrift dependency from Flink
 --

 Key: FLINK-1635
 URL: https://issues.apache.org/jira/browse/FLINK-1635
 Project: Flink
  Issue Type: Bug
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Maximilian Michels

 I've added Thrift and Protobuf to Flink to support it out of the box with 
 Kryo.
 However, after trying to access a HCatalog/Hive table yesterday using Flink I 
 found that there is a dependency conflict between Flink and Hive (on thrift).
 Maybe it makes more sense to properly document our serialization framework 
 and provide a copypaste solution on how to get thrift/protobuf et al to 
 work with Flink.
 Please chime in if you are against removing the out of the box support for 
 protobuf and kryo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1126) Add suggestion for using large TupleX types

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574851#comment-14574851
 ] 

ASF GitHub Bot commented on FLINK-1126:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/786#issuecomment-109367152
  
Merging ...


 Add suggestion for using large TupleX types
 ---

 Key: FLINK-1126
 URL: https://issues.apache.org/jira/browse/FLINK-1126
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Ufuk Celebi
Assignee: Robert Metzger
Priority: Minor

 Instead of
 {code}
 Tuple11String, String, ..., String var = new ...;
 {code}
 I would like to add a hint to use custom types like:
 {code}
 CustomType var = new ...;
 public static class CustomType extends Tuple11String, String, ..., String {
 // constructor matching super
 }
 {code}
 I saw a couple of users sticking to the large TupleX types instead of doing 
 this, which leads to a very clumsy user code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2172) Stabilize SocketOutputFormatTest

2015-06-05 Thread JIRA
Márton Balassi created FLINK-2172:
-

 Summary: Stabilize SocketOutputFormatTest
 Key: FLINK-2172
 URL: https://issues.apache.org/jira/browse/FLINK-2172
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi


As for a resolution of FLINK-2139 I am adding tests for the core streaming 
outputformats. Added a skeleton for the socket output too, but found that it 
was unstable and disabled it for now for that reason. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1635) Remove Apache Thrift dependency from Flink

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574789#comment-14574789
 ] 

ASF GitHub Bot commented on FLINK-1635:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/794#discussion_r31828328
  
--- Diff: docs/apis/best_practices.md ---
@@ -155,3 +155,41 @@ public static final class Tokenizer extends 
RichFlatMapFunctionString, Tuple2S
// .. do more ..
 {% endhighlight %}
 
+
+## Register a custom serializer for your Flink program
+
+If you use a custom type in your Flink program which cannot be serialized 
by the
+Flink type serializer, Flink falls back to using the generic Kryo
+serializer. You may register your own serializer or a serialization system 
like
+Google Protobuf or Apache Thrift with Kryo. To do that, simply register 
the type
+class and the serializer in the `ExecutionConfig` of your Flink program.
+
+
+{% highlight java %}
+final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+
+// register the class of the serializer as serializer for a type
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
MyCustomSerializer.class);
+
+// register an instance as serializer for a type
+MySerializer mySerializer = new MySerializer();
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
mySerializer);
+{% endhighlight %}
+
+Note that your custom serializer has to extend Kryo's Serializer class. In 
the
+case of Google Protobuf or Apache Thrift, this has already been done for
+you:
+
+{% highlight java %}
+
+final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+
+// register the Google Protobuf serializer with Kryo
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
ProtobufSerializer.class);
+
+// register the serializer included with Apache Thrift as the standard 
serializer
+// TBaseSerializer states it should be initalized as a default Kryo 
serializer
+env.getConfig().addDefaultKryoSerializer(MyCustomType.class, 
TBaseSerializer.class);
+env.getConfig().registerTypeWithKryoSerializer(MyCustomType.class, 
TMessage.class);
--- End diff --

yes, sorry. copy/paste error. fixed.


 Remove Apache Thrift dependency from Flink
 --

 Key: FLINK-1635
 URL: https://issues.apache.org/jira/browse/FLINK-1635
 Project: Flink
  Issue Type: Bug
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Maximilian Michels

 I've added Thrift and Protobuf to Flink to support it out of the box with 
 Kryo.
 However, after trying to access a HCatalog/Hive table yesterday using Flink I 
 found that there is a dependency conflict between Flink and Hive (on thrift).
 Maybe it makes more sense to properly document our serialization framework 
 and provide a copypaste solution on how to get thrift/protobuf et al to 
 work with Flink.
 Please chime in if you are against removing the out of the box support for 
 protobuf and kryo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1565][FLINK-2078] Document ExecutionCon...

2015-06-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/781


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2173) Python uses different tmp file than Flink

2015-06-05 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574804#comment-14574804
 ] 

Chesnay Schepler commented on FLINK-2173:
-

small addition from the mailing list: this is not affecting all users.

It will furthermore be trivial to fix once FLINK-1927 is resolved, which 
admittedly could take a while.

 Python uses different tmp file than Flink
 -

 Key: FLINK-2173
 URL: https://issues.apache.org/jira/browse/FLINK-2173
 Project: Flink
  Issue Type: Bug
  Components: Python API
 Environment: Debian Linux
Reporter: Matthias J. Sax
Priority: Critical

 Flink gets the temp file path using System.getProperty(java.io.tmpdir) 
 while Python uses the tempfile.gettempdir() method. However, both can be 
 semantically different.
 On my system Flink uses /tmp while Pyhton used /tmp/users/1000 (1000 is 
 my Linux user-id)
 This issues leads (at least) to failing tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1565) Document object reuse behavior

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574856#comment-14574856
 ] 

ASF GitHub Bot commented on FLINK-1565:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/781#issuecomment-109367441
  
Merging ...


 Document object reuse behavior
 --

 Key: FLINK-1565
 URL: https://issues.apache.org/jira/browse/FLINK-1565
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Fabian Hueske
Assignee: Robert Metzger
 Fix For: 0.9


 The documentation needs to be extended and describe the object reuse behavior 
 of Flink and its implications for how to implement functions.
 The documentation must at least cover the default reuse mode:
 * new objects through iterators and in reduce functions
 * chaining behavior (objects are passed on to the next function which might 
 modify it)
 Optionally, the documentation could describe the object reuse switch 
 introduced by FLINK-1137.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1565][FLINK-2078] Document ExecutionCon...

2015-06-05 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/781#issuecomment-109367441
  
Merging ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1126][docs] Best practice: named TupleX...

2015-06-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/786


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2170) Add fromOrcFile() to TableEnvironment

2015-06-05 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2170:


 Summary: Add fromOrcFile() to TableEnvironment
 Key: FLINK-2170
 URL: https://issues.apache.org/jira/browse/FLINK-2170
 Project: Flink
  Issue Type: New Feature
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor


Add a {{fromOrcFile()}} method to the {{TableEnvironment}} to read a {{Table}} 
from a ORC file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2165) Rename Table conversion methods in TableEnvironment

2015-06-05 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-2165:


Assignee: Fabian Hueske

 Rename Table conversion methods in TableEnvironment
 ---

 Key: FLINK-2165
 URL: https://issues.apache.org/jira/browse/FLINK-2165
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Minor
 Fix For: 0.9


 The {{TableEnvironment}} provides methods to convert DataSets and DataStreams 
 into Tables and back. These methods are called {{toTable()}}, {{toSet()}}, 
 and {{toStream()}}.
 I propose to rename the methods into {{fromDataSet()}}, {{fromDataStream()}}, 
 {{toDataSet()}}, and {{toDataStream()}} for the following reasons:
 - {{fromDataSet()}}, {{fromDataStream()}} is closer to the SQL FROM expression
 - It allows to add methods such as {{fromCSV()}}, {{fromHCat()}}, 
 {{fromParquet()}}, and so on to the {{TableEnvironment}}
 - {{toSet()}} and {{toStream()}} should be renamed for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2171) Add instruction to build Flink with Scala 2.11

2015-06-05 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2171:


 Summary: Add instruction to build Flink with Scala 2.11
 Key: FLINK-2171
 URL: https://issues.apache.org/jira/browse/FLINK-2171
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor


Flink can be built for Scala 2.11. However, the build documentation does not 
cover include instructions for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1635) Remove Apache Thrift dependency from Flink

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574679#comment-14574679
 ] 

ASF GitHub Bot commented on FLINK-1635:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/794#issuecomment-109328706
  
It would be great if you could add some documentation that explains how 
users can register the serializers at the ExecutionConfig.

I've actually added support for these two frameworks because users needed 
this. It would be nice if your docs would explain how to use Flink with 
Thrift/Protobuf types (thats also needed for stuff like Parquet)


 Remove Apache Thrift dependency from Flink
 --

 Key: FLINK-1635
 URL: https://issues.apache.org/jira/browse/FLINK-1635
 Project: Flink
  Issue Type: Bug
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Maximilian Michels

 I've added Thrift and Protobuf to Flink to support it out of the box with 
 Kryo.
 However, after trying to access a HCatalog/Hive table yesterday using Flink I 
 found that there is a dependency conflict between Flink and Hive (on thrift).
 Maybe it makes more sense to properly document our serialization framework 
 and provide a copypaste solution on how to get thrift/protobuf et al to 
 work with Flink.
 Please chime in if you are against removing the out of the box support for 
 protobuf and kryo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2177) NullPointer in task resource release

2015-06-05 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-2177:

Summary: NullPointer in task resource release  (was: NillPointer in task 
resource release)

 NullPointer in task resource release
 

 Key: FLINK-2177
 URL: https://issues.apache.org/jira/browse/FLINK-2177
 Project: Flink
  Issue Type: Bug
  Components: Distributed Runtime
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Ufuk Celebi
Priority: Blocker
 Fix For: 0.9


 {code}
 ==
 ==  FATAL  ===
 ==
 A fatal error occurred, forcing the TaskManager to shut down: FATAL - 
 exception in task resource cleanup
 java.lang.NullPointerException
   at 
 org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.cancelRequestFor(PartitionRequestClientHandler.java:89)
   at 
 org.apache.flink.runtime.io.network.netty.PartitionRequestClient.close(PartitionRequestClient.java:182)
   at 
 org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.releaseAllResources(RemoteInputChannel.java:199)
   at 
 org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.releaseAllResources(SingleInputGate.java:332)
   at 
 org.apache.flink.runtime.io.network.NetworkEnvironment.unregisterTask(NetworkEnvironment.java:368)
   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:650)
   at java.lang.Thread.run(Thread.java:701)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2133] [jobmanager] Fix possible deadloc...

2015-06-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/797


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2177) NillPointer in task resource release

2015-06-05 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575213#comment-14575213
 ] 

Stephan Ewen commented on FLINK-2177:
-

Here are the tests where this occurred (need to download the logs to examine 
the TaskManager output)

https://travis-ci.org/StephanEwen/incubator-flink/builds/65540688

 NillPointer in task resource release
 

 Key: FLINK-2177
 URL: https://issues.apache.org/jira/browse/FLINK-2177
 Project: Flink
  Issue Type: Bug
  Components: Distributed Runtime
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Ufuk Celebi
Priority: Blocker
 Fix For: 0.9


 {code}
 ==
 ==  FATAL  ===
 ==
 A fatal error occurred, forcing the TaskManager to shut down: FATAL - 
 exception in task resource cleanup
 java.lang.NullPointerException
   at 
 org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.cancelRequestFor(PartitionRequestClientHandler.java:89)
   at 
 org.apache.flink.runtime.io.network.netty.PartitionRequestClient.close(PartitionRequestClient.java:182)
   at 
 org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.releaseAllResources(RemoteInputChannel.java:199)
   at 
 org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.releaseAllResources(SingleInputGate.java:332)
   at 
 org.apache.flink.runtime.io.network.NetworkEnvironment.unregisterTask(NetworkEnvironment.java:368)
   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:650)
   at java.lang.Thread.run(Thread.java:701)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2098) Checkpoint barrier initiation at source is not aligned with snapshotting

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575234#comment-14575234
 ] 

ASF GitHub Bot commented on FLINK-2098:
---

Github user StephanEwen closed the pull request at:

https://github.com/apache/flink/pull/755


 Checkpoint barrier initiation at source is not aligned with snapshotting
 

 Key: FLINK-2098
 URL: https://issues.apache.org/jira/browse/FLINK-2098
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Aljoscha Krettek
Priority: Blocker
 Fix For: 0.9


 The stream source does not properly align the emission of checkpoint barriers 
 with the drawing of snapshots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2164) Document batch and streaming startup modes

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575228#comment-14575228
 ] 

ASF GitHub Bot commented on FLINK-2164:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/795#issuecomment-109441645
  
Good addition, +1


 Document batch and streaming startup modes
 --

 Key: FLINK-2164
 URL: https://issues.apache.org/jira/browse/FLINK-2164
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2098) Checkpoint barrier initiation at source is not aligned with snapshotting

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575233#comment-14575233
 ] 

ASF GitHub Bot commented on FLINK-2098:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/755#issuecomment-109442172
  
merged as part of #742


 Checkpoint barrier initiation at source is not aligned with snapshotting
 

 Key: FLINK-2098
 URL: https://issues.apache.org/jira/browse/FLINK-2098
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Aljoscha Krettek
Priority: Blocker
 Fix For: 0.9


 The stream source does not properly align the emission of checkpoint barriers 
 with the drawing of snapshots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2164] Document streaming and batch mode

2015-06-05 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/795#issuecomment-109441645
  
Good addition, +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2098] Improvements on checkpoint-aligne...

2015-06-05 Thread StephanEwen
Github user StephanEwen closed the pull request at:

https://github.com/apache/flink/pull/755


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2175) Allow multiple jobs in single jar file

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575279#comment-14575279
 ] 

ASF GitHub Bot commented on FLINK-2175:
---

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/707#issuecomment-109447546
  
My Travis is green (https://travis-ci.org/mjsax/flink/builds/65609726). 
From my point of view, this PR can be merged. Let me know if you request any 
changes.


 Allow multiple jobs in single jar file
 --

 Key: FLINK-2175
 URL: https://issues.apache.org/jira/browse/FLINK-2175
 Project: Flink
  Issue Type: Improvement
  Components: Examples, other, Webfrontend
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Allow to package multiple jobs into a single jar.
   - extend WebClient to display all available jobs
   - extend WebClient to diplay plan and submit each job



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2133] [jobmanager] Fix possible deadloc...

2015-06-05 Thread StephanEwen
GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/797

[FLINK-2133] [jobmanager] Fix possible deadlock when vertices transition to 
final state



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink 
final_state_deadlock

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/797.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #797


commit 2298cfedbf880b3a6065a307224c5f3e9e326a0b
Author: Stephan Ewen se...@apache.org
Date:   2015-06-05T20:39:29Z

[FLINK-2133] [jobmanager] Fix possible deadlock when vertices transition to 
final state.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2133) Possible deadlock in ExecutionGraph

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575226#comment-14575226
 ] 

ASF GitHub Bot commented on FLINK-2133:
---

GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/797

[FLINK-2133] [jobmanager] Fix possible deadlock when vertices transition to 
final state



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink 
final_state_deadlock

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/797.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #797


commit 2298cfedbf880b3a6065a307224c5f3e9e326a0b
Author: Stephan Ewen se...@apache.org
Date:   2015-06-05T20:39:29Z

[FLINK-2133] [jobmanager] Fix possible deadlock when vertices transition to 
final state.




 Possible deadlock in ExecutionGraph
 ---

 Key: FLINK-2133
 URL: https://issues.apache.org/jira/browse/FLINK-2133
 Project: Flink
  Issue Type: Bug
Reporter: Aljoscha Krettek

 I had the following output on Travis:
 {code}
 Found one Java-level deadlock:
 =
 ForkJoinPool-1-worker-3:
   waiting to lock monitor 0x7f1c54af7eb8 (object 0xd77fa8c0, a 
 org.apache.flink.runtime.util.SerializableObject),
   which is held by flink-akka.actor.default-dispatcher-4
 flink-akka.actor.default-dispatcher-4:
   waiting to lock monitor 0x7f1c5486aca0 (object 0xd77fa218, a 
 org.apache.flink.runtime.util.SerializableObject),
   which is held by ForkJoinPool-1-worker-3
 Java stack information for the threads listed above:
 ===
 ForkJoinPool-1-worker-3:
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.resetForNewExecution(ExecutionJobVertex.java:338)
   - waiting to lock 0xd77fa8c0 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.restart(ExecutionGraph.java:595)
   - locked 0xd77fa218 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph$3.call(ExecutionGraph.java:733)
   at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at 
 scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 flink-akka.actor.default-dispatcher-4:
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.jobVertexInFinalState(ExecutionGraph.java:683)
   - waiting to lock 0xd77fa218 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.subtaskInFinalState(ExecutionJobVertex.java:454)
   - locked 0xd77fa8c0 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.vertexCancelled(ExecutionJobVertex.java:426)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionVertex.executionCanceled(ExecutionVertex.java:565)
   at 
 org.apache.flink.runtime.executiongraph.Execution.cancelingComplete(Execution.java:653)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:784)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply$mcV$sp(JobManager.scala:220)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:219)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:219)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
   at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
   at 

[jira] [Commented] (FLINK-2133) Possible deadlock in ExecutionGraph

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575246#comment-14575246
 ] 

ASF GitHub Bot commented on FLINK-2133:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/797


 Possible deadlock in ExecutionGraph
 ---

 Key: FLINK-2133
 URL: https://issues.apache.org/jira/browse/FLINK-2133
 Project: Flink
  Issue Type: Bug
Reporter: Aljoscha Krettek

 I had the following output on Travis:
 {code}
 Found one Java-level deadlock:
 =
 ForkJoinPool-1-worker-3:
   waiting to lock monitor 0x7f1c54af7eb8 (object 0xd77fa8c0, a 
 org.apache.flink.runtime.util.SerializableObject),
   which is held by flink-akka.actor.default-dispatcher-4
 flink-akka.actor.default-dispatcher-4:
   waiting to lock monitor 0x7f1c5486aca0 (object 0xd77fa218, a 
 org.apache.flink.runtime.util.SerializableObject),
   which is held by ForkJoinPool-1-worker-3
 Java stack information for the threads listed above:
 ===
 ForkJoinPool-1-worker-3:
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.resetForNewExecution(ExecutionJobVertex.java:338)
   - waiting to lock 0xd77fa8c0 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.restart(ExecutionGraph.java:595)
   - locked 0xd77fa218 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph$3.call(ExecutionGraph.java:733)
   at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at 
 scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 flink-akka.actor.default-dispatcher-4:
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.jobVertexInFinalState(ExecutionGraph.java:683)
   - waiting to lock 0xd77fa218 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.subtaskInFinalState(ExecutionJobVertex.java:454)
   - locked 0xd77fa8c0 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.vertexCancelled(ExecutionJobVertex.java:426)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionVertex.executionCanceled(ExecutionVertex.java:565)
   at 
 org.apache.flink.runtime.executiongraph.Execution.cancelingComplete(Execution.java:653)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:784)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply$mcV$sp(JobManager.scala:220)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:219)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:219)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
   at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Found 1 deadlock.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2133) Possible deadlock in ExecutionGraph

2015-06-05 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2133.
-
   Resolution: Fixed
Fix Version/s: 0.9
 Assignee: Stephan Ewen

Fixed via 2298cfedbf880b3a6065a307224c5f3e9e326a0b

 Possible deadlock in ExecutionGraph
 ---

 Key: FLINK-2133
 URL: https://issues.apache.org/jira/browse/FLINK-2133
 Project: Flink
  Issue Type: Bug
Reporter: Aljoscha Krettek
Assignee: Stephan Ewen
 Fix For: 0.9


 I had the following output on Travis:
 {code}
 Found one Java-level deadlock:
 =
 ForkJoinPool-1-worker-3:
   waiting to lock monitor 0x7f1c54af7eb8 (object 0xd77fa8c0, a 
 org.apache.flink.runtime.util.SerializableObject),
   which is held by flink-akka.actor.default-dispatcher-4
 flink-akka.actor.default-dispatcher-4:
   waiting to lock monitor 0x7f1c5486aca0 (object 0xd77fa218, a 
 org.apache.flink.runtime.util.SerializableObject),
   which is held by ForkJoinPool-1-worker-3
 Java stack information for the threads listed above:
 ===
 ForkJoinPool-1-worker-3:
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.resetForNewExecution(ExecutionJobVertex.java:338)
   - waiting to lock 0xd77fa8c0 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.restart(ExecutionGraph.java:595)
   - locked 0xd77fa218 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph$3.call(ExecutionGraph.java:733)
   at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at 
 scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 flink-akka.actor.default-dispatcher-4:
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.jobVertexInFinalState(ExecutionGraph.java:683)
   - waiting to lock 0xd77fa218 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.subtaskInFinalState(ExecutionJobVertex.java:454)
   - locked 0xd77fa8c0 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.vertexCancelled(ExecutionJobVertex.java:426)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionVertex.executionCanceled(ExecutionVertex.java:565)
   at 
 org.apache.flink.runtime.executiongraph.Execution.cancelingComplete(Execution.java:653)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:784)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply$mcV$sp(JobManager.scala:220)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:219)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:219)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
   at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Found 1 deadlock.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2133) Possible deadlock in ExecutionGraph

2015-06-05 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2133.
---

 Possible deadlock in ExecutionGraph
 ---

 Key: FLINK-2133
 URL: https://issues.apache.org/jira/browse/FLINK-2133
 Project: Flink
  Issue Type: Bug
Reporter: Aljoscha Krettek
Assignee: Stephan Ewen
 Fix For: 0.9


 I had the following output on Travis:
 {code}
 Found one Java-level deadlock:
 =
 ForkJoinPool-1-worker-3:
   waiting to lock monitor 0x7f1c54af7eb8 (object 0xd77fa8c0, a 
 org.apache.flink.runtime.util.SerializableObject),
   which is held by flink-akka.actor.default-dispatcher-4
 flink-akka.actor.default-dispatcher-4:
   waiting to lock monitor 0x7f1c5486aca0 (object 0xd77fa218, a 
 org.apache.flink.runtime.util.SerializableObject),
   which is held by ForkJoinPool-1-worker-3
 Java stack information for the threads listed above:
 ===
 ForkJoinPool-1-worker-3:
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.resetForNewExecution(ExecutionJobVertex.java:338)
   - waiting to lock 0xd77fa8c0 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.restart(ExecutionGraph.java:595)
   - locked 0xd77fa218 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph$3.call(ExecutionGraph.java:733)
   at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at 
 scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 flink-akka.actor.default-dispatcher-4:
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.jobVertexInFinalState(ExecutionGraph.java:683)
   - waiting to lock 0xd77fa218 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.subtaskInFinalState(ExecutionJobVertex.java:454)
   - locked 0xd77fa8c0 (a 
 org.apache.flink.runtime.util.SerializableObject)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionJobVertex.vertexCancelled(ExecutionJobVertex.java:426)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionVertex.executionCanceled(ExecutionVertex.java:565)
   at 
 org.apache.flink.runtime.executiongraph.Execution.cancelingComplete(Execution.java:653)
   at 
 org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:784)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply$mcV$sp(JobManager.scala:220)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:219)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$2.apply(JobManager.scala:219)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
   at 
 scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
   at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Found 1 deadlock.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2175] Allow multiple jobs in single jar...

2015-06-05 Thread mjsax
Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/707#issuecomment-109447546
  
My Travis is green (https://travis-ci.org/mjsax/flink/builds/65609726). 
From my point of view, this PR can be merged. Let me know if you request any 
changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1297] Added OperatorStatsAccumulator fo...

2015-06-05 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/605#issuecomment-109446516
  
I am merging this for the next version.
Very nice addition, sorry for the delay.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1297) Add support for tracking statistics of intermediate results

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575263#comment-14575263
 ] 

ASF GitHub Bot commented on FLINK-1297:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/605#issuecomment-109446516
  
I am merging this for the next version.
Very nice addition, sorry for the delay.


 Add support for tracking statistics of intermediate results
 ---

 Key: FLINK-1297
 URL: https://issues.apache.org/jira/browse/FLINK-1297
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Reporter: Alexander Alexandrov
Assignee: Alexander Alexandrov
 Fix For: 0.9

   Original Estimate: 1,008h
  Remaining Estimate: 1,008h

 One of the major problems related to the optimizer at the moment is the lack 
 of proper statistics.
 With the introduction of staged execution, it is possible to instrument the 
 runtime code with a statistics facility that collects the required 
 information for optimizing the next execution stage.
 I would therefore like to contribute code that can be used to gather basic 
 statistics for the (intermediate) result of dataflows (e.g. min, max, count, 
 count distinct) and make them available to the job manager.
 Before I start, I would like to hear some feedback form the other users.
 In particular, to handle skew (e.g. on grouping) it might be good to have 
 some sort of detailed sketch about the key distribution of an intermediate 
 result. I am not sure whether a simple histogram is the most effective way to 
 go. Maybe somebody would propose another lightweight sketch that provides 
 better accuracy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1297] Added OperatorStatsAccumulator fo...

2015-06-05 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/605#issuecomment-109490490
  
The tests seem to be non-deterministic and fail frequently.

Check out this build: 
https://travis-ci.org/StephanEwen/incubator-flink/jobs/65634990

The tests need to be more stable before we can add this to the codebase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink 1844] Add Normaliser to ML library

2015-06-05 Thread fobeligi
GitHub user fobeligi opened a pull request:

https://github.com/apache/flink/pull/798

[Flink 1844] Add Normaliser to ML library

Adds a MinMaxScaler to the ML preprocessing package. MinMax scaler scales 
the values to a user-specified range.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fobeligi/incubator-flink FLINK-1844

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/798.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #798


commit 802b9da07a2c3f7c055b4c024aaecbbe647db1cd
Author: fobeligi faybeligia...@gmail.com
Date:   2015-06-05T21:12:43Z

[FLINK-1844] Add MinMaxScaler implementation in the proprocessing package, 
test for the for the corresponding functionality and documentation.

commit e639185108f9bda253e296bae4c6c4269a30d1d0
Author: fobeligi faybeligia...@gmail.com
Date:   2015-06-05T22:12:33Z

[FLINK-1844] Change second test to use LabeledVectors instead of Vectors




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2174) Allow comments in 'slaves' file

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575392#comment-14575392
 ] 

ASF GitHub Bot commented on FLINK-2174:
---

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/796#issuecomment-109480303
  
My Travis is green (https://travis-ci.org/mjsax/flink/builds/65612778). Any 
comments? Can be merged from my point of view.


 Allow comments in 'slaves' file
 ---

 Key: FLINK-2174
 URL: https://issues.apache.org/jira/browse/FLINK-2174
 Project: Flink
  Issue Type: Improvement
  Components: Start-Stop Scripts
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Trivial

 Currently, each line in slaves in interpreded as a host name. Scripts should 
 skip lines starting with '#'. Also allow for comments at the end of a line 
 and skip empty lines.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2174] Allow comments in 'slaves' file

2015-06-05 Thread mjsax
Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/796#issuecomment-109480303
  
My Travis is green (https://travis-ci.org/mjsax/flink/builds/65612778). Any 
comments? Can be merged from my point of view.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-2174) Allow comments in 'slaves' file

2015-06-05 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated FLINK-2174:
---
Priority: Trivial  (was: Minor)

 Allow comments in 'slaves' file
 ---

 Key: FLINK-2174
 URL: https://issues.apache.org/jira/browse/FLINK-2174
 Project: Flink
  Issue Type: Improvement
  Components: Start-Stop Scripts
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Trivial

 Currently, each line in slaves in interpreded as a host name. Scripts should 
 skip lines starting with '#'. Also allow for comments at the end of a line 
 and skip empty lines.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1297) Add support for tracking statistics of intermediate results

2015-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575462#comment-14575462
 ] 

ASF GitHub Bot commented on FLINK-1297:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/605#issuecomment-109490490
  
The tests seem to be non-deterministic and fail frequently.

Check out this build: 
https://travis-ci.org/StephanEwen/incubator-flink/jobs/65634990

The tests need to be more stable before we can add this to the codebase.


 Add support for tracking statistics of intermediate results
 ---

 Key: FLINK-1297
 URL: https://issues.apache.org/jira/browse/FLINK-1297
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Reporter: Alexander Alexandrov
Assignee: Alexander Alexandrov
 Fix For: 0.9

   Original Estimate: 1,008h
  Remaining Estimate: 1,008h

 One of the major problems related to the optimizer at the moment is the lack 
 of proper statistics.
 With the introduction of staged execution, it is possible to instrument the 
 runtime code with a statistics facility that collects the required 
 information for optimizing the next execution stage.
 I would therefore like to contribute code that can be used to gather basic 
 statistics for the (intermediate) result of dataflows (e.g. min, max, count, 
 count distinct) and make them available to the job manager.
 Before I start, I would like to hear some feedback form the other users.
 In particular, to handle skew (e.g. on grouping) it might be good to have 
 some sort of detailed sketch about the key distribution of an intermediate 
 result. I am not sure whether a simple histogram is the most effective way to 
 go. Maybe somebody would propose another lightweight sketch that provides 
 better accuracy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)