[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager

2015-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308834#comment-14308834
 ] 

ASF GitHub Bot commented on FLINK-1484:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/368#discussion_r24227786
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -125,6 +126,10 @@ Actor with ActorLogMessages with ActorLogging {
   override def postStop(): Unit = {
 log.info(sStopping job manager ${self.path}.)
 
+// disconnect the registered task managers
+instanceManager.getAllRegisteredInstances.asScala.foreach{
+  _.getTaskManager ! Disconnected(JobManager is stopping)}
+
 for((e,_) - currentJobs.values){
   e.fail(new Exception(The JobManager is shutting down.))
--- End diff --

Thanks Henry for spotting it. I corrected it.


 JobManager restart does not notify the TaskManager
 --

 Key: FLINK-1484
 URL: https://issues.apache.org/jira/browse/FLINK-1484
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann

 In case of a JobManager restart, which can happen due to an uncaught 
 exception, the JobManager is restarted. However, connected TaskManager are 
 not informed about the disconnection and continue sending messages to a 
 JobManager with a reseted state. 
 TaskManager should be informed about a possible restart and cleanup their own 
 state in such a case. Afterwards, they can try to reconnect to a restarted 
 JobManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...

2015-02-06 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/368#discussion_r24227786
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -125,6 +126,10 @@ Actor with ActorLogMessages with ActorLogging {
   override def postStop(): Unit = {
 log.info(sStopping job manager ${self.path}.)
 
+// disconnect the registered task managers
+instanceManager.getAllRegisteredInstances.asScala.foreach{
+  _.getTaskManager ! Disconnected(JobManager is stopping)}
+
 for((e,_) - currentJobs.values){
   e.fail(new Exception(The JobManager is shutting down.))
--- End diff --

Thanks Henry for spotting it. I corrected it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1476) Flink VS Spark on loop test

2015-02-06 Thread xuhong (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310584#comment-14310584
 ] 

xuhong commented on FLINK-1476:
---

Hi Fabian,
   I am very grateful for you advise. By your advise, is that mean i should set 
the taskmanager.heap.mb a lager value? I will try it, thank you!

 Flink VS Spark on loop test
 ---

 Key: FLINK-1476
 URL: https://issues.apache.org/jira/browse/FLINK-1476
 Project: Flink
  Issue Type: Test
Affects Versions: 0.7.0-incubating, 0.8
 Environment: 3 machines, every machines has 24 CPU cores and allocate 
 16 CPU cores for the tests. The memory situation is: 3 * 32G
Reporter: xuhong
Priority: Critical

 In the last days, i did some test on flink and spark. The test results 
 shows that flink can do better on many operations, such as GroupBy, Join and 
 some complex jobs. But when I do the KMeans, LinearRegression and other loop 
 tests, i found that flink is no more excellent than spark. I want to konw, 
 whether flink is more comfortable to do the loop jobs with spark.
 I add code: env.setDegreeOfParallelism(16) in each test to allocate same 
 CPU cores as in Spark tests.
 My english is not good, i wish you guys can understand me!
 the following is some config of my Flnk:
 jobmanager.rpc.port: 6123
 jobmanager.heap.mb: 2048
 taskmanager.heap.mb: 2048
 taskmanager.numberOfTaskSlots: 24
 parallelization.degree.default: 72
 jobmanager.web.port: 8081
 webclient.port: 8085
 fs.overwrite-files: true
 taskmanager.memory.fraction: 0.8
 taskmanager.network.numberofBuffers: 7



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1476) Flink VS Spark on loop test

2015-02-06 Thread xuhong (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310579#comment-14310579
 ] 

xuhong commented on FLINK-1476:
---

thanks very much, i will get try by your advises and keep going pay attention 
for your works!

 Flink VS Spark on loop test
 ---

 Key: FLINK-1476
 URL: https://issues.apache.org/jira/browse/FLINK-1476
 Project: Flink
  Issue Type: Test
Affects Versions: 0.7.0-incubating, 0.8
 Environment: 3 machines, every machines has 24 CPU cores and allocate 
 16 CPU cores for the tests. The memory situation is: 3 * 32G
Reporter: xuhong
Priority: Critical

 In the last days, i did some test on flink and spark. The test results 
 shows that flink can do better on many operations, such as GroupBy, Join and 
 some complex jobs. But when I do the KMeans, LinearRegression and other loop 
 tests, i found that flink is no more excellent than spark. I want to konw, 
 whether flink is more comfortable to do the loop jobs with spark.
 I add code: env.setDegreeOfParallelism(16) in each test to allocate same 
 CPU cores as in Spark tests.
 My english is not good, i wish you guys can understand me!
 the following is some config of my Flnk:
 jobmanager.rpc.port: 6123
 jobmanager.heap.mb: 2048
 taskmanager.heap.mb: 2048
 taskmanager.numberOfTaskSlots: 24
 parallelization.degree.default: 72
 jobmanager.web.port: 8081
 webclient.port: 8085
 fs.overwrite-files: true
 taskmanager.memory.fraction: 0.8
 taskmanager.network.numberofBuffers: 7



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1445) Add support to enforce local input split assignment

2015-02-06 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-1445.
--
Resolution: Duplicate

Duplicate of FLINK-1478

 Add support to enforce local input split assignment
 ---

 Key: FLINK-1445
 URL: https://issues.apache.org/jira/browse/FLINK-1445
 Project: Flink
  Issue Type: New Feature
  Components: Java API, JobManager
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Minor

 In some scenarios data sources cannot remotely read data as for example in 
 distributed cluster setups where each machine stores data on its local FS 
 which cannot be remotely read.
 In order to enable such use cases with Flink, we need to 
 1) add support for enforcing local input split reading.
 2) ensure that each input split can be locally read by at least one data 
 source task which means to influence the scheduling of data source tasks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager

2015-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309033#comment-14309033
 ] 

ASF GitHub Bot commented on FLINK-1484:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/368#discussion_r24236048
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/messages/TaskManagerMessages.scala
 ---
@@ -129,6 +129,13 @@ object TaskManagerMessages {
* @param cause reason for the external failure
*/
   case class FailTask(executionID: ExecutionAttemptID, cause: Throwable)
+
+  /**
+   * Makes the TaskManager to disconnect from the registered JobManager
--- End diff --

You're right. Thanks, I changed it.


 JobManager restart does not notify the TaskManager
 --

 Key: FLINK-1484
 URL: https://issues.apache.org/jira/browse/FLINK-1484
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann

 In case of a JobManager restart, which can happen due to an uncaught 
 exception, the JobManager is restarted. However, connected TaskManager are 
 not informed about the disconnection and continue sending messages to a 
 JobManager with a reseted state. 
 TaskManager should be informed about a possible restart and cleanup their own 
 state in such a case. Afterwards, they can try to reconnect to a restarted 
 JobManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1388) POJO support for writeAsCsv

2015-02-06 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308934#comment-14308934
 ] 

Fabian Hueske commented on FLINK-1388:
--

Hi,

the PojoTypeInfo contains all information you need. Get the PojoFields like 

{code}
int numFields = pType.getArity();
for(int i=0;inumFields;i++) {
  PojoField pField = pType.getPojoFieldAt(i);
}
{code}

The PojoField contains the reflection field {{pField.field}} which can be used 
to extract the field like {{pFieldValue = pField.field.get(myPojo)}}.
Have a look at {{PojoSerializer.serialize()}} and {{PojoTypeInfo}}.

 POJO support for writeAsCsv
 ---

 Key: FLINK-1388
 URL: https://issues.apache.org/jira/browse/FLINK-1388
 Project: Flink
  Issue Type: New Feature
  Components: Java API
Reporter: Timo Walther
Assignee: Adnan Khan
Priority: Minor

 It would be great if one could simply write out POJOs in CSV format.
 {code}
 public class MyPojo {
String a;
int b;
 }
 {code}
 to:
 {code}
 # CSV file of org.apache.flink.MyPojo: String a, int b
 Hello World, 42
 Hello World 2, 47
 ...
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1489) Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers

2015-02-06 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1489:


 Summary: Failing JobManager due to blocking calls in 
Execution.scheduleOrUpdateConsumers
 Key: FLINK-1489
 URL: https://issues.apache.org/jira/browse/FLINK-1489
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Assignee: Till Rohrmann


[~Zentol] reported that the JobManager failed to execute his python job. The 
reason is that the the JobManager executes blocking calls in the actor thread 
in the method {{Execution.sendUpdateTaskRpcCall}} as a result to receiving a 
{{ScheduleOrUpdateConsumers}} message. 

Every TaskManager possibly sends a {{ScheduleOrUpdateConsumers}} to the 
JobManager to notify the consumers about available data. The JobManager then 
sends to each TaskManager the respective update call 
{{Execution.sendUpdateTaskRpcCall}}. By blocking the actor thread, we 
effectively execute the update calls sequentially. Due to the ever accumulating 
delay, some of the initial timeouts on the TaskManager side in 
{{IntermediateResultParititon.scheduleOrUpdateConsumers}} fail. As a result the 
execution of the respective Tasks fails.

A solution would be to make the call non-blocking.

A general caveat for actor programming is: We should never block the actor 
thread, otherwise we seriously jeopardize the scalability of the system. Or 
even worse, the system simply fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-1470) JobManagerITCase fails on Travis

2015-02-06 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-1470.

Resolution: Fixed

Fixed with 0ccd1fd5b90dbdacb579e329016a0819cd301d43

 JobManagerITCase fails on Travis
 

 Key: FLINK-1470
 URL: https://issues.apache.org/jira/browse/FLINK-1470
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor

 The {{handle job with an occasionally failing sender vertex}} test of the 
 {{JobManagerITCase}} failed once during one of my Travis builds with the 
 following error:
 {code}
 Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1,220.566 
 sec  FAILURE! - in org.apache.flink.runtime.jobmanager.JobManagerITCase
 The JobManager actor must handle job with an occasionally failing sender 
 vertex(org.apache.flink.runtime.jobmanager.JobManagerITCase)  Time elapsed: 
 1,200.298 sec   FAILURE!
 java.lang.AssertionError: assertion failed: timeout (1199987201586 
 nanoseconds) during expectMsgClass waiting for class 
 org.apache.flink.runtime.messages.JobManagerMessages$JobResultFailed
   at scala.Predef$.assert(Predef.scala:179)
   at 
 akka.testkit.TestKitBase$class.expectMsgClass_internal(TestKit.scala:412)
   at akka.testkit.TestKitBase$class.expectMsgType(TestKit.scala:385)
   at akka.testkit.TestKit.expectMsgType(TestKit.scala:707)
   at 
 org.apache.flink.runtime.jobmanager.JobManagerITCase$$anonfun$1$$anonfun$apply$mcV$sp$16$$anonfun$apply$mcV$sp$26.apply(JobManagerITCase.scala:389)
   at 
 org.apache.flink.runtime.jobmanager.JobManagerITCase$$anonfun$1$$anonfun$apply$mcV$sp$16$$anonfun$apply$mcV$sp$26.apply(JobManagerITCase.scala:386)
   at akka.testkit.TestKitBase$class.within(TestKit.scala:295)
   at akka.testkit.TestKit.within(TestKit.scala:707)
   at akka.testkit.TestKitBase$class.within(TestKit.scala:309)
   at akka.testkit.TestKit.within(TestKit.scala:707)
   at 
 org.apache.flink.runtime.jobmanager.JobManagerITCase$$anonfun$1$$anonfun$apply$mcV$sp$16.apply$mcV$sp(JobManagerITCase.scala:386)
   at 
 org.apache.flink.runtime.jobmanager.JobManagerITCase$$anonfun$1$$anonfun$apply$mcV$sp$16.apply(JobManagerITCase.scala:362)
   at 
 org.apache.flink.runtime.jobmanager.JobManagerITCase$$anonfun$1$$anonfun$apply$mcV$sp$16.apply(JobManagerITCase.scala:362)
   at 
 org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
   at org.scalatest.Transformer.apply(Transformer.scala:22)
   at org.scalatest.Transformer.apply(Transformer.scala:20)
   at org.scalatest.WordSpecLike$$anon$1.apply(WordSpecLike.scala:953)
   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
   at 
 org.apache.flink.runtime.jobmanager.JobManagerITCase.withFixture(JobManagerITCase.scala:37)
 Results :
 Failed tests: 
   
 JobManagerITCase.run:37-org$scalatest$BeforeAndAfterAll$$super$run:37-org$scalatest$WordSpecLike$$super$run:37-runTests:37-runTest:37-withFixture:37-TestKit.within:707-TestKit.within:707-TestKit.expectMsgType:707
  assertion failed: timeout (1199987201586 nanoseconds) during expectMsgClass 
 waiting for class 
 org.apache.flink.runtime.messages.JobManagerMessages$JobResultFailed
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-06 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-1492:
---

 Summary: Exceptions on shutdown concerning BLOB store cleanup
 Key: FLINK-1492
 URL: https://issues.apache.org/jira/browse/FLINK-1492
 Project: Flink
  Issue Type: Bug
  Components: JobManager, TaskManager
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Ufuk Celebi
 Fix For: 0.9


The following stack traces occur not every time, but frequently.

{code}
java.lang.IllegalArgumentException: 
/tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
at 
org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
at 
org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
at 
org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
at 
org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
at 
akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
at 
akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
at 
akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

15:16:15,350 ERROR 
org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
  - LibraryCacheManager did not shutdown properly.
java.io.IOException: Unable to delete file: 
/tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
at 
org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
at 
org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173)
at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86)
at 
akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
at 
akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
at akka.actor.ActorCell.terminate(ActorCell.scala:369)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

15:16:15,345 ERROR org.apache.flink.runtime.blob.BlobCache  
 - Error deleting directory /tmp/blobStore-4313349e-8a58-4683-9fd0-3d2c52be1864 
during JVM shutdown: 

[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-06 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309212#comment-14309212
 ] 

Stephan Ewen commented on FLINK-1492:
-

The current solution is a bit hacky. Right now, we see multiple cleanups I 
think one is on graceful shutdown of the task manager (through observation of 
job manager death) and then through the shutdown hook.

I think the right solution is to not simply let the shutdown hook delete the 
directory, but to have the shutdown hook trigger call a shutdown on the Blob 
manager.
The shutdown should also make sure it occurs only once, so it does not happen 
through both the task manager shutdown, and the shutdown hook.

It is also good practice that the blob manager should remove the shutdown hook 
once shutdown is called, to prevent resource leaks.


 Exceptions on shutdown concerning BLOB store cleanup
 

 Key: FLINK-1492
 URL: https://issues.apache.org/jira/browse/FLINK-1492
 Project: Flink
  Issue Type: Bug
  Components: JobManager, TaskManager
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Ufuk Celebi
 Fix For: 0.9


 The following stack traces occur not every time, but frequently.
 {code}
 java.lang.IllegalArgumentException: 
 /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
   at 
 org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
   at 
 org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
   at 
 org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
   at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
   at 
 akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
   at 
 akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
   at 
 akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 15:16:15,350 ERROR 
 org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
   - LibraryCacheManager did not shutdown properly.
 java.io.IOException: Unable to delete file: 
 /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
   at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
   at 
 org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
   at 
 org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173)
   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
   at 
 org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86)
   at 
 akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
   at 
 akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
   at akka.actor.ActorCell.terminate(ActorCell.scala:369)
   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
   at 

[GitHub] flink pull request: Remove 'incubator-' prefix from README.md.

2015-02-06 Thread aalexandrov
GitHub user aalexandrov opened a pull request:

https://github.com/apache/flink/pull/371

Remove 'incubator-' prefix from README.md.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aalexandrov/flink patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/371.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #371


commit f8da7b89b4fa74c621983c7d0273d840349cf095
Author: Alexander Alexandrov alexander.alexand...@tu-berlin.de
Date:   2015-02-06T14:23:02Z

Remove 'incubator-' prefix from README.md.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1491) Inconsistent logging between Akka and other components

2015-02-06 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-1491:
---

 Summary: Inconsistent logging between Akka and other components
 Key: FLINK-1491
 URL: https://issues.apache.org/jira/browse/FLINK-1491
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.9
Reporter: Stephan Ewen


When running examples in the IDE, the output contains Akka debug messages, 
while the regular logging is disables.

Bot actor logging and other logging should be consistent.

To reproduce, simply run an example job in the IDE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1478) Add strictly local input split assignment

2015-02-06 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-1478:


Assignee: Fabian Hueske

 Add strictly local input split assignment
 -

 Key: FLINK-1478
 URL: https://issues.apache.org/jira/browse/FLINK-1478
 Project: Flink
  Issue Type: New Feature
  Components: JobManager
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Fabian Hueske
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1486) Add a string to the print method to identify output

2015-02-06 Thread Max Michels (JIRA)
Max Michels created FLINK-1486:
--

 Summary: Add a string to the print method to identify output
 Key: FLINK-1486
 URL: https://issues.apache.org/jira/browse/FLINK-1486
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Reporter: Max Michels
Priority: Minor


The output of the {{print}} method of {[DataSet}} is mainly used for debug 
purposes. Currently, it is difficult to identify the output.

I would suggest to add another {{print(String str)}} method which allows the 
user to supply a String to identify the output. This could be a prefix before 
the actual output or a format string (which might be an overkill).

{code}
DataSet data = env.fromElements(1,2,3,4,5);
{code}

For example, {{data.print(MyDataSet: )}} would output print

{noformat}
MyDataSet: 1
MyDataSet: 2
...
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1418) Make 'print()' output on the client command line, rather than on the task manager sysout

2015-02-06 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309101#comment-14309101
 ] 

Stephan Ewen commented on FLINK-1418:
-

If we ever break it, we should do it now.

 Make 'print()' output on the client command line, rather than on the task 
 manager sysout
 

 Key: FLINK-1418
 URL: https://issues.apache.org/jira/browse/FLINK-1418
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 0.9
Reporter: Stephan Ewen

 Right now, the {{print()}} command prints inside the data sinks where the 
 code runs. It should pull data back to the client and print it there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-785] Chained AllReduce

2015-02-06 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/370

[FLINK-785] Chained AllReduce

This a a preliminary PR to see whether I'm on the right track.

I'm wondering whether this would be everything needed to add a Chained 
AllReduce, before i continue with this issue. I tried it out and it appears to 
work, but wanted to make sure.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/incubator-flink chained_all_reduce

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/370.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #370


commit 37b77670b517995498f83df452ed5b20754fc63e
Author: zentol s.mo...@web.de
Date:   2015-02-06T13:38:05Z

[FLINK-785] Chained AllReduce




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-785) Add Chained operators for AllReduce and AllGroupReduce

2015-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309122#comment-14309122
 ] 

ASF GitHub Bot commented on FLINK-785:
--

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/370

[FLINK-785] Chained AllReduce

This a a preliminary PR to see whether I'm on the right track.

I'm wondering whether this would be everything needed to add a Chained 
AllReduce, before i continue with this issue. I tried it out and it appears to 
work, but wanted to make sure.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/incubator-flink chained_all_reduce

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/370.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #370


commit 37b77670b517995498f83df452ed5b20754fc63e
Author: zentol s.mo...@web.de
Date:   2015-02-06T13:38:05Z

[FLINK-785] Chained AllReduce




 Add Chained operators for AllReduce and AllGroupReduce
 --

 Key: FLINK-785
 URL: https://issues.apache.org/jira/browse/FLINK-785
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
  Labels: github-import
 Fix For: pre-apache


 Because the operators `AllReduce` and `AllGroupReduce` are used both for the 
 pre-reduce (combiner side) and the final reduce, they would greatly benefit 
 from a chained version.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/785
 Created by: [StephanEwen|https://github.com/StephanEwen]
 Labels: runtime, 
 Milestone: Release 0.6 (unplanned)
 Created at: Sun May 11 17:41:12 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1488) JobManager web interface logfile access broken

2015-02-06 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1488:
-

 Summary: JobManager web interface logfile access broken
 Key: FLINK-1488
 URL: https://issues.apache.org/jira/browse/FLINK-1488
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9


In 0.9 the logfile access dies with a null pointer exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output

2015-02-06 Thread Max Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309355#comment-14309355
 ] 

Max Michels commented on FLINK-1486:


Thanks for the code hint [~vkalavri]. I've integrated the printing into the 
existing {{PrintingOutputFormat}}.

 Add a string to the print method to identify output
 ---

 Key: FLINK-1486
 URL: https://issues.apache.org/jira/browse/FLINK-1486
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Reporter: Max Michels
Assignee: Max Michels
Priority: Minor
  Labels: usability

 The output of the {{print}} method of {[DataSet}} is mainly used for debug 
 purposes. Currently, it is difficult to identify the output.
 I would suggest to add another {{print(String str)}} method which allows the 
 user to supply a String to identify the output. This could be a prefix before 
 the actual output or a format string (which might be an overkill).
 {code}
 DataSet data = env.fromElements(1,2,3,4,5);
 {code}
 For example, {{data.print(MyDataSet: )}} would output print
 {noformat}
 MyDataSet: 1
 MyDataSet: 2
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

2015-02-06 Thread mxm
GitHub user mxm opened a pull request:

https://github.com/apache/flink/pull/372

[FLINK-1486] add print method for prefixing a user defined string

- extend API to include a `print(String message)` method
- change `PrintingOutputformat` to include a message

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink flink-1486

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/372.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #372


commit 62c7520d821f8a0718fe7fc3daca6fea09546c2b
Author: Max m...@posteo.de
Date:   2015-02-06T15:55:29Z

[FLINK-1486] add print method for prefixing a user defined string

- extend API to include a print(String message) method
- change PrintingOutputformat to include a message




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output

2015-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309354#comment-14309354
 ] 

ASF GitHub Bot commented on FLINK-1486:
---

GitHub user mxm opened a pull request:

https://github.com/apache/flink/pull/372

[FLINK-1486] add print method for prefixing a user defined string

- extend API to include a `print(String message)` method
- change `PrintingOutputformat` to include a message

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink flink-1486

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/372.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #372


commit 62c7520d821f8a0718fe7fc3daca6fea09546c2b
Author: Max m...@posteo.de
Date:   2015-02-06T15:55:29Z

[FLINK-1486] add print method for prefixing a user defined string

- extend API to include a print(String message) method
- change PrintingOutputformat to include a message




 Add a string to the print method to identify output
 ---

 Key: FLINK-1486
 URL: https://issues.apache.org/jira/browse/FLINK-1486
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Reporter: Max Michels
Assignee: Max Michels
Priority: Minor
  Labels: usability

 The output of the {{print}} method of {[DataSet}} is mainly used for debug 
 purposes. Currently, it is difficult to identify the output.
 I would suggest to add another {{print(String str)}} method which allows the 
 user to supply a String to identify the output. This could be a prefix before 
 the actual output or a format string (which might be an overkill).
 {code}
 DataSet data = env.fromElements(1,2,3,4,5);
 {code}
 For example, {{data.print(MyDataSet: )}} would output print
 {noformat}
 MyDataSet: 1
 MyDataSet: 2
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1418) Make 'print()' output on the client command line, rather than on the task manager sysout

2015-02-06 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309085#comment-14309085
 ] 

Chesnay Schepler commented on FLINK-1418:
-

is this not doable without breaking programs?

I'm thinking of something like this:

whenever a print sink is created, an accumulator is created that stores the 
output. the name of the accumulator is stored in a list in the ExEnv.

within env.execute(), after everything else ran (as in, before we return the 
JobExecutionResult), iterate through the accumulator entries stored and print 
the outputs the contents.

 Make 'print()' output on the client command line, rather than on the task 
 manager sysout
 

 Key: FLINK-1418
 URL: https://issues.apache.org/jira/browse/FLINK-1418
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 0.9
Reporter: Stephan Ewen

 Right now, the {{print()}} command prints inside the data sinks where the 
 code runs. It should pull data back to the client and print it there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1488) JobManager web interface logfile access broken

2015-02-06 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1488.
---
Resolution: Fixed

Resolved in https://issues.apache.org/jira/browse/FLINK-1488

 JobManager web interface logfile access broken
 --

 Key: FLINK-1488
 URL: https://issues.apache.org/jira/browse/FLINK-1488
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9


 In 0.9 the logfile access dies with a null pointer exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1485] Typo in Documentation - Join with...

2015-02-06 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/369#issuecomment-73251849
  
Definitely!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1485) Typo in Documentation - Join with Join-Function

2015-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309286#comment-14309286
 ] 

ASF GitHub Bot commented on FLINK-1485:
---

Github user jkirsch commented on the pull request:

https://github.com/apache/flink/pull/369#issuecomment-73251735
  
No problem .. small but equally important :smiley:


 Typo in Documentation - Join with Join-Function
 ---

 Key: FLINK-1485
 URL: https://issues.apache.org/jira/browse/FLINK-1485
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.8
Reporter: Johannes
Assignee: Johannes
Priority: Trivial
 Fix For: 0.9


 Small typo in documentation
 In the java example for Join with Join-Function



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1166) Add a QA bot to Flink that is testing pull requests

2015-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309327#comment-14309327
 ] 

ASF GitHub Bot commented on FLINK-1166:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/366#issuecomment-73254306
  
I found out why the qa-check has more compile errors compared to the 
master: I've instructed the compiler to report them all ;)

I've addressed all comments in the pull request. I'm going to merge it soon 
if there are no other comments.


 Add a QA bot to Flink that is testing pull requests
 ---

 Key: FLINK-1166
 URL: https://issues.apache.org/jira/browse/FLINK-1166
 Project: Flink
  Issue Type: New Feature
  Components: Build System
Reporter: Robert Metzger
Assignee: Robert Metzger
Priority: Minor
  Labels: starter

 We should have a QA bot (similar to Hadoop) that is checking incoming pull 
 requests for a few things:
 - Changes to user-facing APIs
 - More compiler warnings than before
 - more Javadoc warnings than before
 - change of the number of files in the lib/ directory.
 - unused dependencies
 - {{@author}} tag.
 - guava (and other shaded jars) in the lib/ directory.
 It should be somehow extensible to add new tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1166] Add qa-check.sh tool

2015-02-06 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/366#issuecomment-73254306
  
I found out why the qa-check has more compile errors compared to the 
master: I've instructed the compiler to report them all ;)

I've addressed all comments in the pull request. I'm going to merge it soon 
if there are no other comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-456) Optional runtime statistics / metrics collection

2015-02-06 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-456:
-
Summary: Optional runtime statistics / metrics collection  (was: Optional 
runtime statistics collection)

 Optional runtime statistics / metrics collection
 

 Key: FLINK-456
 URL: https://issues.apache.org/jira/browse/FLINK-456
 Project: Flink
  Issue Type: New Feature
  Components: JobManager, TaskManager
Reporter: Fabian Hueske
  Labels: github-import
 Fix For: pre-apache


 The engine should collect job execution statistics (e.g., via accumulators) 
 such as:
 - total number of input / output records per operator
 - histogram of input/output ratio of UDF calls
 - histogram of number of input records per reduce / cogroup UDF call
 - histogram of number of output records per UDF call
 - histogram of time spend in UDF calls
 - number of local and remote bytes read (not via accumulators)
 - ...
 These stats should be made available to the user after execution (via 
 webfrontend). The purpose of this feature is to ease performance debugging of 
 parallel jobs (e.g., to detect data skew).
 It should be possible to deactivate (or activate) the gathering of these 
 statistics.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/456
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, runtime, user satisfaction, 
 Created at: Tue Feb 04 20:32:49 CET 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output

2015-02-06 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309070#comment-14309070
 ] 

Chesnay Schepler commented on FLINK-1486:
-

+1, can see this being useful, and should be very easy to add.

 Add a string to the print method to identify output
 ---

 Key: FLINK-1486
 URL: https://issues.apache.org/jira/browse/FLINK-1486
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Reporter: Max Michels
Priority: Minor
  Labels: usability

 The output of the {{print}} method of {[DataSet}} is mainly used for debug 
 purposes. Currently, it is difficult to identify the output.
 I would suggest to add another {{print(String str)}} method which allows the 
 user to supply a String to identify the output. This could be a prefix before 
 the actual output or a format string (which might be an overkill).
 {code}
 DataSet data = env.fromElements(1,2,3,4,5);
 {code}
 For example, {{data.print(MyDataSet: )}} would output print
 {noformat}
 MyDataSet: 1
 MyDataSet: 2
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output

2015-02-06 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309043#comment-14309043
 ] 

Vasia Kalavri commented on FLINK-1486:
--

+1, I've needed this feature many times! 
We implemented something similar for the graph api [here | 
http://github.com/project-flink/flink-graph/blob/master/src/main/java/flink/graphs/example/utils/ExampleUtils.java]
 (see method {{printResult}}).

 Add a string to the print method to identify output
 ---

 Key: FLINK-1486
 URL: https://issues.apache.org/jira/browse/FLINK-1486
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Reporter: Max Michels
Priority: Minor
  Labels: usability

 The output of the {{print}} method of {[DataSet}} is mainly used for debug 
 purposes. Currently, it is difficult to identify the output.
 I would suggest to add another {{print(String str)}} method which allows the 
 user to supply a String to identify the output. This could be a prefix before 
 the actual output or a format string (which might be an overkill).
 {code}
 DataSet data = env.fromElements(1,2,3,4,5);
 {code}
 For example, {{data.print(MyDataSet: )}} would output print
 {noformat}
 MyDataSet: 1
 MyDataSet: 2
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1487) Failing SchedulerIsolatedTasksTest.testScheduleQueueing test case

2015-02-06 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1487:


 Summary: Failing SchedulerIsolatedTasksTest.testScheduleQueueing 
test case
 Key: FLINK-1487
 URL: https://issues.apache.org/jira/browse/FLINK-1487
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann


I got the following failure on travis:

{{SchedulerIsolatedTasksTest.testScheduleQueueing:283 expected:107 but 
was:106}}

The failure does not occur consistently on travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1486) Add a string to the print method to identify output

2015-02-06 Thread Max Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Michels reassigned FLINK-1486:
--

Assignee: Max Michels

 Add a string to the print method to identify output
 ---

 Key: FLINK-1486
 URL: https://issues.apache.org/jira/browse/FLINK-1486
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Reporter: Max Michels
Assignee: Max Michels
Priority: Minor
  Labels: usability

 The output of the {{print}} method of {[DataSet}} is mainly used for debug 
 purposes. Currently, it is difficult to identify the output.
 I would suggest to add another {{print(String str)}} method which allows the 
 user to supply a String to identify the output. This could be a prefix before 
 the actual output or a format string (which might be an overkill).
 {code}
 DataSet data = env.fromElements(1,2,3,4,5);
 {code}
 For example, {{data.print(MyDataSet: )}} would output print
 {noformat}
 MyDataSet: 1
 MyDataSet: 2
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1485) Typo in Documentation - Join with Join-Function

2015-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309284#comment-14309284
 ] 

ASF GitHub Bot commented on FLINK-1485:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/369


 Typo in Documentation - Join with Join-Function
 ---

 Key: FLINK-1485
 URL: https://issues.apache.org/jira/browse/FLINK-1485
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.8
Reporter: Johannes
Assignee: Johannes
Priority: Trivial

 Small typo in documentation
 In the java example for Join with Join-Function



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1485] Typo in Documentation - Join with...

2015-02-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/369


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1485) Typo in Documentation - Join with Join-Function

2015-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309283#comment-14309283
 ] 

ASF GitHub Bot commented on FLINK-1485:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/369#issuecomment-73251309
  
Thanks for the fix!


 Typo in Documentation - Join with Join-Function
 ---

 Key: FLINK-1485
 URL: https://issues.apache.org/jira/browse/FLINK-1485
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.8
Reporter: Johannes
Assignee: Johannes
Priority: Trivial

 Small typo in documentation
 In the java example for Join with Join-Function



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1485) Typo in Documentation - Join with Join-Function

2015-02-06 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-1485.
--
   Resolution: Fixed
Fix Version/s: 0.9

Fixed with 76dd725d7ce1757c778db58448fbac5036e626ef.

Thanks!

 Typo in Documentation - Join with Join-Function
 ---

 Key: FLINK-1485
 URL: https://issues.apache.org/jira/browse/FLINK-1485
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.8
Reporter: Johannes
Assignee: Johannes
Priority: Trivial
 Fix For: 0.9


 Small typo in documentation
 In the java example for Join with Join-Function



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1485) Typo in Documentation - Join with Join-Function

2015-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309288#comment-14309288
 ] 

ASF GitHub Bot commented on FLINK-1485:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/369#issuecomment-73251849
  
Definitely!


 Typo in Documentation - Join with Join-Function
 ---

 Key: FLINK-1485
 URL: https://issues.apache.org/jira/browse/FLINK-1485
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.8
Reporter: Johannes
Assignee: Johannes
Priority: Trivial
 Fix For: 0.9


 Small typo in documentation
 In the java example for Join with Join-Function



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1485] Typo in Documentation - Join with...

2015-02-06 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/369#issuecomment-73251309
  
Thanks for the fix!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Remove 'incubator-' prefix from README.md.

2015-02-06 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/371#issuecomment-73255136
  
Good catch, pushing it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-1438) ClassCastException for Custom InputSplit in local mode and invalid type code in distributed mode

2015-02-06 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1438.
-
   Resolution: Fixed
Fix Version/s: 0.9
 Assignee: Stephan Ewen

Fixed via a07d59d72fc059a600a3eb1f479b02964ca256f5

 ClassCastException for Custom InputSplit in local mode and invalid type code 
 in distributed mode
 

 Key: FLINK-1438
 URL: https://issues.apache.org/jira/browse/FLINK-1438
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.8, 0.9
Reporter: Fabian Hueske
Assignee: Stephan Ewen
Priority: Minor
 Fix For: 0.9


 Jobs with custom InputSplits fail with a ClassCastException such as 
 {{org.apache.flink.examples.java.misc.CustomSplitTestJob$TestFileInputSplit 
 cannot be cast to 
 org.apache.flink.examples.java.misc.CustomSplitTestJob$TestFileInputSplit}} 
 if executed on a local setup. 
 This issue is probably related to different ClassLoaders used by the 
 JobManager when InputSplits are generated and when they are handed to the 
 InputFormat by the TaskManager. Moving the class of the custom InputSplit 
 into the {{./lib}} folder and removing it from the job's makes the job work.
 To reproduce the bug, run the following job on a local setup. 
 {code}
 public class CustomSplitTestJob {
   public static void main(String[] args) throws Exception {
   ExecutionEnvironment env = 
 ExecutionEnvironment.getExecutionEnvironment();
   DataSetString x = env.createInput(new TestFileInputFormat());
   x.print();
   env.execute();
   }
   public static class TestFileInputFormat implements 
 InputFormatString,TestFileInputSplit {
   @Override
   public void configure(Configuration parameters) {
   }
   @Override
   public BaseStatistics getStatistics(BaseStatistics 
 cachedStatistics) throws IOException {
   return null;
   }
   @Override
   public TestFileInputSplit[] createInputSplits(int minNumSplits) 
 throws IOException {
   return new TestFileInputSplit[]{new 
 TestFileInputSplit()};
   }
   @Override
   public InputSplitAssigner 
 getInputSplitAssigner(TestFileInputSplit[] inputSplits) {
   return new LocatableInputSplitAssigner(inputSplits);
   }
   @Override
   public void open(TestFileInputSplit split) throws IOException {
   }
   @Override
   public boolean reachedEnd() throws IOException {
   return false;
   }
   @Override
   public String nextRecord(String reuse) throws IOException {
   return null;
   }
   @Override
   public void close() throws IOException {
   }
   }
   public static class TestFileInputSplit extends FileInputSplit {
   }
 }
 {code}
 The same happens in distributed mode just that Akka terminates the 
 transmission of the input split with a meaningless {{invalid type code: 00}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Remove unused enum values from Aggregations en...

2015-02-06 Thread hsaputra
GitHub user hsaputra opened a pull request:

https://github.com/apache/flink/pull/373

Remove unused enum values from Aggregations enum.

SImple cleanup to remove unused enum values from Aggregations enum.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsaputra/flink 
remove_unused_enumvalues_from_aggregations

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/373.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #373


commit a831e448c7d0558f0c239eab4a2b89b54facd7c2
Author: Henry Saputra henry.sapu...@gmail.com
Date:   2015-02-06T18:09:20Z

Remove unused enum values from Aggregations enum.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1166] Add qa-check.sh tool

2015-02-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/366


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1166) Add a QA bot to Flink that is testing pull requests

2015-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309409#comment-14309409
 ] 

ASF GitHub Bot commented on FLINK-1166:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/366


 Add a QA bot to Flink that is testing pull requests
 ---

 Key: FLINK-1166
 URL: https://issues.apache.org/jira/browse/FLINK-1166
 Project: Flink
  Issue Type: New Feature
  Components: Build System
Reporter: Robert Metzger
Assignee: Robert Metzger
Priority: Minor
  Labels: starter

 We should have a QA bot (similar to Hadoop) that is checking incoming pull 
 requests for a few things:
 - Changes to user-facing APIs
 - More compiler warnings than before
 - more Javadoc warnings than before
 - change of the number of files in the lib/ directory.
 - unused dependencies
 - {{@author}} tag.
 - guava (and other shaded jars) in the lib/ directory.
 It should be somehow extensible to add new tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output

2015-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309431#comment-14309431
 ] 

ASF GitHub Bot commented on FLINK-1486:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/372#issuecomment-73270614
  
Looks good.


 Add a string to the print method to identify output
 ---

 Key: FLINK-1486
 URL: https://issues.apache.org/jira/browse/FLINK-1486
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Reporter: Max Michels
Assignee: Max Michels
Priority: Minor
  Labels: usability

 The output of the {{print}} method of {[DataSet}} is mainly used for debug 
 purposes. Currently, it is difficult to identify the output.
 I would suggest to add another {{print(String str)}} method which allows the 
 user to supply a String to identify the output. This could be a prefix before 
 the actual output or a format string (which might be an overkill).
 {code}
 DataSet data = env.fromElements(1,2,3,4,5);
 {code}
 For example, {{data.print(MyDataSet: )}} would output print
 {noformat}
 MyDataSet: 1
 MyDataSet: 2
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)