[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager
[ https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308834#comment-14308834 ] ASF GitHub Bot commented on FLINK-1484: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/368#discussion_r24227786 --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala --- @@ -125,6 +126,10 @@ Actor with ActorLogMessages with ActorLogging { override def postStop(): Unit = { log.info(sStopping job manager ${self.path}.) +// disconnect the registered task managers +instanceManager.getAllRegisteredInstances.asScala.foreach{ + _.getTaskManager ! Disconnected(JobManager is stopping)} + for((e,_) - currentJobs.values){ e.fail(new Exception(The JobManager is shutting down.)) --- End diff -- Thanks Henry for spotting it. I corrected it. JobManager restart does not notify the TaskManager -- Key: FLINK-1484 URL: https://issues.apache.org/jira/browse/FLINK-1484 Project: Flink Issue Type: Bug Reporter: Till Rohrmann In case of a JobManager restart, which can happen due to an uncaught exception, the JobManager is restarted. However, connected TaskManager are not informed about the disconnection and continue sending messages to a JobManager with a reseted state. TaskManager should be informed about a possible restart and cleanup their own state in such a case. Afterwards, they can try to reconnect to a restarted JobManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/368#discussion_r24227786 --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala --- @@ -125,6 +126,10 @@ Actor with ActorLogMessages with ActorLogging { override def postStop(): Unit = { log.info(sStopping job manager ${self.path}.) +// disconnect the registered task managers +instanceManager.getAllRegisteredInstances.asScala.foreach{ + _.getTaskManager ! Disconnected(JobManager is stopping)} + for((e,_) - currentJobs.values){ e.fail(new Exception(The JobManager is shutting down.)) --- End diff -- Thanks Henry for spotting it. I corrected it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1476) Flink VS Spark on loop test
[ https://issues.apache.org/jira/browse/FLINK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310584#comment-14310584 ] xuhong commented on FLINK-1476: --- Hi Fabian, I am very grateful for you advise. By your advise, is that mean i should set the taskmanager.heap.mb a lager value? I will try it, thank you! Flink VS Spark on loop test --- Key: FLINK-1476 URL: https://issues.apache.org/jira/browse/FLINK-1476 Project: Flink Issue Type: Test Affects Versions: 0.7.0-incubating, 0.8 Environment: 3 machines, every machines has 24 CPU cores and allocate 16 CPU cores for the tests. The memory situation is: 3 * 32G Reporter: xuhong Priority: Critical In the last days, i did some test on flink and spark. The test results shows that flink can do better on many operations, such as GroupBy, Join and some complex jobs. But when I do the KMeans, LinearRegression and other loop tests, i found that flink is no more excellent than spark. I want to konw, whether flink is more comfortable to do the loop jobs with spark. I add code: env.setDegreeOfParallelism(16) in each test to allocate same CPU cores as in Spark tests. My english is not good, i wish you guys can understand me! the following is some config of my Flnk: jobmanager.rpc.port: 6123 jobmanager.heap.mb: 2048 taskmanager.heap.mb: 2048 taskmanager.numberOfTaskSlots: 24 parallelization.degree.default: 72 jobmanager.web.port: 8081 webclient.port: 8085 fs.overwrite-files: true taskmanager.memory.fraction: 0.8 taskmanager.network.numberofBuffers: 7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1476) Flink VS Spark on loop test
[ https://issues.apache.org/jira/browse/FLINK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310579#comment-14310579 ] xuhong commented on FLINK-1476: --- thanks very much, i will get try by your advises and keep going pay attention for your works! Flink VS Spark on loop test --- Key: FLINK-1476 URL: https://issues.apache.org/jira/browse/FLINK-1476 Project: Flink Issue Type: Test Affects Versions: 0.7.0-incubating, 0.8 Environment: 3 machines, every machines has 24 CPU cores and allocate 16 CPU cores for the tests. The memory situation is: 3 * 32G Reporter: xuhong Priority: Critical In the last days, i did some test on flink and spark. The test results shows that flink can do better on many operations, such as GroupBy, Join and some complex jobs. But when I do the KMeans, LinearRegression and other loop tests, i found that flink is no more excellent than spark. I want to konw, whether flink is more comfortable to do the loop jobs with spark. I add code: env.setDegreeOfParallelism(16) in each test to allocate same CPU cores as in Spark tests. My english is not good, i wish you guys can understand me! the following is some config of my Flnk: jobmanager.rpc.port: 6123 jobmanager.heap.mb: 2048 taskmanager.heap.mb: 2048 taskmanager.numberOfTaskSlots: 24 parallelization.degree.default: 72 jobmanager.web.port: 8081 webclient.port: 8085 fs.overwrite-files: true taskmanager.memory.fraction: 0.8 taskmanager.network.numberofBuffers: 7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1445) Add support to enforce local input split assignment
[ https://issues.apache.org/jira/browse/FLINK-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske resolved FLINK-1445. -- Resolution: Duplicate Duplicate of FLINK-1478 Add support to enforce local input split assignment --- Key: FLINK-1445 URL: https://issues.apache.org/jira/browse/FLINK-1445 Project: Flink Issue Type: New Feature Components: Java API, JobManager Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Minor In some scenarios data sources cannot remotely read data as for example in distributed cluster setups where each machine stores data on its local FS which cannot be remotely read. In order to enable such use cases with Flink, we need to 1) add support for enforcing local input split reading. 2) ensure that each input split can be locally read by at least one data source task which means to influence the scheduling of data source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager
[ https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309033#comment-14309033 ] ASF GitHub Bot commented on FLINK-1484: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/368#discussion_r24236048 --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/messages/TaskManagerMessages.scala --- @@ -129,6 +129,13 @@ object TaskManagerMessages { * @param cause reason for the external failure */ case class FailTask(executionID: ExecutionAttemptID, cause: Throwable) + + /** + * Makes the TaskManager to disconnect from the registered JobManager --- End diff -- You're right. Thanks, I changed it. JobManager restart does not notify the TaskManager -- Key: FLINK-1484 URL: https://issues.apache.org/jira/browse/FLINK-1484 Project: Flink Issue Type: Bug Reporter: Till Rohrmann In case of a JobManager restart, which can happen due to an uncaught exception, the JobManager is restarted. However, connected TaskManager are not informed about the disconnection and continue sending messages to a JobManager with a reseted state. TaskManager should be informed about a possible restart and cleanup their own state in such a case. Afterwards, they can try to reconnect to a restarted JobManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1388) POJO support for writeAsCsv
[ https://issues.apache.org/jira/browse/FLINK-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308934#comment-14308934 ] Fabian Hueske commented on FLINK-1388: -- Hi, the PojoTypeInfo contains all information you need. Get the PojoFields like {code} int numFields = pType.getArity(); for(int i=0;inumFields;i++) { PojoField pField = pType.getPojoFieldAt(i); } {code} The PojoField contains the reflection field {{pField.field}} which can be used to extract the field like {{pFieldValue = pField.field.get(myPojo)}}. Have a look at {{PojoSerializer.serialize()}} and {{PojoTypeInfo}}. POJO support for writeAsCsv --- Key: FLINK-1388 URL: https://issues.apache.org/jira/browse/FLINK-1388 Project: Flink Issue Type: New Feature Components: Java API Reporter: Timo Walther Assignee: Adnan Khan Priority: Minor It would be great if one could simply write out POJOs in CSV format. {code} public class MyPojo { String a; int b; } {code} to: {code} # CSV file of org.apache.flink.MyPojo: String a, int b Hello World, 42 Hello World 2, 47 ... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1489) Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers
Till Rohrmann created FLINK-1489: Summary: Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers Key: FLINK-1489 URL: https://issues.apache.org/jira/browse/FLINK-1489 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Till Rohrmann [~Zentol] reported that the JobManager failed to execute his python job. The reason is that the the JobManager executes blocking calls in the actor thread in the method {{Execution.sendUpdateTaskRpcCall}} as a result to receiving a {{ScheduleOrUpdateConsumers}} message. Every TaskManager possibly sends a {{ScheduleOrUpdateConsumers}} to the JobManager to notify the consumers about available data. The JobManager then sends to each TaskManager the respective update call {{Execution.sendUpdateTaskRpcCall}}. By blocking the actor thread, we effectively execute the update calls sequentially. Due to the ever accumulating delay, some of the initial timeouts on the TaskManager side in {{IntermediateResultParititon.scheduleOrUpdateConsumers}} fail. As a result the execution of the respective Tasks fails. A solution would be to make the call non-blocking. A general caveat for actor programming is: We should never block the actor thread, otherwise we seriously jeopardize the scalability of the system. Or even worse, the system simply fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1470) JobManagerITCase fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann closed FLINK-1470. Resolution: Fixed Fixed with 0ccd1fd5b90dbdacb579e329016a0819cd301d43 JobManagerITCase fails on Travis Key: FLINK-1470 URL: https://issues.apache.org/jira/browse/FLINK-1470 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.9 Reporter: Fabian Hueske Priority: Minor The {{handle job with an occasionally failing sender vertex}} test of the {{JobManagerITCase}} failed once during one of my Travis builds with the following error: {code} Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1,220.566 sec FAILURE! - in org.apache.flink.runtime.jobmanager.JobManagerITCase The JobManager actor must handle job with an occasionally failing sender vertex(org.apache.flink.runtime.jobmanager.JobManagerITCase) Time elapsed: 1,200.298 sec FAILURE! java.lang.AssertionError: assertion failed: timeout (1199987201586 nanoseconds) during expectMsgClass waiting for class org.apache.flink.runtime.messages.JobManagerMessages$JobResultFailed at scala.Predef$.assert(Predef.scala:179) at akka.testkit.TestKitBase$class.expectMsgClass_internal(TestKit.scala:412) at akka.testkit.TestKitBase$class.expectMsgType(TestKit.scala:385) at akka.testkit.TestKit.expectMsgType(TestKit.scala:707) at org.apache.flink.runtime.jobmanager.JobManagerITCase$$anonfun$1$$anonfun$apply$mcV$sp$16$$anonfun$apply$mcV$sp$26.apply(JobManagerITCase.scala:389) at org.apache.flink.runtime.jobmanager.JobManagerITCase$$anonfun$1$$anonfun$apply$mcV$sp$16$$anonfun$apply$mcV$sp$26.apply(JobManagerITCase.scala:386) at akka.testkit.TestKitBase$class.within(TestKit.scala:295) at akka.testkit.TestKit.within(TestKit.scala:707) at akka.testkit.TestKitBase$class.within(TestKit.scala:309) at akka.testkit.TestKit.within(TestKit.scala:707) at org.apache.flink.runtime.jobmanager.JobManagerITCase$$anonfun$1$$anonfun$apply$mcV$sp$16.apply$mcV$sp(JobManagerITCase.scala:386) at org.apache.flink.runtime.jobmanager.JobManagerITCase$$anonfun$1$$anonfun$apply$mcV$sp$16.apply(JobManagerITCase.scala:362) at org.apache.flink.runtime.jobmanager.JobManagerITCase$$anonfun$1$$anonfun$apply$mcV$sp$16.apply(JobManagerITCase.scala:362) at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.WordSpecLike$$anon$1.apply(WordSpecLike.scala:953) at org.scalatest.Suite$class.withFixture(Suite.scala:1122) at org.apache.flink.runtime.jobmanager.JobManagerITCase.withFixture(JobManagerITCase.scala:37) Results : Failed tests: JobManagerITCase.run:37-org$scalatest$BeforeAndAfterAll$$super$run:37-org$scalatest$WordSpecLike$$super$run:37-runTests:37-runTest:37-withFixture:37-TestKit.within:707-TestKit.within:707-TestKit.expectMsgType:707 assertion failed: timeout (1199987201586 nanoseconds) during expectMsgClass waiting for class org.apache.flink.runtime.messages.JobManagerMessages$JobResultFailed {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup
Stephan Ewen created FLINK-1492: --- Summary: Exceptions on shutdown concerning BLOB store cleanup Key: FLINK-1492 URL: https://issues.apache.org/jira/browse/FLINK-1492 Project: Flink Issue Type: Bug Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Ufuk Celebi Fix For: 0.9 The following stack traces occur not every time, but frequently. {code} java.lang.IllegalArgumentException: /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171) at org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292) at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369) at akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63) at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 15:16:15,350 ERROR org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1 - LibraryCacheManager did not shutdown properly. java.io.IOException: Unable to delete file: /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7 at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279) at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171) at org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172) at akka.actor.ActorCell.terminate(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 15:16:15,345 ERROR org.apache.flink.runtime.blob.BlobCache - Error deleting directory /tmp/blobStore-4313349e-8a58-4683-9fd0-3d2c52be1864 during JVM shutdown:
[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup
[ https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309212#comment-14309212 ] Stephan Ewen commented on FLINK-1492: - The current solution is a bit hacky. Right now, we see multiple cleanups I think one is on graceful shutdown of the task manager (through observation of job manager death) and then through the shutdown hook. I think the right solution is to not simply let the shutdown hook delete the directory, but to have the shutdown hook trigger call a shutdown on the Blob manager. The shutdown should also make sure it occurs only once, so it does not happen through both the task manager shutdown, and the shutdown hook. It is also good practice that the blob manager should remove the shutdown hook once shutdown is called, to prevent resource leaks. Exceptions on shutdown concerning BLOB store cleanup Key: FLINK-1492 URL: https://issues.apache.org/jira/browse/FLINK-1492 Project: Flink Issue Type: Bug Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Ufuk Celebi Fix For: 0.9 The following stack traces occur not every time, but frequently. {code} java.lang.IllegalArgumentException: /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171) at org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292) at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369) at akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63) at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 15:16:15,350 ERROR org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1 - LibraryCacheManager did not shutdown properly. java.io.IOException: Unable to delete file: /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7 at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279) at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171) at org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172) at akka.actor.ActorCell.terminate(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462) at
[GitHub] flink pull request: Remove 'incubator-' prefix from README.md.
GitHub user aalexandrov opened a pull request: https://github.com/apache/flink/pull/371 Remove 'incubator-' prefix from README.md. You can merge this pull request into a Git repository by running: $ git pull https://github.com/aalexandrov/flink patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/371.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #371 commit f8da7b89b4fa74c621983c7d0273d840349cf095 Author: Alexander Alexandrov alexander.alexand...@tu-berlin.de Date: 2015-02-06T14:23:02Z Remove 'incubator-' prefix from README.md. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1491) Inconsistent logging between Akka and other components
Stephan Ewen created FLINK-1491: --- Summary: Inconsistent logging between Akka and other components Key: FLINK-1491 URL: https://issues.apache.org/jira/browse/FLINK-1491 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Stephan Ewen When running examples in the IDE, the output contains Akka debug messages, while the regular logging is disables. Bot actor logging and other logging should be consistent. To reproduce, simply run an example job in the IDE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-1478) Add strictly local input split assignment
[ https://issues.apache.org/jira/browse/FLINK-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske reassigned FLINK-1478: Assignee: Fabian Hueske Add strictly local input split assignment - Key: FLINK-1478 URL: https://issues.apache.org/jira/browse/FLINK-1478 Project: Flink Issue Type: New Feature Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Fabian Hueske Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1486) Add a string to the print method to identify output
Max Michels created FLINK-1486: -- Summary: Add a string to the print method to identify output Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Max Michels Priority: Minor The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1418) Make 'print()' output on the client command line, rather than on the task manager sysout
[ https://issues.apache.org/jira/browse/FLINK-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309101#comment-14309101 ] Stephan Ewen commented on FLINK-1418: - If we ever break it, we should do it now. Make 'print()' output on the client command line, rather than on the task manager sysout Key: FLINK-1418 URL: https://issues.apache.org/jira/browse/FLINK-1418 Project: Flink Issue Type: Improvement Components: Java API Affects Versions: 0.9 Reporter: Stephan Ewen Right now, the {{print()}} command prints inside the data sinks where the code runs. It should pull data back to the client and print it there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-785] Chained AllReduce
GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/370 [FLINK-785] Chained AllReduce This a a preliminary PR to see whether I'm on the right track. I'm wondering whether this would be everything needed to add a Chained AllReduce, before i continue with this issue. I tried it out and it appears to work, but wanted to make sure. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zentol/incubator-flink chained_all_reduce Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/370.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #370 commit 37b77670b517995498f83df452ed5b20754fc63e Author: zentol s.mo...@web.de Date: 2015-02-06T13:38:05Z [FLINK-785] Chained AllReduce --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-785) Add Chained operators for AllReduce and AllGroupReduce
[ https://issues.apache.org/jira/browse/FLINK-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309122#comment-14309122 ] ASF GitHub Bot commented on FLINK-785: -- GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/370 [FLINK-785] Chained AllReduce This a a preliminary PR to see whether I'm on the right track. I'm wondering whether this would be everything needed to add a Chained AllReduce, before i continue with this issue. I tried it out and it appears to work, but wanted to make sure. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zentol/incubator-flink chained_all_reduce Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/370.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #370 commit 37b77670b517995498f83df452ed5b20754fc63e Author: zentol s.mo...@web.de Date: 2015-02-06T13:38:05Z [FLINK-785] Chained AllReduce Add Chained operators for AllReduce and AllGroupReduce -- Key: FLINK-785 URL: https://issues.apache.org/jira/browse/FLINK-785 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache Because the operators `AllReduce` and `AllGroupReduce` are used both for the pre-reduce (combiner side) and the final reduce, they would greatly benefit from a chained version. Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/785 Created by: [StephanEwen|https://github.com/StephanEwen] Labels: runtime, Milestone: Release 0.6 (unplanned) Created at: Sun May 11 17:41:12 CEST 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1488) JobManager web interface logfile access broken
Robert Metzger created FLINK-1488: - Summary: JobManager web interface logfile access broken Key: FLINK-1488 URL: https://issues.apache.org/jira/browse/FLINK-1488 Project: Flink Issue Type: Bug Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 0.9 In 0.9 the logfile access dies with a null pointer exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309355#comment-14309355 ] Max Michels commented on FLINK-1486: Thanks for the code hint [~vkalavri]. I've integrated the printing into the existing {{PrintingOutputFormat}}. Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Max Michels Assignee: Max Michels Priority: Minor Labels: usability The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...
GitHub user mxm opened a pull request: https://github.com/apache/flink/pull/372 [FLINK-1486] add print method for prefixing a user defined string - extend API to include a `print(String message)` method - change `PrintingOutputformat` to include a message You can merge this pull request into a Git repository by running: $ git pull https://github.com/mxm/flink flink-1486 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/372.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #372 commit 62c7520d821f8a0718fe7fc3daca6fea09546c2b Author: Max m...@posteo.de Date: 2015-02-06T15:55:29Z [FLINK-1486] add print method for prefixing a user defined string - extend API to include a print(String message) method - change PrintingOutputformat to include a message --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309354#comment-14309354 ] ASF GitHub Bot commented on FLINK-1486: --- GitHub user mxm opened a pull request: https://github.com/apache/flink/pull/372 [FLINK-1486] add print method for prefixing a user defined string - extend API to include a `print(String message)` method - change `PrintingOutputformat` to include a message You can merge this pull request into a Git repository by running: $ git pull https://github.com/mxm/flink flink-1486 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/372.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #372 commit 62c7520d821f8a0718fe7fc3daca6fea09546c2b Author: Max m...@posteo.de Date: 2015-02-06T15:55:29Z [FLINK-1486] add print method for prefixing a user defined string - extend API to include a print(String message) method - change PrintingOutputformat to include a message Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Max Michels Assignee: Max Michels Priority: Minor Labels: usability The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1418) Make 'print()' output on the client command line, rather than on the task manager sysout
[ https://issues.apache.org/jira/browse/FLINK-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309085#comment-14309085 ] Chesnay Schepler commented on FLINK-1418: - is this not doable without breaking programs? I'm thinking of something like this: whenever a print sink is created, an accumulator is created that stores the output. the name of the accumulator is stored in a list in the ExEnv. within env.execute(), after everything else ran (as in, before we return the JobExecutionResult), iterate through the accumulator entries stored and print the outputs the contents. Make 'print()' output on the client command line, rather than on the task manager sysout Key: FLINK-1418 URL: https://issues.apache.org/jira/browse/FLINK-1418 Project: Flink Issue Type: Improvement Components: Java API Affects Versions: 0.9 Reporter: Stephan Ewen Right now, the {{print()}} command prints inside the data sinks where the code runs. It should pull data back to the client and print it there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1488) JobManager web interface logfile access broken
[ https://issues.apache.org/jira/browse/FLINK-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger resolved FLINK-1488. --- Resolution: Fixed Resolved in https://issues.apache.org/jira/browse/FLINK-1488 JobManager web interface logfile access broken -- Key: FLINK-1488 URL: https://issues.apache.org/jira/browse/FLINK-1488 Project: Flink Issue Type: Bug Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 0.9 In 0.9 the logfile access dies with a null pointer exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1485] Typo in Documentation - Join with...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/369#issuecomment-73251849 Definitely! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1485) Typo in Documentation - Join with Join-Function
[ https://issues.apache.org/jira/browse/FLINK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309286#comment-14309286 ] ASF GitHub Bot commented on FLINK-1485: --- Github user jkirsch commented on the pull request: https://github.com/apache/flink/pull/369#issuecomment-73251735 No problem .. small but equally important :smiley: Typo in Documentation - Join with Join-Function --- Key: FLINK-1485 URL: https://issues.apache.org/jira/browse/FLINK-1485 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 0.8 Reporter: Johannes Assignee: Johannes Priority: Trivial Fix For: 0.9 Small typo in documentation In the java example for Join with Join-Function -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1166) Add a QA bot to Flink that is testing pull requests
[ https://issues.apache.org/jira/browse/FLINK-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309327#comment-14309327 ] ASF GitHub Bot commented on FLINK-1166: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/366#issuecomment-73254306 I found out why the qa-check has more compile errors compared to the master: I've instructed the compiler to report them all ;) I've addressed all comments in the pull request. I'm going to merge it soon if there are no other comments. Add a QA bot to Flink that is testing pull requests --- Key: FLINK-1166 URL: https://issues.apache.org/jira/browse/FLINK-1166 Project: Flink Issue Type: New Feature Components: Build System Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor Labels: starter We should have a QA bot (similar to Hadoop) that is checking incoming pull requests for a few things: - Changes to user-facing APIs - More compiler warnings than before - more Javadoc warnings than before - change of the number of files in the lib/ directory. - unused dependencies - {{@author}} tag. - guava (and other shaded jars) in the lib/ directory. It should be somehow extensible to add new tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1166] Add qa-check.sh tool
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/366#issuecomment-73254306 I found out why the qa-check has more compile errors compared to the master: I've instructed the compiler to report them all ;) I've addressed all comments in the pull request. I'm going to merge it soon if there are no other comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-456) Optional runtime statistics / metrics collection
[ https://issues.apache.org/jira/browse/FLINK-456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-456: - Summary: Optional runtime statistics / metrics collection (was: Optional runtime statistics collection) Optional runtime statistics / metrics collection Key: FLINK-456 URL: https://issues.apache.org/jira/browse/FLINK-456 Project: Flink Issue Type: New Feature Components: JobManager, TaskManager Reporter: Fabian Hueske Labels: github-import Fix For: pre-apache The engine should collect job execution statistics (e.g., via accumulators) such as: - total number of input / output records per operator - histogram of input/output ratio of UDF calls - histogram of number of input records per reduce / cogroup UDF call - histogram of number of output records per UDF call - histogram of time spend in UDF calls - number of local and remote bytes read (not via accumulators) - ... These stats should be made available to the user after execution (via webfrontend). The purpose of this feature is to ease performance debugging of parallel jobs (e.g., to detect data skew). It should be possible to deactivate (or activate) the gathering of these statistics. Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/456 Created by: [fhueske|https://github.com/fhueske] Labels: enhancement, runtime, user satisfaction, Created at: Tue Feb 04 20:32:49 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309070#comment-14309070 ] Chesnay Schepler commented on FLINK-1486: - +1, can see this being useful, and should be very easy to add. Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Max Michels Priority: Minor Labels: usability The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309043#comment-14309043 ] Vasia Kalavri commented on FLINK-1486: -- +1, I've needed this feature many times! We implemented something similar for the graph api [here | http://github.com/project-flink/flink-graph/blob/master/src/main/java/flink/graphs/example/utils/ExampleUtils.java] (see method {{printResult}}). Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Max Michels Priority: Minor Labels: usability The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1487) Failing SchedulerIsolatedTasksTest.testScheduleQueueing test case
Till Rohrmann created FLINK-1487: Summary: Failing SchedulerIsolatedTasksTest.testScheduleQueueing test case Key: FLINK-1487 URL: https://issues.apache.org/jira/browse/FLINK-1487 Project: Flink Issue Type: Bug Reporter: Till Rohrmann I got the following failure on travis: {{SchedulerIsolatedTasksTest.testScheduleQueueing:283 expected:107 but was:106}} The failure does not occur consistently on travis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Michels reassigned FLINK-1486: -- Assignee: Max Michels Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Max Michels Assignee: Max Michels Priority: Minor Labels: usability The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1485) Typo in Documentation - Join with Join-Function
[ https://issues.apache.org/jira/browse/FLINK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309284#comment-14309284 ] ASF GitHub Bot commented on FLINK-1485: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/369 Typo in Documentation - Join with Join-Function --- Key: FLINK-1485 URL: https://issues.apache.org/jira/browse/FLINK-1485 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 0.8 Reporter: Johannes Assignee: Johannes Priority: Trivial Small typo in documentation In the java example for Join with Join-Function -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1485] Typo in Documentation - Join with...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/369 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1485) Typo in Documentation - Join with Join-Function
[ https://issues.apache.org/jira/browse/FLINK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309283#comment-14309283 ] ASF GitHub Bot commented on FLINK-1485: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/369#issuecomment-73251309 Thanks for the fix! Typo in Documentation - Join with Join-Function --- Key: FLINK-1485 URL: https://issues.apache.org/jira/browse/FLINK-1485 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 0.8 Reporter: Johannes Assignee: Johannes Priority: Trivial Small typo in documentation In the java example for Join with Join-Function -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1485) Typo in Documentation - Join with Join-Function
[ https://issues.apache.org/jira/browse/FLINK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske resolved FLINK-1485. -- Resolution: Fixed Fix Version/s: 0.9 Fixed with 76dd725d7ce1757c778db58448fbac5036e626ef. Thanks! Typo in Documentation - Join with Join-Function --- Key: FLINK-1485 URL: https://issues.apache.org/jira/browse/FLINK-1485 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 0.8 Reporter: Johannes Assignee: Johannes Priority: Trivial Fix For: 0.9 Small typo in documentation In the java example for Join with Join-Function -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1485) Typo in Documentation - Join with Join-Function
[ https://issues.apache.org/jira/browse/FLINK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309288#comment-14309288 ] ASF GitHub Bot commented on FLINK-1485: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/369#issuecomment-73251849 Definitely! Typo in Documentation - Join with Join-Function --- Key: FLINK-1485 URL: https://issues.apache.org/jira/browse/FLINK-1485 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 0.8 Reporter: Johannes Assignee: Johannes Priority: Trivial Fix For: 0.9 Small typo in documentation In the java example for Join with Join-Function -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1485] Typo in Documentation - Join with...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/369#issuecomment-73251309 Thanks for the fix! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Remove 'incubator-' prefix from README.md.
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/371#issuecomment-73255136 Good catch, pushing it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1438) ClassCastException for Custom InputSplit in local mode and invalid type code in distributed mode
[ https://issues.apache.org/jira/browse/FLINK-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1438. - Resolution: Fixed Fix Version/s: 0.9 Assignee: Stephan Ewen Fixed via a07d59d72fc059a600a3eb1f479b02964ca256f5 ClassCastException for Custom InputSplit in local mode and invalid type code in distributed mode Key: FLINK-1438 URL: https://issues.apache.org/jira/browse/FLINK-1438 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.8, 0.9 Reporter: Fabian Hueske Assignee: Stephan Ewen Priority: Minor Fix For: 0.9 Jobs with custom InputSplits fail with a ClassCastException such as {{org.apache.flink.examples.java.misc.CustomSplitTestJob$TestFileInputSplit cannot be cast to org.apache.flink.examples.java.misc.CustomSplitTestJob$TestFileInputSplit}} if executed on a local setup. This issue is probably related to different ClassLoaders used by the JobManager when InputSplits are generated and when they are handed to the InputFormat by the TaskManager. Moving the class of the custom InputSplit into the {{./lib}} folder and removing it from the job's makes the job work. To reproduce the bug, run the following job on a local setup. {code} public class CustomSplitTestJob { public static void main(String[] args) throws Exception { ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); DataSetString x = env.createInput(new TestFileInputFormat()); x.print(); env.execute(); } public static class TestFileInputFormat implements InputFormatString,TestFileInputSplit { @Override public void configure(Configuration parameters) { } @Override public BaseStatistics getStatistics(BaseStatistics cachedStatistics) throws IOException { return null; } @Override public TestFileInputSplit[] createInputSplits(int minNumSplits) throws IOException { return new TestFileInputSplit[]{new TestFileInputSplit()}; } @Override public InputSplitAssigner getInputSplitAssigner(TestFileInputSplit[] inputSplits) { return new LocatableInputSplitAssigner(inputSplits); } @Override public void open(TestFileInputSplit split) throws IOException { } @Override public boolean reachedEnd() throws IOException { return false; } @Override public String nextRecord(String reuse) throws IOException { return null; } @Override public void close() throws IOException { } } public static class TestFileInputSplit extends FileInputSplit { } } {code} The same happens in distributed mode just that Akka terminates the transmission of the input split with a meaningless {{invalid type code: 00}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Remove unused enum values from Aggregations en...
GitHub user hsaputra opened a pull request: https://github.com/apache/flink/pull/373 Remove unused enum values from Aggregations enum. SImple cleanup to remove unused enum values from Aggregations enum. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hsaputra/flink remove_unused_enumvalues_from_aggregations Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/373.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #373 commit a831e448c7d0558f0c239eab4a2b89b54facd7c2 Author: Henry Saputra henry.sapu...@gmail.com Date: 2015-02-06T18:09:20Z Remove unused enum values from Aggregations enum. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1166] Add qa-check.sh tool
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/366 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1166) Add a QA bot to Flink that is testing pull requests
[ https://issues.apache.org/jira/browse/FLINK-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309409#comment-14309409 ] ASF GitHub Bot commented on FLINK-1166: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/366 Add a QA bot to Flink that is testing pull requests --- Key: FLINK-1166 URL: https://issues.apache.org/jira/browse/FLINK-1166 Project: Flink Issue Type: New Feature Components: Build System Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor Labels: starter We should have a QA bot (similar to Hadoop) that is checking incoming pull requests for a few things: - Changes to user-facing APIs - More compiler warnings than before - more Javadoc warnings than before - change of the number of files in the lib/ directory. - unused dependencies - {{@author}} tag. - guava (and other shaded jars) in the lib/ directory. It should be somehow extensible to add new tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309431#comment-14309431 ] ASF GitHub Bot commented on FLINK-1486: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-73270614 Looks good. Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Max Michels Assignee: Max Michels Priority: Minor Labels: usability The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)