[GitHub] flink pull request: [FLINK-1478] Add support for strictly local in...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/375#issuecomment-73477321 Looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1478) Add strictly local input split assignment
[ https://issues.apache.org/jira/browse/FLINK-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311945#comment-14311945 ] ASF GitHub Bot commented on FLINK-1478: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/375#issuecomment-73477321 Looks good. Add strictly local input split assignment - Key: FLINK-1478 URL: https://issues.apache.org/jira/browse/FLINK-1478 Project: Flink Issue Type: New Feature Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Fabian Hueske Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Remove unused enum values from Aggregations en...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/373#issuecomment-73477963 Good to merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1463] Fix stateful/stateless Serializer...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/353#issuecomment-73478102 +1 to merge --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/368#issuecomment-73478622 My bad, there is a ticket already. Can you squash the commits then and add the ticket tag? Otherwise, good to merge! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1179] Add button to JobManager web inte...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/374#issuecomment-73477910 Very nice work. I have one comment inline, otherwise +1 to go! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1179] Add button to JobManager web inte...
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/374#discussion_r24315701 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/jobmanager/web/SetupInfoServlet.java --- @@ -127,25 +134,42 @@ private void writeTaskmanagers(HttpServletResponse resp) throws IOException { objInner.put(physicalMemory, instance.getResources().getSizeOfPhysicalMemory() 20); objInner.put(freeMemory, instance.getResources().getSizeOfJvmHeap() 20); objInner.put(managedMemory, instance.getResources().getSizeOfManagedMemory() 20); + objInner.put(instanceID, instance.getId()); array.put(objInner); } catch (JSONException e) { LOG.warn(Json object creation failed, e); } - + } try { obj.put(taskmanagers, array); } catch (JSONException e) { LOG.warn(Json object creation failed, e); } - + PrintWriter w = resp.getWriter(); w.write(obj.toString()); } - + + private void writeStackTraceOfTaskManager(String instanceIdStr, HttpServletResponse resp) throws IOException { --- End diff -- The `RequestStackTrace` message may fail, if the task manager is not reachable. I suggest to surround this block with try / catch(Throwable) and forward the error message to the web client. The response JSON may then have two fields: errorMessage and stackTrace. If errorMessage is defined, display the message, otherwise print the stack trace. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1179) Add button to JobManager web interface to request stack trace of a TaskManager
[ https://issues.apache.org/jira/browse/FLINK-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311954#comment-14311954 ] ASF GitHub Bot commented on FLINK-1179: --- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/374#discussion_r24315701 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/jobmanager/web/SetupInfoServlet.java --- @@ -127,25 +134,42 @@ private void writeTaskmanagers(HttpServletResponse resp) throws IOException { objInner.put(physicalMemory, instance.getResources().getSizeOfPhysicalMemory() 20); objInner.put(freeMemory, instance.getResources().getSizeOfJvmHeap() 20); objInner.put(managedMemory, instance.getResources().getSizeOfManagedMemory() 20); + objInner.put(instanceID, instance.getId()); array.put(objInner); } catch (JSONException e) { LOG.warn(Json object creation failed, e); } - + } try { obj.put(taskmanagers, array); } catch (JSONException e) { LOG.warn(Json object creation failed, e); } - + PrintWriter w = resp.getWriter(); w.write(obj.toString()); } - + + private void writeStackTraceOfTaskManager(String instanceIdStr, HttpServletResponse resp) throws IOException { --- End diff -- The `RequestStackTrace` message may fail, if the task manager is not reachable. I suggest to surround this block with try / catch(Throwable) and forward the error message to the web client. The response JSON may then have two fields: errorMessage and stackTrace. If errorMessage is defined, display the message, otherwise print the stack trace. Add button to JobManager web interface to request stack trace of a TaskManager -- Key: FLINK-1179 URL: https://issues.apache.org/jira/browse/FLINK-1179 Project: Flink Issue Type: New Feature Components: JobManager Reporter: Robert Metzger Assignee: Chiwan Park Priority: Minor Labels: starter This is something I do quite often manually and I think it might be helpful for users as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager
[ https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311958#comment-14311958 ] ASF GitHub Bot commented on FLINK-1484: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/368#issuecomment-73478074 Looks good. Since this is a behavior change, can you file a ticket for this, Till? JobManager restart does not notify the TaskManager -- Key: FLINK-1484 URL: https://issues.apache.org/jira/browse/FLINK-1484 Project: Flink Issue Type: Bug Reporter: Till Rohrmann In case of a JobManager restart, which can happen due to an uncaught exception, the JobManager is restarted. However, connected TaskManager are not informed about the disconnection and continue sending messages to a JobManager with a reseted state. TaskManager should be informed about a possible restart and cleanup their own state in such a case. Afterwards, they can try to reconnect to a restarted JobManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/368#issuecomment-73478074 Looks good. Since this is a behavior change, can you file a ticket for this, Till? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1179) Add button to JobManager web interface to request stack trace of a TaskManager
[ https://issues.apache.org/jira/browse/FLINK-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311967#comment-14311967 ] ASF GitHub Bot commented on FLINK-1179: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/374#issuecomment-73478511 I've tried it out locally. Looks very nice. Thank you. +1 to merge. Add button to JobManager web interface to request stack trace of a TaskManager -- Key: FLINK-1179 URL: https://issues.apache.org/jira/browse/FLINK-1179 Project: Flink Issue Type: New Feature Components: JobManager Reporter: Robert Metzger Assignee: Chiwan Park Priority: Minor Labels: starter This is something I do quite often manually and I think it might be helpful for users as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1179] Add button to JobManager web inte...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/374#issuecomment-73478511 I've tried it out locally. Looks very nice. Thank you. +1 to merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1396) Add hadoop input formats directly to the user API.
[ https://issues.apache.org/jira/browse/FLINK-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311965#comment-14311965 ] ASF GitHub Bot commented on FLINK-1396: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/363#issuecomment-73478491 Looks good. We are getting into very long package names here ;-) `org.apache.flink.api.java.hadoop.mapred.wrapper.*` Add hadoop input formats directly to the user API. -- Key: FLINK-1396 URL: https://issues.apache.org/jira/browse/FLINK-1396 Project: Flink Issue Type: Bug Reporter: Robert Metzger Assignee: Aljoscha Krettek Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1493) Support for streaming jobs preserving global ordering of records
[ https://issues.apache.org/jira/browse/FLINK-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311962#comment-14311962 ] Márton Balassi commented on FLINK-1493: --- Hey Matthias, Thanks for looking into this. The basic model you described seems appealing to me as it avoids deadlocks, but it might result in blowing up the buffers. This morning I've overheard a discussion between [~gyfora] and [~StephanEwen] on this issue in means of fault tolerance, so I'd hand this over to them. As for your questions I can answer the second one: StreamRecord is serialized through the StreamRecordSerializer, motivated by the TupleSerializer and the TypeSerializer in general. Compared to the old Record/Value types this separates the data type from its serialization. Support for streaming jobs preserving global ordering of records Key: FLINK-1493 URL: https://issues.apache.org/jira/browse/FLINK-1493 Project: Flink Issue Type: New Feature Components: Streaming Reporter: Márton Balassi Distributed streaming jobs do not give total, global ordering guarantees for records only partial ordering is provided by the system: records travelling on the same exact route of the physical plan are ordered, but they aren't between routes. It turns out that although this feature can only be implemented via merge sorting in the input buffers on a timestamp field thus creating substantial latency is still desired for a number of applications. Just a heads up for the implementation: the sorting introduces back pressure in the buffers and might cause deadlocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/368#issuecomment-73479833 Yes, I'll do it. On Mon, Feb 9, 2015 at 10:23 AM, Stephan Ewen notificati...@github.com wrote: My bad, there is a ticket already. Can you squash the commits then and add the ticket tag? Otherwise, good to merge! â Reply to this email directly or view it on GitHub https://github.com/apache/flink/pull/368#issuecomment-73478622. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1495][yarn] Make Akka timeout configura...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/377#issuecomment-73530921 Thank you. I'll merge it once the master is building again ;) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1498) Spurious failures on Travis for I/O heavy tasks
Stephan Ewen created FLINK-1498: --- Summary: Spurious failures on Travis for I/O heavy tasks Key: FLINK-1498 URL: https://issues.apache.org/jira/browse/FLINK-1498 Project: Flink Issue Type: Bug Components: Build System Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Priority: Minor Fix For: 0.9 The symptom is missing memory in the Java NIO classes {code} Caused by: java.io.IOException: Cannot allocate memory at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:210) at org.apache.flink.runtime.io.disk.iomanager.SegmentWriteRequest.write(AsynchronousFileIOChannel.java:267) at org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync$WriterThread.run(IOManagerAsync.java:440) {code} From a quick check, it seems you can fix this by increasing the minimal JVM memory. I will try to add {{-Xms256}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1201) Graph API for Flink
[ https://issues.apache.org/jira/browse/FLINK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312477#comment-14312477 ] ASF GitHub Bot commented on FLINK-1201: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/335#issuecomment-73547684 Simple renaming didn't seem to keep the history, so I did the filtering again :) Didn't you have the same problem when moving flink-addons to flink-staging? Let me know if it's fine now. Thanks! Graph API for Flink Key: FLINK-1201 URL: https://issues.apache.org/jira/browse/FLINK-1201 Project: Flink Issue Type: New Feature Reporter: Kostas Tzoumas Assignee: Vasia Kalavri This issue tracks the development of a Graph API/DSL for Flink. Until the code is pushed to the Flink repository, collaboration is happening here: https://github.com/project-flink/flink-graph -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons (...
Github user vasia commented on the pull request: https://github.com/apache/flink/pull/335#issuecomment-73547684 Simple renaming didn't seem to keep the history, so I did the filtering again :) Didn't you have the same problem when moving flink-addons to flink-staging? Let me know if it's fine now. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Closed] (FLINK-1376) SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state
[ https://issues.apache.org/jira/browse/FLINK-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann closed FLINK-1376. Resolution: Fixed Fixed with db1b8b993c12f2e74b6cc9a48414265666dc0e69 in 0.9 Fixed with 91382bb8c1f63dde0b11cc6f4dc9c18f29731cdd in 0.8 SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state Key: FLINK-1376 URL: https://issues.apache.org/jira/browse/FLINK-1376 Project: Flink Issue Type: Bug Affects Versions: 0.8, 0.9 Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.8.1 In case that the TaskManager fatally fails and some of the failing node's slots are SharedSlots, then the slots are not properly released by the JobManager. This causes that the corresponding job will not be properly failed, leaving the system in a corrupted state. The reason for that is that the AllocatedSlot is not aware of being treated as a SharedSlot and thus he cannot release the associated SubSlots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1495) Make Akka timeout configurable in YARN client.
[ https://issues.apache.org/jira/browse/FLINK-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312368#comment-14312368 ] ASF GitHub Bot commented on FLINK-1495: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/377#issuecomment-73530921 Thank you. I'll merge it once the master is building again ;) Make Akka timeout configurable in YARN client. -- Key: FLINK-1495 URL: https://issues.apache.org/jira/browse/FLINK-1495 Project: Flink Issue Type: Improvement Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1499) Make TaskManager to disconnect from TaskManager in case of a restart
Till Rohrmann created FLINK-1499: Summary: Make TaskManager to disconnect from TaskManager in case of a restart Key: FLINK-1499 URL: https://issues.apache.org/jira/browse/FLINK-1499 Project: Flink Issue Type: Bug Reporter: Till Rohrmann In case of a TaskManager restart, the TaskManager does not unregisters from the JobManager. However, it tries to reconnect once the restart has been finished. In order to maintain a consistent state, the TaskManager should disconnect from the JobManager upon restart or termination. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1495) Make Akka timeout configurable in YARN client.
[ https://issues.apache.org/jira/browse/FLINK-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312361#comment-14312361 ] ASF GitHub Bot commented on FLINK-1495: --- Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/377#issuecomment-73529074 LGTM. Make Akka timeout configurable in YARN client. -- Key: FLINK-1495 URL: https://issues.apache.org/jira/browse/FLINK-1495 Project: Flink Issue Type: Improvement Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1495][yarn] Make Akka timeout configura...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/377#issuecomment-73529074 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...
GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/378 [FLINK-1489] Fixes blocking scheduleOrUpdateConsumers message calls Replaces the blocking calls with futures which in case of an exception let the respective task fail. Furthermore, the PartitionInfos are buffered on the JobManager in case that some of the consumers are not yet scheduled. Once the state of the consumers switched to running, all buffered partition infos are sent to the consumers. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink fixScheduleOrUpdateConsumers Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/378.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #378 commit d17f15ac966d59444aed86ed7d1c9cc1a2716152 Author: Till Rohrmann trohrm...@apache.org Date: 2015-02-06T14:13:28Z [FLINK-1489] Replaces blocking scheduleOrUpdateConsumers message calls with asynchronous futures. Buffers PartitionInfos at the JobManager in case that the respective consumer has not been scheduled. Conflicts: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1489) Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers
[ https://issues.apache.org/jira/browse/FLINK-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312430#comment-14312430 ] ASF GitHub Bot commented on FLINK-1489: --- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/378 [FLINK-1489] Fixes blocking scheduleOrUpdateConsumers message calls Replaces the blocking calls with futures which in case of an exception let the respective task fail. Furthermore, the PartitionInfos are buffered on the JobManager in case that some of the consumers are not yet scheduled. Once the state of the consumers switched to running, all buffered partition infos are sent to the consumers. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink fixScheduleOrUpdateConsumers Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/378.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #378 commit d17f15ac966d59444aed86ed7d1c9cc1a2716152 Author: Till Rohrmann trohrm...@apache.org Date: 2015-02-06T14:13:28Z [FLINK-1489] Replaces blocking scheduleOrUpdateConsumers message calls with asynchronous futures. Buffers PartitionInfos at the JobManager in case that the respective consumer has not been scheduled. Conflicts: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers --- Key: FLINK-1489 URL: https://issues.apache.org/jira/browse/FLINK-1489 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Till Rohrmann [~Zentol] reported that the JobManager failed to execute his python job. The reason is that the the JobManager executes blocking calls in the actor thread in the method {{Execution.sendUpdateTaskRpcCall}} as a result to receiving a {{ScheduleOrUpdateConsumers}} message. Every TaskManager possibly sends a {{ScheduleOrUpdateConsumers}} to the JobManager to notify the consumers about available data. The JobManager then sends to each TaskManager the respective update call {{Execution.sendUpdateTaskRpcCall}}. By blocking the actor thread, we effectively execute the update calls sequentially. Due to the ever accumulating delay, some of the initial timeouts on the TaskManager side in {{IntermediateResultParititon.scheduleOrUpdateConsumers}} fail. As a result the execution of the respective Tasks fails. A solution would be to make the call non-blocking. A general caveat for actor programming is: We should never block the actor thread, otherwise we seriously jeopardize the scalability of the system. Or even worse, the system simply fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1500) exampleScalaPrograms.EnumTriangleOptITCase does not finish on Travis
Till Rohrmann created FLINK-1500: Summary: exampleScalaPrograms.EnumTriangleOptITCase does not finish on Travis Key: FLINK-1500 URL: https://issues.apache.org/jira/browse/FLINK-1500 Project: Flink Issue Type: Bug Reporter: Till Rohrmann The test case org.apache.flink.test.exampleScalaPrograms.EnumTriangleOptITCase does not finish on Travis. This problem is non-deterministic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1502) Expose metrics to graphite, ganglia and JMX.
Robert Metzger created FLINK-1502: - Summary: Expose metrics to graphite, ganglia and JMX. Key: FLINK-1502 URL: https://issues.apache.org/jira/browse/FLINK-1502 Project: Flink Issue Type: Sub-task Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor The metrics library allows to expose collected metrics easily to other systems such as graphite, ganglia or Java's JVM (VisualVM). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Remove unused enum values from Aggregations en...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/373 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1504) Add support for accessing secured HDFS clusters in standalone mode
Robert Metzger created FLINK-1504: - Summary: Add support for accessing secured HDFS clusters in standalone mode Key: FLINK-1504 URL: https://issues.apache.org/jira/browse/FLINK-1504 Project: Flink Issue Type: Improvement Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Only for one single user. So the user who starts flink has the kerberos credentials. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1495) Make Akka timeout configurable in YARN client.
[ https://issues.apache.org/jira/browse/FLINK-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312546#comment-14312546 ] ASF GitHub Bot commented on FLINK-1495: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/377 Make Akka timeout configurable in YARN client. -- Key: FLINK-1495 URL: https://issues.apache.org/jira/browse/FLINK-1495 Project: Flink Issue Type: Improvement Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1495) Make Akka timeout configurable in YARN client.
[ https://issues.apache.org/jira/browse/FLINK-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger resolved FLINK-1495. --- Resolution: Fixed Fix Version/s: 0.9 Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/46e05261. Make Akka timeout configurable in YARN client. -- Key: FLINK-1495 URL: https://issues.apache.org/jira/browse/FLINK-1495 Project: Flink Issue Type: Improvement Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1495][yarn] Make Akka timeout configura...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/377 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1498) Spurious failures on Travis for I/O heavy tasks
[ https://issues.apache.org/jira/browse/FLINK-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1498. - Resolution: Fixed Fixed in 52d9806baaff1689f21962febb7dc73d68572289 Spurious failures on Travis for I/O heavy tasks --- Key: FLINK-1498 URL: https://issues.apache.org/jira/browse/FLINK-1498 Project: Flink Issue Type: Bug Components: Build System Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Priority: Minor Fix For: 0.9 The symptom is missing memory in the Java NIO classes {code} Caused by: java.io.IOException: Cannot allocate memory at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:210) at org.apache.flink.runtime.io.disk.iomanager.SegmentWriteRequest.write(AsynchronousFileIOChannel.java:267) at org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync$WriterThread.run(IOManagerAsync.java:440) {code} From a quick check, it seems you can fix this by increasing the minimal JVM memory. I will try to add {{-Xms256}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-456) Integrate runtime metrics / statistics
[ https://issues.apache.org/jira/browse/FLINK-456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-456: - Summary: Integrate runtime metrics / statistics (was: Optional runtime statistics / metrics collection) Integrate runtime metrics / statistics -- Key: FLINK-456 URL: https://issues.apache.org/jira/browse/FLINK-456 Project: Flink Issue Type: New Feature Components: JobManager, TaskManager Reporter: Fabian Hueske Assignee: Robert Metzger Labels: github-import Fix For: pre-apache The engine should collect job execution statistics (e.g., via accumulators) such as: - total number of input / output records per operator - histogram of input/output ratio of UDF calls - histogram of number of input records per reduce / cogroup UDF call - histogram of number of output records per UDF call - histogram of time spend in UDF calls - number of local and remote bytes read (not via accumulators) - ... These stats should be made available to the user after execution (via webfrontend). The purpose of this feature is to ease performance debugging of parallel jobs (e.g., to detect data skew). It should be possible to deactivate (or activate) the gathering of these statistics. Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/456 Created by: [fhueske|https://github.com/fhueske] Labels: enhancement, runtime, user satisfaction, Created at: Tue Feb 04 20:32:49 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface
Robert Metzger created FLINK-1501: - Summary: Integrate metrics library and report basic metrics to JobManager web interface Key: FLINK-1501 URL: https://issues.apache.org/jira/browse/FLINK-1501 Project: Flink Issue Type: Sub-task Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger As per mailing list, the library: https://github.com/dropwizard/metrics The goal of this task is to get the basic infrastructure in place. Subsequent issues will integrate more features into the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1502) Expose metrics to graphite, ganglia and JMX.
[ https://issues.apache.org/jira/browse/FLINK-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312849#comment-14312849 ] Henry Saputra commented on FLINK-1502: -- Should this be subtask for FLINK-1501 ? Expose metrics to graphite, ganglia and JMX. Key: FLINK-1502 URL: https://issues.apache.org/jira/browse/FLINK-1502 Project: Flink Issue Type: Sub-task Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor Fix For: pre-apache The metrics library allows to expose collected metrics easily to other systems such as graphite, ganglia or Java's JVM (VisualVM). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1475) Minimize log output of yarn test cases
[ https://issues.apache.org/jira/browse/FLINK-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger resolved FLINK-1475. --- Resolution: Fixed Fix Version/s: 0.9 Minimize log output of yarn test cases -- Key: FLINK-1475 URL: https://issues.apache.org/jira/browse/FLINK-1475 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Robert Metzger Priority: Minor Fix For: 0.9 The new yarn test cases are quite verbose. Maybe we could increase the log level for these tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup
[ https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312094#comment-14312094 ] ASF GitHub Bot commented on FLINK-1492: --- GitHub user uce opened a pull request: https://github.com/apache/flink/pull/376 [FLINK-1492] Fix exceptions on blob store shutdown You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/incubator-flink flink-1492-proper_shutdown_hook Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/376.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #376 commit f29132ca9fb549d47a29322695f467996ed727a4 Author: Ufuk Celebi u...@apache.org Date: 2015-02-09T10:44:47Z [FLINK-1492] Fix exceptions on blob store shutdown Exceptions on shutdown concerning BLOB store cleanup Key: FLINK-1492 URL: https://issues.apache.org/jira/browse/FLINK-1492 Project: Flink Issue Type: Bug Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Ufuk Celebi Fix For: 0.9 The following stack traces occur not every time, but frequently. {code} java.lang.IllegalArgumentException: /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171) at org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292) at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369) at akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63) at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 15:16:15,350 ERROR org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1 - LibraryCacheManager did not shutdown properly. java.io.IOException: Unable to delete file: /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7 at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279) at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171) at org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172) at akka.actor.ActorCell.terminate(ActorCell.scala:369)
[jira] [Created] (FLINK-1494) Build fails on BlobCacheTest
Fabian Hueske created FLINK-1494: Summary: Build fails on BlobCacheTest Key: FLINK-1494 URL: https://issues.apache.org/jira/browse/FLINK-1494 Project: Flink Issue Type: Bug Components: Local Runtime, TaskManager Environment: Apache Maven 3.0.5 Maven home: /usr/share/maven Java version: 1.7.0_65, vendor: Oracle Corporation Java home: /usr/lib/jvm/java-7-openjdk-amd64/jre Default locale: en_US, platform encoding: UTF-8 OS name: linux, version: 3.16.0-4-amd64, arch: amd64, family: unix Reporter: Fabian Hueske Building Flink with Maven repeatedly fails with the following error: {code} Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 127.283 sec FAILURE! - in org.apache.flink.runtime.blob.BlobCacheTest testBlobCache(org.apache.flink.runtime.blob.BlobCacheTest) Time elapsed: 127.282 sec FAILURE! java.lang.AssertionError: Could not connect to BlobServer at address 0.0.0.0/0.0.0.0:56760 at org.junit.Assert.fail(Assert.java:88) at org.apache.flink.runtime.blob.BlobCacheTest.testBlobCache(BlobCacheTest.java:109) java.io.IOException: Could not connect to BlobServer at address 0.0.0.0/0.0.0.0:52657 at org.apache.flink.runtime.blob.BlobClient.init(BlobClient.java:61) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManagerTest.testLibraryCacheManagerCleanup(BlobLibraryCacheManagerTest.java:56) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Caused by: java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at java.net.Socket.connect(Socket.java:538) at org.apache.flink.runtime.blob.BlobClient.init(BlobClient.java:59) ... 24 more Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 127.299 sec FAILURE! - in org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManagerTest testLibraryCacheManagerCleanup(org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManagerTest) Time elapsed: 127.298 sec FAILURE! java.lang.AssertionError: Could not connect to BlobServer at address 0.0.0.0/0.0.0.0:52657 at org.junit.Assert.fail(Assert.java:88) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManagerTest.testLibraryCacheManagerCleanup(BlobLibraryCacheManagerTest.java:108) Results : Failed tests: BlobCacheTest.testBlobCache:109 Could not connect to BlobServer at address 0.0.0.0/0.0.0.0:56760 BlobLibraryCacheManagerTest.testLibraryCacheManagerCleanup:108 Could not connect to BlobServer at address
[jira] [Updated] (FLINK-1391) Kryo fails to properly serialize avro collection types
[ https://issues.apache.org/jira/browse/FLINK-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1391: -- Fix Version/s: 0.8.1 Kryo fails to properly serialize avro collection types -- Key: FLINK-1391 URL: https://issues.apache.org/jira/browse/FLINK-1391 Project: Flink Issue Type: Sub-task Affects Versions: 0.8, 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 0.8.1 Before FLINK-610, Avro was the default generic serializer. Now, special types coming from Avro are handled by Kryo .. which seems to cause errors like: {code} Exception in thread main org.apache.flink.runtime.client.JobExecutionException: java.lang.NullPointerException at org.apache.avro.generic.GenericData$Array.add(GenericData.java:200) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:116) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:22) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761) at org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:143) at org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:148) at org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:244) at org.apache.flink.runtime.plugable.DeserializationDelegate.read(DeserializationDelegate.java:56) at org.apache.flink.runtime.io.network.serialization.AdaptiveSpanningRecordDeserializer.getNextRecord(AdaptiveSpanningRecordDeserializer.java:71) at org.apache.flink.runtime.io.network.channels.InputChannel.readRecord(InputChannel.java:189) at org.apache.flink.runtime.io.network.gates.InputGate.readRecord(InputGate.java:176) at org.apache.flink.runtime.io.network.api.MutableRecordReader.next(MutableRecordReader.java:51) at org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:53) at org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:170) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257) at java.lang.Thread.run(Thread.java:744) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1478] Add support for strictly local in...
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/375#discussion_r24329907 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionJobVertex.java --- @@ -260,15 +260,49 @@ public void connectToPredecessors(MapIntermediateDataSetID, IntermediateResult public void scheduleAll(Scheduler scheduler, boolean queued) throws NoResourceAvailableException { -// ExecutionVertex[] vertices = this.taskVertices; -// -// for (int i = 0; i vertices.length; i++) { -// ExecutionVertex v = vertices[i]; -// -// if (v.get -// } + ExecutionVertex[] vertices = this.taskVertices; - for (ExecutionVertex ev : getTaskVertices()) { + // check if we need to do pre-assignment of tasks + if (inputSplitsPerSubtask != null) { + + final MapString, ListInstance instances = scheduler.getInstancesByHost(); + final MapString, Integer assignments = new HashMapString, Integer(); + + for (int i = 0; i vertices.length; i++) { + ListLocatableInputSplit splitsForHost = inputSplitsPerSubtask[i]; + if (splitsForHost == null || splitsForHost.isEmpty()) { + continue; + } + + String[] hostNames = splitsForHost.get(0).getHostnames(); + if (hostNames == null || hostNames.length == 0 || hostNames[0] == null) { + continue; + } + + String host = hostNames[0]; + ExecutionVertex v = vertices[i]; + + ListInstance instancesOnHost = instances.get(host); + + if (instancesOnHost == null || instancesOnHost.isEmpty()) { + throw new NoResourceAvailableException(Cannot schedule a strictly local task to host + host + + . No TaskManager available on that host.); + } + + Integer pos = assignments.get(host); + if (pos == null) { + pos = 0; + assignments.put(host, 0); + } else { + assignments.put(host, pos + 1 % instancesOnHost.size()); --- End diff -- It should be possible that multiple subtasks go to the same instance. If there are too many, it would fail in the scheduler, yes. We can check the the number of subtasks on the instance does not exceed the number of slots. This seems to me like a workaround solution anyways (until we can tie splits to tasks), so it might be okay. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1478) Add strictly local input split assignment
[ https://issues.apache.org/jira/browse/FLINK-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312264#comment-14312264 ] ASF GitHub Bot commented on FLINK-1478: --- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/375#discussion_r24329907 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionJobVertex.java --- @@ -260,15 +260,49 @@ public void connectToPredecessors(MapIntermediateDataSetID, IntermediateResult public void scheduleAll(Scheduler scheduler, boolean queued) throws NoResourceAvailableException { -// ExecutionVertex[] vertices = this.taskVertices; -// -// for (int i = 0; i vertices.length; i++) { -// ExecutionVertex v = vertices[i]; -// -// if (v.get -// } + ExecutionVertex[] vertices = this.taskVertices; - for (ExecutionVertex ev : getTaskVertices()) { + // check if we need to do pre-assignment of tasks + if (inputSplitsPerSubtask != null) { + + final MapString, ListInstance instances = scheduler.getInstancesByHost(); + final MapString, Integer assignments = new HashMapString, Integer(); + + for (int i = 0; i vertices.length; i++) { + ListLocatableInputSplit splitsForHost = inputSplitsPerSubtask[i]; + if (splitsForHost == null || splitsForHost.isEmpty()) { + continue; + } + + String[] hostNames = splitsForHost.get(0).getHostnames(); + if (hostNames == null || hostNames.length == 0 || hostNames[0] == null) { + continue; + } + + String host = hostNames[0]; + ExecutionVertex v = vertices[i]; + + ListInstance instancesOnHost = instances.get(host); + + if (instancesOnHost == null || instancesOnHost.isEmpty()) { + throw new NoResourceAvailableException(Cannot schedule a strictly local task to host + host + + . No TaskManager available on that host.); + } + + Integer pos = assignments.get(host); + if (pos == null) { + pos = 0; + assignments.put(host, 0); + } else { + assignments.put(host, pos + 1 % instancesOnHost.size()); --- End diff -- It should be possible that multiple subtasks go to the same instance. If there are too many, it would fail in the scheduler, yes. We can check the the number of subtasks on the instance does not exceed the number of slots. This seems to me like a workaround solution anyways (until we can tie splits to tasks), so it might be okay. Add strictly local input split assignment - Key: FLINK-1478 URL: https://issues.apache.org/jira/browse/FLINK-1478 Project: Flink Issue Type: New Feature Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Fabian Hueske Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1496) Events at unitialized input channels are lost
Ufuk Celebi created FLINK-1496: -- Summary: Events at unitialized input channels are lost Key: FLINK-1496 URL: https://issues.apache.org/jira/browse/FLINK-1496 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: master Reporter: Ufuk Celebi If a program sends an event backwards to the producer task, it might happen that some of it input channels have not been initialized yet (UnknownInputChannel). In that case, the events are lost and will never be received at the producer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1478) Add strictly local input split assignment
[ https://issues.apache.org/jira/browse/FLINK-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312271#comment-14312271 ] ASF GitHub Bot commented on FLINK-1478: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/375#discussion_r24330528 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionJobVertex.java --- @@ -260,15 +260,49 @@ public void connectToPredecessors(MapIntermediateDataSetID, IntermediateResult public void scheduleAll(Scheduler scheduler, boolean queued) throws NoResourceAvailableException { -// ExecutionVertex[] vertices = this.taskVertices; -// -// for (int i = 0; i vertices.length; i++) { -// ExecutionVertex v = vertices[i]; -// -// if (v.get -// } + ExecutionVertex[] vertices = this.taskVertices; - for (ExecutionVertex ev : getTaskVertices()) { + // check if we need to do pre-assignment of tasks + if (inputSplitsPerSubtask != null) { + + final MapString, ListInstance instances = scheduler.getInstancesByHost(); + final MapString, Integer assignments = new HashMapString, Integer(); + + for (int i = 0; i vertices.length; i++) { + ListLocatableInputSplit splitsForHost = inputSplitsPerSubtask[i]; + if (splitsForHost == null || splitsForHost.isEmpty()) { + continue; + } + + String[] hostNames = splitsForHost.get(0).getHostnames(); + if (hostNames == null || hostNames.length == 0 || hostNames[0] == null) { + continue; + } + + String host = hostNames[0]; + ExecutionVertex v = vertices[i]; + + ListInstance instancesOnHost = instances.get(host); + + if (instancesOnHost == null || instancesOnHost.isEmpty()) { + throw new NoResourceAvailableException(Cannot schedule a strictly local task to host + host + + . No TaskManager available on that host.); + } + + Integer pos = assignments.get(host); + if (pos == null) { + pos = 0; + assignments.put(host, 0); + } else { + assignments.put(host, pos + 1 % instancesOnHost.size()); --- End diff -- Ah, yes sure. I confused instances and slots... Add strictly local input split assignment - Key: FLINK-1478 URL: https://issues.apache.org/jira/browse/FLINK-1478 Project: Flink Issue Type: New Feature Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Fabian Hueske Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1475) Minimize log output of yarn test cases
[ https://issues.apache.org/jira/browse/FLINK-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312093#comment-14312093 ] Robert Metzger commented on FLINK-1475: --- Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/2f16ca2d. Minimize log output of yarn test cases -- Key: FLINK-1475 URL: https://issues.apache.org/jira/browse/FLINK-1475 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Robert Metzger Priority: Minor Fix For: 0.9 The new yarn test cases are quite verbose. Maybe we could increase the log level for these tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1492] Fix exceptions on blob store shut...
GitHub user uce opened a pull request: https://github.com/apache/flink/pull/376 [FLINK-1492] Fix exceptions on blob store shutdown You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/incubator-flink flink-1492-proper_shutdown_hook Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/376.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #376 commit f29132ca9fb549d47a29322695f467996ed727a4 Author: Ufuk Celebi u...@apache.org Date: 2015-02-09T10:44:47Z [FLINK-1492] Fix exceptions on blob store shutdown --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1444) Add data properties for data sources
[ https://issues.apache.org/jira/browse/FLINK-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312977#comment-14312977 ] ASF GitHub Bot commented on FLINK-1444: --- GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/379 [FLINK-1444][api-extending] Add support for split data properties on data sources This pull request adds support for declaring global and local properties for input splits. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink sourceProperties Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/379.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #379 commit c1f77c1293320094b8d07b357a40960f4a657cf0 Author: Fabian Hueske fhue...@apache.org Date: 2015-02-06T13:28:00Z [FLINK-1444][api-extending] Add support for attaching data properties to data sources Add data properties for data sources Key: FLINK-1444 URL: https://issues.apache.org/jira/browse/FLINK-1444 Project: Flink Issue Type: New Feature Components: Java API, JobManager, Optimizer Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Minor This issue proposes to add support for attaching data properties to data sources. These data properties are defined with respect to input splits. Possible properties are: - partitioning across splits: all elements of the same key (combination) are contained in one split - sorting / grouping with splits: elements are sorted or grouped on certain keys within a split - key uniqueness: a certain key (combination) is unique for all elements of the data source. This property is not defined wrt. input splits. The optimizer can leverage this information to generate more efficient execution plans. The InputFormat will be responsible to generate input splits such that the promised data properties are actually in place. Otherwise, the program will produce invalid results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1444][api-extending] Add support for sp...
GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/379 [FLINK-1444][api-extending] Add support for split data properties on data sources This pull request adds support for declaring global and local properties for input splits. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink sourceProperties Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/379.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #379 commit c1f77c1293320094b8d07b357a40960f4a657cf0 Author: Fabian Hueske fhue...@apache.org Date: 2015-02-06T13:28:00Z [FLINK-1444][api-extending] Add support for attaching data properties to data sources --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1479) The spawned threads in the sorter have no context class loader
[ https://issues.apache.org/jira/browse/FLINK-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312188#comment-14312188 ] Robert Metzger commented on FLINK-1479: --- Fixed in release-0.8 by [~StephanEwen] in http://git-wip-us.apache.org/repos/asf/flink/commit/44b799d6. The spawned threads in the sorter have no context class loader -- Key: FLINK-1479 URL: https://issues.apache.org/jira/browse/FLINK-1479 Project: Flink Issue Type: Bug Components: Local Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9, 0.8.1 The context class loader of task threads is the user code class loader that has access to the libraries of the user program. The sorter spawns extra threads (reading, sorting, spilling) without setting the context class loader on those threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1479) The spawned threads in the sorter have no context class loader
[ https://issues.apache.org/jira/browse/FLINK-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1479: -- Fix Version/s: 0.8.1 The spawned threads in the sorter have no context class loader -- Key: FLINK-1479 URL: https://issues.apache.org/jira/browse/FLINK-1479 Project: Flink Issue Type: Bug Components: Local Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9, 0.8.1 The context class loader of task threads is the user code class loader that has access to the libraries of the user program. The sorter spawns extra threads (reading, sorting, spilling) without setting the context class loader on those threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1376) SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state
[ https://issues.apache.org/jira/browse/FLINK-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1376: -- Fix Version/s: 0.8.1 SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state Key: FLINK-1376 URL: https://issues.apache.org/jira/browse/FLINK-1376 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.8.1 In case that the TaskManager fatally fails and some of the failing node's slots are SharedSlots, then the slots are not properly released by the JobManager. This causes that the corresponding job will not be properly failed, leaving the system in a corrupted state. The reason for that is that the AllocatedSlot is not aware of being treated as a SharedSlot and thus he cannot release the associated SubSlots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-73504724 To make it a bit more explicit what is sink identifier and what is the task identifier (especially when just one of the two are printed), I prefixed the sink identifier with sinkId and the task identifier with taskId. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-1376) SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state
[ https://issues.apache.org/jira/browse/FLINK-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1376: -- Affects Version/s: 0.9 0.8 SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state Key: FLINK-1376 URL: https://issues.apache.org/jira/browse/FLINK-1376 Project: Flink Issue Type: Bug Affects Versions: 0.8, 0.9 Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.8.1 In case that the TaskManager fatally fails and some of the failing node's slots are SharedSlots, then the slots are not properly released by the JobManager. This causes that the corresponding job will not be properly failed, leaving the system in a corrupted state. The reason for that is that the AllocatedSlot is not aware of being treated as a SharedSlot and thus he cannot release the associated SubSlots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312194#comment-14312194 ] ASF GitHub Bot commented on FLINK-1486: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-73504724 To make it a bit more explicit what is sink identifier and what is the task identifier (especially when just one of the two are printed), I prefixed the sink identifier with sinkId and the task identifier with taskId. Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Max Michels Assignee: Max Michels Priority: Minor Labels: usability The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312202#comment-14312202 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-73505531 @qmlmoon has provided TPCH Query 3 / 10 and WebLogAnalysis examples Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1376] [runtime] Add proper shared slot ...
Github user tillrohrmann closed the pull request at: https://github.com/apache/flink/pull/318 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1478] Add support for strictly local in...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/375#discussion_r24327047 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionJobVertex.java --- @@ -260,15 +260,49 @@ public void connectToPredecessors(MapIntermediateDataSetID, IntermediateResult public void scheduleAll(Scheduler scheduler, boolean queued) throws NoResourceAvailableException { -// ExecutionVertex[] vertices = this.taskVertices; -// -// for (int i = 0; i vertices.length; i++) { -// ExecutionVertex v = vertices[i]; -// -// if (v.get -// } + ExecutionVertex[] vertices = this.taskVertices; - for (ExecutionVertex ev : getTaskVertices()) { + // check if we need to do pre-assignment of tasks + if (inputSplitsPerSubtask != null) { + + final MapString, ListInstance instances = scheduler.getInstancesByHost(); + final MapString, Integer assignments = new HashMapString, Integer(); + + for (int i = 0; i vertices.length; i++) { + ListLocatableInputSplit splitsForHost = inputSplitsPerSubtask[i]; + if (splitsForHost == null || splitsForHost.isEmpty()) { + continue; + } + + String[] hostNames = splitsForHost.get(0).getHostnames(); + if (hostNames == null || hostNames.length == 0 || hostNames[0] == null) { + continue; + } + + String host = hostNames[0]; + ExecutionVertex v = vertices[i]; + + ListInstance instancesOnHost = instances.get(host); + + if (instancesOnHost == null || instancesOnHost.isEmpty()) { + throw new NoResourceAvailableException(Cannot schedule a strictly local task to host + host + + . No TaskManager available on that host.); + } + + Integer pos = assignments.get(host); + if (pos == null) { + pos = 0; + assignments.put(host, 0); + } else { + assignments.put(host, pos + 1 % instancesOnHost.size()); --- End diff -- Doesn't this potentially cause multiple subtasks being assigned to the same instance? I guess that would fail in the scheduler. Shouldn't we catch the case here and return a more detailed exception why scheduling constraint could not be fulfilled? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1478) Add strictly local input split assignment
[ https://issues.apache.org/jira/browse/FLINK-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312214#comment-14312214 ] ASF GitHub Bot commented on FLINK-1478: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/375#discussion_r24327047 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionJobVertex.java --- @@ -260,15 +260,49 @@ public void connectToPredecessors(MapIntermediateDataSetID, IntermediateResult public void scheduleAll(Scheduler scheduler, boolean queued) throws NoResourceAvailableException { -// ExecutionVertex[] vertices = this.taskVertices; -// -// for (int i = 0; i vertices.length; i++) { -// ExecutionVertex v = vertices[i]; -// -// if (v.get -// } + ExecutionVertex[] vertices = this.taskVertices; - for (ExecutionVertex ev : getTaskVertices()) { + // check if we need to do pre-assignment of tasks + if (inputSplitsPerSubtask != null) { + + final MapString, ListInstance instances = scheduler.getInstancesByHost(); + final MapString, Integer assignments = new HashMapString, Integer(); + + for (int i = 0; i vertices.length; i++) { + ListLocatableInputSplit splitsForHost = inputSplitsPerSubtask[i]; + if (splitsForHost == null || splitsForHost.isEmpty()) { + continue; + } + + String[] hostNames = splitsForHost.get(0).getHostnames(); + if (hostNames == null || hostNames.length == 0 || hostNames[0] == null) { + continue; + } + + String host = hostNames[0]; + ExecutionVertex v = vertices[i]; + + ListInstance instancesOnHost = instances.get(host); + + if (instancesOnHost == null || instancesOnHost.isEmpty()) { + throw new NoResourceAvailableException(Cannot schedule a strictly local task to host + host + + . No TaskManager available on that host.); + } + + Integer pos = assignments.get(host); + if (pos == null) { + pos = 0; + assignments.put(host, 0); + } else { + assignments.put(host, pos + 1 % instancesOnHost.size()); --- End diff -- Doesn't this potentially cause multiple subtasks being assigned to the same instance? I guess that would fail in the scheduler. Shouldn't we catch the case here and return a more detailed exception why scheduling constraint could not be fulfilled? Add strictly local input split assignment - Key: FLINK-1478 URL: https://issues.apache.org/jira/browse/FLINK-1478 Project: Flink Issue Type: New Feature Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Fabian Hueske Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1392) Serializing Protobuf - issue 1
[ https://issues.apache.org/jira/browse/FLINK-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1392: -- Fix Version/s: 0.8.1 Serializing Protobuf - issue 1 -- Key: FLINK-1392 URL: https://issues.apache.org/jira/browse/FLINK-1392 Project: Flink Issue Type: Sub-task Affects Versions: 0.8, 0.9 Reporter: Felix Neutatz Assignee: Robert Metzger Priority: Minor Fix For: 0.8.1 Hi, I started to experiment with Parquet using Protobuf. When I use the standard Protobuf class: com.twitter.data.proto.tutorial.AddressBookProtos The code which I run, can be found here: [https://github.com/FelixNeutatz/incubator-flink/blob/ParquetAtFlink/flink-addons/flink-hadoop-compatibility/src/main/java/org/apache/flink/hadoopcompatibility/mapreduce/example/ParquetProtobufOutput.java] I get the following exception: {code:xml} Exception in thread main java.lang.Exception: Deserializing the InputFormat (org.apache.flink.api.java.io.CollectionInputFormat) failed: Could not read the user code wrapper: Error while deserializing element from collection at org.apache.flink.runtime.jobgraph.InputFormatVertex.initializeOnMaster(InputFormatVertex.java:60) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$5.apply(JobManager.scala:179) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$5.apply(JobManager.scala:172) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:172) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:34) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:27) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:27) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:52) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: org.apache.flink.runtime.operators.util.CorruptConfigurationException: Could not read the user code wrapper: Error while deserializing element from collection at org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:285) at org.apache.flink.runtime.jobgraph.InputFormatVertex.initializeOnMaster(InputFormatVertex.java:57) ... 25 more Caused by: java.io.IOException: Error while deserializing element from collection at org.apache.flink.api.java.io.CollectionInputFormat.readObject(CollectionInputFormat.java:108) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
[GitHub] flink pull request: [FLINK-1495][yarn] Make Akka timeout configura...
GitHub user rmetzger opened a pull request: https://github.com/apache/flink/pull/377 [FLINK-1495][yarn] Make Akka timeout configurable in YARN client. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rmetzger/flink yarn_timeouts Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/377.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #377 commit aedc5ae56d4dd94a20a3b000412ef011f370d24f Author: Robert Metzger rmetz...@apache.org Date: 2015-02-09T13:45:56Z [FLINK-1495][yarn] Make Akka timeout configurable in YARN client. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1179] Add button to JobManager web inte...
Github user chiwanpark commented on the pull request: https://github.com/apache/flink/pull/374#issuecomment-73509674 @StephanEwen Thanks for your advice! I fixed it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1179) Add button to JobManager web interface to request stack trace of a TaskManager
[ https://issues.apache.org/jira/browse/FLINK-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312231#comment-14312231 ] ASF GitHub Bot commented on FLINK-1179: --- Github user chiwanpark commented on the pull request: https://github.com/apache/flink/pull/374#issuecomment-73509674 @StephanEwen Thanks for your advice! I fixed it. Add button to JobManager web interface to request stack trace of a TaskManager -- Key: FLINK-1179 URL: https://issues.apache.org/jira/browse/FLINK-1179 Project: Flink Issue Type: New Feature Components: JobManager Reporter: Robert Metzger Assignee: Chiwan Park Priority: Minor Labels: starter This is something I do quite often manually and I think it might be helpful for users as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1396) Add hadoop input formats directly to the user API.
[ https://issues.apache.org/jira/browse/FLINK-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312233#comment-14312233 ] ASF GitHub Bot commented on FLINK-1396: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/363 Add hadoop input formats directly to the user API. -- Key: FLINK-1396 URL: https://issues.apache.org/jira/browse/FLINK-1396 Project: Flink Issue Type: Bug Reporter: Robert Metzger Assignee: Aljoscha Krettek Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1303) HadoopInputFormat does not work with Scala API
[ https://issues.apache.org/jira/browse/FLINK-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek resolved FLINK-1303. - Resolution: Fixed Fix Version/s: 0.8.1 Resolved in https://github.com/apache/flink/commit/8b3805ba5905c3d84f3e0631bc6090a618df8e90 HadoopInputFormat does not work with Scala API -- Key: FLINK-1303 URL: https://issues.apache.org/jira/browse/FLINK-1303 Project: Flink Issue Type: Sub-task Components: Scala API Reporter: Aljoscha Krettek Assignee: Aljoscha Krettek Fix For: 0.9, 0.8.1 It fails because the HadoopInputFormat uses the Flink Tuple2 type. For this, type extraction fails at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1396) Add hadoop input formats directly to the user API.
[ https://issues.apache.org/jira/browse/FLINK-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek resolved FLINK-1396. - Resolution: Fixed Fix Version/s: 0.8.1 0.9 Resolved in https://github.com/apache/flink/commit/8b3805ba5905c3d84f3e0631bc6090a618df8e90 Add hadoop input formats directly to the user API. -- Key: FLINK-1396 URL: https://issues.apache.org/jira/browse/FLINK-1396 Project: Flink Issue Type: Bug Reporter: Robert Metzger Assignee: Aljoscha Krettek Priority: Minor Fix For: 0.9, 0.8.1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1495) Make Akka timeout configurable in YARN client.
Robert Metzger created FLINK-1495: - Summary: Make Akka timeout configurable in YARN client. Key: FLINK-1495 URL: https://issues.apache.org/jira/browse/FLINK-1495 Project: Flink Issue Type: Improvement Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Port FLINK-1391 and FLINK-1392 to release-0.8...
Github user rmetzger closed the pull request at: https://github.com/apache/flink/pull/364 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Port FLINK-1391 and FLINK-1392 to release-0.8...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/364#issuecomment-73510972 Merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1391) Kryo fails to properly serialize avro collection types
[ https://issues.apache.org/jira/browse/FLINK-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312239#comment-14312239 ] Robert Metzger commented on FLINK-1391: --- Pushed fix into release-0.8: 84c4998125b175ee524cec4292ab29060784861c Kryo fails to properly serialize avro collection types -- Key: FLINK-1391 URL: https://issues.apache.org/jira/browse/FLINK-1391 Project: Flink Issue Type: Sub-task Affects Versions: 0.8, 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 0.8.1 Before FLINK-610, Avro was the default generic serializer. Now, special types coming from Avro are handled by Kryo .. which seems to cause errors like: {code} Exception in thread main org.apache.flink.runtime.client.JobExecutionException: java.lang.NullPointerException at org.apache.avro.generic.GenericData$Array.add(GenericData.java:200) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:116) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:22) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761) at org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:143) at org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:148) at org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:244) at org.apache.flink.runtime.plugable.DeserializationDelegate.read(DeserializationDelegate.java:56) at org.apache.flink.runtime.io.network.serialization.AdaptiveSpanningRecordDeserializer.getNextRecord(AdaptiveSpanningRecordDeserializer.java:71) at org.apache.flink.runtime.io.network.channels.InputChannel.readRecord(InputChannel.java:189) at org.apache.flink.runtime.io.network.gates.InputGate.readRecord(InputGate.java:176) at org.apache.flink.runtime.io.network.api.MutableRecordReader.next(MutableRecordReader.java:51) at org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:53) at org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:170) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257) at java.lang.Thread.run(Thread.java:744) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1391) Kryo fails to properly serialize avro collection types
[ https://issues.apache.org/jira/browse/FLINK-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312240#comment-14312240 ] ASF GitHub Bot commented on FLINK-1391: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/364#issuecomment-73510972 Merged. Kryo fails to properly serialize avro collection types -- Key: FLINK-1391 URL: https://issues.apache.org/jira/browse/FLINK-1391 Project: Flink Issue Type: Sub-task Affects Versions: 0.8, 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 0.8.1 Before FLINK-610, Avro was the default generic serializer. Now, special types coming from Avro are handled by Kryo .. which seems to cause errors like: {code} Exception in thread main org.apache.flink.runtime.client.JobExecutionException: java.lang.NullPointerException at org.apache.avro.generic.GenericData$Array.add(GenericData.java:200) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:116) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:22) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761) at org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:143) at org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:148) at org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:244) at org.apache.flink.runtime.plugable.DeserializationDelegate.read(DeserializationDelegate.java:56) at org.apache.flink.runtime.io.network.serialization.AdaptiveSpanningRecordDeserializer.getNextRecord(AdaptiveSpanningRecordDeserializer.java:71) at org.apache.flink.runtime.io.network.channels.InputChannel.readRecord(InputChannel.java:189) at org.apache.flink.runtime.io.network.gates.InputGate.readRecord(InputGate.java:176) at org.apache.flink.runtime.io.network.api.MutableRecordReader.next(MutableRecordReader.java:51) at org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:53) at org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:170) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257) at java.lang.Thread.run(Thread.java:744) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1495) Make Akka timeout configurable in YARN client.
[ https://issues.apache.org/jira/browse/FLINK-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312246#comment-14312246 ] ASF GitHub Bot commented on FLINK-1495: --- GitHub user rmetzger opened a pull request: https://github.com/apache/flink/pull/377 [FLINK-1495][yarn] Make Akka timeout configurable in YARN client. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rmetzger/flink yarn_timeouts Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/377.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #377 commit aedc5ae56d4dd94a20a3b000412ef011f370d24f Author: Robert Metzger rmetz...@apache.org Date: 2015-02-09T13:45:56Z [FLINK-1495][yarn] Make Akka timeout configurable in YARN client. Make Akka timeout configurable in YARN client. -- Key: FLINK-1495 URL: https://issues.apache.org/jira/browse/FLINK-1495 Project: Flink Issue Type: Improvement Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup
[ https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312172#comment-14312172 ] ASF GitHub Bot commented on FLINK-1492: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/376#issuecomment-73500870 This change is also in 0.8 so do we need to apply the fix there as well for the upcoming 0.8.1 release? Exceptions on shutdown concerning BLOB store cleanup Key: FLINK-1492 URL: https://issues.apache.org/jira/browse/FLINK-1492 Project: Flink Issue Type: Bug Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Ufuk Celebi Fix For: 0.9 The following stack traces occur not every time, but frequently. {code} java.lang.IllegalArgumentException: /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171) at org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292) at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369) at akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63) at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 15:16:15,350 ERROR org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1 - LibraryCacheManager did not shutdown properly. java.io.IOException: Unable to delete file: /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7 at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279) at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171) at org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172) at akka.actor.ActorCell.terminate(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at
[GitHub] flink pull request: [FLINK-1492] Fix exceptions on blob store shut...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/376#issuecomment-73500870 This change is also in 0.8 so do we need to apply the fix there as well for the upcoming 0.8.1 release? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1463] Fix stateful/stateless Serializer...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/353#issuecomment-73501310 Manually merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1463) RuntimeStatefulSerializerFactory declares ClassLoader as transient but later tries to use it
[ https://issues.apache.org/jira/browse/FLINK-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek resolved FLINK-1463. - Resolution: Fixed Resolved in https://github.com/apache/flink/commit/7bc78cbf97d341ebfed32fdfe20f21e4d146a869 RuntimeStatefulSerializerFactory declares ClassLoader as transient but later tries to use it Key: FLINK-1463 URL: https://issues.apache.org/jira/browse/FLINK-1463 Project: Flink Issue Type: Bug Affects Versions: 0.8, 0.9 Reporter: Aljoscha Krettek Assignee: Aljoscha Krettek Priority: Blocker Fix For: 0.8.1 At least one user has seen an exception because of this. In theory, the ClassLoader is set again in readParametersFromConfig. But the way it is used in TupleComparatorBase, this method is never called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-377] [FLINK-671] Generic Interface / PA...
Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-73505531 @qmlmoon has provided TPCH Query 3 / 10 and WebLogAnalysis examples --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1478) Add strictly local input split assignment
[ https://issues.apache.org/jira/browse/FLINK-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312200#comment-14312200 ] ASF GitHub Bot commented on FLINK-1478: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/375#issuecomment-73505358 Only minor remarks. Looks good otherwise. Add strictly local input split assignment - Key: FLINK-1478 URL: https://issues.apache.org/jira/browse/FLINK-1478 Project: Flink Issue Type: New Feature Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Fabian Hueske Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1487) Failing SchedulerIsolatedTasksTest.testScheduleQueueing test case
[ https://issues.apache.org/jira/browse/FLINK-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312156#comment-14312156 ] Robert Metzger commented on FLINK-1487: --- I had another instance of this issue, in this case {code} Failed tests: SchedulerIsolatedTasksTest.testScheduleQueueing:283 expected:102 but was:101 {code} https://travis-ci.org/apache/flink/jobs/50039482 Failing SchedulerIsolatedTasksTest.testScheduleQueueing test case - Key: FLINK-1487 URL: https://issues.apache.org/jira/browse/FLINK-1487 Project: Flink Issue Type: Bug Reporter: Till Rohrmann I got the following failure on travis: {{SchedulerIsolatedTasksTest.testScheduleQueueing:283 expected:107 but was:106}} The failure does not occur consistently on travis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1342) Quickstart's assembly can possibly filter out user's code
[ https://issues.apache.org/jira/browse/FLINK-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1342: -- Priority: Critical (was: Major) Quickstart's assembly can possibly filter out user's code - Key: FLINK-1342 URL: https://issues.apache.org/jira/browse/FLINK-1342 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Márton Balassi Priority: Critical Fix For: 0.9, 0.8.1 I've added a quick solution for [1] for the time being. The assembly still filters out everything from the org.apache.flink namespace, so any user code placed there will be missing from the fat jar. If we do not use filtering at all the size of the jar goes up to almost 100 MB. [1] https://issues.apache.org/jira/browse/FLINK-1225 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup
[ https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312173#comment-14312173 ] ASF GitHub Bot commented on FLINK-1492: --- Github user uce commented on the pull request: https://github.com/apache/flink/pull/376#issuecomment-73500986 Yes, if it is finally OK. Exceptions on shutdown concerning BLOB store cleanup Key: FLINK-1492 URL: https://issues.apache.org/jira/browse/FLINK-1492 Project: Flink Issue Type: Bug Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Ufuk Celebi Fix For: 0.9 The following stack traces occur not every time, but frequently. {code} java.lang.IllegalArgumentException: /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171) at org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292) at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369) at akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63) at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 15:16:15,350 ERROR org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1 - LibraryCacheManager did not shutdown properly. java.io.IOException: Unable to delete file: /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7 at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279) at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171) at org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172) at akka.actor.ActorCell.terminate(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at
[jira] [Commented] (FLINK-1494) Build fails on BlobCacheTest
[ https://issues.apache.org/jira/browse/FLINK-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312177#comment-14312177 ] Ufuk Celebi commented on FLINK-1494: Repeatedly as in sometimes or always? ;) The only thing that touched the blob store recently were the shutdown hooks. Does reverting commits e766dba and 8803304 solve this issue? If yes, it seems to be related to the hooks. :( Build fails on BlobCacheTest Key: FLINK-1494 URL: https://issues.apache.org/jira/browse/FLINK-1494 Project: Flink Issue Type: Bug Components: Local Runtime, TaskManager Environment: Apache Maven 3.0.5 Maven home: /usr/share/maven Java version: 1.7.0_65, vendor: Oracle Corporation Java home: /usr/lib/jvm/java-7-openjdk-amd64/jre Default locale: en_US, platform encoding: UTF-8 OS name: linux, version: 3.16.0-4-amd64, arch: amd64, family: unix Reporter: Fabian Hueske Building Flink with Maven repeatedly fails with the following error: {code} Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 127.283 sec FAILURE! - in org.apache.flink.runtime.blob.BlobCacheTest testBlobCache(org.apache.flink.runtime.blob.BlobCacheTest) Time elapsed: 127.282 sec FAILURE! java.lang.AssertionError: Could not connect to BlobServer at address 0.0.0.0/0.0.0.0:56760 at org.junit.Assert.fail(Assert.java:88) at org.apache.flink.runtime.blob.BlobCacheTest.testBlobCache(BlobCacheTest.java:109) java.io.IOException: Could not connect to BlobServer at address 0.0.0.0/0.0.0.0:52657 at org.apache.flink.runtime.blob.BlobClient.init(BlobClient.java:61) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManagerTest.testLibraryCacheManagerCleanup(BlobLibraryCacheManagerTest.java:56) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Caused by: java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at java.net.Socket.connect(Socket.java:538) at org.apache.flink.runtime.blob.BlobClient.init(BlobClient.java:59) ... 24 more Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 127.299 sec FAILURE! - in org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManagerTest testLibraryCacheManagerCleanup(org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManagerTest) Time elapsed: 127.298 sec FAILURE! java.lang.AssertionError: Could not
[jira] [Commented] (FLINK-1463) RuntimeStatefulSerializerFactory declares ClassLoader as transient but later tries to use it
[ https://issues.apache.org/jira/browse/FLINK-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312176#comment-14312176 ] ASF GitHub Bot commented on FLINK-1463: --- Github user aljoscha closed the pull request at: https://github.com/apache/flink/pull/353 RuntimeStatefulSerializerFactory declares ClassLoader as transient but later tries to use it Key: FLINK-1463 URL: https://issues.apache.org/jira/browse/FLINK-1463 Project: Flink Issue Type: Bug Affects Versions: 0.8, 0.9 Reporter: Aljoscha Krettek Assignee: Aljoscha Krettek Priority: Blocker Fix For: 0.8.1 At least one user has seen an exception because of this. In theory, the ClassLoader is set again in readParametersFromConfig. But the way it is used in TupleComparatorBase, this method is never called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1494) Build fails on BlobCacheTest
[ https://issues.apache.org/jira/browse/FLINK-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske closed FLINK-1494. Resolution: Not a Problem The build failure was caused by not cleaned up artefacts, i.e., not calling Maven's {{clean}} target. Building with {{mvn clean test}} succeeds. Build fails on BlobCacheTest Key: FLINK-1494 URL: https://issues.apache.org/jira/browse/FLINK-1494 Project: Flink Issue Type: Bug Components: Local Runtime, TaskManager Environment: Apache Maven 3.0.5 Maven home: /usr/share/maven Java version: 1.7.0_65, vendor: Oracle Corporation Java home: /usr/lib/jvm/java-7-openjdk-amd64/jre Default locale: en_US, platform encoding: UTF-8 OS name: linux, version: 3.16.0-4-amd64, arch: amd64, family: unix Reporter: Fabian Hueske Building Flink with Maven repeatedly fails with the following error: {code} Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 127.283 sec FAILURE! - in org.apache.flink.runtime.blob.BlobCacheTest testBlobCache(org.apache.flink.runtime.blob.BlobCacheTest) Time elapsed: 127.282 sec FAILURE! java.lang.AssertionError: Could not connect to BlobServer at address 0.0.0.0/0.0.0.0:56760 at org.junit.Assert.fail(Assert.java:88) at org.apache.flink.runtime.blob.BlobCacheTest.testBlobCache(BlobCacheTest.java:109) java.io.IOException: Could not connect to BlobServer at address 0.0.0.0/0.0.0.0:52657 at org.apache.flink.runtime.blob.BlobClient.init(BlobClient.java:61) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManagerTest.testLibraryCacheManagerCleanup(BlobLibraryCacheManagerTest.java:56) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Caused by: java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at java.net.Socket.connect(Socket.java:538) at org.apache.flink.runtime.blob.BlobClient.init(BlobClient.java:59) ... 24 more Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 127.299 sec FAILURE! - in org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManagerTest testLibraryCacheManagerCleanup(org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManagerTest) Time elapsed: 127.298 sec FAILURE! java.lang.AssertionError: Could not connect to BlobServer at address 0.0.0.0/0.0.0.0:52657 at org.junit.Assert.fail(Assert.java:88)
[jira] [Commented] (FLINK-1493) Support for streaming jobs preserving global ordering of records
[ https://issues.apache.org/jira/browse/FLINK-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312182#comment-14312182 ] Gyula Fora commented on FLINK-1493: --- Hey, This is actually quite tricky to make it robust and still performant. Give me a day and I will actually implement something like this, because I am rewriting the readers for our fault tolerance ideas to also mind the superstep barriers. I will get back to when it is done and it should be trivial to implement what you want on top of it. Two the generics qustions: On the first you are right, I fixed it :) On the second, we use serialization delegates to handle serializations. It is a wrapper that contains a type serializer. This just allows more convenient implementations. Support for streaming jobs preserving global ordering of records Key: FLINK-1493 URL: https://issues.apache.org/jira/browse/FLINK-1493 Project: Flink Issue Type: New Feature Components: Streaming Reporter: Márton Balassi Distributed streaming jobs do not give total, global ordering guarantees for records only partial ordering is provided by the system: records travelling on the same exact route of the physical plan are ordered, but they aren't between routes. It turns out that although this feature can only be implemented via merge sorting in the input buffers on a timestamp field thus creating substantial latency is still desired for a number of applications. Just a heads up for the implementation: the sorting introduces back pressure in the buffers and might cause deadlocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1376) SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state
[ https://issues.apache.org/jira/browse/FLINK-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312192#comment-14312192 ] ASF GitHub Bot commented on FLINK-1376: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/318#issuecomment-73504622 I've merged it to the 0.8 branch. I forgot to close the PR, could you do it manually? Thank you. SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state Key: FLINK-1376 URL: https://issues.apache.org/jira/browse/FLINK-1376 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.8.1 In case that the TaskManager fatally fails and some of the failing node's slots are SharedSlots, then the slots are not properly released by the JobManager. This causes that the corresponding job will not be properly failed, leaving the system in a corrupted state. The reason for that is that the AllocatedSlot is not aware of being treated as a SharedSlot and thus he cannot release the associated SubSlots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1376) SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state
[ https://issues.apache.org/jira/browse/FLINK-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312191#comment-14312191 ] Robert Metzger commented on FLINK-1376: --- Resolved for release-0.8 with target 0.8.1 in http://git-wip-us.apache.org/repos/asf/flink/commit/91382bb8. SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state Key: FLINK-1376 URL: https://issues.apache.org/jira/browse/FLINK-1376 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.8.1 In case that the TaskManager fatally fails and some of the failing node's slots are SharedSlots, then the slots are not properly released by the JobManager. This causes that the corresponding job will not be properly failed, leaving the system in a corrupted state. The reason for that is that the AllocatedSlot is not aware of being treated as a SharedSlot and thus he cannot release the associated SubSlots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1376] [runtime] Add proper shared slot ...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/318#issuecomment-73504622 I've merged it to the 0.8 branch. I forgot to close the PR, could you do it manually? Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1478] Add support for strictly local in...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/375#issuecomment-73505358 Only minor remarks. Looks good otherwise. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1376) SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state
[ https://issues.apache.org/jira/browse/FLINK-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312203#comment-14312203 ] ASF GitHub Bot commented on FLINK-1376: --- Github user tillrohrmann closed the pull request at: https://github.com/apache/flink/pull/318 SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state Key: FLINK-1376 URL: https://issues.apache.org/jira/browse/FLINK-1376 Project: Flink Issue Type: Bug Affects Versions: 0.8, 0.9 Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.8.1 In case that the TaskManager fatally fails and some of the failing node's slots are SharedSlots, then the slots are not properly released by the JobManager. This causes that the corresponding job will not be properly failed, leaving the system in a corrupted state. The reason for that is that the AllocatedSlot is not aware of being treated as a SharedSlot and thus he cannot release the associated SubSlots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-703) Use complete element as join key.
[ https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313163#comment-14313163 ] Fabian Hueske commented on FLINK-703: - I propose wildcard field expressions ( {{*}} and {{_}} ) to define full elements (of any type) as join, group, or cogroup keys (see Define keys using Field Expressions section in the [programming guide|https://github.com/apache/flink/blob/master/docs/programming_guide.md]. Use complete element as join key. - Key: FLINK-703 URL: https://issues.apache.org/jira/browse/FLINK-703 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chiwan Park Priority: Trivial Labels: github-import Fix For: pre-apache In some situations such as semi-joins it could make sense to use a complete element as join key. Currently this can be done using a key-selector function, but we could offer a shortcut for that. This is not an urgent issue, but might be helpful. Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/703 Created by: [fhueske|https://github.com/fhueske] Labels: enhancement, java api, user satisfaction, Milestone: Release 0.6 (unplanned) Created at: Thu Apr 17 23:40:00 CEST 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1478] Add support for strictly local in...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/375 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1478) Add strictly local input split assignment
[ https://issues.apache.org/jira/browse/FLINK-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312311#comment-14312311 ] ASF GitHub Bot commented on FLINK-1478: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/375 Add strictly local input split assignment - Key: FLINK-1478 URL: https://issues.apache.org/jira/browse/FLINK-1478 Project: Flink Issue Type: New Feature Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Fabian Hueske Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1478) Add strictly local input split assignment
[ https://issues.apache.org/jira/browse/FLINK-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1478. - Resolution: Fixed Fixed in 4386620c06e94c9f4e3030ea7ae0f480845e2969 Add strictly local input split assignment - Key: FLINK-1478 URL: https://issues.apache.org/jira/browse/FLINK-1478 Project: Flink Issue Type: New Feature Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Fabian Hueske Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)