[GitHub] flink pull request: [FLINK-951] Reworking of Iteration Synchroniza...
Github user markus-h commented on the pull request: https://github.com/apache/flink/pull/570#issuecomment-214833775 Sorry for not driving this further. I think this pull request is now way too outdated to have any chance of rebasing it to the current master, therefore I will close it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-951] Reworking of Iteration Synchroniza...
Github user markus-h closed the pull request at: https://github.com/apache/flink/pull/570 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...
Github user markus-h closed the pull request at: https://github.com/apache/flink/pull/598 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...
Github user markus-h commented on the pull request: https://github.com/apache/flink/pull/598#issuecomment-93987347 There is no specific usecase, but when you try to process big graphs locally you often run out of memory with delta iterations. But the reason I needed this change is a different one. I am doing research on failure recovery methods in graph analysis. Most Pregel like systems just do a full checkpointing of all vertices. This was way easier to implement with a bulk iteration than with delta iterations in Flink so I decided to just provide gelly with this mode. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...
Github user markus-h commented on the pull request: https://github.com/apache/flink/pull/598#issuecomment-94008273 Hi @vasia, thanks for your comments! I thought about this extension in a different way. Whenever you have a graph that is too big to process it with delta iteration you could just turn on bulk mode to get the computation done. It will be a lot slower, but sometimes this might be better then not getting any results. I dont think a dedicated bulk operator would be very useful. People can just use plain Flink if they dont need the Pregel abstraction. And in most cases it would be much slower then using the current solution. You know gelly and its usecases a lot better then me. If you dont think that a mode like this might be userful I am totally find with that. It is a very small change anyway. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...
GitHub user markus-h opened a pull request: https://github.com/apache/flink/pull/598 [FLINK-1885] [gelly] Added bulk execution mode to gellys vertex centric iterations See https://issues.apache.org/jira/browse/FLINK-1885 I essentially exchanged the delta iteration with a bulk iteration and made the coGroup of the VertexUpdateUdf kind of an outer join so that the vertices that are not changed in one superstep are kept around in the next one. You can merge this pull request into a Git repository by running: $ git pull https://github.com/markus-h/incubator-flink gellyBulkMode Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/598.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #598 commit e3641c88ea260dbb533015adfb6ef44272a2e615 Author: Markus Holzemer markus.holze...@gmx.de Date: 2015-04-13T15:55:03Z Added bulk execution mode to gellys vertex centric iterations --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-951] Reworking of Iteration Synchroniza...
Github user markus-h commented on the pull request: https://github.com/apache/flink/pull/570#issuecomment-90694363 There seems to be a race condition somewhere in my code but I have trouble finding it since I can not reproduce it locally. I thought my last change would fix it but it didn't. So if somebody has some free time and knows a bit about race conditions feel free to help me :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-951] Reworking of Iteration Synchroniza...
Github user markus-h commented on the pull request: https://github.com/apache/flink/pull/570#issuecomment-89849730 Thanks for your comments! I will try to revert my formattings. I am used to doing STRG+F while programming that probably changed the formatting. I also got rid of the Thread.sleep(). The problem was acutally a different one. Akka delivered the same response object to threads on the same machine, but I thought it would be copies. Now I do a hard copy of the response, that seems to fix the problem. I am not sure how to test the interaction between IterationHead and JM though. Is there some similiar testcase that I could use as a basis? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Reworking of Iteration Synchronization, Accumu...
Github user markus-h commented on the pull request: https://github.com/apache/flink/pull/36#issuecomment-89529411 This change is continued in https://github.com/apache/flink/pull/570 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Reworking of Iteration Synchronization, Accumu...
Github user markus-h closed the pull request at: https://github.com/apache/flink/pull/36 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-951] Reworking of Iteration Synchroniza...
GitHub user markus-h opened a pull request: https://github.com/apache/flink/pull/570 [FLINK-951] Reworking of Iteration Synchronization, Accumulators and Aggregators Iteration synchronization through JobManager Unification of Accumulators and Aggregators (removal of former Aggregators) Adjusted testcases accordingly I redid the work of my very old pull request https://github.com/apache/flink/pull/36 A more detailed description can be found in jira https://issues.apache.org/jira/browse/FLINK-951 I came across some unexpected behaviour with akka that made a small hack neccessary. Perhaps somebody with more experience in akka can find a better solution. See IterationHeadPactTask line 392. You can merge this pull request into a Git repository by running: $ git pull https://github.com/markus-h/incubator-flink iterationsAndAccumulatorsRework2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/570.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #570 commit 5492487892ff99f10fccdb075404dedaa3371ff7 Author: Markus Holzemer markus.holze...@gmx.de Date: 2015-04-02T15:56:19Z Iteration synchronization through JobManager Unification of Accumulators and Aggregators (removal of former Aggregators) Adjusted testcases accordingly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Reworking of Iteration Synchronization, Accumu...
Github user markus-h commented on the pull request: https://github.com/apache/flink/pull/36#issuecomment-88924335 I continued working on this. I will try to integrate this change in the current master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---