haha, yes, actually I just confirmed! If I flip my args, I get the error you mention in the first e-mail. you're trying to generate a graph giving the edge list as a vertex list and this is a way too big dataset for your memory settings (cmp. ~15m edges vs. the actual 400k).
I hope that clear everything out :-) Cheers, V. On 18 March 2015 at 23:44, Vasiliki Kalavri <vasilikikala...@gmail.com> wrote: > Well, one thing I notice is that your vertices and edges args are flipped. > Might be the source of error :-) > > On 18 March 2015 at 23:04, Mihail Vieru <vi...@informatik.hu-berlin.de> > wrote: > >> I'm also using 0 as sourceID. The exact program arguments: >> >> 0 /home/vieru/dev/flink-experiments/data/social_network.edgelist >> /home/vieru/dev/flink-experiments/data/social_network.verticeslist >> /home/vieru/dev/flink-experiments/sssp-output-higgstwitter 10 >> >> And yes, I call both methods on the initialized Graph *mappedInput*. I >> don't understand why the distances are computed correctly for the small >> graph (also read from files) but not for the larger one. >> The messages appear to be wrong in the latter case. >> >> >> On 18.03.2015 21:55, Vasiliki Kalavri wrote: >> >> hmm, I'm starting to run out of ideas... >> What's your source ID parameter? I ran mine with 0. >> About the result, you call both createVertexCentricIteration() and >> runVertexCentricIteration() on the initialized graph, right? >> >> On 18 March 2015 at 22:33, Mihail Vieru <vi...@informatik.hu-berlin.de> >> wrote: >> >>> Hi Vasia, >>> >>> yes, I am using the latest master. I just did a pull again and the >>> problem persists. Perhaps Robert could confirm as well. >>> >>> I've set the solution set to unmanaged in SSSPUnweighted as Stephan >>> proposed and the job finishes. So I am able to proceed using this >>> workaround. >>> >>> An odd thing occurs now though. The distances aren't computed correctly >>> for the SNAP graph and remain the one set in InitVerticesMapper(). For the >>> small graph in SSSPDataUnweighted they are OK. I'm currently investigating >>> this behavior. >>> >>> Cheers, >>> Mihail >>> >>> >>> On 18.03.2015 20:55, Vasiliki Kalavri wrote: >>> >>> Hi Mihail, >>> >>> I used your code to generate the vertex file, then gave this and the >>> edge list as input to your SSSP implementation and still couldn't reproduce >>> the exception. I'm using the same local setup as I describe above. >>> I'm not aware of any recent changes that might be relevant, but, just in >>> case, are you using the latest master? >>> >>> Cheers, >>> V. >>> >>> On 18 March 2015 at 19:21, Mihail Vieru <vi...@informatik.hu-berlin.de> >>> wrote: >>> >>>> Hi Vasia, >>>> >>>> I have used a simple job (attached) to generate a file which looks like >>>> this: >>>> >>>> 0 0 >>>> 1 1 >>>> 2 2 >>>> ... >>>> 456629 456629 >>>> 456630 456630 >>>> >>>> I need the vertices to be generated from a file for my future work. >>>> >>>> Cheers, >>>> Mihail >>>> >>>> >>>> >>>> On 18.03.2015 17:04, Vasiliki Kalavri wrote: >>>> >>>> Hi Mihail, Robert, >>>> >>>> I've tried reproducing this, but I couldn't. >>>> I'm using the same twitter input graph from SNAP that you link to and >>>> also Scala IDE. >>>> The job finishes without a problem (both the SSSP example from Gelly >>>> and the unweighted version). >>>> >>>> The only thing I changed to run your version was creating the graph >>>> from the edge set only, i.e. like this: >>>> >>>> Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges, >>>> new MapFunction<Long, Long>() { >>>> public Long map(Long value) { >>>> return Long.MAX_VALUE; >>>> } >>>> }, env); >>>> >>>> Since the twitter input is an edge list, how do you generate the vertex >>>> dataset in your case? >>>> >>>> Thanks, >>>> -Vasia. >>>> >>>> On 18 March 2015 at 16:54, Mihail Vieru <vi...@informatik.hu-berlin.de> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> great! Thanks! >>>>> >>>>> I really need this bug fixed because I'm laying the groundwork for my >>>>> Diplom thesis and I need to be sure that the Gelly API is reliable and can >>>>> handle large datasets as intended. >>>>> >>>>> Cheers, >>>>> Mihail >>>>> >>>>> >>>>> On 18.03.2015 15:40, Robert Waury wrote: >>>>> >>>>> Hi, >>>>> >>>>> I managed to reproduce the behavior and as far as I can tell it seems >>>>> to be a problem with the memory allocation. >>>>> >>>>> I have filed a bug report in JIRA to get the attention of somebody >>>>> who knows the runtime better than I do. >>>>> >>>>> https://issues.apache.org/jira/browse/FLINK-1734 >>>>> >>>>> Cheers, >>>>> Robert >>>>> >>>>> On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru < >>>>> vi...@informatik.hu-berlin.de> wrote: >>>>> >>>>>> Hi Robert, >>>>>> >>>>>> thank you for your reply. >>>>>> >>>>>> I'm starting the job from the Scala IDE. So only one JobManager and >>>>>> one TaskManager in the same JVM. >>>>>> I've doubled the memory in the eclipse.ini settings but I still get >>>>>> the Exception. >>>>>> >>>>>> -vmargs >>>>>> -Xmx2048m >>>>>> -Xms100m >>>>>> -XX:MaxPermSize=512m >>>>>> >>>>>> Best, >>>>>> Mihail >>>>>> >>>>>> >>>>>> On 17.03.2015 10:11, Robert Waury wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> can you tell me how much memory your job has and how many workers >>>>>> you are running? >>>>>> >>>>>> From the trace it seems the internal hash table allocated only 7 MB >>>>>> for the graph data and therefore runs out of memory pretty quickly. >>>>>> >>>>>> Skewed data could also be an issue but with a minimum of 5 pages >>>>>> and a maximum of 8 it seems to be distributed fairly even to the >>>>>> different >>>>>> partitions. >>>>>> >>>>>> Cheers, >>>>>> Robert >>>>>> >>>>>> On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru < >>>>>> vi...@informatik.hu-berlin.de> wrote: >>>>>> >>>>>>> And the correct SSSPUnweighted attached. >>>>>>> >>>>>>> >>>>>>> On 17.03.2015 01:23, Mihail Vieru wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm getting the following RuntimeException for an adaptation of the >>>>>>>> SingleSourceShortestPaths example using the Gelly API (see attachment). >>>>>>>> It's been adapted for unweighted graphs having vertices with Long >>>>>>>> values. >>>>>>>> >>>>>>>> As an input graph I'm using the social network graph (~200MB >>>>>>>> unpacked) from here: >>>>>>>> https://snap.stanford.edu/data/higgs-twitter.html >>>>>>>> >>>>>>>> For the small SSSPDataUnweighted graph (also attached) it >>>>>>>> terminates and computes the distances correctly. >>>>>>>> >>>>>>>> >>>>>>>> 03/16/2015 17:18:23 IterationHead(WorksetIteration >>>>>>>> (Vertex-centric iteration >>>>>>>> (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 >>>>>>>> | >>>>>>>> org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) >>>>>>>> switched to FAILED >>>>>>>> java.lang.RuntimeException: Memory ran out. Compaction failed. >>>>>>>> numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow >>>>>>>> segments: 176 bucketSize: 217 Overall memory: 20316160 Partition >>>>>>>> memory: >>>>>>>> 7208960 Message: Index: 8, Size: 7 >>>>>>>> at >>>>>>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205) >>>>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>>>> >>>>>>>> >>>>>>>> Best, >>>>>>>> Mihail >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> >> >