Re: Creating a representative streaming workload
Hi, Sorry for the ultra-late reply. Another real-life streaming scenario would be the one I am working on: - collecting data from telecom cells in real-time - and filtering out certain information or enriching/correlating (adding additional info based on the parameters received) events - this is done in order to understand what is happening in the network and to ensure better quality of service. As for Robert's proposal, I'd like to work on the stream generator if there is no time constraint, but first of all I'd like to hear more details. What kind of data are we generating? How many fields are there and of what type? Ideally, the user calling this generator should be able to make this decision. Can we create a JIRA for this? This way, it would be easier to start working on the task. Thanks! Andra On Wed, Nov 18, 2015 at 12:14 PM, Robert Metzgerwrote: > Hey Vasia, > > I think a very common workload would be an event stream from web servers > of an online shop. Usually, these shops have multiple servers, so events > arrive out of order. > I think there are plenty of different use cases that you can build around > that data: > - Users perform different actions that a streaming system could track > (analysis of click-paths), > - some simple statistics using windows (items sold in the last 10 minutes, > ..). > - Maybe fraud detection would be another use case. > - Often, there also needs to be a sink to HDFS or another file system for > a long-term archive. > > I would love to see such an event generator in flink's contrib module. I > think that's something the entire streaming space could use. > > > > > On Mon, Nov 16, 2015 at 8:22 PM, Nick Dimiduk wrote: > >> All those should apply for streaming too... >> >> On Mon, Nov 16, 2015 at 11:06 AM, Vasiliki Kalavri < >> vasilikikala...@gmail.com> wrote: >> >>> Hi, >>> >>> thanks Nick and Ovidiu for the links! >>> >>> Just to clarify, we're not looking into creating a generic streaming >>> benchmark. We have quite limited time and resources for this project. What >>> we want is to decide on a set of 3-4 _common_ streaming applications. To >>> give you an idea, for the batch workload, we will pick something like a >>> grep, one relational application, a graph algorithm, and an ML algorithm. >>> >>> Cheers, >>> -Vasia. >>> >>> On 16 November 2015 at 19:25, Ovidiu-Cristian MARCU < >>> ovidiu-cristian.ma...@inria.fr> wrote: >>> Regarding Flink vs Spark / Storm you can check here: http://www.sparkbigdata.com/102-spark-blog-slim-baltagi/14-results-of-a-benchmark-between-apache-flink-and-apache-spark Best regards, Ovidiu On 16 Nov 2015, at 15:21, Vasiliki Kalavri wrote: Hello squirrels, with some colleagues and students here at KTH, we have started 2 projects to evaluate (1) performance and (2) behavior in the presence of memory interference in cloud environments, for Flink and other systems. We want to provide our students with a workload of representative applications for testing. While for batch applications, it is quite clear to us what classes of applications are widely used and how to create a workload of different types of applications, we are not quite sure about the streaming workload. That's why, we'd like your opinions! If you're using Flink streaming in your company or your project, we'd love your input even more :-) What kind of applications would you consider as "representative" of a streaming workload? Have you run any experiments to evaluate Flink versus Spark, Storm etc.? If yes, would you mind sharing your code with us? We will of course be happy to share our results with everyone after we have completed our study. Thanks a lot! -Vasia. >>> >> >
Re: Creating Graphs from DataStream in Flink Gelly
Hi, There is a separate project related to graph streaming. It's called gelly-streaming. And, if you look here: https://github.com/vasia/gelly-streaming/blob/master/src/main/java/org/apache/flink/graph/streaming/GraphStream.java You can find a constructor which creates a graph from a DataStream of edges. Gelly itself cannot do that, for now. Hope this helps! Andra On Mon, Nov 2, 2015 at 3:35 PM, UFUOMA IGHOROJEwrote: > Is there some support for creating Graphs from DataStream in Flink Gelly? > My use case is to create dynamic graphs, where the graph topology can be > updated from a stream of data. From the API, all I can find is creating > graphs from DataSets. > > Best, > > Ufuoma >
Re: Gelly ran out of memory
Hi Flavio, These kinds of exceptions generally arise from the fact that you ran out of `user` memory. You can try to increase that a bit. In your flink-conf.yaml try adding # The memory fraction allocated system -user taskmanager.memory.fraction: 0.4 This will give 0.6 of the unit of memory to the user and 0.4 to the system. Tell me if that helped. Andra On Thu, Aug 20, 2015 at 12:02 PM, Flavio Pompermaier pomperma...@okkam.it wrote: Hi to all, I tried to run my gelly job on Flink 0.9-SNAPSHOT and I was having an EOFException, so I tried on 0.10-SNAPSHOT and now I have the following error: Caused by: java.lang.RuntimeException: Memory ran out. Compaction failed. numPartitions: 32 minPartition: 73 maxPartition: 80 number of overflow segments: 0 bucketSize: 570 Overall memory: 102367232 Partition memory: 81100800 Message: null at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertRecordIntoPartition(CompactingHashTable.java:465) at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:414) at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:211) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:272) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) at java.lang.Thread.run(Thread.java:745) Probably I'm doing something wrong but I can't understand how to estimate the required memory for my Gelly job.. Best, Flavio
Re: Too few memory segments provided exception
I also questioned the vertex-centric approach before. The exact computation does not throw this exception so I guess adapting the approximate version will do the trick [I also suggested improving the algorithm to use less operators offline]. However, the issue still persists. We saw it in Affinity Propagation as well... So even if the problem will disappear for this example, I am curious how we should handle it in the future. On Mon, Jul 20, 2015 at 3:15 PM, Vasiliki Kalavri vasilikikala...@gmail.com wrote: Hi Shivani, why are you using a vertex-centric iteration to compute the approximate Adamic-Adar? It's not an iterative computation :) In fact, it should be as complex (in terms of operators) as the exact Adamic-Adar, only more efficient because of the different neighborhood representation. Are you having the same problem with the exact computation? Cheers, Vasia. On 20 July 2015 at 14:41, Maximilian Michels m...@apache.org wrote: Hi Shivani, The issue is that by the time the Hash Join is executed, the MutableHashTable cannot allocate enough memory segments. That means that your other operators are occupying them. It is fine that this also occurs on Travis because the workers there have limited memory as well. Till suggested to change the memory fraction through the ExuectionEnvironment. Can you try that? Cheers, Max On Mon, Jul 20, 2015 at 2:23 PM, Shivani Ghatge shgha...@gmail.com wrote: Hello Maximilian, Thanks for the suggestion. I will use it to check the program. But when I am creating a PR for the same implementation with a Test, I am getting the same error even on Travis build. So for that what would be the solution? Here is my PR https://github.com/apache/flink/pull/923 And here is the Travis build status https://travis-ci.org/apache/flink/builds/71695078 Also on the IDE it is working fine in Collection execution mode. Thanks and Regards, Shivani On Mon, Jul 20, 2015 at 2:14 PM, Maximilian Michels m...@apache.org wrote: Hi Shivani, Flink doesn't have enough memory to perform a hash join. You need to provide Flink with more memory. You can either increase the taskmanager.heap.mb config variable or set taskmanager.memory.fraction to some value greater than 0.7 and smaller then 1.0. The first config variable allocates more overall memory for Flink; the latter changes the ratio between Flink managed memory (e.g. for hash join) and user memory (for you functions and Gelly's code). If you run this inside an IDE, the memory is configured automatically and you don't have control over that at the moment. You could, however, start a local cluster (./bin/start-local) after you adjusted your flink-conf.yaml and run your programs against that configured cluster. You can do that either through your IDE using a RemoteEnvironment or by submitting the packaged JAR to the local cluster using the command-line tool (./bin/flink). Hope that helps. Cheers, Max On Mon, Jul 20, 2015 at 2:04 PM, Shivani Ghatge shgha...@gmail.com wrote: Hello, I am working on a problem which implements Adamic Adar Algorithm using Gelly. I am running into this exception for all the Joins (including the one that are part of the reduceOnNeighbors function) Too few memory segments provided. Hash Join needs at least 33 memory segments. The problem persists even when I comment out some of the joins. Even after using edg = edg.join(graph.getEdges(), JoinOperatorBase.JoinHint.BROADCAST_HASH_SECOND).where(0,1).equalTo(0,1).with(new JoinEdge()); as suggested by @AndraLungu the problem persists. The code is DataSetTuple2Long, Long degrees = graph.getDegrees(); //get neighbors of each vertex in the HashSet for it's value computedNeighbors = graph.reduceOnNeighbors(new GatherNeighbors(), EdgeDirection.ALL); //get vertices with updated values for the final Graph which will be used to get Adamic Edges Vertices = computedNeighbors.join(degrees, JoinOperatorBase.JoinHint.BROADCAST_HASH_FIRST).where(0).equalTo(0).with(new JoinNeighborDegrees()); GraphLong, Tuple3Double, HashSetLong, ListTuple3Long, Long, Double, Double updatedGraph = Graph.fromDataSet(Vertices, edges, env); //configure Vertex Centric Iteration VertexCentricConfiguration parameters = new VertexCentricConfiguration(); parameters.setName(Find Adamic Adar Edge Weights); parameters.setDirection(EdgeDirection.ALL); //run Vertex Centric Iteration to get the Adamic Adar Edges into the vertex Value updatedGraph = updatedGraph.runVertexCentricIteration(new GetAdamicAdarEdgesLong(), new NeighborsMessengerLong(), 1, parameters); //Extract Vertices of the updated graph DataSetVertexLong, Tuple3Double, HashSetLong, ListTuple3Long, Long, Double vertices = updatedGraph.getVertices(); //Extract the list of Edges from the vertex values
Re: Gelly EOFException
For now, there is a validator that checks whether the vertex ids correspond to the target/src ids in the edges. If you want to check for vertex ID uniqueness, you'll have to implement your own custom validator... I know people with the same error outside Gelly, so I doubt that the lack of unique ids triggered the exception :) On Thu, Jul 16, 2015 at 6:14 PM, Flavio Pompermaier pomperma...@okkam.it wrote: I thought a bit about this error..in my job I was generating multiple vertices with the same id. Could this cause such errors? Maybe there could be a check about such situations in Gelly.. On Tue, Jul 14, 2015 at 10:00 PM, Andra Lungu lungu.an...@gmail.com wrote: Hello, Sorry for the delay. The bug is not in Gelly, but is, as hinted in the exception and as can be seen in the logs, in Flink's Runtime. Mihail may actually be on to something. The bug is actually very similar to the one described in FLINK-1916. However, as can be seen in the discussion thread there, it's a bit difficult to fix it without some steps to reproduce. I unfortunately managed to reproduce it and have opened a Jira... FLINK-2360 https://issues.apache.org/jira/browse/FLINK-2360. It's a similar delta iteration setting. Hope we can get some help with this. Thanks! Andra On Tue, Jul 14, 2015 at 2:12 PM, Mihail Vieru vi...@informatik.hu-berlin.de wrote: Hi, looks very similar to this bug: https://issues.apache.org/jira/browse/FLINK-1916 Best, Mihail On 14.07.2015 14:09, Andra Lungu wrote: Hi Flavio, Could you also show us a code snippet? On Tue, Jul 14, 2015 at 2:06 PM, Flavio Pompermaier pomperma...@okkam.it wrote: Hi to all, in my vertex centric iteration I get the following exception, am I doing something wrong or is it a bug of Gelly? starting iteration [1]: CoGroup (Messaging) (6/8) IterationHead(WorksetIteration (Vertex-centric iteration (test.gelly.functions.VUpdateFunction@1814786f | test.gelly.functions.VMessagingFunction@67eecbc2))) (4/8) switched to FAILED with exception. java.io.EOFException at org.apache.flink.runtime.operators.hash.InMemoryPartition$WriteView.nextSegment(InMemoryPartition.java:333) at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.advance(AbstractPagedOutputView.java:140) at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.write(AbstractPagedOutputView.java:201) at org.apache.flink.api.java.typeutils.runtime.DataOutputViewStream.write(DataOutputViewStream.java:39) at com.esotericsoftware.kryo.io.Output.flush(Output.java:163) at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.serialize(KryoSerializer.java:187) at org.apache.flink.api.java.typeutils.runtime.PojoSerializer.serialize(PojoSerializer.java:372) at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:116) at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:30) at org.apache.flink.runtime.operators.hash.InMemoryPartition.appendRecord(InMemoryPartition.java:219) at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:536) at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:347) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:209) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:270) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) at java.lang.Thread.run(Thread.java:745) Best, Flavio
Re: Gelly EOFException
Hello, Sorry for the delay. The bug is not in Gelly, but is, as hinted in the exception and as can be seen in the logs, in Flink's Runtime. Mihail may actually be on to something. The bug is actually very similar to the one described in FLINK-1916. However, as can be seen in the discussion thread there, it's a bit difficult to fix it without some steps to reproduce. I unfortunately managed to reproduce it and have opened a Jira... FLINK-2360 https://issues.apache.org/jira/browse/FLINK-2360. It's a similar delta iteration setting. Hope we can get some help with this. Thanks! Andra On Tue, Jul 14, 2015 at 2:12 PM, Mihail Vieru vi...@informatik.hu-berlin.de wrote: Hi, looks very similar to this bug: https://issues.apache.org/jira/browse/FLINK-1916 Best, Mihail On 14.07.2015 14:09, Andra Lungu wrote: Hi Flavio, Could you also show us a code snippet? On Tue, Jul 14, 2015 at 2:06 PM, Flavio Pompermaier pomperma...@okkam.it wrote: Hi to all, in my vertex centric iteration I get the following exception, am I doing something wrong or is it a bug of Gelly? starting iteration [1]: CoGroup (Messaging) (6/8) IterationHead(WorksetIteration (Vertex-centric iteration (test.gelly.functions.VUpdateFunction@1814786f | test.gelly.functions.VMessagingFunction@67eecbc2))) (4/8) switched to FAILED with exception. java.io.EOFException at org.apache.flink.runtime.operators.hash.InMemoryPartition$WriteView.nextSegment(InMemoryPartition.java:333) at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.advance(AbstractPagedOutputView.java:140) at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.write(AbstractPagedOutputView.java:201) at org.apache.flink.api.java.typeutils.runtime.DataOutputViewStream.write(DataOutputViewStream.java:39) at com.esotericsoftware.kryo.io.Output.flush(Output.java:163) at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.serialize(KryoSerializer.java:187) at org.apache.flink.api.java.typeutils.runtime.PojoSerializer.serialize(PojoSerializer.java:372) at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:116) at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:30) at org.apache.flink.runtime.operators.hash.InMemoryPartition.appendRecord(InMemoryPartition.java:219) at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:536) at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:347) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:209) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:270) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) at java.lang.Thread.run(Thread.java:745) Best, Flavio
Re: The slot in which the task was scheduled has been killed (probably loss of TaskManager)
Something similar in flink-0.10-SNAPSHOT: 06/29/2015 10:33:46 CHAIN Join(Join at main(TriangleCount.java:79)) - Combine (Reduce at main(TriangleCount.java:79))(222/224) switched to FAILED java.lang.Exception: The slot in which the task was executed has been released. Probably loss of TaskManager 57c67d938c9144bec5ba798bb8ebe636 @ wally025 - 8 slots - URL: akka.tcp:// flink@130.149.249.35:56135/user/taskmanager at org.apache.flink.runtime.instance.SimpleSlot.releaseSlot(SimpleSlot.java:151) at org.apache.flink.runtime.instance.SlotSharingGroupAssignment.releaseSharedSlot(SlotSharingGroupAssignment.java:547) at org.apache.flink.runtime.instance.SharedSlot.releaseSlot(SharedSlot.java:119) at org.apache.flink.runtime.instance.Instance.markDead(Instance.java:154) at org.apache.flink.runtime.instance.InstanceManager.unregisterTaskManager(InstanceManager.java:182) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:421) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:92) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:46) at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369) at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501) at akka.actor.ActorCell.invoke(ActorCell.scala:486) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 06/29/2015 10:33:46 Job execution switched to status FAILING. On Mon, Jun 29, 2015 at 1:08 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: I witnessed a similar issue yesterday on a simple job (single task chain, no shuffles) with a release-0.9 based fork. 2015-04-15 14:59 GMT+02:00 Flavio Pompermaier pomperma...@okkam.it: Yes , sorry for that..I found it somewhere in the logs..the problem was that the program didn't die immediately but was somehow hanging and I discovered the source of the problem only running the program on a subset of the data. Thnks for the support, Flavio On Wed, Apr 15, 2015 at 2:56 PM, Stephan Ewen se...@apache.org wrote: This means that the TaskManager was lost. The JobManager can no longer reach the TaskManager and consists all tasks executing ob the TaskManager as failed. Have a look at the TaskManager log, it should describe why the TaskManager failed. Am 15.04.2015 14:45 schrieb Flavio Pompermaier pomperma...@okkam.it: Hi to all, I have this strange error in my job and I don't know what's going on. What can I do? The full exception is: The slot in which the task was scheduled has been killed (probably loss of TaskManager). at org.apache.flink.runtime.instance.SimpleSlot.cancel(SimpleSlot.java:98) at org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroupAssignment.releaseSimpleSlot(SlotSharingGroupAssignment.java:335) at org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroupAssignment.releaseSharedSlot(SlotSharingGroupAssignment.java:319) at org.apache.flink.runtime.instance.SharedSlot.releaseSlot(SharedSlot.java:106) at org.apache.flink.runtime.instance.Instance.markDead(Instance.java:151) at org.apache.flink.runtime.instance.InstanceManager.unregisterTaskManager(InstanceManager.java:182) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:435) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at
Re: Gelly available already?
Hi Sebastian, For me it works just as described there, with 0.9, but there should be no problem for 0.8.1. Here is an example pom.xml https://github.com/andralungu/gelly-partitioning/blob/first/pom.xml Hope that helps! Andra On Mon, Mar 23, 2015 at 11:02 PM, Sebastian ssc.o...@googlemail.com wrote: Hi, Is gelly already usable in the 0.8.1 release? I tried adding dependency groupIdorg.apache.flink/groupId artifactIdflink-gelly/artifactId version0.8.1/version /dependency as described in [1], but my project fails to build. Best, Sebastian [1] http://ci.apache.org/projects/flink/flink-docs-master/gelly_guide.html
Re: Gelly available already?
For now it just works with the Java API. On Mon, Mar 23, 2015 at 11:42 PM, Sebastian ssc.o...@googlemail.com wrote: Is gelly supposed to be usable from Scala? It looks as it is hardcoded to use the Java API. Best, Sebastian On 23.03.2015 23:15, Robert Metzger wrote: Hi, Gelly is not part of any offical flink release. You have to use a Snapshot version of Flink if you want to try it out. Sent from my iPhone On 23.03.2015, at 23:10, Andra Lungu lungu.an...@gmail.com mailto:lungu.an...@gmail.com wrote: Hi Sebastian, For me it works just as described there, with 0.9, but there should be no problem for 0.8.1. Here is an example pom.xml https://github.com/andralungu/gelly-partitioning/blob/first/pom.xml Hope that helps! Andra On Mon, Mar 23, 2015 at 11:02 PM, Sebastian ssc.o...@googlemail.com mailto:ssc.o...@googlemail.com wrote: Hi, Is gelly already usable in the 0.8.1 release? I tried adding dependency groupIdorg.apache.flink/__groupId artifactIdflink-gelly/__artifactId version0.8.1/version /dependency as described in [1], but my project fails to build. Best, Sebastian [1] http://ci.apache.org/projects/__flink/flink-docs-master/ gelly___guide.html http://ci.apache.org/projects/flink/flink-docs- master/gelly_guide.html