Graph algorithms shows the definite difference between MapReduce and BSP model. Disk I/O overhead isn't the problem, the number of iteration needed is different (that's why Spark doing Pregel-clone for graph-parallel).
Instead, the graph partitioning is needed. So, the streaming graph analysis is VERY difficult. At a glance, this can be looked as a integration of Storm, Kafka (or K/V store), and Giraph. But, transferring vertices to proper processor (or getting vertices from K/V store) is quite a tricky issue (I think this is almost impossible). Some incremental learning algorithms also has the same issue. However, we can do this in an unbroken line. On Wed, May 14, 2014 at 7:44 PM, Tommaso Teofili <[email protected]> wrote: > Hi Edward, > > it looks interesting, however I would need more information to completely > understand what the job and data flow would be. > > Regards, > Tommaso > > > 2014-05-14 3:46 GMT+02:00 Edward J. Yoon <[email protected]>: > >> Hi, >> >> I've just drawn the diagram of multi-bsp job scenario. Does this make >> sense to you? >> >> >> https://docs.google.com/drawings/d/1WpBEBzRz9zXn-G8-DWDE7O2JlhxkxT2mjTloMGSJuKM/edit?usp=sharing >> >> The differentiation is the direct connectivity between data processing >> and advanced analytical computing applications. >> >> -- >> Best Regards, Edward J. Yoon >> CEO at DataSayer Co., Ltd. >> -- Best Regards, Edward J. Yoon CEO at DataSayer Co., Ltd.
