Thanks, Jun Ki and good point, Hyunsung. I will create Jira issue and please comments what you guys think. let's focus on listing options first. Would be great if any of you can contribute on this~
On Wed, Apr 20, 2016 at 9:52 AM Hyunsung Jo <[email protected]> wrote: > Hi all, > > I agree that providing options to selectively store graph components can be > useful. > That said, if S2Graph decides to do this, I feel like the data model should > be better documented so that a general user can fully understand what each > option means. > > Thanks, > Jo > > On Wed, Apr 20, 2016 at 9:23 AM Jun Ki Kim <[email protected]> wrote: > > > Sounds good idea! > > > > I especially love optional kafka publishing feature. > > Kafka is a good distributed massive data queue. That's why people usually > > send their data to the Kafka. > > On the other hands, It is too much data in a specific topic to handle or > > select data. I met the situation to just select and process one label > from > > a one topic. I had to spend much my resources to filter out edges not my > > own. > > I convince the your "optional" feature will be helpful to S2Graph. > > > > Thanks for your suggestion! > > > > 2016년 4월 20일 (수) 오전 9:14, DO YUNG YOON <[email protected]>님이 작성: > > > > > Here is problem I encountered. > > > > > > I create label 'user_url_click' which store click log specifying who > > click > > > which url. > > > In many cases, clicked url is very skewed and the # of edges for very > > > popular url becomes very large, which yield memstore flush too often. > > > > > > Actually there is no need to store reversed direction(which store which > > url > > > is clicked by who) in my case since there is no query traversing from > url > > > with direction 'in', but there is no way to skip this to avoid too > often > > > memstore flush. > > > > > > So I think it would be better to provide extra options on label so user > > can > > > avoid these problem if they know what they are doing. > > > > > > Here is list of extra options I think might be helpful regarding > storing > > > edge. > > > > > > 1. skipReverse: skip storing atomatic reverse direction edge. > > > 2. skipStoreVertex: skip storing vertex when storing edge. > > > 3. skipStoreSnapshotEdge: skip storing snapshotEdge when > consistencyLevel > > > is weak. > > > > > > Also I think it would be good if we can provide options to control how > > edge > > > is published into kafka. > > > There is only one flag `isAsync` on label, which control which kafka > > topic > > > edges with specific label should be published into. > > > I think providing option to skip or sampling on publishing into kafka > > also > > > can be helpful. > > > > > > Wondering what other folks think > > > > > > Best Regards. > > > DOYUNG YOON > > > > > >
