Re: Provide optional options on label.

DO YUNG YOON Thu, 21 Apr 2016 07:44:02 -0700

Thanks, Jun Ki and good point, Hyunsung.

I will create Jira issue and please comments what you guys think. let's
focus on listing options first.
Would be great if any of you can contribute on this~



On Wed, Apr 20, 2016 at 9:52 AM Hyunsung Jo <[email protected]> wrote:

> Hi all,
>
> I agree that providing options to selectively store graph components can be
> useful.
> That said, if S2Graph decides to do this, I feel like the data model should
> be better documented so that a general user can fully understand what each
> option means.
>
> Thanks,
> Jo
>
> On Wed, Apr 20, 2016 at 9:23 AM Jun Ki Kim <[email protected]> wrote:
>
> > Sounds good idea!
> >
> > I especially love optional kafka publishing feature.
> > Kafka is a good distributed massive data queue. That's why people usually
> > send their data to the Kafka.
> > On the other hands, It is too much data in a specific topic to handle or
> > select data. I met the situation to just select and process one label
> from
> > a one topic. I had to spend much my resources to filter out edges not my
> > own.
> > I convince the your "optional" feature will be helpful to S2Graph.
> >
> > Thanks for your suggestion!
> >
> > 2016년 4월 20일 (수) 오전 9:14, DO YUNG YOON <[email protected]>님이 작성:
> >
> > > Here is problem I encountered.
> > >
> > > I create label 'user_url_click' which store click log specifying who
> > click
> > > which url.
> > > In many cases, clicked url is very skewed and the # of edges for very
> > > popular url becomes very large, which yield memstore flush too often.
> > >
> > > Actually there is no need to store reversed direction(which store which
> > url
> > > is clicked by who) in my case since there is no query traversing from
> url
> > > with direction 'in', but there is no way to skip this to avoid too
> often
> > > memstore flush.
> > >
> > > So I think it would be better to provide extra options on label so user
> > can
> > > avoid these problem if they know what they are doing.
> > >
> > > Here is list of extra options I think might be helpful regarding
> storing
> > > edge.
> > >
> > > 1. skipReverse: skip storing atomatic reverse direction edge.
> > > 2. skipStoreVertex: skip storing vertex when storing edge.
> > > 3. skipStoreSnapshotEdge: skip storing snapshotEdge when
> consistencyLevel
> > > is weak.
> > >
> > > Also I think it would be good if we can provide options to control how
> > edge
> > > is published into kafka.
> > > There is only one flag `isAsync` on label, which control which kafka
> > topic
> > > edges with specific label should be published into.
> > > I think providing option to skip or sampling on publishing into kafka
> > also
> > > can be helpful.
> > >
> > > Wondering what other folks think
> > >
> > > Best Regards.
> > > DOYUNG YOON
> > >
> >
>

Re: Provide optional options on label.

Reply via email to