Interesting, thanks Marko. I'm just reading up on Gremlin and Cypher, so
I'll try this out in a bit.
But, I just realized while re-reading my post that I can actually get my
answer in a single traverse, by flipping my logic. Instead of fetching the
followers of each person in the list, and matching it up against the people
I follow, I can just fetch the list of people each of {the people I follow}
follow.
Btw, we run Node.js, so parallelization overhead from threads isn't an issue
for us. So I'm not sure if this will really have any significant perf
improvement, but I'll try both these ideas out.
Thanks! // Makes sense here ;)
Aseem
On Wed, Aug 24, 2011 at 2:34 PM, Marko Rodriguez <[email protected]>wrote:
> Hi,
>
> With Gremlin (and I believe Cypher), you can do these operations lazily and
> thus, be more memory efficient. Moreoever, I don't know how much speedup you
> will get with parallelism, but I suspect the overhead of threads is going to
> slow down your query --- (as I've seen with 'small queries' using parallel
> branches). With large queries (touching millions of things, parallelism
> starts to show benefits).
>
> In Gremlin, this is how your query is represented, where if vertex 1 is
> "I":
>
> m = [:]; x = [] as Set
> g.v(1).out('follows').aggregate(x).sideEffect{m[it] =
> it.in('follows').retain(x).count()}
> >> -1
>
> m will have keys that are the vertices that you follow and values being the
> number of shared followers of those followers (?! loopy talk ?!). More
> specifically, it has the answers analagous to "vertex 2 followed by 4 people
> that you follow."
>
> If you use Gremlin 1.2 (which I don't think Neo4j has released with their
> server yet), there is a more concise representation that I can show you.
> Also, Gremlin 1.3-SNAPSHOT will make this query ~twice as fast, but its not
> released yet :(. I can show you a trick to make it twice as fast if you are
> interested.
>
> HTH, // Peter taught me what HTH means. Its a good salutation. I would
> previously have done "Thanks, Marko" but that doesn't really make much
> sense.
> Marko.
>
> http://markorodriguez.com
>
> On Aug 24, 2011, at 3:14 PM, Aseem Kishore wrote:
>
> > Hi guys,
> >
> > We're building a social network which has an asymmetrical follower model
> > like Twitter's: users "follow" each other.
> >
> > We have various views where we show a list of people. This could be e.g.
> all
> > people in the network, or it might be some user's followers, or it might
> be
> > a list of people that share interests, etc.
> >
> > In these views, it's easy to show how many followers each person has. But
> we
> > also want to show a message like "Followed by 4 people you follow" next
> to
> > each person. This helps show the trustworthiness/relevance of each
> person.
> >
> > We implemented that by logic like this:
> >
> > 1. Fetch the list of people that *I* follow.
> > 2. Given the list of people we want to show, for each person in
> parallel...
> > 3. ...Fetch the list of people that follow *that* person...
> > 4. ...And compare this list with the list of people that I follow.
> >
> > Each "fetch" is a traverse (breadth first, max depth 1). This requires
> O(n)
> > traverses, where "n" is the number of people we're showing in this view.
> >
> > (Assume that, generally, the number of people we're showing is smaller
> than
> > the number of people I potentially follow, but the logic could be
> reversed
> > if this is not the case: for each person I follow, fetch the list of
> people
> > that *they* follow.)
> >
> > I wanted to do a sanity check: is this the best way of answering this
> > question? Or is there a better way, e.g. via a single traverse somehow,
> or
> > via a Cypher or Gremlin query?
> >
> > Thanks much!
> >
> > Aseem
> > _______________________________________________
> > Neo4j mailing list
> > [email protected]
> > https://lists.neo4j.org/mailman/listinfo/user
>
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user