Re: [Neo4j] Query/design question: trusted followers scenario

Marko Rodriguez Wed, 24 Aug 2011 14:34:51 -0700

Hi,

With Gremlin (and I believe Cypher), you can do these operations lazily and 
thus, be more memory efficient. Moreoever, I don't know how much speedup you 
will get with parallelism, but I suspect the overhead of threads is going to 
slow down your query --- (as I've seen with 'small queries' using parallel 
branches). With large queries (touching millions of things, parallelism starts 
to show benefits).


In Gremlin, this is how your query is represented, where if vertex 1 is "I":

        m = [:]; x = [] as Set
        g.v(1).out('follows').aggregate(x).sideEffect{m[it] = 
it.in('follows').retain(x).count()} >> -1

m will have keys that are the vertices that you follow and values being the 
number of shared followers of those followers (?! loopy talk ?!). More 
specifically, it has the answers analagous to "vertex 2 followed by 4 people 
that you follow."

If you use Gremlin 1.2 (which I don't think Neo4j has released with their 
server yet), there is a more concise representation that I can show you. Also, 
Gremlin 1.3-SNAPSHOT will make this query ~twice as fast, but its not released 
yet :(. I can show you a trick to make it twice as fast if you are interested.

HTH, // Peter taught me what HTH means. Its a good salutation. I would 
previously have done "Thanks, Marko" but that doesn't really make much sense.
Marko.

http://markorodriguez.com

On Aug 24, 2011, at 3:14 PM, Aseem Kishore wrote:

> Hi guys,
> 
> We're building a social network which has an asymmetrical follower model
> like Twitter's: users "follow" each other.
> 
> We have various views where we show a list of people. This could be e.g. all
> people in the network, or it might be some user's followers, or it might be
> a list of people that share interests, etc.
> 
> In these views, it's easy to show how many followers each person has. But we
> also want to show a message like "Followed by 4 people you follow" next to
> each person. This helps show the trustworthiness/relevance of each person.
> 
> We implemented that by logic like this:
> 
> 1. Fetch the list of people that *I* follow.
> 2. Given the list of people we want to show, for each person in parallel...
> 3. ...Fetch the list of people that follow *that* person...
> 4. ...And compare this list with the list of people that I follow.
> 
> Each "fetch" is a traverse (breadth first, max depth 1). This requires O(n)
> traverses, where "n" is the number of people we're showing in this view.
> 
> (Assume that, generally, the number of people we're showing is smaller than
> the number of people I potentially follow, but the logic could be reversed
> if this is not the case: for each person I follow, fetch the list of people
> that *they* follow.)
> 
> I wanted to do a sanity check: is this the best way of answering this
> question? Or is there a better way, e.g. via a single traverse somehow, or
> via a Cypher or Gremlin query?
> 
> Thanks much!
> 
> Aseem
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Query/design question: trusted followers scenario

Reply via email to