Changed to Spark Graphx and it solved my problem perfectly. :) On Thursday, July 10, 2014 9:05:44 AM UTC+8, Guitao Ding wrote: > > > Hi Michael, > > The data is from one website. The user_id is the ID of each registered > user. And every user_id is linked to one or more cookie_ids (the value of > one cookie). > Also one cookie_id is linked to one or more user_ids. So user_id and > cookie_id are many to many mapping. My case is to find all user_ids linked > with each other (via cookie_ids, and no limit on the path length) and > assign them one unique ID (here i used then smallest user_id in the path). > > for example: > All user_ids (user_id1, user_id2, user_id3) in the following path should > be assigned one same ID (e.g. user_id1) > user_id1----cookie_id1----user_id2-----cookie-id2-----user_id3 > > I imported the data to neo4j and use different labels for user_id and > cookie_id. There is only one type of relationship (cookie_id -- user_id) > and the direction doesn't matter. > > On Thursday, July 10, 2014 6:01:14 AM UTC+8, Michael Hunger wrote: >> >> This is a graph global query, with unlimited paths, so it might generate >> many billions or trillions of paths to look at. >> Esp. if you don't provide a direction. >> >> if your nodes are all users, then you do the equivalent of finding all >> paths between the cross product of 16M^2 >> >> Perhaps you can describe your actual use-case that you try to solve? >> >> Michael >> >> >> >> On Wed, Jul 9, 2014 at 4:23 PM, Guitao Ding <[email protected]> wrote: >> >>> Hi all, >>> >>> I'm leaning to user neo4j for relation analysis recently. Today I found >>> it took too long for my cypher query took to finish. >>> >>> I used the batch importer <https://github.com/jexp/batch-import/tree/20> >>> to import all data into neo4j. And I wanted to find the smallest user_id >>> connected (directly or undirectly, path length no limit) to each user_id. >>> Below is the details: >>> >>> neo4j version: 2.1.2 >>> nodes num: 16M (three labels) >>> relation num: 10M >>> cypher query: >>> >>> match (n:user_id)-[:mapping*]-(d:user_id) >>> with n.value as user_id, >>> case when min(d.value) > n.value then n.value else min(d.value) end as >>> people_id >>> return user_id, people_id >>> >>> >>> What should I do to improve my query performance? Any suggestions would >>> be appreciated! >>> >>> Thanks in advance. >>> >>> Guitao >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >>
-- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
