Changed to Spark Graphx and it solved my problem perfectly. :)

On Thursday, July 10, 2014 9:05:44 AM UTC+8, Guitao Ding wrote:
>
>
> Hi Michael,
>
> The data is from one website. The user_id is the ID of each registered 
> user. And every user_id is linked to one or more cookie_ids (the value of 
> one cookie).
> Also one cookie_id is linked to one or more user_ids. So user_id and 
> cookie_id are many to many mapping. My case is to find all user_ids linked 
> with each other (via cookie_ids, and no limit on the path length) and 
> assign them one unique ID (here i used then smallest user_id in the path).
>
> for example:
> All user_ids (user_id1, user_id2, user_id3) in the following path should 
> be assigned one same ID (e.g. user_id1)
> user_id1----cookie_id1----user_id2-----cookie-id2-----user_id3
>
> I imported the data to neo4j and use different labels for user_id and 
> cookie_id. There is only one type of relationship (cookie_id -- user_id) 
> and the direction doesn't matter.
>
> On Thursday, July 10, 2014 6:01:14 AM UTC+8, Michael Hunger wrote:
>>
>> This is a graph global query, with unlimited paths, so it might generate 
>> many billions or trillions of paths to look at.
>> Esp. if you don't provide a direction.
>>
>> if your nodes are all users, then you do the equivalent of finding all 
>> paths between the cross product of 16M^2
>>
>> Perhaps you can describe your actual use-case that you try to solve?
>>
>> Michael
>>
>>
>>
>> On Wed, Jul 9, 2014 at 4:23 PM, Guitao Ding <[email protected]> wrote:
>>
>>> Hi all,
>>>
>>> I'm leaning to user neo4j for relation analysis recently. Today I found 
>>> it took too long for my cypher query took to finish.
>>>
>>> I used the batch importer <https://github.com/jexp/batch-import/tree/20> 
>>> to import all data into neo4j. And I wanted to find the smallest user_id 
>>> connected (directly or undirectly, path length no limit) to each user_id. 
>>> Below is the details:
>>>
>>> neo4j version: 2.1.2
>>> nodes num: 16M (three labels)
>>> relation num: 10M
>>> cypher query: 
>>>
>>> match (n:user_id)-[:mapping*]-(d:user_id)
>>> with n.value as user_id,
>>> case when min(d.value) > n.value then n.value else min(d.value) end as 
>>> people_id
>>> return user_id, people_id
>>>
>>>
>>> What should I do to improve my query performance? Any suggestions would 
>>> be appreciated!
>>>
>>> Thanks in advance.
>>>
>>>  Guitao
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to