Thank you Michael, 

This is great insight, and helps me better understand query structure. As I 
need the limit to be a parameter, I think it is not possible to use path 
length as the limiter. It also turns out not every post is categorized, so 
I found this variation on your suggestion to be the one that worked best, 
and cut DB hits by 45%!

// Get latest posts and their categories
PROFILE MATCH (r:Reference {id: 
"latest"})-[:REFERS_TO]->(:Post)-[:PREVIOUS*0..]->(p)
WITH p LIMIT 20
OPTIONAL MATCH (p)-[:MEMBER_OF]->(c:Category)
RETURN p, COLLECT(c.uuid) as c

Your second suggestion, collecting, sorting and unwinding the paths is 
complicated and harder to maintain/explain. Did you suggest this as an 
academic alternative, or potentially more performant (if I didn't have the 
parameter requirement)?

Next, I sometimes need to get just the recent posts belonging to any of a 
set of categories (which will be a parameter as well). Could you comment on 
whether the following query could be further optimized? It takes over 7x 
more db hits than the simpler query (@ limit 20, and may be non-linear as 
limit grows, not sure why):

// Get latest posts in 2 categories
PROFILE MATCH (r:Reference {id: 
"latest"})-[:REFERS_TO]->(:Post)-[:PREVIOUS*0..9999]->(p)-[:MEMBER_OF]->(c:Category)
WHERE c.uuid IN ['b582bd70-e16c-11e5-8722-fef6f6021414', 
'b583cee0-e16c-11e5-8722-fef6f6021414']
WITH p, c LIMIT 20
RETURN p, COLLECT(c.uuid) as c

Regards, and thanks again.





On Saturday, March 5, 2016 at 4:40:18 AM UTC-6, Michael Hunger wrote:
>
> Hi Jesse,
>
> your first query has to aggregate over all the p's first to get the 
> cagetories (you never know if the last p is the same as a previous one in 
> the result set and has yet another category that you also have to collect 
> for the post)
>
> yours query just takes the first 50 pairs of posts and categories and then 
> groups them
>
> you could also do this which should be more efficient.
>
> PROFILE MATCH (r:Reference {id: "latest"})-->(:Post)-[:PREVIOUS*0..]->(p)
> WITH p LIMIT 50
> MATCH (p)-[:MEMBER_OF]->(c:Category)
> RETURN p, COLLECT(c.uuid) AS c
>
> Also if you're interested in only the latest 50 you can also just limit 
> the path length
>
> PROFILE MATCH (r:Reference {id: "latest"})-->(:Post)-[:PREVIOUS*0..50]->(p)
> WITH p LIMIT 50
> MATCH (p)-[:MEMBER_OF]->(c:Category)
> RETURN p, COLLECT(c.uuid) AS c
>
> also add the relationship-type between Reference and post, so that the 
> planner can do a better job
>
> another thing you can try is to just get the longest path and then take 
> all the nodes of the path.
>
> PROFILE MATCH (r:Reference {id: "latest"})-->(first:Post)
> MATCH path = (first)-[rels:PREVIOUS*0..50]->(p)
> WITH path
> ORDER BY size(rels) DESC LIMIT 1
> UNWIND nodes(path) as p
> MATCH (p)-[:MEMBER_OF]->(c:Category)
> RETURN p, COLLECT(c.uuid) AS c
>
> HTH Michael
>
> On Fri, Mar 4, 2016 at 10:26 PM, Jesse Hemingway <[email protected] 
> <javascript:>> wrote:
>
>> I'm pretty new to neo4j, and I'm working on optimizing my queries. For my 
>> example, let's say we have a lot of 'Post' nodes that can grow in an 
>> unbounded fashion. I'm using a linked-list to be able to scan back through 
>> recent posts without consulting every node. This works, but my question has 
>> to do with a minor semantic change that means the difference between 
>> consulting LIMIT nodes, or ALL nodes. First, the bad one; this query 
>> consults all Post nodes:
>>
>> MATCH (r:Reference {id: 
>> "latest"})-->(:Post)-[:PREVIOUS*0..]->(p)-[:MEMBER_OF]->(c:Category)
>> RETURN p, COLLECT(c.uuid) AS c LIMIT 50
>>
>> However, this next query only consults only the last 50, as I'd like. 
>> Could someone explain why putting the LIMIT on the WITH makes all the 
>> difference?
>>
>> PROFILE MATCH (r:Reference {id: 
>> "latest"})-->(:Post)-[:PREVIOUS*0..]->(p)-[:MEMBER_OF]->(c:Category)
>> WITH p, c LIMIT 50
>> RETURN p, COLLECT(c.uuid) AS c
>>
>> Thanks in advance for any insight!
>>
>>
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to