Re: [GENERAL] Recursive CTE trees + Sorting by votes

2014-08-07 Thread Gregory Taylor
Hello Martijn,

Thanks for the reply, my responses are inline below.

On Wed, Aug 6, 2014 at 5:38 PM, Martijn van Oosterhout klep...@svana.org
wrote:

 On Wed, Aug 06, 2014 at 05:28:09PM -0400, Gregory Taylor wrote:
  We are working on a threaded comment system, and found this post by
 Disqus
  to be super helpful:
 
 
 http://cramer.io/2010/05/30/scaling-threaded-comments-on-django-at-disqus/
 
  The CTE works wonderfully, and we're really happy with the results. The
  last obstacle is figuring out how to sort by a votes field, meanwhile
  preserving the tree structure.

 What do you mean exactly? Do you mean that want everything at the same
 level to be sorted by vote?


Each level of the tree should be sorted by vote, while retaining the
correct hierarchy. So the top level entry with the most votes should be at
the top, plus all of the items beneath it (with each level of the tree
under that row being sorted correctly).



  If we ORDER BY path, votes (assuming we have the same structure as in
 the
  article), we never need tie-breaking on path, so the votes part of
 this
  doesn't even come into the equation.
 
  I suspect we need to do some path manipulation, but I'm not too sure of
  where to begin with this. I attempted incorporating votes into the
 path,
  but I failed pretty badly with this. It's probably way off, but here's my
  last (failed) attempt:
 
  https://gist.github.com/gtaylor/e3926a90fe108d52a4c8

 I think what you need to do is do the ordering withing the CTE itself.
 Something like:

 WITH RECUSIVE cte () AS (
SELECT ... ORDER BY vote DESC
 UNION ALL
SELECT ... JOIN cte ... ORDER BY vote DESC
 ) SELECT * from cte;


It looks like you can't order within a CTE.



 Or another idea, add a column that is the path of the parent:

 WITH RECUSIVE cte () AS (
SELECT array[] as path_parent, array[id] as path, ... ORDER BY vote DESC
 UNION ALL
SELECT cte.path as path_parent, cte.path || comments.id as path, ...
 JOIN cte ... ORDER BY vote DESC
 ) SELECT * from cte order by path, votes desc;


I got this recommendation from someone else, and think that it's probably
the way to go. I've been playing with it unsuccessfully so far, though.
Most certainly because I've got something weirded up. Here's what I have:


WITH RECURSIVE cte (
id, discussion_id, body, num_votes,
class_section_id, modified_time,
author_id, reply_parent_id,
path, votes_path, depth
)  AS (
SELECT  discussion_response.id, discussion_response.discussion_id,
discussion_response.body, discussion_response.num_votes,
discussion_response.last_edited_time,
discussion_response.class_section_id,
discussion_response.author_id,
discussion_response.reply_parent_id,
array[id] AS path,
array[num_votes, id] AS votes_path,
1 AS depth
FROMdiscussion_response
WHERE   reply_parent_id IS NULL AND discussion_id=2763

UNION ALL

SELECT  discussion_response.id, discussion_response.discussion_id,
discussion_response.body, discussion_response.num_votes,
discussion_response.last_edited_time,
discussion_response.class_section_id,
discussion_response.author_id,
discussion_response.reply_parent_id,
cte.path || discussion_response.id,
cte.votes_path || discussion_response.num_votes ||
discussion_response.id,
cte.depth + 1 AS depth
FROMdiscussion_response
JOIN cte ON discussion_response.reply_parent_id = cte.id
WHERE discussion_response.discussion_id=2763
)
SELECT * FROM cte ORDER BY votes_path DESC, path DESC LIMIT 50 OFFSET 0;

The problem with this is that non-root level (depth  1) rows end up at the
top because of the ordering by votes_path. For example:

id=292839, num_votes=0, reply_parent_id=211957,
votes_path={2,211957,0,292839}, path={211957,292839}, depth=2
id=211957, num_votes=2, reply_parent_id=NULL, votes_path={2,211957},
path={211957}, depth=1

I understand why it is ordered this way, it's just not what I was hoping
for. Ideally this ends up like this:

id=211957, num_votes=2, reply_parent_id=NULL, votes_path={2,211957},
path={211957}, depth=1
id=292839, num_votes=0, reply_parent_id=211957,
votes_path={2,211957,0,292839}, path={211957,292839}, depth=2

Sorting by path causes the correct tree structure to be returned and in
the right order, but obviously it's not
sorted at all by votes.

-- 
Greg Taylor
http://gc-taylor.com


Re: [GENERAL] Recursive CTE trees + Sorting by votes

2014-08-07 Thread Vik Fearing
On 08/07/2014 01:22 PM, Gregory Taylor wrote:
 I got this recommendation from someone else, and think that it's
 probably the way to go. I've been playing with it unsuccessfully so far,
 though. Most certainly because I've got something weirded up. Here's
 what I have:
 
 
 WITH RECURSIVE cte (
 id, discussion_id, body, num_votes,
 class_section_id, modified_time,
 author_id, reply_parent_id,
 path, votes_path, depth
 )  AS (
 SELECT  discussion_response.id http://discussion_response.id,
 discussion_response.discussion_id,
 discussion_response.body, discussion_response.num_votes,
 discussion_response.last_edited_time,
 discussion_response.class_section_id,
 discussion_response.author_id,
 discussion_response.reply_parent_id,
 array[id] AS path,
 array[num_votes, id] AS votes_path,
 1 AS depth
 FROMdiscussion_response
 WHERE   reply_parent_id IS NULL AND discussion_id=2763
 
 UNION ALL
 
 SELECT  discussion_response.id http://discussion_response.id,
 discussion_response.discussion_id,
 discussion_response.body, discussion_response.num_votes,
 discussion_response.last_edited_time,
 discussion_response.class_section_id,
 discussion_response.author_id,
 discussion_response.reply_parent_id,
 cte.path || discussion_response.id
 http://discussion_response.id,
 cte.votes_path || discussion_response.num_votes ||
 discussion_response.id http://discussion_response.id,
 cte.depth + 1 AS depth
 FROMdiscussion_response
 JOIN cte ON discussion_response.reply_parent_id = cte.id
 http://cte.id
 WHERE discussion_response.discussion_id=2763
 )
 SELECT * FROM cte ORDER BY votes_path DESC, path DESC LIMIT 50 OFFSET 0;
 
 The problem with this is that non-root level (depth  1) rows end up at
 the top because of the ordering by votes_path. For example:
 
 id=292839, num_votes=0, reply_parent_id=211957,
 votes_path={2,211957,0,292839}, path={211957,292839}, depth=2
 id=211957, num_votes=2, reply_parent_id=NULL, votes_path={2,211957},
 path={211957}, depth=1
 
 I understand why it is ordered this way, it's just not what I was hoping
 for. Ideally this ends up like this:
 
 id=211957, num_votes=2, reply_parent_id=NULL, votes_path={2,211957},
 path={211957}, depth=1
 id=292839, num_votes=0, reply_parent_id=211957,
 votes_path={2,211957,0,292839}, path={211957,292839}, depth=2
 
 Sorting by path causes the correct tree structure to be returned and
 in the right order, but obviously it's not
 sorted at all by votes.

Just export the order from your CTE.

WITH RECURSIVE tree AS (
SELECT dr.id,
   ...,
   array[dr.id] as path,
   1 as depth,
   row_number() over (order by dr.num_votes desc) as sort_order
FROM discussion_response AS dr
WHERE dr.reply_parent_id IS NULL
  AND dr.discussion_id = 2763

UNION ALL

SELECT dr.id,
   ...,
   tree.path || dr.id,
   tree.depth + 1
   row_number() over (order by dr.num_votes desc)
FROM discussion_response AS dr
JOIN tree ON tree.id = dr.reply_parent_id
WHERE NOT array[dr.id] @ tree.path
)
SELECT *
FROM tree
ORDER BY depth, sort_order
LIMIT 50;
-- 
Vik


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive CTE trees + Sorting by votes

2014-08-07 Thread Gregory Taylor
On Thu, Aug 7, 2014 at 8:12 AM, Vik Fearing vik.fear...@dalibo.com wrote:

 Just export the order from your CTE.

 WITH RECURSIVE tree AS (
 SELECT dr.id,
...,
array[dr.id] as path,
1 as depth,
row_number() over (order by dr.num_votes desc) as sort_order
 FROM discussion_response AS dr
 WHERE dr.reply_parent_id IS NULL
   AND dr.discussion_id = 2763

 UNION ALL

 SELECT dr.id,
...,
tree.path || dr.id,
tree.depth + 1
row_number() over (order by dr.num_votes desc)
 FROM discussion_response AS dr
 JOIN tree ON tree.id = dr.reply_parent_id
 WHERE NOT array[dr.id] @ tree.path
 )
 SELECT *
 FROM tree
 ORDER BY depth, sort_order
 LIMIT 50;


It looks like this clobbers the hierarchy by sorting by depth first. I'm
trying to preserve said hierarchy so I can paginate using OFFSET/LIMIT
easily. I'm not sure what I'm shooting for is even possible, though.

-- 
Greg Taylor
http://gc-taylor.com


Re: [GENERAL] Recursive CTE trees + Sorting by votes

2014-08-07 Thread Paul Jungwirth
 Or another idea, add a column that is the path of the parent:

I don't think this will work. The problem is you need the full path to
keep the children with their parents, but you also need the score. If
you make the path an array of (-votes, id) tuples (perhaps flattened
for simplicity), then you get the correct ordering. That way at every
stage you are sorting by votes, but still keeping children with their
parents:

comments= WITH RECURSIVE cte (id, message, author, path, parent_id,
depth, votes)  AS (
SELECT  id,
message,
author,
array[-votes,id] AS path,
parent_id,
1 AS depth, votes
FROMcomments
WHERE   parent_id IS NULL
UNION ALL
SELECT  comments.id,
comments.message,
comments.author,
cte.path || -comments.votes || comments.id,
comments.parent_id,
cte.depth + 1 AS depth, comments.votes
FROMcomments
JOIN cte ON comments.parent_id = cte.id
)
SELECT id, message, author, path, depth, votes FROM cte
ORDER BY path;
 id |   message   | author |   path| depth | votes
+-++---+---+---
  5 | Very interesting post!  | thedz  | {-3,5}| 1 | 3
  8 | Fo sho, Yall| Mac| {-3,5,-12,8}  | 2 |12
  7 | Agreed  | G  | {-3,5,-5,7}   | 2 | 5
  6 | You sir, are wrong  | Chris  | {-3,5,-3,6}   | 2 | 3
  1 | This thread is really cool! | David  | {-1,1}| 1 | 1
  3 | I agree David!  | Daniel | {-1,1,-4,3}   | 2 | 4
  2 | Ya David, we love it!   | Jason  | {-1,1,-3,2}   | 2 | 3
  4 | gift Jason  | Anton  | {-1,1,-3,2,-15,4} | 3 |15
(8 rows)

Time: 0.966 ms

Paul


On Wed, Aug 6, 2014 at 2:38 PM, Martijn van Oosterhout
klep...@svana.org wrote:
 On Wed, Aug 06, 2014 at 05:28:09PM -0400, Gregory Taylor wrote:
 We are working on a threaded comment system, and found this post by Disqus
 to be super helpful:

 http://cramer.io/2010/05/30/scaling-threaded-comments-on-django-at-disqus/

 The CTE works wonderfully, and we're really happy with the results. The
 last obstacle is figuring out how to sort by a votes field, meanwhile
 preserving the tree structure.

 What do you mean exactly? Do you mean that want everything at the same
 level to be sorted by vote?

 If we ORDER BY path, votes (assuming we have the same structure as in the
 article), we never need tie-breaking on path, so the votes part of this
 doesn't even come into the equation.

 I suspect we need to do some path manipulation, but I'm not too sure of
 where to begin with this. I attempted incorporating votes into the path,
 but I failed pretty badly with this. It's probably way off, but here's my
 last (failed) attempt:

 https://gist.github.com/gtaylor/e3926a90fe108d52a4c8

 I think what you need to do is do the ordering withing the CTE itself.
 Something like:

 WITH RECUSIVE cte () AS (
SELECT ... ORDER BY vote DESC
 UNION ALL
SELECT ... JOIN cte ... ORDER BY vote DESC
 ) SELECT * from cte;

 Or another idea, add a column that is the path of the parent:

 WITH RECUSIVE cte () AS (
SELECT array[] as path_parent, array[id] as path, ... ORDER BY vote DESC
 UNION ALL
SELECT cte.path as path_parent, cte.path || comments.id as path, ... JOIN 
 cte ... ORDER BY vote DESC
 ) SELECT * from cte order by path, votes desc;

 Hope this helps,
 --
 Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
-- Arthur Schopenhauer

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.10 (GNU/Linux)

 iQIVAwUBU+KgXUvt++dL5i1EAQgKzQ//fWqd56vcwKsYQDtbUE3Q2/ohUinYxpb6
 HgS9HoEs8QU3b4yzE6VOVXcUcN3/z6PPx4Mz3rqFOVgsFcZR2umGAaVw5oEr57Bd
 mqFDVgUxq8Xio2tijO0XFU89fh+/Cvus08CRh+OH6POLe6M76ox6cmFPtQzeaEon
 iFKXZZRIzFv7zpoE3xsQ7wgqSF44L0TIJIjdw3Dhcs8fN+T/jO0hJtUMKidGwbbv
 9f08r9kjSMBYAhKCPXZHy/By/E91DhA8GjJFL1MloHPol/lzSkn7v7amWJZaILyE
 g3ghGUG1YhPJPA3Dw2VBKWzumNyu8kXSzTvzN6PacFToCf2ZIfTJH59ehPqztt0o
 FC6auCvO1vWS3NbOKSwdBVvXb/bJsIM3uqN16LSVhHqUp75eOFp5AWKJMCjQF1hE
 MkHk5xyz2CWsYZTlzqCKtGxRjFEbxKGjtqsxcM4qKM3uSjMG/ZhaAY6FZFLIage0
 yxsHrE5N+zfDAGV1EplxxtzMHUEqyFnBYQNRHUSChLPCkgrluOeFFRQU22aVpUUL
 vbPIBI8E16bbtU6zwnE3DoMdBm1Pq5E4c+URbfbzJhGB1e/DkDqf7pOZjojLJ9ue
 DRP777bBbsYwtCdS69kiIDkfwA2f7lliILI9wpnKSg64SIWlCR6NVWFTsfU8OP4l
 cJw8kApkDr4=
 =8bEW
 -END PGP SIGNATURE-




-- 
_
Pulchritudo splendor veritatis.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive CTE trees + Sorting by votes

2014-08-07 Thread Gregory Taylor
On Thu, Aug 7, 2014 at 11:57 AM, Paul Jungwirth p...@illuminatedcomputing.com
 wrote:

  Or another idea, add a column that is the path of the parent:

 I don't think this will work. The problem is you need the full path to
 keep the children with their parents, but you also need the score. If
 you make the path an array of (-votes, id) tuples (perhaps flattened
 for simplicity), then you get the correct ordering. That way at every
 stage you are sorting by votes, but still keeping children with their
 parents:

 comments= WITH RECURSIVE cte (id, message, author, path, parent_id,
 depth, votes)  AS (
 SELECT  id,
 message,
 author,
 array[-votes,id] AS path,
 parent_id,
 1 AS depth, votes
 FROMcomments
 WHERE   parent_id IS NULL
 UNION ALL
 SELECT  comments.id,
 comments.message,
 comments.author,
 cte.path || -comments.votes || comments.id,
 comments.parent_id,
 cte.depth + 1 AS depth, comments.votes
 FROMcomments
 JOIN cte ON comments.parent_id = cte.id
 )
 SELECT id, message, author, path, depth, votes FROM cte
 ORDER BY path;
  id |   message   | author |   path| depth |
 votes

 +-++---+---+---
   5 | Very interesting post!  | thedz  | {-3,5}| 1 |
   3
   8 | Fo sho, Yall| Mac| {-3,5,-12,8}  | 2 |
  12
   7 | Agreed  | G  | {-3,5,-5,7}   | 2 |
   5
   6 | You sir, are wrong  | Chris  | {-3,5,-3,6}   | 2 |
   3
   1 | This thread is really cool! | David  | {-1,1}| 1 |
   1
   3 | I agree David!  | Daniel | {-1,1,-4,3}   | 2 |
   4
   2 | Ya David, we love it!   | Jason  | {-1,1,-3,2}   | 2 |
   3
   4 | gift Jason  | Anton  | {-1,1,-3,2,-15,4} | 3 |
  15
 (8 rows)

 Time: 0.966 ms


This is outstanding, Paul. I'm still checking things over, but it looks
like this is going to work. It looks like I was really close, but didn't
think to go negative, and I had one of my arrays flip-flopped from what
you've got. I made those two changes and it would appear that this is
perfect.

Much appreciated, I would have been beating my head against this for a lot
longer without the help!


[GENERAL] Recursive CTE trees + Sorting by votes

2014-08-06 Thread Gregory Taylor
We are working on a threaded comment system, and found this post by Disqus
to be super helpful:

http://cramer.io/2010/05/30/scaling-threaded-comments-on-django-at-disqus/

The CTE works wonderfully, and we're really happy with the results. The
last obstacle is figuring out how to sort by a votes field, meanwhile
preserving the tree structure.

If we ORDER BY path, votes (assuming we have the same structure as in the
article), we never need tie-breaking on path, so the votes part of this
doesn't even come into the equation.

I suspect we need to do some path manipulation, but I'm not too sure of
where to begin with this. I attempted incorporating votes into the path,
but I failed pretty badly with this. It's probably way off, but here's my
last (failed) attempt:

https://gist.github.com/gtaylor/e3926a90fe108d52a4c8

Any ideas would be greatly appreciated! If we can retain the path structure
and also sort by votes, we'll be able to paginate freely without issues.

-- 
Greg Taylor
http://gc-taylor.com


Re: [GENERAL] Recursive CTE trees + Sorting by votes

2014-08-06 Thread Martijn van Oosterhout
On Wed, Aug 06, 2014 at 05:28:09PM -0400, Gregory Taylor wrote:
 We are working on a threaded comment system, and found this post by Disqus
 to be super helpful:
 
 http://cramer.io/2010/05/30/scaling-threaded-comments-on-django-at-disqus/
 
 The CTE works wonderfully, and we're really happy with the results. The
 last obstacle is figuring out how to sort by a votes field, meanwhile
 preserving the tree structure.

What do you mean exactly? Do you mean that want everything at the same
level to be sorted by vote?

 If we ORDER BY path, votes (assuming we have the same structure as in the
 article), we never need tie-breaking on path, so the votes part of this
 doesn't even come into the equation.
 
 I suspect we need to do some path manipulation, but I'm not too sure of
 where to begin with this. I attempted incorporating votes into the path,
 but I failed pretty badly with this. It's probably way off, but here's my
 last (failed) attempt:
 
 https://gist.github.com/gtaylor/e3926a90fe108d52a4c8

I think what you need to do is do the ordering withing the CTE itself.
Something like:

WITH RECUSIVE cte () AS (
   SELECT ... ORDER BY vote DESC
UNION ALL
   SELECT ... JOIN cte ... ORDER BY vote DESC
) SELECT * from cte;

Or another idea, add a column that is the path of the parent:

WITH RECUSIVE cte () AS (
   SELECT array[] as path_parent, array[id] as path, ... ORDER BY vote DESC
UNION ALL
   SELECT cte.path as path_parent, cte.path || comments.id as path, ... JOIN 
cte ... ORDER BY vote DESC
) SELECT * from cte order by path, votes desc;
  
Hope this helps,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature