Michael Segel wrote:
On Wednesday 30 November 2005 12:42, Rick Hillegas wrote:

Hi Michael,

You could streamline your recursive walk by using a temporary table and
a database procedure. The temporary table would hold the ids you
recursively harvest. It would be populated by your database procedure,
which would walk up the levels of your hierarchy. When the procedure
returned, you could then join the temporary table to your
discussion_item table to get the details you needed. It's not an elegant
solution, but by running the procedure inside the database, you would
eliminate a lot of network handshaking.

Derby does not support hierarchical queries. You're welcome to log an
enhancement request for this feature.


Well if you're going to use Temp tables, why not go outside of just SQL statements and then use cursors?

By using prepared statements you could do this and maybe use a little bit of recursion. Since this is SELECT, you'd want to make sure that you do this out side of a Transaction. (Transactions and recursive SQL statements can get messy.)

Pretty straight forward from there.
You only need to prepare two statements, and each recursion requires a unique resultSet to be declared.

Your table would look like this:
Table foo:
msg_id int -- (you can autogenerate or use long or something else depending on # of messages and retension rules.)
        parent_id int (set default to 0)
        msg_author ...
        msg_header ...
msg_txt ...
Then create an index on msg_id and also on parent_id;

This can be done a couple of ways or permutations.
Here's the simplest.

Make the following CLASS variables;
A database connection con;
Create a vector to store the result sets.
int rec_lvl = 0 (this is the level of recursion)
Create and prepare the following select statement:
SELECT msg_id, msg_author, msg_header, msg_txt
FROM   foo
WHERE parent_id = ?
ORDER BY msg_id;

You then have two methods to write:
1) A Method to pull the data out of the result set, and pop it on to a vector, including the rec_lvl variable. (This will be used to build your tree)

2) The recursive method that finds all the children to the input value of parent_id.

So the first time you call the method you pass in the value of 0 for the parent id. Then for each record you get back, you pass the result set to the popOnVector and the current rec_lvl value. You then call the recursive routine, passing in the current msg_id and (rec_lvl + 1);

Thats all it takes.

When presenting a discussion thread, there are typically two things you want to take into account:
* Ordering of messages
* Indentation

Ordering:
* All replies to a specific message (messages having a 'parent') should be displayed directly under that parent message. * Within a sub-group of messages, sharing the same parent, messages should be ordered on its timestamp.

Indentation:
* Every message which has a parent should be more indented than the parent.
* Every message having the same distance to the "root" message (the one without a parent) should have the same indentation.

With the current table definitions I have seen suggested here, neither of these can be computed dynamically with a single, non-recursive SQL query.

The indentation part is easy: Don't compute it dynamically, store it in the table. Whenever you add a new record to the table, you know the id of the parent message. The indentation level for the record to be inserted is the parent's indentation level plus one.

The ordering part is more tricky. What information do we have in each record that can be used for ordering.

message id - most likely monotonically increasing with each message added (chronologically)
parent id
indentation level
timestamp - this one is, like the message id, monotonically increasing, so it does not add any information for us to use when sorting

So the table is (wrt sorting) a set of triplets: (message_id, parent_id, indentation). For a database to sort this set, it must be able to look at two random triplets and determine which of those comes first, without considering other triplets that may be in the set. Is this possible?

As an example, look at these two triplets:

(8,3,2)
(7,6,3)

Which comes first?

It's impossible to tell, since it depends on the values of other rows. Consider these two message trees:

(1,null,0) <- root message
 (2,1,1) <- 2nd message received, 1 is parent, indentation is 1
  (4,2,2) < - Received after message 3 below, but is reply to 2
 (3,1,1)
  (8,3,2)
 (5,1,1)
  (6,5,2)
   (7,6,3)

Here, (8,3,2) comes before (7,6,3), because of the subgroups they belong to.

(1,null,0)
 (2,1,1)
  (6,2,2)
   (7,6,3)
 (3,1,1)
  (8,3,2)
 (4,1,1)
 (5,1,1)

Here, (7,6,3) comes before (8,3,2).

So I think you don't get the complete ordering for free from SQL, you'll have to do some of the sorting in your java code. It should be possible to do with a 1-pass algorithm, I think.

--
Oyvind Bakksjo
Sun Microsystems, Database Technology Group
Trondheim, Norway
http://weblogs.java.net/blog/bakksjo/

Reply via email to