> If you are doing this often, you could leave spaces in the left and right
> values so that you could minimize the number of rows that need to be
> updated. The article makes every leaf use x and x+1 for left and right which
> forces another update to add a child. If instead you used x and x+20 you'd
> leave space for more children without any updates. This could be applied
> from top to bottom, starting with the root category getting 0 and MAX_INT
> for its values.

Then I would have to check what values are available when inserting,
and possibly normalise every so often. I'll think about that, and when
I have enough data in the database I'll set up a test system to play
with the possibility.


> However, it's probably not even worth applying that complexity until you
> prove that frequent category additions are causing problems. Most systems
> will be querying against the categories table far more frequently, and
> that's where this model pays off. If you want to see all products in
> category X and its subcategories, it's a single *non-recursive* query.
> That's huge if you are doing a lot of searches like this.

You are right, that non-recursive bit is important. In fact, I think
that I'm convinced. Thanks!


>> But what a mess this would be if the two methods go outĀ of sync!
>
> Sure, but these values would be maintained by your code--not end-users. It
> just comes down to making sure your code is correct through appropriate unit
> tests. By moving the logic to a stored procedure, you can ensure the table
> is locked during the updates to keep two users from adding a new category
> simultaneously.

So long as it is in fact my code, that's fine. But when others start
maintaining it and not reading comments, it may get ugly. That does
not apply to this particular pet project, but it is a consideration
for future projects.


>> That pays off more? For the guy writing code or for the database
>> memory requirement?
>
> Performance-wise. The nested set method looks to be moderately more complex
> code-wise, but luckily that is done just once while querying the database is
> done again and again. As with all optimizations, it's best to measure and
> make sure there's a problem before trying to solve it. Once you've built a
> few hierarchical systems, you'll be able to make a gut call up front.

I see, thanks. Good point about making sure that the problem exists
before trying to fix it, I've seen people optimise away where there is
no bottleneck.


>> Only two update statements, but they are affecting on average half the
>> database's rows!
>
> Of a single table: categories. Hopefully you have far more items that get
> categorized than you do categories.
>

True.

>> Which do you call the hierarchical model? That term is not used in the
>> linked article.
>
> Well, both models are hierarchical in the sense that there's a parent-child
> relationship. By hierarchical here I mean that the method of implementation
> involves each category pointing to its parent directly via a parent_id
> column. Searching for all subcategories of category X requires searching
> first for all children, then all grandchildren, and so on, resulting in a
> recursive query.
> Using the nested sets model requires a single non-recursive query to get the
> same data.
>

I do agree that the non-recursive method at retrieval time advantage
far outweighs the update-half-the-table issue upon addition of an
additional category.

Thanks!

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to