Regarding nested sets, there is one problem - it's not just slower on
write, it gets progressively slower the more nodes in the tree since in
the usual implementation you need to update the left/right values for
every single set in the tree... which potentially amounts to every row
in the table. There are some options to use fractional numbers rather
than integers but this isn't perfect either.


One case we saw on one of our customers is a hierarchical check:

SELECT SUM(aff.size) FROM "artefact" a JOIN "artefact_file_files" aff ON
aff.artefact = a.id WHERE a.path LIKE '%/12345/%' LIMIT 2

Without heading to nested sets, there might be a way to minimise this
particular pain point - if artefact 1 has a path of /2/3/1/, a simple
two column table could be created and kept up to date when artefact
paths change, that records (1, 2), (1, 3) and (1, 1) - meaning that
querying that aspect of the hierarchy comes quite a bit cheaper as you
can say 'which artefacts are in the same hierarchy'. It's not a complete
replacement but for just purely simple 'is in the same hierarchy as'
checks, it might be a cheap win.

-- 
You received this bug notification because you are a member of Mahara
Contributors, which is subscribed to Mahara.
Matching subscriptions: Subscription for all Mahara Contributors -- please ask 
on #mahara-dev or mahara.org forum before editing or unsubscribing it!
https://bugs.launchpad.net/bugs/1427885

Title:
  Change "artefact.path" column to use the "nested set" technique for
  managing hierarchical data

Status in Mahara:
  Triaged

Bug description:
  Originally, we just had each artefact store its parent ID. This is
  slow because it requires running multiple queries to find all the
  descendants of a node.

  Then, we added a "path" element to each artefact. This is better, but
  you can't get a performance improvement by indexing the column,
  because most of the queries rely on the "LIKE" operator. (See
  https://bugs.launchpad.net/mahara/+bug/1423700 )

  So if we want to squeeze more performance out of this, I think the one
  remaining thing to look into is the "nested set" technique. This
  technique results in very fast searches for descendants, with the cost
  of somewhat slower writes. http://mikehillyer.com/articles/managing-
  hierarchical-data-in-mysql/

  There are even existing PHP libraries for using the technique, such as
  this one: http://www.sideralis.org/baobab/

To manage notifications about this bug go to:
https://bugs.launchpad.net/mahara/+bug/1427885/+subscriptions

_______________________________________________
Mailing list: https://launchpad.net/~mahara-contributors
Post to     : mahara-contributors@lists.launchpad.net
Unsubscribe : https://launchpad.net/~mahara-contributors
More help   : https://help.launchpad.net/ListHelp

Reply via email to