Re: [whatwg] Recursion and loops of Microdata items

Philip Jägenstedt Fri, 10 Jun 2011 00:08:15 -0700

On Thu, 09 Jun 2011 19:01:24 +0200, Ian Hickson <[email protected]> wrote:

On Thu, 9 Jun 2011, Philip Jägenstedt wrote:

As for the spec, I don't think it can or needs to define the algorithmon aform suitable form implementation. Something along these lines would bemuch
clearer for reference:
1. create a (possibly disconnected) graph of all the items in thedocument (or
subtree)
2. find the strongly connected components
3. create a list of "loopy" items: those that are in the same componentas any
other item

The traversal would remain mostly as before, but whenever an item is
encountered, one checks if it is in the list of "loopy" items and if so
ignores it. Since "loopy" is a global property, you'll see the samepropertiesregardless of the path taken to reach it, which may or may not be thecase
with the current spec. (In any case, it's a nice feature.)


The main reason I didn't do something like this with the current spec was

that I was trying to minimise the work needed when implementing the APIin

a dynamic situation. The above would imply that any time anything in the
document changed in a way that could affect microdata, you'd have to redo
the whole document before the next time the API was invoked. That seems
expensive. (Consider the WHATWG spec, which has microdata in it and is
about 5MB. Do you really want to crawl the whole document looking for
microdata each time the API is invoked?)

What I had tried to do when implementing the spec is start at whatever
point in the DOM the API call was related to, and then search for loops
from that point, and drop anything loopy. That's still expensive, but it
at least minimises the total amount of work.

Does that make sense?

If the expense isn't a big deal then I don't mind doing it the other way
too, but this seems like an API that we're going to have enough trouble
getting implemented in the first place without giving implementors a
reason to avoid doing it at all.

I don't think the spec needs to be giving suggestions for efficientimplementation for live collections, because we inevitable won't implementexactly that algorithm anyway. It is a problem if the algorithm doesn'tclearly map to some simpler higher-level behavior, as we certainly don'twant to emulate some edge-cases just to follow the letter of the spec.

But, let's disregard the exact algorithm for a moment and only considerthe actual requirement we're suggesting: "any item which can reach itselfby following itemrefs should be removed"

It seems to me that it's quite possible to check this criteria whiletraversing using an algorithm of similar structure to what is already inthe spec. One issue is that one must first find all the loopy items andthen remove them in one step, rather than interleaving thechecking/removing. However, it seems that this could be solved by simplyflagging them instead of actually removing them, so that they will stilltake part in later loop-checks.

Am I missing something significant about the spec'd algorithm that wouldmake it more efficient than the above?

If we just go ahead and show an efficient (enough) implementation of loopremoval using the suggested criteria, I assume that would be convincingenough? Or do you really think that itemref is useless and complicatedenough that it would be better to throw it overboard?


--
Philip Jägenstedt
Core Developer
Opera Software

Re: [whatwg] Recursion and loops of Microdata items

Reply via email to