Re: Why does HTMLDocumentImpl need to override getElementById(String) ???

Jacob Kjome Fri, 22 Sep 2006 23:43:51 -0700

At 02:06 PM 9/22/2006, you wrote:
>Jacob Kjome <[EMAIL PROTECTED]> wrote on 09/22/2006 12:17:53 PM:
>
>> Quoting Michael Glavassevich <[EMAIL PROTECTED]>:
>>
>> > I imagine it was overridden because the getElementById() defined by
>DOM
>> > HTML [1] (IDs are attributes named "id") has different behaviour than
>the
>> > one in DOM Core [2] (IDs are attributes of type ID).
>>
>> But the HTML4.01 DTD defines "id" as of type ID...
>> http://www.w3.org/TR/html4/sgml/dtd.html#coreattrs
>
>And if the document has no DTD or references a different DTD which
>declares "id" to be some other type then it isn't an ID. The "identifiers"
>map only includes elements which have an ID attribute (see

>http://xerces.apache.org/xerces2-j/javadocs/api/org/w3c/dom/Attr.html#isId()).

>Since HTML DOM treats "id" as an ID irrespective of what its declared type
>is there's no guarantee that you'll find it in the map.
>
>> > There probably wasn't
>> > much thought about making it efficient and since there's really no one
>> > maintaining the HTML DOM implementation it hasn't improved much over
>the
>> > years.
>> >
>>
>> Point taken, but how much effort do you think it would be?  If the
>> HTML4.01 DTD
>> defines "id" as of type ID, then couldn't Xerces track these id's in the
>> "identifiers" map while building the document?
>
>Sure, if "id" has been declared in the DTD then it will be in the map.
>
>> I'm talking about
>> the DOMParser
>> building it using an HTML parser configuration such as that provided by
>> NekoHTML.  Or is it up to NekoHTML to ensure this happens?
>
>NekoHTML might auto-magically assign the type of these attributes to ID
>but a conforming XML parser won't.
>
>> Should I be talking to you or Andy Clark?
>
>Andy would know better than anyone else. (I've never used NekoHTML.)
>


I've got a question out to Andy on this.  I'll let you know what I find out.


>> Ultimately, would you accept a change in HTMLDocumentImpl to let
>thesuperclass
>> determine the Id from it's internal "identifiers" map or would that
>introduce
>> more risk than you are willing to accept; that the parser might not
>populate
>> the "identifiers" map for HTML?
>> Of course, it could implemented to fall back
>> to existing behavior if the parent class can't find it in the
>> "identifiers" map
>> (or the map is null), allowing for optimized behavior for parsers
>> that choose to
>> properly populate the map and safe fallback for parsers that don't.
>>
>> What do you think?
>
>Provided the method (fallback included) continues to do what the DOM HTML
>spec says that should be fine. Are you planning to contribute a patch? :-)
>

What do you think of this (below). It actually goes one step furtherby not necessarily counting on the parsing process to build up the"identifiers" map. It first checks super.getElementById() (whichuses the "identifiers" map). If it finds the element, it returns itimmediately. No unnecessary recursion. If it doesn't find it, itfalls back to the pre-existing recursive behavior to find theelement. If it is found, it populates the "identifiers" map with theelement so the next call to getElementById() will beoptimized. Finally, it returns the element.


    public synchronized Element getElementById( String elementId )
    {
        Element idElement = super.getElementById(elementId);
        if (idElement != null) {
            return idElement;
        }
        idElement = getElementById( elementId, this );
        if (idElement != null) {
            putIdentifier(elementId, idElement);
        }
        return idElement;
    }

What do you think? With this in place, even if no parse-timesolution is found for HTML documents to populate the "identifiers"map, there's still some optimization provided in the case thatgetElementById() is called more than once on the same Id. I know,it's kind of odd, but there are some situations where such a thinghappens and it doesn't hurt anything to account for it. And with aparse-time solution, the less efficient recursion will never have tobe run because "identifiers" map will already have been populatedallowing for fully optimized access to elements with Ids.

Now, hopefully I'll get a good answer from Andy so I can take fulladvantage of the optimization!



later,

Jake

>> Jake
>>
>> > [1]
>http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/level-one-html.html
>> > [2]
>> > http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.
>> html#ID-getElBId
>> >
>> > Michael Glavassevich
>> > XML Parser Development
>> > IBM Toronto Lab
>> > E-mail: [EMAIL PROTECTED]
>> > E-mail: [EMAIL PROTECTED]
>
>Thanks.
>
>Michael Glavassevich
>XML Parser Development
>IBM Toronto Lab
>E-mail: [EMAIL PROTECTED]
>E-mail: [EMAIL PROTECTED]
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Why does HTMLDocumentImpl need to override getElementById(String) ???

Reply via email to