On Thu, 2009-07-23 at 21:13 +0800, Bo Yang wrote: > Thanks for your review, jmb. > > > > > In general, this is ok. There are quite a lot of typos in this code -- > > mostly in comments, though some in code. I've highlighted the code ones, > > but it'd be good if the ones in comments are fixed, too. > > Forgive me for my poor English and careless, I will check them out. > > > seem to be a fair few functions that are lacking API documentation. This > > needs fixing before merge. > > Yes, I will add comments for all API.
Thanks. > > > > There's a bunch of reference counting stuff that's not quite right in > > this diff, but as you're working on this, I've not commented on it. > > Now, most of the ref count stuff should be fixed by the leakage test phase. > > >> Index: test/run-test.sh > >> =================================================================== > >> --- /dev/null 2009-04-16 19:17:07.000000000 +0100 > >> +++ test/run-test.sh 2009-07-15 02:08:31.000000000 +0100 > >> @@ -0,0 +1,130 @@ > >> +#!/bin/bash > > > > Bash is not available on all supported build hosts. Thus, this script is > > not portable. > > > > Compilation of test binaries should be handled by the Makefile. > > > > Running the test binaries should be performed by a testrunner written in > > Perl. There's one in the core buildsystem that may be suitable. > > I want to migrate the test framework to Makefile and testrunner after > HTML module, because the now test framework can work and migrating it > will cost much time. How do you think about this? As long as there is a plan, then I don't mind. What I don't want to see happen is that the current system gets left in place because noone gets around to replacing it. > >> Index: test/README > >> =================================================================== > >> --- /dev/null 2009-04-16 19:17:07.000000000 +0100 > >> +++ test/README 2009-07-15 02:08:31.000000000 +0100 > >> @@ -0,0 +1,2 @@ > >> +Note the lib/ directory contain the old test related files, although they > >> +are not workable, I leave them here for later use if there is a chance. > > > > Is there any need for this. Also, why preserve the lib directory -- > > version control does that for you. > > Removed. Ok. > >> + DERIVATION_RESTRICTION = 0x00000001, > >> + DERIVATION_EXTENSION = 0x00000002, > >> + DERIVATION_UNION = 0x00000004, > >> + DERIVATION_LIST = 0x00000008 > >> +} dom_type_info_derivation_method; > > > > Is this a bitfield? > > Yes, it is, see > http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#TypeInfo. > > >> +dom_exception resource_mgr_create_hashtable(struct resource_mgr *res, > >> + size_t chains, hash_func f, struct hash_table **ht); > >> + > >> +#endif > > > > I'm not sure I understand the need for a resource manager. Can you > > explain? > > > > Generally, a resource manager is used to alloc/dealloc memory and > intern strings. In the now implementation, the dom_document itself is > both a Document and a resource manager. I think it is better to > separate the resource management function out of the Document and make > it a single object as resource_manager. Well, all allocation (aside from DocumentType nodes) occurs in the context of a document. Thus, by definition, the Document has to manage resources. It doesn't really make sense to me to have separate resource management objects floating around. > >> +#include <libwapcaplet/libwapcaplet.h> > >> + > >> +#include "core/node.h" > >> +#include "core/document_type.h" > >> + > >> +#include "utils/utils.h" > >> +#include "utils/validate.h" > >> +#include "utils/namespace.h" > >> + > >> +#include "bootstrap/implementation.h" > >> + > >> +static dom_alloc _alloc; > >> +static void * _pw; > > > > Lose the space after '*'. > > > > I'm not comfortable about these being static. Is there any guarantee > > that they're shared by different implementation instances? Note that > > DOMImplementations may need language-specific hooks (e.g. for > > normalisation). > > > > I am sorry but I did not realize why a single static allocator is not enough. Is it possible for there to be multiple DOMImplementations? If so, then a static allocator simply won't work. > >> +/* Clone a dom_string */ > >> +dom_exception dom_string_clone(dom_alloc alloc, void *pw, struct > >> dom_string *str, > >> + struct dom_string **ret); > > > > Would this be better named as _copy()? > > It is not just copying, it realloc the string with new allocator, so I > think clone is better. Right. This needs explaining in the documentation. > >> +/* Create a DOM string from a buffer > >> + * We will not alloc any memory in this method because the client provide > >> + * it. This function call mainly used for create a string from lwc_string > >> */ > >> +dom_exception dom_string_create_from_lwcstring(dom_alloc alloc, void *pw, > >> + struct lwc_context_s *ctx, struct lwc_string_s *str, > >> + struct dom_string **ret); > > > > The documentation here is wrong. Also, does this not complicate things > > hugely for the client? They'd have to know they're dealing with an > > lwc_string. > > > > I have changed the comment and move the API to internal. Ok. > >> @@ -117,6 +120,12 @@ > >> dom_exception dom_document_create_string(struct dom_document *doc, > >> const uint8_t *data, size_t len, struct dom_string **result); > >> > >> +dom_exception dom_document_create_lwcstring(struct dom_document *doc, > >> + const uint8_t *data, size_t len, struct lwc_string_s > >> **result); > > > > What is this for? Why is it in the public API? > >> +dom_exception dom_document_create_string_from_lwcstring(struct > >> dom_document *doc, > >> + struct lwc_string_s *str, struct dom_string **result); > >> + > > > > Similarly. > > > > If the client has a lwc_string * and want to call our API which take a > dom_string, this function is very helpful for that situation and it is > a common situation. That doesn't explain why dom_document_create_lwcstring() is in the public API. > >> Index: include/dom/core/implementation.h > >> =================================================================== > >> --- include/dom/core/implementation.h (revision 8546) > >> +++ include/dom/core/implementation.h (working copy) > >> @@ -30,20 +30,19 @@ > >> dom_exception dom_implementation_get_feature( > >> struct dom_implementation *impl, > >> struct dom_string *feature, struct dom_string *version, > >> - void **object, > >> - dom_alloc alloc, void *pw); > >> + void **object); > > > > How is this going to create an object if it doesn't know what allocator > > to use? An implementation may be used by more than one document. > > I intend to use the static allocator above to allocate the object. > Generally, the bootstrap part all use the static allocator which is > set by calling dom_implregistry_dom_implementation_initialise. And > that function is called by the dom_initialise, this means when the > client initialise DOM, he should provide the allocator for > implementation related stuff. And when he create document, it provide > the document allocator then. Ok. > >> Index: include/dom/core/namednodemap.h > >> =================================================================== > >> --- include/dom/core/namednodemap.h (revision 8546) > >> +++ include/dom/core/namednodemap.h (working copy) > >> @@ -80,4 +80,11 @@ > >> _dom_namednodemap_remove_named_item_ns((dom_namednodemap *) > >> (m), \ > >> (dom_string *) (n), (dom_string *) (l), (dom_node **) (r)) > >> > >> +bool _dom_namednodemap_equal(struct dom_namednodemap *m1, > >> + struct dom_namednodemap *m2); > >> + > >> +#define dom_namednodemap_equal(m1, m2) _dom_namednodemap_equal( \ > >> + (struct dom_namednodemap *) (m1), \ > >> + (struct dom_namednodemap *) (m2)) > > > > Is this not implementation detail? Why is it in the public API? > > I think our clients may need to test whether two NodeMap are equal. > Such as in JS. Would they not just perform simple equality checking -- i.e. comparing the objects' addresses in memory? > >> Index: include/dom/bootstrap/implregistry.h > >> =================================================================== > >> --- include/dom/bootstrap/implregistry.h (revision 8546) > >> +++ include/dom/bootstrap/implregistry.h (working copy) > >> @@ -15,16 +15,18 @@ > >> struct dom_implementation_list; > >> struct dom_string; > >> > >> +/* Initialise the dom_implementation */ > >> +dom_exception dom_impleregistry_dom_implementation_initialise( > >> + dom_alloc allocator, void *ptr); > > > > Why does the registry need initialising? > > Set the allocator for various implementation stuff alloc/dealloc. Right. > > > >> /* Retrieve a DOM implementation from the registry */ > >> dom_exception dom_implregistry_get_dom_implementation( > >> struct dom_string *features, > >> - struct dom_implementation **impl, > >> - dom_alloc alloc, void *pw); > >> + struct dom_implementation **impl); > >> > >> /* Get a list of DOM implementations that support the requested features > >> */ > >> dom_exception dom_implregistry_get_dom_implementation_list( > >> struct dom_string *features, > >> - struct dom_implementation_list **list, > >> - dom_alloc alloc, void *pw); > >> + struct dom_implementation_list **list); > > > > These API changes prevent implementation objects being created on the > > fly. > > I am sorry, but could you please why? It turns out that I completely misunderstood this change. Ignore my comments here. > >> - } else { > >> + } else if (colon == 0) { > >> + /* Some name like ":name" */ > >> + if (namespace != NULL) > >> + return DOM_NAMESPACE_ERR; > > > > ":name" means an element in a namespace with no prefix. It's perfectly > > valid, assuming I'm remembering XML namespaces correctly. > > In the spec, http://www.w3.org/TR/2004/REC-xml-names11-20040204/#ns-qualnames > . It said a QName can't start with a ':' character. So it does. I stand corrected. > >> + > >> + while (slen > 0) { > >> + err = parserutils_charset_utf8_to_ucs4(s, slen, &c, &clen); > >> + if (err != PARSERUTILS_OK) { > >> + return (uint32_t) -1; > >> + } > > > > If this function is called a lot, you'll find that > > parserutils_charset_utf8_char_byte_length() is faster, as it doesn't > > perform the conversion of utf8 to ucs4 for every character (it simply > > considers the first byte of the UTF-8 character and returns the implied > > byte length. Obviously, you'll need to convert the indexth character to > > UCS-4 to return it. > > I will try to change this. Ok. > >> > >> +/* Walk the logic-adjacent text in document order */ > >> +static dom_exception walk_lat_order(dom_node_internal *node, > >> walk_operation opt, > >> + walk_order order, dom_string **ret, bool *cont); > >> +/* Walk the logic-adjacent text */ > >> +static dom_exception walk_lat(dom_text *text, walk_operation opt, > >> + dom_string **ret); > > > > Can these two please be renamed to be more obvious -- "lat" is entirely > > opaque to someone reading the code. > > > >> +dom_exception walk_lat_order(dom_node_internal *node, walk_operation opt, > >> + walk_order order, dom_string **ret, bool *cont) > > > > Needs documenting. > > > > It should also iterate across the subtree, not recurse. Recursion over > > deep trees can result in stack overflows. Algorithms that recurse over > > the tree should not appear anywhere in libdom. > > > >> +dom_exception walk_lat(dom_text *text, walk_operation opt, > >> + dom_string **ret) > > > > As above. > > I will change this. Thanks. > >> --- src/core/element.c (revision 8546) > >> +++ src/core/element.c (working copy) > >> +/* Note: The DOM implentation usally use uppercase letter for element > >> names > >> + * and lowercase for attribute names. > >> + */ > > > > This is completely untrue. Case sensitivity entirely depends upon the > > source document language. Libdom can make no assumptions about it. > > > >> +#define ID_ATTRIBUTE "id" > > > > Is there a need for this? The language binding is probably the place > > that needs to be able to extract ID attributes. > > I will change this. Ok. > >> +static dom_exception hashtable_clone(struct hash_table *sh, > >> + struct dom_document *sd, struct hash_table **th, > >> + struct dom_document *td, dom_node_internal *parent); > > > > Should this not be in hashtable.c? > > There is only one place where I need to clone a hash_table, so I put > it here. You are right it is in the hash_table, but putting it here is > simple, because the Element now what is the key and value in the > hash_table. I disagree. It should be a generic function within hashtable.c. Something like: struct hash_table *hash_clone(struct hash_table *hash, dom_alloc alloc, void *alloc_pw, void *(*clone_key)(void *key, void *pw, dom_alloc alloc, void *alloc_pw), void *key_pw, void *(*clone_value)(void *value, void *pw, dom_alloc alloc, void *alloc_pw), void *value_pw); > >> +DIR_SOURCES := \ > >> + string.c node.c \ > >> + attr.c characterdata.c element.c \ > >> + implementation.c impllist.c \ > >> + text.c typeinfo.c comment.c \ > >> + namednodemap.c nodelist.c \ > >> + cdatasection.c document_type.c entity_ref.c pi.c \ > >> + doc_fragment.c document.c > > > > Is there a reason why these are no longer in alphabetical order? > > I rearrange the files according on their dependence and category. > String and Node are types that all other types need. > cdatasection/doc_type/entiry_ref/pi are XML related types. Right. J.
