sounds like a good idea. if I get time to benchmark the new implementation over the holiday, I'll post any results I generate. peter
"BAZLEY, Sebastian" <[EMAIL PROTECTED]> wrote: The HTML parser using JTidy now does a single scan of the DOM, using a recursive method. Functionally, it seems OK. It picks up all the images in the image test file, and ignores duplicates, but I've not tested it for speed or memory usage. Additional functionality: - it should find all background images (e.g. in table, TD, TR etc) - it handles the BASE tag - it should find These additional extractions may need tweaking a bit to limit their scope. The other change I made was to treat the uniqueURL Set as a Collection. It still behaves the same, but it means that the parser could be called with a different Collection that allows duplicates. So I propose to add an overloaded version of getEmbeddedResourceURLs() (to all 3 parsers) that accepts a Collection as a parameter - it can then be used to retrieve the resources with duplicates. While setting up the unit tests for JTidy, I found a minor problem - once getParser() has been called, the parser cannot be changed. I think it would be useful if this restriction was removed, and propose to do this by splitting the getParser() code between the HTMLParser class and the implementation, by adding a factory method to each implementation, which would then be free to re-use parsers, or create a new one each time, as necessary. The HTMLParser class has a useful unit test (which I borrowed for the JTidy parser). It would be nice if this could be used to test all the parsers, not just the default. If HTMLParser is changed to be able to get multiple parsers, I think this would happen as a by-product. [Each parser could have a test to instantiate itself via getParser()]. Hope this all makes sense ... S. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------- Do you Yahoo!? Free Pop-Up Blocker - Get it now
