Hello, I'm doing work in information extraction from the web and would like to be able to programatically determine the relative x-y locations of arbitrary elements specified in an HTML document on a rendered page. I'm particularly interested in text, but want the size of images and other page elements to be factored into the locations returned.
I haven't had any luck finding something that can do this out of the box, so am about to attempt hacking the Mozilla layout engine to give me the information I'd like. Before I do, I have two questions: First, does anyone know of a simple way to programatically determine the x-y coordinates of a text and other HTML elements that doesn't involve substantial modifications to the browser? Second, if coding it myself is what I need to do then does anyone have any pointers? Is this a reasonable way to proceed? Thanks! Jeff _______________________________________________ mozilla-layout mailing list [email protected] http://mail.mozilla.org/listinfo/mozilla-layout
