I am developing an application which is currently using Mozilla's
Webclient as a browser embedded in the Java application. This
application requires that we perform the normal functions of a web
browser (which the webclient does beautifully), as well as inspecting
the DOM behind the scenes
One thing in particular we would like to do is to find out what controls
('input' tags, etc.) belong to a particular form. The trouble is that
sometimes the parser seems to do some peculiar things when the HTML is
not properly formed. For example, <input> tags that were contained
within <form> tags show up before the <form> tag in the DOM.
ex:
go to http://www.ti.com/ and inspect the DOM. There is one form
on the page, but some of it's hidden inputs show up *above* the
form.
Normally I would say, "the HTML is not well-formed, what can you do?"
But the weird thing is that both the webclient and Mozilla itself are
able to submit the form with the proper arguments.
Is there a way that we can find all of the inputs for a form using
Webclient (possibly using xpcom, some home-brewed JNI, etc?)
-Chris