> 1) The viewer displays a page with a form in it. (New document type?)
>     (anyone working on displaying a form in plucker???)

        A few new custom gadgets have to get created here (LstDrawList
populated with form elements, etc.). Remember, we have to populate the
FormDB (on the Palm) with all the <option...> elements which appeared in the
original webpage which created it. For example:

        [Select your location]

        <INPUT TYPE="HIDDEN" NAME="section" VALUE="generic">
        <SELECT NAME="location">
        <OPTION SELECTED VALUE=""> Select your location
        <OPTION VALUE="AL"> Alabama
        <OPTION VALUE="AK"> Alaska
        <OPTION VALUE="AZ"> Arizona
        <OPTION VALUE="AR"> Arkansas
        <OPTION VALUE="CA"> California
        <OPTION VALUE="CO"> Colorado
        <OPTION VALUE="CT"> Connecticut
        <OPTION VALUE="DE"> Delaware
        <OPTION VALUE="DC"> District of Columbia
        <OPTION VALUE="FL"> Florida
        ...

        You get the idea. That's just one dropdown. We have to be
intelligent about how we parse form data. This would be great if Python had
an XML or smart HTML parser (*cough* perl *cough*)

> 2) User enters/selects data and presses <submit>.
>     Since the actual POST only has a URL and some data pairs, write this
>     to a memo (in catagory PluckerForms?) along with the URL of the
>     original page and database name.

        What about forms with multiple submit buttons, or *NO* submit
button? (like those nasty Javascript forms showing up lately on WAY too many
sites, the onClick(); submit actions, arg!).

        Also, and this begs for the Python equivalent of HTML::LinkExtor,
how do you extract the absolute url of a script which processes the form, if
it's referenced as '../../../login.cgi', or your settings in the parser are
configured with --staybelow=http://some/path/below/that/cgi's/location.
There's got to be some error checking here. Perhaps ignore --stay-below
where there's a form value referenced? What about an image which is a submit
button? We can't take care of all of these cases, and we're certainly not
going to go writing a Javascript execution engine here to handle Javascript
parsing (Javascript can't actually be "parsed" as you know, it must be
executed, and the return trapped, not easy).

> 3) The conduit/separate program/whoever creates a home.html from the
>    memo info to build a new DB called:
>   <origDBname>FormsDB and contains pages generated from the POST requests
>    and named for the original page on which they appeared.

        You and your MemoDB.. just kidding.

        I think we should definately serialize the FormsDB name, since I
could have wired-0612-2001.pdb, wired-0528-2001.pdb, and so on, each with
their own forms in it. Something unique, or auto-assign the FormDB a random
number at sync time? Not sure yet how this would work. Ponders...

        What about --no-urlinfo or the 'Details' dialog? What about forms
which require 'logins' (i.e. credentials encoded in the URI)

> 4) Conduit/user runs plucker-build to get the data.

        EEP! Stop. Right here we have a logistics issue. We have to make
sure we do this in the right order. Take, for example, a page like my bank,
which allows me to log in and check my account balance and other things. I
have three consecutive separate web pages, which require forms to be filled
out (dropdowns for state, account type, etc) in order, before I am shown the
account balance. If I fill them out "alphabetically" (by page), then it will
clearly fail. And in effect, I only really need the last URI with the full
encoding of the previous forms' pages, to get to the data I need.

        Also, as I mentioned earlier, we have to make sure that POST of the
Palm-side FormsDB happens before *ANY* gathering of new content over port 80
(or 8080 or whatever) occurs.

        The other thing to consider here is the "palm-in-cradle-waiting"
timespan. There are still many people who have set up cron jobs to fetch
content for them, and they plop their Palm in the cradle and sync. If we
require that the Palm sit in the cradle (so we can query it for Form data)
at each sync, that may cripple that functionality for those users. We have
to think of the best way to do this *AT SYNC TIME*, so that when the user is
about to *RECEIVE* a new pdb, their Palm is queried for any existing FormDB
data, and then the conduit/parser can grok it and do something with it.

        Additionally, some checking to make sure that there *IS* data in the
form on the Palm before doing that 360-degree sync needs to happen. We now
have to consider separating "independant" databases (databases which have
static content, no forms, no need to "query" them for information to POST),
vs. "dynamic" databases, where they have to be queried from the desktop
(conduit/parser) for whatever information they may hold that needs to be
sent back to the server.

> 5) Conduit/user syncs new DB.

        If we're doing a 360-degree sync, the need/theories behind using an
actual caching mechanism come into play here. Drop the files onto disk (and
free the memory in the array, i.e. making the parser faster), and then run
across them for precedence, then query the palm, extract form data (if any),
POST to the website, refresh whatever pages need to be refreshed in the
local cache, then gather, build, sync.

        That's one idea, and certainly not the best approach. I use Plucker
from various (dozen) machines, including Network Hotsync to a remote machine
which listens for my sync. Keeping several versions of cache on all machines
is a bad design.

        Keep the neurons firing, we're getting some good ideas down here.



/d


Reply via email to