"David A. Desrosiers" wrote:
>
> Architecturally, this is how it "should" be done, AvantSlow or
> otherwise:
>
> 1. Sync to a URI which includes a form, which brings down an empty,
> non-cast form (no values in the elements).
>
> 2. Create a database of that URI, which should *NOT* be physically
> located inside any other database proper. What I mean here is that
> if you grab a top-level URI, and have --maxdepth=3, and on
> page3.html, there's a form, this form should *NOT* go inside
> MyURI.pdb, it should be externally linked (and writeable) as
> MyURI-formXX.pdb, where XX is the number of the forms in that
> gather process (example is 4 pages which are linked, each with
> their own forms).
>
<snip>
>
> 3. Form values are saved in some writable place. Remember that
> MyURI.pdb is not writeable at all in our current scheme, nor is
> deletion or modification of pages themselves.
If the form is externally linked and writeable in it's own DB, wouldn't
it seem that that would be the logical place to write to?
> 4. Run the parser again, but this time it communicates with the Palm
> itself, querying for any remaining form values which show up as
> "DirtyRecords" in the databases which it queries.
This seems like a job for the conduit rather than the parser. One of
our goals here is to eliminate the need for HotSyncing. Its probably not
the most useful for the rest of the Plucker community, but I'll share:
an IE toolbar button launches a program which calls the spider then
serially sends the file the parser made to the palm via a rudimentary
protocol. The conduit and parser in our case become integrated, at
least as it appears to the user. As these features more complex we're
practically re-writing the HotSync manager, which seems like a lot of
pointless work. But you do have complete control over exactly what
happens, how easy it is, and how long it takes.
The form download should obviously happen while this transfer takes
place. In the HotSync world, it'd be a conduit that does the work.
> 5. Once the desktop parser has the "Dirty" records in tow, it can then
> build a POST response, sending it to the server, and awaiting for
> responses from the server to those values.
Yep.
> 6. Content is returned from the server, parser gathers it up again,
> and somehow has to push this back to the Palm, into a place where
> MyURI.pdb can then locate the content as "new", and allow the user
> to interact with it. This assumes, of course, that you were *ONLY*
> sync'ing to move this form data around, not to create new PDB's at
> this time. If you wanted to sync new form data, *AND* update the
> database, it could present a problem also, or if you were creating
> new databases during that sync.
It's the "allow the user to interact with it" part that still needs some
more thought. How and when exactly are these form results going to be
displayed? I propose some sort of "form manager" where you can get a
list of each of the POSTs/GETs made to a form (including date, time, and
data submitted) and have a chance to view any of the returned results to
any of these submissions.
> a. Where do we store the form data on the Palm?
I think the architecture we're looking for is similar to the way email
is handled on the Palm. When you send an email from Plucker, that email
is dumped into the MailDB which then has a conduit on the desktop to
deal with it. Forms should have their own DB, if not several, as you
suggested, for each page they live off of.
> b. Where do we store the form data at sync time while a parser run is
> gathering OTHER (non-posted form content)?
What?
> c. How do we support SSL-based forms in the current Python scheme,
> without requireing the user to recompile Python with libssl?
Oh dear god. Second version, please... :)
> d. Do we need to support cookies, locked into the form database on the
> palm, and POST that back in the "session" we create when we sync
> that content back up to the server at POST time?
Cookies would be nice, except they have usually have expiration times so
trying to use them at next sync would fail. At this point, we're going
to have to have *some* level of cooperation from the web site itself.
Either non-expiring cookies or a unique ID embedded in the form which
would get returned with the form results.
> e. How do we handle the precedence of forms in a depth-first vs.
> breadth-first gathering process?
I'm not sure what you mean here. Forms live on individual pages; what
sort of order is there between them?
> f. How do we handle the threading of gathers? POST form values at the
> end of the parser run? (one sync required; parse, sync, pull form
> data, push to server, get response, build pdb, push new data to
> Palm) At the beginning? (two sync's required; one to get the form
> values, disconnect, parse, sync to push new data to the Palm). If I
> have a 2-meg database and am sync'ing without a cradle, I don't
> want my Palm to sit there locked up over IrDA while content is
> being GATHERED, I only want to wait while content is being SYNC'd.
Like I said, it seems more appropriate for the conduit to be managing
the records from the Palm. The conduit should be able to feed the
returned HTML directly into the parser. Hrmmmmm.... yeah, I'm still not
comfortable with this. It seems to me that the parser should be just
that, a parser, and not have to worry about pulling stuff from the palm.
The parser currently could read in HTML that has been dumped to a local
file, but then it'd have a local, incorrect URL attached to it, yes?
It'd be nice to be able to feed the parser some HTML code and assign a
specified URL to it. I really want to be able to select a block of text
in a browser and be able to transfer only that portion to the Palm,
which we could do if this was implemented.
> Hope this churns some neurons.
it certainly has. :) good discussion.