Re: To infinity (well, SOAP) and beyond!

Adam D. Barratt Wed, 13 Feb 2008 14:58:06 -0800

On Wed, 2008-02-13 at 16:21 +0100, Stefano Zacchiroli wrote:
> On Tue, Feb 12, 2008 at 09:04:14PM +0000, Adam D. Barratt wrote:
> > following some of my recent commits, I've decided to take the bull by
> > its proverbial horns and look at converting some of our current HTML
> > scraping to use the BTS SOAP interface.
> 
> That's great news!


:-)

[...]
> > bts
> > ---
> > 
> > The perennial trigger for discussions about replacing HTML scraping with
> > SOAP. Sadly the fact that bts (rather usefully :) supports offline
> > working and local caches of bug content means we're largely stuck with
> > parsing the generated HTML. 
> 
> Uhm, does it? I might start asking dumb questions since I've never used
> the offline part of bts, but what features are actually provided in
> offline mode? According to the manpage:
> 
> * show/bugs clearly should work offline, but in that case we are anyhow
>   showing either an HTML page or a mailbox, so it isn't really related
>   to SOAP or scarping HTML, since in one of the two cases the HTML is
>   actually the final target of our action

If you're using bts cache, or show/bugs with cache mode set to full, you
don't just get an HTML page, but a set of HTML pages, attachments,
mboxes and version graph images. In theory, one should be able to
navigate between and within each of the pages as if one were online,
assuming the relevant files are in the cache - the version images are
displayed, links to individual messages, source and binary package bug
pages, maintainer bug pages, attachments, etc. are mangled to refer to
the local files (together with a link to the online version).

It doesn't always work, but that's the theory.

The regexes in mangle_cache_file() and href_to_filename() have made my
head hurt at least once^Wtwice - particularly the ones I wrote ;).

Adam


-- 
To unsubscribe, send mail to [EMAIL PROTECTED]

Re: To infinity (well, SOAP) and beyond!

Reply via email to