Geoff Hutchison writes:
> > to get temporary read/write CVS access to the source code on
> > SourceForge?
>
> Done.
Thanks.
> BTW, the current development on 3.2 is on the htdig-3-2-x branch and
> the mainline is pretty broken. You should either repopulate the
> mainline from the htdig-3-2-0-b3 release or branch off of
> htdig-3-2-x.
Well, I just tried to do a naive join of the 3-2-x branch onto the
main line and it resulted in 158 conflicts. I'll leave this to more
experienced hands to sort out :-) For now I'll branch from 3-2-0-b3.
> My suggestion is to take a look at the 3.2 code if you haven't
> already and think about what sorts of "components" you need besides
> what's there now.
One of my goals is to separate the act of retrieving a page from the
act of computing its statistics and updating the database. The reason
is that I want to be able to stuff into the system pages that might
come from a completely different source--perhaps a local cache,
perhaps pages that have been preprocessed somehow, etc.
My preliminary thoughts is that there should be a kind of
internet.get_document(url, doc)
method which stores the contents of the url into doc, and another
statistics.tally(doc)
which accumulates the statistics from the (already-read) document and
possibly stuffs them into the database.
But we'll see. There's no way to know whether a refactoring will work
except to try it.
Michael
--
Michael Haggerty
[EMAIL PROTECTED]
_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/htdig-dev