Il 12/01/2010 19:54, Magnus Hagander ha scritto:
On Tue, Jan 12, 2010 at 18:34, Dave Page<dp...@pgadmin.org>  wrote:
On Tue, Jan 12, 2010 at 10:24 PM, Tom Lane<t...@sss.pgh.pa.us>  wrote:
"Joshua D. Drake"<j...@commandprompt.com>  writes:
On Tue, 2010-01-12 at 10:24 +0530, Dave Page wrote:
So just to put this into perspective and give anyone paying attention
an idea of the pain that lies ahead should they decide to work on
this:

- We need to import the old archives (of which there are hundreds of
thousands of messages, the first few years of which have, umm, minimal
headers.
- We need to generate thread indexes
- We need to re-generate the original URLs for backwards compatibility

Now there's encouragement :-)

Or, we just leave the current infrastructure in place and use a new one
for all new messages going forward. We shouldn't limit our ability to
have a decent system due to decisions of the past.

-1.  What's the point of having archives?  IMO the mailing list archives
are nearly as critical a piece of the project infrastructure as the CVS
repository.  We've already established that moving to a new SCM that
fails to preserve the CVS history wouldn't be acceptable.  I hardly
think that the bar is any lower for mailing list archives.

Now I think we could possibly skip the requirement suggested above for
URL compatibility, if we just leave the old archives on-line so that
those URLs all still resolve.  But if we can't load all the old messages
into the new infrastructure, it'll basically be useless for searching
purposes.

(Hmm, re-reading what you said, maybe we are suggesting the same thing,
but it's not clear.  Anyway my point is that Dave's first two
requirements are real.  Only the third might not be.)

The third actually isn't actually that hard to do in theory. The
message numbers are basically the zero-based position in the mbox
file, and the rest of the URL is obvious.

The third part is trivial. The search system already does 95% of it.
I've already implemented exactly that kind of redirect thing on top of
the search code once just as a poc, and it was less than 30 minutes of
hacking. Can't seem to find the script ATM though, but you get the
idea.

Let's not focus on that part, we can easily solve that.

Agreed. That's the part that worries me less.


Cheers
--
Matteo Beccati

Development & Consulting - http://www.beccati.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to