[Wikitech-l] mail archive permalinks [was Re: Mailman archives broken?]

2012-08-17 Thread Adam Wight

Tilman, thanks for those links.

I thought the base-32 encoded hash of Message-Id discussed in 
[http://wiki.list.org/display/DEV/Stable+URLs] gives us a 
straightforward and effective solution to the problem.  Ten characters 
or so should be plenty.  This would produce URLs like, 
[http://lists.wikimedia.org/pipermail/wikitech-l/2012-August/OHRDQGOX35.html]


We could prefix these with a parent directory that serves as a 
versioning scheme for our hash, allowing us to create forwarding rules 
if the permalink rules change in the future.  For example (and I have no 
experience, this might not work), we can generate an .htaccess at the 
root of old archive directories, which redirects each of the old 
sequential URLs to the new, hashed location.


-Adam

On 08/17/2012 08:00 AM, Tilman Bayer wrote:

On Fri, Aug 17, 2012 at 4:26 AM, MZMcBride z...@mzmcbride.com wrote:

Guillaume Paumier wrote:

I was told yesterday that the mailman/pipermail archives were broken,
in that permalinks were no longer linking to the messages they used to
link to (therefore not being permalinks at all).

This is pretty devastating. It's difficult to overstate the importance of
Mailman archives in documenting Wikimedia's history (or even history before
Wikimedia was a concept). I've come across links such as the one at
https://en.wikipedia.org/wiki/Wikipedia:Tim_Starling_Day that I can't even
find anywhere in the Mailman archives any longer. :-(

MZMcBride


Many historical Signpost articles are affected as well:
https://en.wikipedia.org/w/index.php?title=Special%3ASearchsearch=pipermail+wikitech+prefix%3AWikipedia%3AWikipedia+Signpost%2F2

BTW, here's Brion dreaming about a stable archiving system in 2007 ...
http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/28993

In the same year, the lead developer of Mailman said that fixing this
problem of breaking URLs was absolutely critical
(http://mail.python.org/pipermail/mailman-developers/2007-July/019632.html
) and  some ideas were thrown around
(http://wiki.list.org/display/DEV/Stable+URLs ), but apparently this
huge data integrity problem still hasn't been solved.




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] mail archive permalinks [was Re: Mailman archives broken?]

2012-08-17 Thread Adam Wight
Mailman 3 already has code to add this X-Message-ID-Hash header, and 
integrate with mail archiving tools.


-Adam

On 08/17/2012 11:32 AM, Adam Wight wrote:

Tilman, thanks for those links.

I thought the base-32 encoded hash of Message-Id discussed in 
[http://wiki.list.org/display/DEV/Stable+URLs] gives us a 
straightforward and effective solution to the problem.  Ten characters 
or so should be plenty.  This would produce URLs like, 
[http://lists.wikimedia.org/pipermail/wikitech-l/2012-August/OHRDQGOX35.html] 



We could prefix these with a parent directory that serves as a 
versioning scheme for our hash, allowing us to create forwarding rules 
if the permalink rules change in the future.  For example (and I have 
no experience, this might not work), we can generate an .htaccess at 
the root of old archive directories, which redirects each of the old 
sequential URLs to the new, hashed location.


-Adam

On 08/17/2012 08:00 AM, Tilman Bayer wrote:

On Fri, Aug 17, 2012 at 4:26 AM, MZMcBride z...@mzmcbride.com wrote:

Guillaume Paumier wrote:

I was told yesterday that the mailman/pipermail archives were broken,
in that permalinks were no longer linking to the messages they used to
link to (therefore not being permalinks at all).
This is pretty devastating. It's difficult to overstate the 
importance of
Mailman archives in documenting Wikimedia's history (or even history 
before

Wikimedia was a concept). I've come across links such as the one at
https://en.wikipedia.org/wiki/Wikipedia:Tim_Starling_Day that I 
can't even

find anywhere in the Mailman archives any longer. :-(

MZMcBride


Many historical Signpost articles are affected as well:
https://en.wikipedia.org/w/index.php?title=Special%3ASearchsearch=pipermail+wikitech+prefix%3AWikipedia%3AWikipedia+Signpost%2F2 



BTW, here's Brion dreaming about a stable archiving system in 2007 ...
http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/28993 



In the same year, the lead developer of Mailman said that fixing this
problem of breaking URLs was absolutely critical
(http://mail.python.org/pipermail/mailman-developers/2007-July/019632.html 


) and  some ideas were thrown around
(http://wiki.list.org/display/DEV/Stable+URLs ), but apparently this
huge data integrity problem still hasn't been solved.




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l