Stefan Kühn wrote:
Hello!

At the moment I and some other users of the dumps show every day/week/month at http://dumps.wikimedia.org/ for a new dump.

Please see the following shell script I am using to fetch "-latest-pages-articles.xml.bz2" from plwiki and plwikisource.

This script run on Solaris 5.9 machine.

Unfortunately, the dumps no longer bear a "Last-Modified" header, so currently wget is downloading them every day. Is there possibility to have those headers back?

#! /bin/sh

WIKIHOME=/home/wiki
HTTP_PROXY=http://my.proxy:8080/
export HTTP_PROXY

[ "$#" = "0" ] && set plwikisource plwiki

[ -d "${WIKIHOME}/log" ] || mkdir -p "${WIKIHOME}/log"
for DB in "$@"
do
        WIKI_XML="${WIKIHOME}/${DB}-latest-pages-articles.xml"
        WIKI_BZIP="${WIKI_XML}.bz2"
        LOGFILE="${WIKIHOME}/log/${DB}-`date +%Y%m%d-%H%M.log`"
        /usr/local/bin/wget -o "${LOGFILE}" -P "${WIKIHOME}" -N \

http://download.wikimedia.org/${DB}/latest/${DB}-latest-pages-articles.xml.bz2
        if /usr/bin/test "${WIKI_BZIP}" -nt "${WIKI_XML}"
        then
            /usr/bin/bzip2 -dc "${WIKI_BZIP}" > "${WIKI_XML}"
            /usr/bin/touch -r  "${WIKI_BZIP}" "${WIKI_XML}"
        else
            /usr/bin/rm "${LOGFILE}"
        fi
done


Thanks for your help!

Stefan Kühn
http://de.wikipedia.org/wiki/User:Stefan_K%C3%BChn



Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Toolserver-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/toolserver-l

Reply via email to