Stefan Kühn wrote:
Hello!At the moment I and some other users of the dumps show every day/week/month at http://dumps.wikimedia.org/ for a new dump.
Please see the following shell script I am using to fetch "-latest-pages-articles.xml.bz2" from plwiki and plwikisource.
This script run on Solaris 5.9 machine.Unfortunately, the dumps no longer bear a "Last-Modified" header, so currently wget is downloading them every day. Is there possibility to have those headers back?
#! /bin/sh WIKIHOME=/home/wiki HTTP_PROXY=http://my.proxy:8080/ export HTTP_PROXY [ "$#" = "0" ] && set plwikisource plwiki [ -d "${WIKIHOME}/log" ] || mkdir -p "${WIKIHOME}/log" for DB in "$@" do WIKI_XML="${WIKIHOME}/${DB}-latest-pages-articles.xml" WIKI_BZIP="${WIKI_XML}.bz2" LOGFILE="${WIKIHOME}/log/${DB}-`date +%Y%m%d-%H%M.log`" /usr/local/bin/wget -o "${LOGFILE}" -P "${WIKIHOME}" -N \ http://download.wikimedia.org/${DB}/latest/${DB}-latest-pages-articles.xml.bz2 if /usr/bin/test "${WIKI_BZIP}" -nt "${WIKI_XML}" then /usr/bin/bzip2 -dc "${WIKI_BZIP}" > "${WIKI_XML}" /usr/bin/touch -r "${WIKI_BZIP}" "${WIKI_XML}" else /usr/bin/rm "${LOGFILE}" fi done
Thanks for your help! Stefan Kühn http://de.wikipedia.org/wiki/User:Stefan_K%C3%BChn
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Toolserver-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/toolserver-l
