On 13.10.01, 19:36:23, Tim Kynerd wrote:
> Hi,
> 
> I have been running a plucker-build script that plucks news from a couple of
> sites.  However, this script generally takes 35-40 minutes (!) to run, and
> since I live in Sweden and pay for even local phone calls by the minute, I'd
> like to shorten this.
> 
> Just for the heck of it, I've just installed wwwoffle, which caches HTML
> documents from the Web, and played with it a little bit.  I can easily get
> it to download and cache the documents I'd like to pluck (which should take
> less time than downloading *and* parsing them, right?) -- but it stores them
> in a hashed form in special directories, and they're only accessible through
> a proxy server on my local machine.

> 
> Is there any way to make plucker use this proxy server?  I checked the docs
> and tried setting up a .pluckerrc file with the "http_proxy=" option in the
> [DEFAULT] section, but when I try to pluck a document that's in the wwwoffle
> cache, the system still brings up the Internet connection, indicating that
> plucker isn't trying to use the proxy server to access that document.

It seems that (on Unix, at least) the Plucker Python scripts honour
the "http_proxy" and "ftp_proxy" environment variables. Just set them
tp "http://name.of.your.proxy:port/"; (in my case it's
"http://wwwproxy:3128/";, using a private squid proxy in the local
network.)

Set the environment variable in the shell from which you start the
plucker scripts. On my system it definitely queries the proxy, as I
can see from the logs. Also note that the http_proxy variable's
content needs to be in the form of an URL, preferably with port
number.

> 
> Or can anyone suggest some other intelligent way to do what I need to do?
> Any help is welcome.
> 
> Regards,
> Tim Kynerd
> 
> Sunrise in Stockholm today:  7:20
> Sunset in Stockholm today:  17:47
> My rail transit photos at http://www.kynerd.nu
> 

-- 
Bernd Sieker

NetBSD - the cathedral versus the bizarre.
                -- Julian Assange

Reply via email to