On Thu, Jun 18, 2009 at 01:54:48PM +0100, Francis Davey wrote:
> There's lots I'd like (such as auto splitting of PDFs)

I too think this would be a great usability feature.

parliament.gov makes this easier because downloading all PDFs to
whatheyclaimedfor.com's server is as simple as running:

wget --mirror --no-parent 
http://mpsallowances.parliament.uk/mpslordsandoffices/hocallowances/allowances-by-mp/

(though it looks to be 6GB+ total)

Then (as has been suggested), ImageMagick can be used to split the PDFs into
PNGs:

convert Gerry_Adams_0708_ACA.pdf Gerry_Adams_0708_ACA.png

results in a PNG for each page (Gerry_Adams_0708_ACA-5.png, being the 6th page,
for example).

It shouldn't be hard to write a script that creates PNGs for each PDF such that
the PNGs are sanely structured. I can do it, but I'll need to know
whattheyclaimedfor.com's preferred directory structure, though I won't be able
to do it in the next two days.

-- 
Tom

_______________________________________________
Mailing list [email protected]
Archive, settings, or unsubscribe:
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public

Reply via email to