Re: [backstage-developer] Nice bit of crowd-sourcing

Dillon Farmer Fri, 19 Jun 2009 09:53:53 -0700

John O'Donovan wrote:

The Telegraph are (allegedly) producing a large supplement explainingsome of the _redacted details_...I can neither confirm or deny this information.Cheers,


_*                                          *_
*John O'Donovan*
Chief Technical Architect

*BBC Future Media & Technology (Journalism)
*BC3 C1, Broadcast Centre, 201 Wood Lane, London

http://news.bbc.co.uk/
http://news.bbc.co.uk/sport/
http://news.bbc.co.uk/weather/

------------------------------------------------------------------------

*From:* [email protected][mailto:[email protected]] *On Behalf Of *AlexMace

*Sent:* 19 June 2009 16:30
*To:* [email protected]
*Subject:* Re: [backstage-developer] Nice bit of crowd-sourcing

I presume someone has tried the old trick of extracting the picturesfrom PDF to make sure they didn't just apply the redaction over thetop of the pictures? Sounds unlikely, but it's been done before...


On 19 Jun 2009, at 16:23, Brian Butterworth wrote:

2009/6/19 John O'Donovan <[email protected]<mailto:[email protected]>>


    We did something similar, but it's the low tech version... :o)

    _http://news.bbc.co.uk/1/hi/uk_politics/8106044.stm_
    <http://news.bbc.co.uk/1/hi/uk_politics/8106044.stm>

    with things people have found being published here...


    http://news.bbc.co.uk/1/hi/uk_politics/8106650.stm
    <http://news.bbc.co.uk/1/hi/uk_politics/8106650.stm>__
    <http://news.bbc.co.uk/1/hi/uk_politics/8106650.stm>

    I like what the Guardian have done with this - been playing with
    it...


After an hour I start wondering about using OCR software...

Does anyone know of a command line OCR tool that I could use?Something that works with PHP perhaps?

As it seems very easy to get at the images fromthe http://mps-expenses.guardian.co.uk/page/X<http://mps-expenses.guardian.co.uk/page/194884/>/ pages as there isonly one image in the whole document. Running OCR will generate lotsof crud, but it could be matched against the human input to act asvalidation.


    Cheers,

    _*                                          *_
    *John O'Donovan*
    Chief Technical Architect

    *BBC Future Media & Technology (Journalism)
    *BC3 C1, Broadcast Centre, 201 Wood Lane, London

    http://news.bbc.co.uk/
    http://news.bbc.co.uk/sport/
    http://news.bbc.co.uk/weather/

    ------------------------------------------------------------------------
    *From:* [email protected]
    <mailto:[email protected]>
    [mailto:[email protected]
    <mailto:[email protected]>] *On Behalf Of
    *Brian Butterworth
    *Sent:* 19 June 2009 09:46
    *To:* [email protected]
    <mailto:[email protected]>
    *Subject:* [backstage-developer] Nice bit of crowd-sourcing

    Nice bit
    of crowd-sourcing I think here:

    http://mps-expenses.guardian.co.uk/

    Shame the
    app's not AJAX, would been easier to use that way, but generally
    a great way
    of checking 77252 pages of documents.

    Kind-of-wondering why Auntie didn't do it first, but...

    All the best

    Brian Butterworth




--

Brian Butterworth

follow me on twitter: http://twitter.com/briantist

web: http://www.ukfree.tv - independent digital television andswitchover advice, since 2002

It's so funny to see steganography in action in this context. BTW issteganography the correct term for this?


Cheers,

Dillon

Re: [backstage-developer] Nice bit of crowd-sourcing

Reply via email to