Anyone have any suggestions to this issue raised on the Wikisource-L
mailing list?


---------- Forwarded message ----------
From: Bodhisattwa Mandal
Date: Sat, Feb 20, 2016 at 4:02 AM
Subject: Re: [Wikisource-l] Vote for Google OCR-Wikisource integration
in 2015 community wishlist
To: "discussion list for Wikisource, the free library"
<wikisourc...@lists.wikimedia.org>


Hi,

The OCR4Wikisource script is evolving heavily. Already more than
1,50,000 pages have been OCRed in both Tamil and Bengali Wikisource
using the OCR4Wikisource script. The idea and the tool proved to be a
game-changer for Indic Wikisource projects.

And when we were getting some hope, Google announced that they will
charge for doing OCR using their drive.
https://cloud.google.com/vision/

Is there any chance that WMF will go for negotiation with Google so
that we can do the mass OCR free of charge? I remember Asaf once told
that this possibility can be pursued. I think, now is the time to do
that.

Regards,

Yeah!
I'm really happy that the BUB tool is resurrecting, and for the new
OCR script. Thanks everyone!

Aubrey

On Tue, Jan 5, 2016 at 9:53 PM, Asaf Bartov <abar...@wikimedia.org> wrote:
>
> On Tue, Jan 5, 2016 at 10:29 AM, Bodhisattwa Mandal 
> <bodhisattwa.rg...@gmail.com> wrote:
>>
>> Hi,
>>
>> I am happy to inform, that Shrinivasan has created a python script to 
>> automate the process in Linux system. This scripts upload the PDF files to 
>> Google Drive, download the OCRed text and split, merge the text files 
>> properly to fit as the PDF file. We have just tested the script for small 
>> files in Kannad and Bengali Wikisource and it was successful. We are going 
>> to test the script for using different types and sizes of files and in other 
>> Indic languages in next few days.
>>
>> The script is in https://github.com/tshrinivasan/OCR4wikisource
>
>
> Fantastic news!
>
>    A.
>
>
> _______________________________________________
> Wikisource-l mailing list
> wikisourc...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>


_______________________________________________
Wikisource-l mailing list
wikisourc...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


_______________________________________________
Wikisource-l mailing list
wikisourc...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

_______________________________________________
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>

Reply via email to