On Wed, Aug 21, 2013 at 11:23 AM, Tejaswini Niranjana t...@cscs.res.in wrote:
Colleagues working in Bangla say that in their experience it is faster,
cheaper, and less error-prone to create digital texts by typing them in.
The cheaper is an interesting word to use in this context. Are we
still
India Community list
Subject: Re: [Wikimediaindia-l] Indic print material digitization workshop
query
Colleagues working in Bangla say that in their experience it is faster,
cheaper, and less error-prone to create digital texts by typing them in.
Once there is a larger body of digitised texts
*To:* Wikimedia India Community list
*Subject:* Re: [Wikimediaindia-l] Indic print material digitization
workshop query
** **
Colleagues working in Bangla say that in their experience it is faster,
cheaper, and less error-prone to create digital texts by typing them in.
Once there is a larger
[mailto:
wikimediaindia-l-boun...@lists.wikimedia.org] *On Behalf Of *Tejaswini
Niranjana
*Sent:* 21 August 2013 11:24
*To:* Wikimedia India Community list
*Subject:* Re: [Wikimediaindia-l] Indic print material digitization
workshop query
** **
Colleagues working in Bangla say
@Sumana Harihareswara
Please look the Bengali OCR https://code.google.com/p/banglaocr/ and its
need to developed.
On Mon, Aug 19, 2013 at 10:12 PM, Sumana Harihareswara
suma...@wikimedia.org wrote:
On 08/19/2013 02:52 AM, L. Shyamal wrote:
Re-posting a now outdated query from meta
Hi Everyone,
In my opinion, it is always better to OCR the documents. I agree that
it's error prone but there is a
Google Summer of Code project being done by AnkurIndia whose aim is to
improve the quality of OCRs
for Indian scripts.
Colleagues working in Bangla say that in their experience it is faster,
cheaper, and less error-prone to create digital texts by typing them in.
Once there is a larger body of digitised texts, and OCR technology for
Indian languages also improves, OCR could become the preferred option.
Tejaswini
Re-posting a now outdated query from meta
http://meta.wikimedia.org/wiki/Talk:India_Access_To_Knowledge/Events/Bangalore/Digitization_workshop_18August2013
now that the workshop has already been conducted I think those that have
attended the workshop could comment if this cover Indic language
On Mon, Aug 19, 2013 at 12:22 PM, L. Shyamal lshya...@gmail.com wrote:
Re-posting a now outdated query from meta
http://meta.wikimedia.org/wiki/Talk:India_Access_To_Knowledge/Events/Bangalore/Digitization_workshop_18August2013
The phrase creating text based documents which forms the basis of
Thank you Subhashish for the response at:
http://meta.wikimedia.org/wiki/Talk:India_Access_To_Knowledge/Events/Bangalore/Digitization_workshop_18August2013
Dear Shyamal, this workshop was for demonstrating the participants about
create a home made set up to scan the books, edit the scanned images
On 08/19/2013 02:52 AM, L. Shyamal wrote:
Re-posting a now outdated query from meta
http://meta.wikimedia.org/wiki/Talk:India_Access_To_Knowledge/Events/Bangalore/Digitization_workshop_18August2013
now that the workshop has already been conducted I think those that have
attended the workshop
Whether to OCR or not to OCR is a significant issue! When we OCR a page of
text, the resultant is often error-prone, lost formatting, and the
correction requires crowd-sourced correction. Many of us know about Project
Gutenberg. The site provides plain vanilla etexts. But what most people do
not
A brief account about the Workshop as a participant.
The workshop was meant as a DIY digitization without having to invest in a
scanner and to use a simple digital camera for effective digitization of
books and documents.
The following were covered during the Workshop by Viswaprabha, who mainly
I am still traveling away in Bangalore etc. seeking out sources and
opportunities for old texts and other media for Wikimedia commons from
various locations of India. I think different people look at the issue
through different layers and perspectives. This might call for a detailed
write-up on
On Mon, Aug 19, 2013 at 10:12 PM, Sumana Harihareswara
suma...@wikimedia.org wrote:
Is there a central list of the problems that OCR software (especially
open source OCR software) has with text written in Indic languages? If
so, I could help encourage people to fix those problems, as
15 matches
Mail list logo