What's the system they're using for corrections? Have they built a web application, or are they fixing the text in something like Word?
On 2011-12-29 12:54, Roger Loran Bailey wrote: > As I said, Bookshare uses human proofreaders. The books are scanned and, > of course, there are scanning errors. Then human volunteers download the > scanned copy and proofread it correcting the errors. The corrected copy > is then uploaded and that is what goes into the Bookshare collection. > > On 12/29/2011 3:50 PM, Edward Betts wrote: >> Does the bookshare correction software maintain word page coordinates? >> >> On 2011-12-29 12:23, Roger Loran Bailey wrote: >>> I have an idea. It might run into a problem with copyright issues, but I >>> am not sure because I think it might come under the provisions in the >>> copyright act that covers preparing books for use by the print impaired. >>> I am not sure what it is, but I think Open Library has a relationship >>> with Bookshare. Bookshare has human volunteers who proofread scans of >>> books. I am one of them. Might it be possible for Open Library and >>> Bookshare to share scanned books? That is, copies of books that are held >>> by Bookshare could be turned over to Open Library to be posted as >>> protected Daisy books and the scans that come from the Internet Archive >>> could be turned over to Bookshare to be proofread by Bookshare >>> volunteers. Then after they have been proofread and enter the Bookshare >>> collection they could be copied and returned to Open Library to be >>> posted as better copies of what was there before. Am I just fantasizing >>> or might something like this be possible? >>> >>> On 12/29/2011 3:03 PM, Edward Betts wrote: >>>> We don't currently have a system for recording the quality of the OCR or >>>> correcting mistakes. >>>> >>>> As you point out the OCR doesn't properly handle blackletter type. >>>> >>>> A system for correcting OCR is often requested, conceptually it is quite >>>> simple. Just a web page that shows the page image and a way to edit the >>>> text. We keen to maintain page coordinate information for each word so >>>> that we can highlight words in the book reader and search inside. This >>>> makes the problem more difficult. >>>> >>>> We would like to build a correction system, but we don't have the >>>> resources. >>>> >>> _______________________________________________ >>> Ol-discuss mailing list >>> [email protected] >>> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss >>> To unsubscribe from this mailing list, send email to >>> [email protected] >> _______________________________________________ >> Ol-discuss mailing list >> [email protected] >> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss >> To unsubscribe from this mailing list, send email to >> [email protected] > _______________________________________________ > Ol-discuss mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss > To unsubscribe from this mailing list, send email to > [email protected] _______________________________________________ Ol-discuss mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to [email protected]
