So it doesn't match our requirements, it doesn't maintain scanned image page coordinates for words.
On 2011-12-29 16:37, Roger Loran Bailey wrote: > Most of the Bookshare volunteers do use Word. > > On 12/29/2011 7:10 PM, Edward Betts wrote: >> What's the system they're using for corrections? Have they built a web >> application, or are they fixing the text in something like Word? >> >> On 2011-12-29 12:54, Roger Loran Bailey wrote: >>> As I said, Bookshare uses human proofreaders. The books are scanned and, >>> of course, there are scanning errors. Then human volunteers download the >>> scanned copy and proofread it correcting the errors. The corrected copy >>> is then uploaded and that is what goes into the Bookshare collection. >>> >>> On 12/29/2011 3:50 PM, Edward Betts wrote: >>>> Does the bookshare correction software maintain word page coordinates? >>>> >>>> On 2011-12-29 12:23, Roger Loran Bailey wrote: >>>>> I have an idea. It might run into a problem with copyright issues, but I >>>>> am not sure because I think it might come under the provisions in the >>>>> copyright act that covers preparing books for use by the print impaired. >>>>> I am not sure what it is, but I think Open Library has a relationship >>>>> with Bookshare. Bookshare has human volunteers who proofread scans of >>>>> books. I am one of them. Might it be possible for Open Library and >>>>> Bookshare to share scanned books? That is, copies of books that are held >>>>> by Bookshare could be turned over to Open Library to be posted as >>>>> protected Daisy books and the scans that come from the Internet Archive >>>>> could be turned over to Bookshare to be proofread by Bookshare >>>>> volunteers. Then after they have been proofread and enter the Bookshare >>>>> collection they could be copied and returned to Open Library to be >>>>> posted as better copies of what was there before. Am I just fantasizing >>>>> or might something like this be possible? >>>>> >>>>> On 12/29/2011 3:03 PM, Edward Betts wrote: >>>>>> We don't currently have a system for recording the quality of the OCR or >>>>>> correcting mistakes. >>>>>> >>>>>> As you point out the OCR doesn't properly handle blackletter type. >>>>>> >>>>>> A system for correcting OCR is often requested, conceptually it is quite >>>>>> simple. Just a web page that shows the page image and a way to edit the >>>>>> text. We keen to maintain page coordinate information for each word so >>>>>> that we can highlight words in the book reader and search inside. This >>>>>> makes the problem more difficult. >>>>>> >>>>>> We would like to build a correction system, but we don't have the >>>>>> resources. >>>>>> >>>>> _______________________________________________ >>>>> Ol-discuss mailing list >>>>> [email protected] >>>>> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss >>>>> To unsubscribe from this mailing list, send email to >>>>> [email protected] >>>> _______________________________________________ >>>> Ol-discuss mailing list >>>> [email protected] >>>> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss >>>> To unsubscribe from this mailing list, send email to >>>> [email protected] >>> _______________________________________________ >>> Ol-discuss mailing list >>> [email protected] >>> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss >>> To unsubscribe from this mailing list, send email to >>> [email protected] >> _______________________________________________ >> Ol-discuss mailing list >> [email protected] >> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss >> To unsubscribe from this mailing list, send email to >> [email protected] > _______________________________________________ > Ol-discuss mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss > To unsubscribe from this mailing list, send email to > [email protected] _______________________________________________ Ol-discuss mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to [email protected]
