Re: [ol-discuss] Ol-discuss Digest, Vol 47, Issue 5

AMBACHEW GEBEREMARIAME Tue, 14 Jun 2011 01:15:12 -0700

10qu too!

On Mon, Jun 13, 2011 at 10:00 PM, <[email protected]> wrote:


> Send Ol-discuss mailing list submissions to
>        [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
> or, via email, send a message with subject or body 'help' to
>        [email protected]
>
> You can reach the person managing the list at
>        [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Ol-discuss digest..."
>
>
> Today's Topics:
>
>   1. Re: ol.org book reader (Lars Aronsson)
>   2. Re: ol.org book reader (Michael Ang)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 13 Jun 2011 13:10:03 +0200
> From: Lars Aronsson <[email protected]>
> Subject: Re: [ol-discuss] ol.org book reader
> To: Open Library -- general discussion <[email protected]>
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> On 06/12/2011 04:14 PM, Karen Coyle wrote:
> > This doesn't answer your exact question, but the full text of the
> > digitized books is crawled. You can see this by doing a Google search
> > like:
> >
> > LOUISIANA SCOTT SHUMAN site:archive.org
> >
> > That's a very artificial search, but it gives you the idea. This isn't
> > related to the book reader but to the stored full text on the Internet
> > Archive.
>
> Exactly, that's my point: it "isn't related to the book reader",
> but I think it should be. It gives hits in ..._djvu.txt, but Google
> doesn't lead me to the right page.
>
>
>
> --
>   Lars Aronsson ([email protected])
>   Project Runeberg - free Nordic literature - http://runeberg.org/
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 13 Jun 2011 11:34:15 -0700
> From: Michael Ang <[email protected]>
> Subject: Re: [ol-discuss] ol.org book reader
> To: [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> We had the idea to have the OCR text on separate URLs by page (or
> similar) to improve search accessibility a few years ago and we may yet
> get there.  We're working on having the OCR text available for reading
> and correction (may not immediately be integrated with the BookReader).
>
> For the BookReader I might go with the new #! url fragments that are
> designed to allow web apps to dynamically update the url while still
> being accessible to search engines.
> http://code.google.com/web/ajaxcrawling/docs/specification.html
>
>   - mang
>
> On 6/11/11 7:49 PM, Lars Aronsson wrote:
> > Reading my own question again, I understand I didn't phrase it
> > very well:
> >> Can this be combined with making the text searchable
> >> by web search engines, like plain web pages?
> > Here's what I envision, and my question is if you have
> > any plans going in this direction:
> >
> > In the bookreader, one should not only be able to zoom
> > in and out or to activate the sound playback, but also to
> > view the OCR text and proofread the OCR text (like a
> > wiki page). To a search engine spider, only the view text
> > option should be available, and the buttons for previous
> > and next page should be plain links, so the text of each
> > page gets indexed under the right page URL.
> >
> > The way I would want the bookreader to appear to a
> > search spider is the way my existing website looks,
> > this example being the first page of Hamlet, in the
> > Swedish translation of 1861,
> > http://runeberg.org/hagberg/a/0183.html
> > Here is the scanned book page, and you can scroll
> > down to the OCR text below.
> >
> > If you google the role names "Voltimand, Cornelius,
> > Rosenkranz, Gyldenstern", you will see that it
> > is indexed by Google at this very URL. (English and
> > German editions spell the names a little different.)
> >
> > I'd like to use the bookreader with its soft scrolling
> > and book page flipping for humans, but I don't
> > want to give up the direct per page indexing by
> > Google and other search engines. So, can the
> > two be combined? Did anybody try this?
> >
> >
>
>
>
> ------------------------------
>
> _______________________________________________
> Ol-discuss mailing list
> [email protected]
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
> To unsubscribe from this mailing list, send email to
> [email protected]
>
> End of Ol-discuss Digest, Vol 47, Issue 5
> *****************************************
>

_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
[email protected]

Re: [ol-discuss] Ol-discuss Digest, Vol 47, Issue 5

Reply via email to