Re: [ol-tech] Full text search and some code clarification

Ankush Wed, 25 Jun 2014 08:17:14 -0700

Hey Guys,

I am able to use the schema in solr-biblio/inside/ for the one called by
BookReader. However, I feel the schema is a bit outdated. Are you using
solr 1.4.1 here?


I have made my setup with 1.4.1 , however, I am running into some problems
here. For any search query , solr isnt highlighting all the matched
occurences . Matches coming in an approximately 54K length of the field are
being highlighted.

This is the stackoverflow question I have raised for the query -
http://stackoverflow.com/questions/24364900/show-all-occurrences-of-query-while-highlighting-in-solr-1-4

I know this is a solr question, but I suppose if I could have any
clarifications on this, it would be immensly helpful.
Like, If I am using incorrect schema, please direct me to the correct
schema.

Ankush Chadda
about.me/iamkhush
 [image: Ankush Chadda on about.me]
   <http://about.me/iamkhush>


On Mon, Jun 16, 2014 at 9:40 AM, Anand Chitipothu <[email protected]> wrote:

> On 16-Jun-2014, at 12:49 AM, Ankush wrote:
>
> Hey,
>
> I am trying to implement fulltext search on my website, which uses
> openlibrary framework.I dont have prior experience on solr . Can you help
> me clear my doubts -
>
>
>    - The schema that you currently use for fulltext search is the inside
>    core of solr-biblio (
>    https://github.com/internetarchive/openlibrary/tree/master/conf/solr-biblio
>    ).
>
> No, it is used for searching work records in openlibray, not fulltext
> search.
>
>  http://openlibrary.org/search
>
>
>    - Is solr-biblio used for all the searches on website?
>
> No, fulltext search uses completely different solr instance with different
> schema.
>
>
>    - Now in order to index the books, I saw the script inside_all.py(
>    
> https://github.com/internetarchive/openlibrary/tree/master/openlibrary/solr/inside/index_all.py).This
>    scripts makes hit to fulltext/abbyy_to_text.php, Gets page_count and body
>    and uses it to index. Now abby_to_text.php is in the BookReaderIA dir,
>    which uses extract_paragraph.py to return the data. What I cannot
>    understand is, that extract_paragraphs.py prints page_count in 'meta:...' (
>    
> https://github.com/openlibrary/bookreader/blob/master/BookReaderIA/fulltext/extract_paragraphs.py#L155)
>    , but abby_to_text.php is trying to fetch a string 'page count' from the
>    data (
>    
> https://github.com/internetarchive/openlibrary/blob/master/openlibrary/solr/inside/index_all.py#L130).
>    How is this working on your end
>
> It is not in my head right now (I'm not the one who implemented it). I'll
> look at how it works and let you know.
>
> Anand
>
> _______________________________________________
> Ol-tech mailing list
> [email protected]
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
> Archives: http://www.mail-archive.com/[email protected]/
> To unsubscribe from this mailing list, send email to
> [email protected]
>

_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
Archives: http://www.mail-archive.com/[email protected]/
To unsubscribe from this mailing list, send email to 
[email protected]

Re: [ol-tech] Full text search and some code clarification

Reply via email to