Re: [ol-tech] Full text search and some code clarification

Anand Chitipothu Sun, 15 Jun 2014 21:11:16 -0700

On 16-Jun-2014, at 12:49 AM, Ankush wrote:

> Hey,
> 
> I am trying to implement fulltext search on my website, which uses 
> openlibrary framework.I dont have prior experience on solr . Can you help me 
> clear my doubts - 
> 
> The schema that you currently use for fulltext search is the inside core of 
> solr-biblio 
> (https://github.com/internetarchive/openlibrary/tree/master/conf/solr-biblio).
>  
No, it is used for searching work records in openlibray, not fulltext search.


 http://openlibrary.org/search
> Is solr-biblio used for all the searches on website?
No, fulltext search uses completely different solr instance with different 
schema.
> Now in order to index the books, I saw the script 
> inside_all.py(https://github.com/internetarchive/openlibrary/tree/master/openlibrary/solr/inside/index_all.py).This
>  scripts makes hit to fulltext/abbyy_to_text.php, Gets page_count and body 
> and uses it to index. Now abby_to_text.php is in the BookReaderIA dir, which 
> uses extract_paragraph.py to return the data. What I cannot understand is, 
> that extract_paragraphs.py prints page_count in 'meta:...' 
> (https://github.com/openlibrary/bookreader/blob/master/BookReaderIA/fulltext/extract_paragraphs.py#L155)
>  , but abby_to_text.php is trying to fetch a string 'page count' from the 
> data 
> (https://github.com/internetarchive/openlibrary/blob/master/openlibrary/solr/inside/index_all.py#L130).
>  How is this working on your end
It is not in my head right now (I'm not the one who implemented it). I'll look 
at how it works and let you know.

Anand

_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
Archives: http://www.mail-archive.com/[email protected]/
To unsubscribe from this mailing list, send email to 
[email protected]

Re: [ol-tech] Full text search and some code clarification

Reply via email to