The problem: If we index a monograph in Solr, there's no way to convert search 
results into page-level hits. The solution: have a "paged-text" fieldtype which 
keeps track of page divisions as it indexes, and reports page-level hits in the 
search results.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: SOLR-380
                 URL: https://issues.apache.org/jira/browse/SOLR-380
             Project: Solr
          Issue Type: New Feature
          Components: search
            Reporter: Tricia Williams
            Priority: Minor


"Paged-Text" FieldType for Solr
> 
> A chance to dig into the guts of Solr. The problem: If we index a
> monograph in Solr, there's no way to convert search results into
> page-level hits. The solution: have a "paged-text" fieldtype which keeps
> track of page divisions as it indexes, and reports page-level hits in the
> search results.
> 
> The input would contain page milestones: <page id="234"/>. As Solr
> processed the tokens (using its standard tokenizers and filters), it would
> concurrently build a structural map of the item, indicating which term
> position marked the beginning of which page: <page id="234"
> firstterm="14324"/>. This map would be stored in an unindexed field in
> some efficient format.
> 
> At search time, Solr would retrieve term positions for all hits that are
> returned in the current request, and use the stored map to determine page
> ids for each term position. The results would imitate the results for
> highlighting, something like:
> 
> <lst name="pages">
>         <lst name="doc1">
>                 <int name="pageid">234</int>
>                 <int name="pageid">236</int>
>         </lst>
>         <lst name="doc2">
>                 <int name="pageid">19</int>
>         </lst>
> </lst>
> <lst name="hitpos">
>         <lst name="doc1">
>                 <lst name="234">
>                         <int name="pos">14325</int>
>                 </lst>
>         </lst>
>         ...
> </lst>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to