Re: question from a beginner

2011-11-07 Thread Chris Hostetter

: So for example, if searching on "Santa Clara"  I would like to display all
: sections/paragraphs where "Santa Clara" occurs in the document. 

can you clarify what you mean by "display" and how you intend to use that 
info?

it may be obvious to you what you mean by "display" but depending on the 
answer there are differnet approachs to take.

for example: you may be able to tune highlighting to show you all of the 
"snippets" of each document where the search matches, but the results will 
still just contain one "document" for your file, so things like facet 
counts would still only return "1" doc per file -- and if you want a UI 
that lets the user "click" on each match, you would still only have one 
file to return to them for all of the snippets in each document.

alternately, you could split the word file up into multiple files per 
section (or paragraph, or page -- whatever you want), and index them 
independently, and then each "matching document" would corrispond to the 
section of the file that you've split up - so facet counts and stuff 
like that owuld corrispond to your individuall little files, and users 
could click on each result and your UI could return that micro-file, 
etc...

bottom line: think through your entire use case, and how you want users to 
interact with the results, and then model your documents based on that.  
then figure out how to feed the data into solr to match your model.


-Hoss


RE: question from a beginner

2011-10-31 Thread Bent Jensen
Yes, that certainly crossed my mind, but I have no idea of how to do that.
Would I need to pick a unique keyword from every paragraph and use that for
the index?

-Original Message-
From: Michael Sokolov [mailto:soko...@ifactory.com] 
Sent: Monday, October 31, 2011 5:20 AM
To: solr-user@lucene.apache.org
Cc: Phil Scadden
Subject: Re: question from a beginner

You might also consider indexing each paragraph as a separate document 
if the documents are very large.

-Mike

On 10/30/2011 11:51 PM, Phil Scadden wrote:
> Look up highlighting. http://wiki.apache.org/solr/HighlightingParameters
>
>
> Notice: This email and any attachments are confidential. If received in
error please destroy and immediately notify us. Do not copy or disclose the
contents.
>



Re: question from a beginner

2011-10-31 Thread Michael Sokolov
You might also consider indexing each paragraph as a separate document 
if the documents are very large.


-Mike

On 10/30/2011 11:51 PM, Phil Scadden wrote:

Look up highlighting. http://wiki.apache.org/solr/HighlightingParameters


Notice: This email and any attachments are confidential. If received in error 
please destroy and immediately notify us. Do not copy or disclose the contents.





Re: question from a beginner

2011-10-30 Thread Phil Scadden
Look up highlighting. http://wiki.apache.org/solr/HighlightingParameters


Notice: This email and any attachments are confidential. If received in error 
please destroy and immediately notify us. Do not copy or disclose the contents.



question from a beginner

2011-10-30 Thread Bent Jensen
Not sure if this is appropirate for this list, but I will try anyway and
hope to get a few pointers.

 

I am trying to help a Rehabilitation Research Center set up a document
search on their website (as a volunteer). They have a word document with a
lot of information about resources and contact places for rehab patients.

 

So for example, if searching on "Santa Clara"  I would like to display all
sections/paragraphs where "Santa Clara" occurs in the document. 

I am trying to use the solr example, and have indexed the document. However,
when I do a search on Santa Clara,  I only get the fields back such as
author, content_type, id, etc.. Also it only shows one search result per
file, even though the file has multiple occurrences of the search word. 

 

If I use  q=*:*, then all the document text is returned.

 

I Hope you can point me in the right direction. 

 

thanks in advance

 

Ben