What are you actual highlighting requirements? you could try things like maxAnalyzedChars, requireFieldMatch, etc....
http://wiki.apache.org/solr/HighlightingParameters has a good list, but you've probably already seen that page.... Best Erick On Tue, Jun 29, 2010 at 9:11 PM, Peter Spam <ps...@mac.com> wrote: > To follow up, I've found that my queries are very fast (even with &fq=), > until I add &hl=true. What can I do to speed up highlighting? Should I > consider injecting a line at a time, rather than the entire file as a field? > > > -Pete > > On Jun 29, 2010, at 11:07 AM, Peter Spam wrote: > > > Thanks for everyone's help - I have this working now, but sometimes the > queries are incredibly slow!! For example, <int name="QTime">461360</int>. > Also, I had to bump up the min/max RAM size to 1GB/3.5GB for things to > inject without throwing heap memory errors. However, my data set is very > small! 36 text files, for a total of 113MB. (It will grow to many TB, but > for now, this is a test). The largest file is 34MB. > > > > Therefore, I'm sure I'm doing something wrong :-) Here's my config: > > > > > ----------------------------------------------------------------------------------------------- > > > > For the schema.xml, <types> is all default. For fields, here are the > only lines that aren't commented out: > > > > <field name="id" type="string" indexed="true" stored="true" > required="true" /> > > <field name="body" type="text" indexed="true" stored="true" > multiValued="true"/> > > <field name="timestamp" type="date" indexed="true" stored="true" > default="NOW" multiValued="false"/> > > <field name="build" type="string" indexed="true" stored="true" > multiValued="false"/> > > <field name="device" type="string" indexed="true" stored="true" > multiValued="false"/> > > <dynamicField name="*" type="ignored" multiValued="true" /> > > > > ... then, for the rest: > > > > <uniqueKey>id</uniqueKey> > > > > <!-- field for the QueryParser to use when an explicit fieldname is > absent --> > > <defaultSearchField>body</defaultSearchField> > > > > <!-- SolrQueryParser configuration: defaultOperator="AND|OR" --> > > <solrQueryParser defaultOperator="AND"/> > > > > > > > ----------------------------------------------------------------------------------------------- > > > > > > Invoking: java -Xmx3584M -Xms1024M -jar start.jar > > > > > > > ----------------------------------------------------------------------------------------------- > > > > > > Injecting: > > > > #!/bin/sh > > > > J=0 > > for i in `find . -name \*.txt`; do > > (( J++ )) > > curl " > http://localhost:8983/solr/update/extract?literal.id=doc$J&fmap.content=body" > -F "myfi...@$i"; > > done; > > > > > > echo "------------- Committing" > > curl "http://localhost:8983/solr/update/extract?commit=true" > > > > > > > ----------------------------------------------------------------------------------------------- > > > > > > Searching: > > > > > http://localhost:8983/solr/select?q=testing&hl=true&fl=id,score&hl.snippets=5&hl.mergeContiguous=true > > > > > > > > > > > > -Pete > > > > On Jun 28, 2010, at 5:22 PM, Erick Erickson wrote: > > > >> try adding &hl.fl=text > >> to specify your highlight field. I don't understand why you're only > >> getting the ID field back though. Do note that the highlighting > >> is after the docs, related by the ID. > >> > >> Try a (non highlighting) query of just * to verify that you're > >> pointing at the index you think you are. It's possible that > >> you've modified a different index with SolrJ than your web > >> server is pointing at. > >> > >> Also, SOLR has no way of knowing you're modified your index > >> with SolrJ, so it may not be automatically reopening an > >> IndexReader so your recent changes may not be visible > >> until you force the SOLR reader to reopen. > >> > >> HTH > >> Erick > >> > >> On Mon, Jun 28, 2010 at 6:49 PM, Peter Spam <ps...@mac.com> wrote: > >> > >>> On Jun 28, 2010, at 2:00 PM, Ahmet Arslan wrote: > >>> > >>>>> 1) I can get my docs in the index, but when I search, it > >>>>> returns the entire document. I'd love to have it only > >>>>> return the line (or two) around the search term. > >>>> > >>>> Solr can generate Google-like snippets as you describe. > >>>> http://wiki.apache.org/solr/HighlightingParameters > >>> > >>> Here's how I commit my documents: > >>> > >>> J=0; > >>> for i in `find . -name \*.txt`; do > >>> (( J++ )) > >>> curl "http://localhost:8983/solr/update/extract?literal.id=doc$J" > >>> -F "myfi...@$i"; > >>> done; > >>> > >>> echo "------------- Committing" > >>> curl "http://localhost:8983/solr/update/extract?commit=true" > >>> > >>> > >>> Then, I try to query using > >>> > http://localhost:8983/solr/select?rows=10&start=0&fl=*,score&hl=true&q=testing > >>> but I only get back the document ID rather than the snippet: > >>> > >>> <doc> > >>> <float name="score">0.05030759</float> > >>> <arr name="content_type"> > >>> <str>text/plain</str> > >>> </arr> > >>> <str name="id">doc16</str> > >>> </doc> > >>> > >>> I'm using the schema.xml from the "lucid imagination: Indexing text and > >>> html files" tutorial. > >>> > >>> > >>> > >>> -Pete > >>> > > > >