Thanks for everyone's help - I have this working now, but sometimes the queries are incredibly slow!! For example, <int name="QTime">461360</int>. Also, I had to bump up the min/max RAM size to 1GB/3.5GB for things to inject without throwing heap memory errors. However, my data set is very small! 36 text files, for a total of 113MB. (It will grow to many TB, but for now, this is a test). The largest file is 34MB.
Therefore, I'm sure I'm doing something wrong :-) Here's my config: ----------------------------------------------------------------------------------------------- For the schema.xml, <types> is all default. For fields, here are the only lines that aren't commented out: <field name="id" type="string" indexed="true" stored="true" required="true" /> <field name="body" type="text" indexed="true" stored="true" multiValued="true"/> <field name="timestamp" type="date" indexed="true" stored="true" default="NOW" multiValued="false"/> <field name="build" type="string" indexed="true" stored="true" multiValued="false"/> <field name="device" type="string" indexed="true" stored="true" multiValued="false"/> <dynamicField name="*" type="ignored" multiValued="true" /> ... then, for the rest: <uniqueKey>id</uniqueKey> <!-- field for the QueryParser to use when an explicit fieldname is absent --> <defaultSearchField>body</defaultSearchField> <!-- SolrQueryParser configuration: defaultOperator="AND|OR" --> <solrQueryParser defaultOperator="AND"/> ----------------------------------------------------------------------------------------------- Invoking: java -Xmx3584M -Xms1024M -jar start.jar ----------------------------------------------------------------------------------------------- Injecting: #!/bin/sh J=0 for i in `find . -name \*.txt`; do (( J++ )) curl "http://localhost:8983/solr/update/extract?literal.id=doc$J&fmap.content=body" -F "myfi...@$i"; done; echo "------------- Committing" curl "http://localhost:8983/solr/update/extract?commit=true" ----------------------------------------------------------------------------------------------- Searching: http://localhost:8983/solr/select?q=testing&hl=true&fl=id,score&hl.snippets=5&hl.mergeContiguous=true -Pete On Jun 28, 2010, at 5:22 PM, Erick Erickson wrote: > try adding &hl.fl=text > to specify your highlight field. I don't understand why you're only > getting the ID field back though. Do note that the highlighting > is after the docs, related by the ID. > > Try a (non highlighting) query of just * to verify that you're > pointing at the index you think you are. It's possible that > you've modified a different index with SolrJ than your web > server is pointing at. > > Also, SOLR has no way of knowing you're modified your index > with SolrJ, so it may not be automatically reopening an > IndexReader so your recent changes may not be visible > until you force the SOLR reader to reopen. > > HTH > Erick > > On Mon, Jun 28, 2010 at 6:49 PM, Peter Spam <ps...@mac.com> wrote: > >> On Jun 28, 2010, at 2:00 PM, Ahmet Arslan wrote: >> >>>> 1) I can get my docs in the index, but when I search, it >>>> returns the entire document. I'd love to have it only >>>> return the line (or two) around the search term. >>> >>> Solr can generate Google-like snippets as you describe. >>> http://wiki.apache.org/solr/HighlightingParameters >> >> Here's how I commit my documents: >> >> J=0; >> for i in `find . -name \*.txt`; do >> (( J++ )) >> curl "http://localhost:8983/solr/update/extract?literal.id=doc$J" >> -F "myfi...@$i"; >> done; >> >> echo "------------- Committing" >> curl "http://localhost:8983/solr/update/extract?commit=true" >> >> >> Then, I try to query using >> http://localhost:8983/solr/select?rows=10&start=0&fl=*,score&hl=true&q=testing >> but I only get back the document ID rather than the snippet: >> >> <doc> >> <float name="score">0.05030759</float> >> <arr name="content_type"> >> <str>text/plain</str> >> </arr> >> <str name="id">doc16</str> >> </doc> >> >> I'm using the schema.xml from the "lucid imagination: Indexing text and >> html files" tutorial. >> >> >> >> -Pete >>