Re: Very basic questions: Faceted front-end?
Have you had a look at www.twigkit.com ? Could be worth the bucks... -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 1. juli 2010, at 00.59, Peter Spam wrote: Wow, thanks Lance - it's really fast now! The last piece of the puzzle is setting up a nice front-end. Are there any pre-built front-ends available, that mimic Google (for example), with facets? -Peter On Jun 29, 2010, at 9:04 PM, Lance Norskog wrote: To highlight a field, Solr needs some extra Lucene values. If these are not configured for the field in the schema, Solr has to re-analyze the field to highlight it. If you want faster highlighting, you have to add term vectors to the schema. Here is the grand map of such things: http://wiki.apache.org/solr/FieldOptionsByUseCase On Tue, Jun 29, 2010 at 6:29 PM, Erick Erickson erickerick...@gmail.com wrote: What are you actual highlighting requirements? you could try things like maxAnalyzedChars, requireFieldMatch, etc http://wiki.apache.org/solr/HighlightingParameters has a good list, but you've probably already seen that page Best Erick On Tue, Jun 29, 2010 at 9:11 PM, Peter Spam ps...@mac.com wrote: To follow up, I've found that my queries are very fast (even with fq=), until I add hl=true. What can I do to speed up highlighting? Should I consider injecting a line at a time, rather than the entire file as a field? -Pete On Jun 29, 2010, at 11:07 AM, Peter Spam wrote: Thanks for everyone's help - I have this working now, but sometimes the queries are incredibly slow!! For example, int name=QTime461360/int. Also, I had to bump up the min/max RAM size to 1GB/3.5GB for things to inject without throwing heap memory errors. However, my data set is very small! 36 text files, for a total of 113MB. (It will grow to many TB, but for now, this is a test). The largest file is 34MB. Therefore, I'm sure I'm doing something wrong :-) Here's my config: --- For the schema.xml, types is all default. For fields, here are the only lines that aren't commented out: field name=id type=string indexed=true stored=true required=true / field name=body type=text indexed=true stored=true multiValued=true/ field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/ field name=build type=string indexed=true stored=true multiValued=false/ field name=device type=string indexed=true stored=true multiValued=false/ dynamicField name=* type=ignored multiValued=true / ... then, for the rest: uniqueKeyid/uniqueKey !-- field for the QueryParser to use when an explicit fieldname is absent -- defaultSearchFieldbody/defaultSearchField !-- SolrQueryParser configuration: defaultOperator=AND|OR -- solrQueryParser defaultOperator=AND/ --- Invoking: java -Xmx3584M -Xms1024M -jar start.jar --- Injecting: #!/bin/sh J=0 for i in `find . -name \*.txt`; do (( J++ )) curl http://localhost:8983/solr/update/extract?literal.id=doc$Jfmap.content=body; -F myfi...@$i; done; echo - Committing curl http://localhost:8983/solr/update/extract?commit=true; --- Searching: http://localhost:8983/solr/select?q=testinghl=truefl=id,scorehl.snippets=5hl.mergeContiguous=true -Pete On Jun 28, 2010, at 5:22 PM, Erick Erickson wrote: try adding hl.fl=text to specify your highlight field. I don't understand why you're only getting the ID field back though. Do note that the highlighting is after the docs, related by the ID. Try a (non highlighting) query of just * to verify that you're pointing at the index you think you are. It's possible that you've modified a different index with SolrJ than your web server is pointing at. Also, SOLR has no way of knowing you're modified your index with SolrJ, so it may not be automatically reopening an IndexReader so your recent changes may not be visible until you force the SOLR reader to reopen. HTH Erick On Mon, Jun 28, 2010 at 6:49 PM, Peter Spam ps...@mac.com wrote: On Jun 28, 2010, at 2:00 PM, Ahmet Arslan wrote: 1) I can get my docs in the index, but when I search, it returns the entire document. I'd love to have it only return the line (or two) around the search term. Solr can generate Google-like snippets as you describe. http://wiki.apache.org/solr/HighlightingParameters Here's how I commit my documents: J=0; for i in `find . -name \*.txt`; do (( J++ )) curl
Re: Very basic questions: Faceted front-end?
Solr trunk now has a built-in UI, and it is also something that works with Solr 1.4 as well (with some effort). Here's how to get it working with Solr 1.4: http://www.lucidimagination.com/blog/2009/11/04/solritas-solr-1-4s-hidden-gem/ In Solr trunk, all you have to do is navigate to /solr/browse and you get a google-like UI that does highlighting, faceting, spell- checking, etc. There's a partial screenshot (of the debug feature) attached to this issue: https://issues.apache.org/jira/browse/SOLR-1957 Erik On Jun 30, 2010, at 9:21 PM, Peter Spam wrote: Ah, I found this: https://issues.apache.org/jira/browse/SOLR-634 ... aka solr-ui. Is there anything else along these lines? Thanks! -Peter On Jun 30, 2010, at 3:59 PM, Peter Spam wrote: Wow, thanks Lance - it's really fast now! The last piece of the puzzle is setting up a nice front-end. Are there any pre-built front-ends available, that mimic Google (for example), with facets? -Peter On Jun 29, 2010, at 9:04 PM, Lance Norskog wrote: To highlight a field, Solr needs some extra Lucene values. If these are not configured for the field in the schema, Solr has to re- analyze the field to highlight it. If you want faster highlighting, you have to add term vectors to the schema. Here is the grand map of such things: http://wiki.apache.org/solr/FieldOptionsByUseCase On Tue, Jun 29, 2010 at 6:29 PM, Erick Erickson erickerick...@gmail.com wrote: What are you actual highlighting requirements? you could try things like maxAnalyzedChars, requireFieldMatch, etc http://wiki.apache.org/solr/HighlightingParameters has a good list, but you've probably already seen that page Best Erick On Tue, Jun 29, 2010 at 9:11 PM, Peter Spam ps...@mac.com wrote: To follow up, I've found that my queries are very fast (even with fq=), until I add hl=true. What can I do to speed up highlighting? Should I consider injecting a line at a time, rather than the entire file as a field? -Pete On Jun 29, 2010, at 11:07 AM, Peter Spam wrote: Thanks for everyone's help - I have this working now, but sometimes the queries are incredibly slow!! For example, int name=QTime461360/int. Also, I had to bump up the min/max RAM size to 1GB/3.5GB for things to inject without throwing heap memory errors. However, my data set is very small! 36 text files, for a total of 113MB. (It will grow to many TB, but for now, this is a test). The largest file is 34MB. Therefore, I'm sure I'm doing something wrong :-) Here's my config: --- For the schema.xml, types is all default. For fields, here are the only lines that aren't commented out: field name=id type=string indexed=true stored=true required=true / field name=body type=text indexed=true stored=true multiValued=true/ field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/ field name=build type=string indexed=true stored=true multiValued=false/ field name=device type=string indexed=true stored=true multiValued=false/ dynamicField name=* type=ignored multiValued=true / ... then, for the rest: uniqueKeyid/uniqueKey !-- field for the QueryParser to use when an explicit fieldname is absent -- defaultSearchFieldbody/defaultSearchField !-- SolrQueryParser configuration: defaultOperator=AND|OR -- solrQueryParser defaultOperator=AND/ --- Invoking: java -Xmx3584M -Xms1024M -jar start.jar --- Injecting: #!/bin/sh J=0 for i in `find . -name \*.txt`; do (( J++ )) curl http://localhost:8983/solr/update/extract?literal.id=doc$Jfmap.content=body -F myfi...@$i; done; echo - Committing curl http://localhost:8983/solr/update/extract?commit=true; --- Searching: http://localhost:8983/solr/select?q=testinghl=truefl=id,scorehl.snippets=5hl.mergeContiguous=true -Pete On Jun 28, 2010, at 5:22 PM, Erick Erickson wrote: try adding hl.fl=text to specify your highlight field. I don't understand why you're only getting the ID field back though. Do note that the highlighting is after the docs, related by the ID. Try a (non highlighting) query of just * to verify that you're pointing at the index you think you are. It's possible that you've modified a different index with SolrJ than your web server is pointing at. Also, SOLR has no way of knowing you're modified your index with SolrJ, so it may not be automatically reopening an IndexReader so your recent changes may not be visible until you force the SOLR reader to reopen. HTH Erick On Mon, Jun 28, 2010 at 6:49 PM, Peter Spam ps...@mac.com
Re: Very basic questions: Faceted front-end?
Very nice indeed! That definitely needs to be shouted about in the docs. Any way to make it work with facet queries or can dismax requests not do that? I tried adding a few facet.query parameters but it came back with nothing in the facet list. Mark On 1 Jul 2010, at 12:36 pm, Erik Hatcher wrote: Solr trunk now has a built-in UI, and it is also something that works with Solr 1.4 as well (with some effort). Here's how to get it working with Solr 1.4: http://www.lucidimagination.com/blog/2009/11/04/solritas-solr-1-4s-hidden-gem/ In Solr trunk, all you have to do is navigate to /solr/browse and you get a google-like UI that does highlighting, faceting, spell- checking, etc. There's a partial screenshot (of the debug feature) attached to this issue: https://issues.apache.org/jira/browse/SOLR-1957 Erik On Jun 30, 2010, at 9:21 PM, Peter Spam wrote: Ah, I found this: https://issues.apache.org/jira/browse/SOLR-634 ... aka solr-ui. Is there anything else along these lines? Thanks! -Peter On Jun 30, 2010, at 3:59 PM, Peter Spam wrote: Wow, thanks Lance - it's really fast now! The last piece of the puzzle is setting up a nice front-end. Are there any pre-built front-ends available, that mimic Google (for example), with facets? -Peter On Jun 29, 2010, at 9:04 PM, Lance Norskog wrote: To highlight a field, Solr needs some extra Lucene values. If these are not configured for the field in the schema, Solr has to re- analyze the field to highlight it. If you want faster highlighting, you have to add term vectors to the schema. Here is the grand map of such things: http://wiki.apache.org/solr/FieldOptionsByUseCase On Tue, Jun 29, 2010 at 6:29 PM, Erick Erickson erickerick...@gmail.com wrote: What are you actual highlighting requirements? you could try things like maxAnalyzedChars, requireFieldMatch, etc http://wiki.apache.org/solr/HighlightingParameters has a good list, but you've probably already seen that page Best Erick On Tue, Jun 29, 2010 at 9:11 PM, Peter Spam ps...@mac.com wrote: To follow up, I've found that my queries are very fast (even with fq=), until I add hl=true. What can I do to speed up highlighting? Should I consider injecting a line at a time, rather than the entire file as a field? -Pete On Jun 29, 2010, at 11:07 AM, Peter Spam wrote: Thanks for everyone's help - I have this working now, but sometimes the queries are incredibly slow!! For example, int name=QTime461360/int. Also, I had to bump up the min/max RAM size to 1GB/3.5GB for things to inject without throwing heap memory errors. However, my data set is very small! 36 text files, for a total of 113MB. (It will grow to many TB, but for now, this is a test). The largest file is 34MB. Therefore, I'm sure I'm doing something wrong :-) Here's my config: --- For the schema.xml, types is all default. For fields, here are the only lines that aren't commented out: field name=id type=string indexed=true stored=true required=true / field name=body type=text indexed=true stored=true multiValued=true/ field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/ field name=build type=string indexed=true stored=true multiValued=false/ field name=device type=string indexed=true stored=true multiValued=false/ dynamicField name=* type=ignored multiValued=true / ... then, for the rest: uniqueKeyid/uniqueKey !-- field for the QueryParser to use when an explicit fieldname is absent -- defaultSearchFieldbody/defaultSearchField !-- SolrQueryParser configuration: defaultOperator=AND|OR -- solrQueryParser defaultOperator=AND/ --- Invoking: java -Xmx3584M -Xms1024M -jar start.jar --- Injecting: #!/bin/sh J=0 for i in `find . -name \*.txt`; do (( J++ )) curl http://localhost:8983/solr/update/extract?literal.id=doc$Jfmap.content=body -F myfi...@$i; done; echo - Committing curl http://localhost:8983/solr/update/extract?commit=true; --- Searching: http://localhost:8983/solr/select?q=testinghl=truefl=id,scorehl.snippets=5hl.mergeContiguous=true -Pete On Jun 28, 2010, at 5:22 PM, Erick Erickson wrote: try adding hl.fl=text to specify your highlight field. I don't understand why you're only getting the ID field back though. Do note that the highlighting is after the docs, related by the ID. Try a (non highlighting) query of just * to verify that you're pointing at the index you think you are. It's possible that you've modified a different index with SolrJ than
Re: Very basic questions: Faceted front-end?
On Jul 1, 2010, at 10:33 AM, Mark Allan wrote: Very nice indeed! That definitely needs to be shouted about in the docs. Why thanks! And yeah, marketing isn't my strong point, but it is indeed a way cool feature of Solr that deserves more attention that I can give it. Any way to make it work with facet queries or can dismax requests not do that? I tried adding a few facet.query parameters but it came back with nothing in the facet list. You'll have to adjust the templates to pull facet queries out into the view. I'll try to do that later today unless you beat me to it and provide a patch :) It'll be pretty trivial to do so. It also needs to support date range faceting too. Erik
Re: Very basic questions: Faceted front-end?
Wow, thanks Lance - it's really fast now! The last piece of the puzzle is setting up a nice front-end. Are there any pre-built front-ends available, that mimic Google (for example), with facets? -Peter On Jun 29, 2010, at 9:04 PM, Lance Norskog wrote: To highlight a field, Solr needs some extra Lucene values. If these are not configured for the field in the schema, Solr has to re-analyze the field to highlight it. If you want faster highlighting, you have to add term vectors to the schema. Here is the grand map of such things: http://wiki.apache.org/solr/FieldOptionsByUseCase On Tue, Jun 29, 2010 at 6:29 PM, Erick Erickson erickerick...@gmail.com wrote: What are you actual highlighting requirements? you could try things like maxAnalyzedChars, requireFieldMatch, etc http://wiki.apache.org/solr/HighlightingParameters has a good list, but you've probably already seen that page Best Erick On Tue, Jun 29, 2010 at 9:11 PM, Peter Spam ps...@mac.com wrote: To follow up, I've found that my queries are very fast (even with fq=), until I add hl=true. What can I do to speed up highlighting? Should I consider injecting a line at a time, rather than the entire file as a field? -Pete On Jun 29, 2010, at 11:07 AM, Peter Spam wrote: Thanks for everyone's help - I have this working now, but sometimes the queries are incredibly slow!! For example, int name=QTime461360/int. Also, I had to bump up the min/max RAM size to 1GB/3.5GB for things to inject without throwing heap memory errors. However, my data set is very small! 36 text files, for a total of 113MB. (It will grow to many TB, but for now, this is a test). The largest file is 34MB. Therefore, I'm sure I'm doing something wrong :-) Here's my config: --- For the schema.xml, types is all default. For fields, here are the only lines that aren't commented out: field name=id type=string indexed=true stored=true required=true / field name=body type=text indexed=true stored=true multiValued=true/ field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/ field name=build type=string indexed=true stored=true multiValued=false/ field name=device type=string indexed=true stored=true multiValued=false/ dynamicField name=* type=ignored multiValued=true / ... then, for the rest: uniqueKeyid/uniqueKey !-- field for the QueryParser to use when an explicit fieldname is absent -- defaultSearchFieldbody/defaultSearchField !-- SolrQueryParser configuration: defaultOperator=AND|OR -- solrQueryParser defaultOperator=AND/ --- Invoking: java -Xmx3584M -Xms1024M -jar start.jar --- Injecting: #!/bin/sh J=0 for i in `find . -name \*.txt`; do (( J++ )) curl http://localhost:8983/solr/update/extract?literal.id=doc$Jfmap.content=body; -F myfi...@$i; done; echo - Committing curl http://localhost:8983/solr/update/extract?commit=true; --- Searching: http://localhost:8983/solr/select?q=testinghl=truefl=id,scorehl.snippets=5hl.mergeContiguous=true -Pete On Jun 28, 2010, at 5:22 PM, Erick Erickson wrote: try adding hl.fl=text to specify your highlight field. I don't understand why you're only getting the ID field back though. Do note that the highlighting is after the docs, related by the ID. Try a (non highlighting) query of just * to verify that you're pointing at the index you think you are. It's possible that you've modified a different index with SolrJ than your web server is pointing at. Also, SOLR has no way of knowing you're modified your index with SolrJ, so it may not be automatically reopening an IndexReader so your recent changes may not be visible until you force the SOLR reader to reopen. HTH Erick On Mon, Jun 28, 2010 at 6:49 PM, Peter Spam ps...@mac.com wrote: On Jun 28, 2010, at 2:00 PM, Ahmet Arslan wrote: 1) I can get my docs in the index, but when I search, it returns the entire document. I'd love to have it only return the line (or two) around the search term. Solr can generate Google-like snippets as you describe. http://wiki.apache.org/solr/HighlightingParameters Here's how I commit my documents: J=0; for i in `find . -name \*.txt`; do (( J++ )) curl http://localhost:8983/solr/update/extract?literal.id=doc$J; -F myfi...@$i; done; echo - Committing curl http://localhost:8983/solr/update/extract?commit=true; Then, I try to query using
Re: Very basic questions: Faceted front-end?
Ah, I found this: https://issues.apache.org/jira/browse/SOLR-634 ... aka solr-ui. Is there anything else along these lines? Thanks! -Peter On Jun 30, 2010, at 3:59 PM, Peter Spam wrote: Wow, thanks Lance - it's really fast now! The last piece of the puzzle is setting up a nice front-end. Are there any pre-built front-ends available, that mimic Google (for example), with facets? -Peter On Jun 29, 2010, at 9:04 PM, Lance Norskog wrote: To highlight a field, Solr needs some extra Lucene values. If these are not configured for the field in the schema, Solr has to re-analyze the field to highlight it. If you want faster highlighting, you have to add term vectors to the schema. Here is the grand map of such things: http://wiki.apache.org/solr/FieldOptionsByUseCase On Tue, Jun 29, 2010 at 6:29 PM, Erick Erickson erickerick...@gmail.com wrote: What are you actual highlighting requirements? you could try things like maxAnalyzedChars, requireFieldMatch, etc http://wiki.apache.org/solr/HighlightingParameters has a good list, but you've probably already seen that page Best Erick On Tue, Jun 29, 2010 at 9:11 PM, Peter Spam ps...@mac.com wrote: To follow up, I've found that my queries are very fast (even with fq=), until I add hl=true. What can I do to speed up highlighting? Should I consider injecting a line at a time, rather than the entire file as a field? -Pete On Jun 29, 2010, at 11:07 AM, Peter Spam wrote: Thanks for everyone's help - I have this working now, but sometimes the queries are incredibly slow!! For example, int name=QTime461360/int. Also, I had to bump up the min/max RAM size to 1GB/3.5GB for things to inject without throwing heap memory errors. However, my data set is very small! 36 text files, for a total of 113MB. (It will grow to many TB, but for now, this is a test). The largest file is 34MB. Therefore, I'm sure I'm doing something wrong :-) Here's my config: --- For the schema.xml, types is all default. For fields, here are the only lines that aren't commented out: field name=id type=string indexed=true stored=true required=true / field name=body type=text indexed=true stored=true multiValued=true/ field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/ field name=build type=string indexed=true stored=true multiValued=false/ field name=device type=string indexed=true stored=true multiValued=false/ dynamicField name=* type=ignored multiValued=true / ... then, for the rest: uniqueKeyid/uniqueKey !-- field for the QueryParser to use when an explicit fieldname is absent -- defaultSearchFieldbody/defaultSearchField !-- SolrQueryParser configuration: defaultOperator=AND|OR -- solrQueryParser defaultOperator=AND/ --- Invoking: java -Xmx3584M -Xms1024M -jar start.jar --- Injecting: #!/bin/sh J=0 for i in `find . -name \*.txt`; do (( J++ )) curl http://localhost:8983/solr/update/extract?literal.id=doc$Jfmap.content=body; -F myfi...@$i; done; echo - Committing curl http://localhost:8983/solr/update/extract?commit=true; --- Searching: http://localhost:8983/solr/select?q=testinghl=truefl=id,scorehl.snippets=5hl.mergeContiguous=true -Pete On Jun 28, 2010, at 5:22 PM, Erick Erickson wrote: try adding hl.fl=text to specify your highlight field. I don't understand why you're only getting the ID field back though. Do note that the highlighting is after the docs, related by the ID. Try a (non highlighting) query of just * to verify that you're pointing at the index you think you are. It's possible that you've modified a different index with SolrJ than your web server is pointing at. Also, SOLR has no way of knowing you're modified your index with SolrJ, so it may not be automatically reopening an IndexReader so your recent changes may not be visible until you force the SOLR reader to reopen. HTH Erick On Mon, Jun 28, 2010 at 6:49 PM, Peter Spam ps...@mac.com wrote: On Jun 28, 2010, at 2:00 PM, Ahmet Arslan wrote: 1) I can get my docs in the index, but when I search, it returns the entire document. I'd love to have it only return the line (or two) around the search term. Solr can generate Google-like snippets as you describe. http://wiki.apache.org/solr/HighlightingParameters Here's how I commit my documents: J=0; for i in `find . -name \*.txt`; do (( J++ )) curl http://localhost:8983/solr/update/extract?literal.id=doc$J; -F