LetterTokenizer + EdgeNGram + apostrophe in query = invalid result
I have the following field defined in my schema: I have the default field set to "person" and have indexed the following document: The following queries return the result as expected using the standard request handler: vincent m d onofrio d'o onofrio d onofrio The following query fails: d'onofrio This is weird because "d'o" returns a result. As soon as I type the "n" I start to get no results. I ran this though the field analysis page and it shows that this query is being tokenized correctly and should yield a result. I am using a build of trunk Solr (r1073990) and the example solrconfig.xml. I am also using the example schema with the addition of my ngram field. Any ideas? I have tried this with other word's containing an apostrophe and they all stop returning results after 4 characters. Thanks, Matt Weber
Re: Ramdirectory
I have used this without issue. In the example solrconfig.xml replace this line: with this one: Thanks, Matt Weber On Thu, Feb 24, 2011 at 7:47 PM, Bill Bell wrote: > Thanks - yeah that is why I asked how to use it. But I still don't know > how to use it. > > https://hudson.apache.org/hudson/job/Solr-3.x/javadoc/org/apache/solr/core/ > RAMDirectoryFactory.html > > > https://issues.apache.org/jira/browse/SOLR-465 > > > > > > > Is that right? Examples? Options? > > Where do I put that in solrconfig.xml ? Do I put it in > mainIndex/directoryProvider ? > > I know that SOLR-465 is more generic, but > https://issues.apache.org/jira/browse/SOLR-480 seems easier to use. > > > > Thanks. > > > On 2/24/11 6:21 PM, "Chris Hostetter" wrote: > >> >>: I could not figure out how to setup the ramdirectory option in >>solrconfig.XML. Does anyone have an example for 1.4? >> >>it wasn't an option in 1.4. >> >>as Koji had already mentioned in the other thread where you chimed in >>and asked about this, it was added in the 3x branch... >> >>http://lucene.472066.n3.nabble.com/Question-Solr-Index-main-in-RAM-td25671 >>66.html >> >> >> >>-Hoss > > > -- Thanks, Matt Weber
Re: field collapsing sums
You might want to see how the stats component works with field collapsing. Thanks, Matt Weber On Sep 30, 2009, at 5:16 PM, Uri Boness wrote: Hi, At the moment I think the most appropriate place to put it is in the AbstractDocumentCollapser (in the getCollapseInfo method). Though, it might not be the most efficient. Cheers, Uri Joe Calderon wrote: hello all, i have a question on the field collapsing patch, say i have an integer field called "num_in_stock" and i collapse by some other column, is it possible to sum up that integer field and return the total in the output, if not how would i go about extending the collapsing component to support that? thx much --joe
Re: Showing few results for each category (facet)
So, you want to display 5 results from each category and still know how many results are in each category. This is a perfect situation for the field collapsing patch: https://issues.apache.org/jira/browse/SOLR-236 http://wiki.apache.org/solr/FieldCollapsing Here is how I would do it. Add a field to your schema called category or whatever. Then while indexing you populate that field with whatever category the document belongs in. While executing a search, collapse the results on that field with a max collapse of 5. This will give you at most 5 results per category. Now, at the same time enable faceting on that field and DO NOT use the collapsing parameter to recount the facet vales. This means that the facet counts will be reflect the non-collapsed results. This facet should only be used to get the count for each category, not displayed to the user. On your search results page that gets the collapsed results, you can put a link that says "Show all X results from this category" where X is the value you pull out of the facet. When a user clicks that link you basically do the same search with field collapsing disabled, and a filter query on the specific category they want to see, for example: &fq=category:people. Hope this helps. Thanks, Matt Weber On Sep 29, 2009, at 4:55 AM, Marian Steinbach wrote: On Tue, Sep 29, 2009 at 11:36 AM, Varun Gupta wrote: ... One way that I can think of doing this is by making as many queries as there are categories and show these results under each category. But this will be very inefficient. Is there any way I can do this ? Hi Varun! I think that doing multiple queries doesn't have to be inefficient, since Solr caches subsequent queries for the same term and facets. Imagine this as your first query: - q: xyz - facets: myfacet and this as a second query: - q:xyz - fq: myfacet=a Compared to the first query, the second query will be very fast, since all the hard work ahs been done in query one and then cached. At least that's my understanding. Please correct me if I'm wrong. Marian
Re: Usage of Sort and fq
A description and examples of both parameters can be found here: http://wiki.apache.org/solr/CommonQueryParameters Thanks, Matt Weber On Sep 29, 2009, at 4:10 AM, Avlesh Singh wrote: /?q=*:*&fq:category:animal&sort=child_count%20asc Search for all documents (of animals), and filter the ones that belong to the category "animal" and sort ascending by a field called child_count that contains number of children for each animal. You can pass multiple fq's with more "&fq=..." parameters. Secondary, tertiary sorts can be specified using comma (",") as the separator. i.e. "sort=fieldA asc,fieldB desc, fieldC asc, ..." Cheers Avlesh On Tue, Sep 29, 2009 at 3:51 PM, bhaskar chandrasekar wrote: Hi, Can some one let me know how to use sort and fq parameters in Solr. Any examples woould be appreciated. Regards Bhaskar
Re: Using two Solr documents to represent one logical document/file
Check out the field collapsing patch: http://wiki.apache.org/solr/FieldCollapsing https://issues.apache.org/jira/browse/SOLR-236 Thanks, Matt Weber On Sep 25, 2009, at 3:15 AM, Peter Ledbrook wrote: Hi, I want to index both the contents of a document/file and metadata associated with that document. Since I also want to update the content and metadata indexes independently, I believe that I need to use two separate Solr documents per real/logical document. The question I have is how do I merge query results so that only one result is returned per real/logical document, not per Solr document? In particular, I don't want to filter the results to satisfy any "max results" constraint. I have read that this can be achieved with a facet search. Is this the best approach, or is there some alternative? Thanks, Peter -- View this message in context: http://www.nabble.com/Using-two-Solr-documents-to-represent-one-logical-document-file-tp25609646p25609646.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is it possible to query for "everything" ?
Query for *:* Thanks, Matt Weber On Sep 14, 2009, at 4:18 PM, Jonathan Vanasco wrote: I'm using Solr for seach and faceted browsing Is it possible to have solr search for 'everything' , at least as far as q is concerned ? The request handlers I've found don't like it if I don't pass in a q parameter
Re: Searching for the '+' character
Why don't you create a synonym for + that expands to your customers product name that includes the plus? You can even have your FE do this sort of replacement BEFORE submitting to Solr. Thanks, Matt Weber On Sep 14, 2009, at 11:42 AM, AHMET ARSLAN wrote: Thanks Ahmet, Thats excellent, thanks :) I may have to increase the gramsize to take into account other possible uses but i can now read around these filters to make the adjustments. With regard to WordDelimiterFilterFactory. Is there a way to place a delimiter on this filter to still get most of its functionality without it absorbing the + signs? Yes you are right, preserveOriginal="1" will causes the original token to be indexed without modifications. Will i loose a lot of 'good' functionality by removing it? It depends of your input data. It is used to break one token into subwords. Like: "Wi-Fi" -> "Wi", "Fi" and "PowerShot" -> "Power", "Shot" If you input data set contains such words, you may need it. But I think just to make last character searchable, using NGramFilter(s) is not an optimal solution. I don't know what type of dataset you have but, I think using separate two fields (with different types) for that is more suitable. One field will contain actual data itself. The other will hold only the last character(s). You can achieve this by a copyField or programatically during indexing. The type of the field lastCharsField will be using EdgeNGramFilter so that only last character of token(s) will pass that filter. During searching you will search those two fields: originalField:\+ OR lastCharsField:\+ The query lastCharsField:\+ will return you all the products ending with +. Hope this helps.
Re: When to optimize?
I would say once a day is a pretty good rule of thumb. If you think this is a bit much and if you have few updates you can probably back that off to once every couple days to once a week. However, if you have a large batch update or your query performance starts to degrade, you will need to optimize your index. Thanks, Matt Weber On Sep 13, 2009, at 6:21 PM, William Pierce wrote: Folks: Are there good rules of thumb for when to optimize? We have a large index consisting of approx 7M documents and we currently have it set to optimize once a day. But sometimes there are very few changes that have been committed during a day and it seems like a waste to optimize (esp. since our servers are pretty well loaded). So I was looking to get some good rules of thumb for when it makes sense to optimize: Optimize when x% of the documents have been changed since the last optimize or some such. Any ideas would be greatly appreciated! -- Bill
Re: Solr Cell
Found my own answer, use the literal parameter. Should have dug around before asking. Sorry. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On Jul 23, 2009, at 2:26 PM, Matt Weber wrote: Is it possible to supply addition metadata along with the binary file when using Solr Cell? For example, I have a pdf called somefile.pdf and I have some external metadata related to that file. Such metadata might be things like author, publisher, source, date published, etc. I want to post the binary data for somefile.pdf to Solr Cell AND map my metadata into other fields in the same document that has the extracted text from the pdf. I know I could do this using Tika and SolrJ directly, but it would be much easier if Solr Cell can do it. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com
Solr Cell
Is it possible to supply addition metadata along with the binary file when using Solr Cell? For example, I have a pdf called somefile.pdf and I have some external metadata related to that file. Such metadata might be things like author, publisher, source, date published, etc. I want to post the binary data for somefile.pdf to Solr Cell AND map my metadata into other fields in the same document that has the extracted text from the pdf. I know I could do this using Tika and SolrJ directly, but it would be much easier if Solr Cell can do it. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com
Re: Solr relevancy score - conversion
Solr does not support this. You can do it yourself by taking the highest score and using that as 100% and calculating other percentages from that number. For example if the max score is 10 and the next result has a score of 5, you would do (5 / 10) * 100 = 50%. Hope this helps. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On Jun 8, 2009, at 10:05 PM, Vijay_here wrote: Hi, I am using solr to inxdex some of the legal documents, where i need the solr search engine to return relevancy ranking score for each search results. As of now i am getting score like 3.12, 1.23, 0.23 so on. Would need an more proportionate score like rounded to 100% (95% relevant, 80 % relevant and so on). Is there a way to make solr returns such scores of such relevance. Any other approach to arrive at this scores also be appreciated thanks vijay -- View this message in context: http://www.nabble.com/Solr-relevancy-score---conversion-tp23936413p23936413.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Home on Linux JBoss ignored
Check the dataDir setting in solrconfig.xml. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On Jun 5, 2009, at 6:03 AM, Dean Pullen wrote: I lied, it's actually saving data to: /usr/local/jboss-portal-2.7.1.GA/bin/C:\home\jboss\solr\data Which is a tad crazy! And I have no idea why! Dean. -Original Message- From: Dean Pullen [mailto:dean.pul...@msp-uk.com] Sent: 05 June 2009 09:47 To: solr-user@lucene.apache.org Subject: Solr Home on Linux JBoss ignored Hi all, Have an odd problem on JBoss 4.2.3 running on Redhat. It's odd, because the configuration works fine on Windows. Our Solr home is defined in the Solr.war web.xml as: [Linux] solr/home java.lang.String /home/jboss/solr [Windows] solr/home java.lang.String c:/home/jboss/solr However, on Linux Solr is still defaulting to JBoss Web's [Tomcat] work directory, i.e. /usr/local/jboss-portal-2.7.1.GA/server/default/work/jboss.web/ localhost/solr Instead of the defined /home/jboss/solr Can anyone shed any light on this? Thanks, Dean. Scanned by MailDefender - managed email security from intY - www.maildefender.net Scanned by MailDefender - managed email security from intY - www.maildefender.net Scanned by MailDefender - managed email security from intY - www.maildefender.net
Re: Facet counts limit
1. The limit parameter takes a signed integer, so the max value is 2,147,483,647. 2. I don't think there is a defined limit which would mean you are only limited to want your system can handle. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 20, 2009, at 11:41 AM, sachin78 wrote: Have two questions? 1) What is the limit on facet counts? ex : test(10,0).Is this valid? 2) What is the limit on the no of facets? how many facets can a query get? --Sachin -- View this message in context: http://www.nabble.com/Facet-counts-limit-tp23641105p23641105.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Search Query Questions
I think you will want to look at the Field Collapsing patch for this. http://issues.apache.org/jira/browse/SOLR-236 . Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 14, 2009, at 5:52 PM, Chris Miller wrote: Oh, one more question 3) Is there a way to effectively do a GROUP BY? For example, if I have a document that has a photoID attached to it, is there a way to return a set of results that does not duplicate the photoID field? Thanks, Chris Miller ServerMotion www.servermotion.com On May 14, 2009, at 7:46 PM, Chris Miller wrote: I have two questions: 1) How do I search for ALL items? For example, I provide a sort query parameter of "updated" and a rows query parameter of 10 to limit the query results. I still have to provide a search query, of course. What if I want to provide a list of ALL results that match this? Or, in this case, the most recent 10 updated documents? 2) How do I search for all documents with a field that has data? For example, I have a field "foo" that is optional and multi- valued. How do I search for documents that have this field set to anything. Thanks, Chris Miller ServerMotion www.servermotion.com
Re: Selective Searches Based on User Identity
Here is a good presentation on search security from the Infonortics Search Conference that was held a few weeks ago. http://www.infonortics.com/searchengines/sh09/slides/kehoe.pdf The approach you are using is called early-binding. As Jay mentioned, one of the downsides is updating the documents each time you have an ACL change. You could use the late-binding approach that checks each result after the query but before you display to the user. I don't recommend this approach because it will strain your security infrastructure because you will need to check if the user can access each result. Good luck. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 1:21 PM, Jay Hill wrote: The only downside would be that you would have to update a document anytime a user was granted or denied access. You would have to query before the update to get the current values for grantedUID and deniedUID, remove/add values, and update the index. If you don't have a lot of changes in the system that wouldn't be a big deal, but if a lot of changes are happening throughout the day you might have to queue requests and batch them. -Jay On Tue, May 12, 2009 at 1:05 PM, Matt Weber wrote: I also work with the FAST Enterprise Search engine and this is exactly how their Security Access Module works. They actually use a modified base-32 encoded value for indexing, but that is because they don't have the luxury of untokenized/un-processed String fields like Solr. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 12:26 PM, Terence Gannon wrote: Paul -- thanks for the reply, I appreciate it. That's a very practical approach, and is worth taking a closer look at. Actually, taking your idea one step further, perhaps three fields; 1) ownerUid (uid of the document's owner) 2) grantedUid (uid of users who have been granted access), and 3) deniedUid (uid of users specifically denied access to the document). These fields, coupled with some business rules around how they were populated should cover off all possibilities I think. Access to the Solr instance would have to be tightly controlled, but that's something that should be done anyway. You sure wouldn't want end users preparing their own XML and throwing it at Solr -- it would be pretty easy to figure out how to get around the access/denied fields and get at stuff the owner didn't intend. This approach mimics to some degree what is being done in the operating system, but it's still elegant and provides the level of control required. Anybody else have any thoughts in this regard? Has anybody implemented anything similar, and if so, how did it work? Thanks, and best regards... Terence
Re: Selective Searches Based on User Identity
I also work with the FAST Enterprise Search engine and this is exactly how their Security Access Module works. They actually use a modified base-32 encoded value for indexing, but that is because they don't have the luxury of untokenized/un-processed String fields like Solr. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 12:26 PM, Terence Gannon wrote: Paul -- thanks for the reply, I appreciate it. That's a very practical approach, and is worth taking a closer look at. Actually, taking your idea one step further, perhaps three fields; 1) ownerUid (uid of the document's owner) 2) grantedUid (uid of users who have been granted access), and 3) deniedUid (uid of users specifically denied access to the document). These fields, coupled with some business rules around how they were populated should cover off all possibilities I think. Access to the Solr instance would have to be tightly controlled, but that's something that should be done anyway. You sure wouldn't want end users preparing their own XML and throwing it at Solr -- it would be pretty easy to figure out how to get around the access/denied fields and get at stuff the owner didn't intend. This approach mimics to some degree what is being done in the operating system, but it's still elegant and provides the level of control required. Anybody else have any thoughts in this regard? Has anybody implemented anything similar, and if so, how did it work? Thanks, and best regards... Terence
Re: Facet counts for common terms of the searched field
I mean you can sort the facet results by frequency, which happens to be the default behavior. Here is an example field for your schema: stored="true" multiValued="true" /> Here is an example query: http://localhost:8983/solr/select?q=textfield:copper&facet=true&facet.field=textfieldfacet&facet.limit=5 This will give you the top 5 words in the textfieldfacet. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 7:57 AM, sachin78 wrote: Thanks Matt for your reply. What do you mean by frequency(the default)? Can you please provide an example schema and query will look like. --Sachin Matt Weber-2 wrote: You may have to take care of this at index time. You can create a new multivalued field that has minimal processing. Then at index time, index the full contents of textfield as normal, but then also split it on whitespace and index each word in the new field you just created. Now you will be able to facet on this new field and sort the facet by frequency (the default) to get the most popular words. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 7:33 AM, sachin78 wrote: Does anybody have answer to this post.I have a similar requirement. Suppose I have free text field say I index the field.If I search for textfield:copper.I have to get facet counts for the most common words found in a textfield. ie. example:search for textfield:glass should return facet counts for common words found textfield. semiconductor(10),iron(20), silicon (25) material (8) thin(25) and so on. Can this be done using tagging or MLT. Thanks, Sachin Raju444us wrote: I have a requirement. If I search for text field let's say "metal:glass" what i want is to get the facet counts for all the terms related to "glass" in my search results. window(100) since a window can be glass. plastic(10) plastic is a material just like glass Iron(10) Paper(15) Can I use MLT to get this functionality.Please let me know how can I achieve this.If possible an example query. Thanks, Raju -- View this message in context: http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23503794.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23504241.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Facet counts for common terms of the searched field
You may have to take care of this at index time. You can create a new multivalued field that has minimal processing. Then at index time, index the full contents of textfield as normal, but then also split it on whitespace and index each word in the new field you just created. Now you will be able to facet on this new field and sort the facet by frequency (the default) to get the most popular words. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 7:33 AM, sachin78 wrote: Does anybody have answer to this post.I have a similar requirement. Suppose I have free text field say I index the field.If I search for textfield:copper.I have to get facet counts for the most common words found in a textfield. ie. example:search for textfield:glass should return facet counts for common words found textfield. semiconductor(10),iron(20), silicon (25) material (8) thin(25) and so on. Can this be done using tagging or MLT. Thanks, Sachin Raju444us wrote: I have a requirement. If I search for text field let's say "metal:glass" what i want is to get the facet counts for all the terms related to "glass" in my search results. window(100) since a window can be glass. plastic(10) plastic is a material just like glass Iron(10) Paper(15) Can I use MLT to get this functionality.Please let me know how can I achieve this.If possible an example query. Thanks, Raju -- View this message in context: http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23503794.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr + wordpress
I actually wrote a plugin that integrates Solr with WordPress. http://www.mattweber.org/2009/04/21/solr-for-wordpress/ http://wordpress.org/extend/plugins/solr-for-wordpress/ https://launchpad.net/solr4wordpress Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 8, 2009, at 10:10 AM, Noble Paul നോബിള് नोब्ळ् wrote: Somebody has writte an articles on integrating Solr with wordpress http://www.ipros.nl/2008/12/15/using-solr-with-wordpress/ -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Solr autocompletion in rails
First, your solrconfig.xml should have the something similar to the following: class="org.apache.solr.handler.component.TermsComponent"/> class="org.apache.solr.handler.component.SearchHandler"> termsComp This will give you a request handler called "/autoSuggest" that you will use for suggestions. Then you need to write some rails code to access this. I am not very familiar with ruby, but I believe you might want to try http://wiki.apache.org/solr/solr-ruby . Make sure you set your query type to "/autoSuggest". If that won't work for you, then just use the standard http libraries to access the autoSuggest url directly and get json output. With any of these methods make sure you set the following parameters: terms=true terms.fl=source_field terms.lower=input_term terms.prefix=input_term terms.lower.incl=false For direct access to the json output you will want these as well: indent=true wt=json The terms.fl parameter specifys the field(s) you want to use as the source for suggestions. Make sure this field has very little processing done on it, maybe lowercasing and tokenization only. Here is an example url that should give you some output once things are working: http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=spell&terms.lower=t&terms.prefix=t&terms.lower.incl=false&indent=true&wt=json The next thing is to parse the json output and do whatever you want with the results. In my example, I just printed out each suggestion on a single line of the response because this is what the jQuery autocomplete plugin wanted. The easiest way to parse the json output is to use the json ruby library, http://json.rubyforge.org/. After you have your rails controller working you can hook it into your FE with some javascript like I did in the example on my blog. Hope this helps. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 7, 2009, at 7:37 AM, manisha_5 wrote: Thanks a lot for the information. But I am still a bit confused about the use of TermsComponents. Like where are we exactly going to put these codes in Solr.For example I changed schema.xml to add autocomplete feauture.I read your blog too, its very helpful.But still a little confused. :-(( Can you explain it a bit? Matt Weber-2 wrote: You will probably want to use the new TermsComponent in Solr 1.4. See http://wiki.apache.org/solr/TermsComponent . I just recently wrote a blog post about using autocompletion with TermsComponent, a servlet, and jQuery. You can probably follow these instructions, but instead of writing a servlet you can write a rails handler parsing the json output directly. http://www.mattweber.org/2009/05/02/solr-autosuggest-with-termscomponent-and-jquery/ . Thanks, Matt Weber On May 4, 2009, at 9:39 AM, manisha_5 wrote: Hi, I am new to solr. I am using solr server to index the data and make search in a Ruby on rails project.I want to add autocompletion feature. I tried with the xml patch in the schema.xml file of solr, but dont know how to test if the feature is working.also havent been able to integrate the same in the Rails project that is using Solr.Can anyone please provide some help in this regards?? the patch of codes in Schema.xml is : -- View this message in context: http://www.nabble.com/Solr-autocompletion-in-rails-tp23372020p23372020.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Solr-autocompletion-in-rails-tp23372020p23428267.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Conditional/Calculated Fields (is it possible?)
I do not think this is possible. You will probably want to handle this logic on your side during indexing. Index the document with the fist price, then as that price expires, update the document with the new price. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 6, 2009, at 4:32 AM, Andrew Ingram wrote: Hi everyone, I'm working on the search schema for ecommerce products and I'm having an issue with the prices. Basically, a product has two price values and a date, the product effectively has one price before the date and the other one after. This poses no problem for the site itself since I can use conditional logic, but I have no idea how to approach this with regards to solr queries. The price of a product is used for both faceting and sorting and should use whichever price is active at the time of the query. Is there any way to do define a field whose value is a simple algorithm operating on the value of other fields? I'm quite happy to use a custom field type if necessary, though I'm not sure if what I want is even possible and I don't really know where to begin. Any help would be appreciated Regards, Andrew Ingram
Re: Multi-index Design
1 - A field that is called "type" which is probably a string field that you index values such as "people", "organization", "product". 2 - Yes, for each document you are indexing, you will include it's type, ie. "person" 3, 4, 5 - You would have a core for each domain. Each domain will then have it's own index that contains documents of all types. See http://wiki.apache.org/solr/MultipleIndexes . Thanks, Matt Weber On May 5, 2009, at 11:14 AM, Michael Ludwig wrote: Chris Masters schrieb: - flatten the searchable objects as much as I can - use a type field to distinguish - into a single index - use multi-core approach to segregate domains of data Some newbie questions: (1) What is a "type field"? Is it to designate different types of documents, e.g. product descriptions and forum postings? (2) Would I include such a "type field" in the data I send to the update facility and maybe configure Solr to take special action depending on the value of the update field? (3) Like, write the processing results to a domain dedicated to that type of data that I could limit my search to, as per Otis' post? (4) And is that what's called a "core" here? (5) Or, failing (3), and lumping everything together in one search domain (core?), would I use that "type field" to limit my search to a particular type of data? Michael Ludwig
Re: Solr autocompletion in rails
You will probably want to use the new TermsComponent in Solr 1.4. See http://wiki.apache.org/solr/TermsComponent . I just recently wrote a blog post about using autocompletion with TermsComponent, a servlet, and jQuery. You can probably follow these instructions, but instead of writing a servlet you can write a rails handler parsing the json output directly. http://www.mattweber.org/2009/05/02/solr-autosuggest-with-termscomponent-and-jquery/ . Thanks, Matt Weber On May 4, 2009, at 9:39 AM, manisha_5 wrote: Hi, I am new to solr. I am using solr server to index the data and make search in a Ruby on rails project.I want to add autocompletion feature. I tried with the xml patch in the schema.xml file of solr, but dont know how to test if the feature is working.also havent been able to integrate the same in the Rails project that is using Solr.Can anyone please provide some help in this regards?? the patch of codes in Schema.xml is : minGramSize="3" maxGramSize="15" /> maxGramSize="100" minGramSize="1" /> -- View this message in context: http://www.nabble.com/Solr-autocompletion-in-rails-tp23372020p23372020.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: autoSuggest
I am not sure you can return the results in order of frequency, you will have to sort the results yourself. Also, for autoSuggest you will want to add the terms.prefix=input term and terms.lower.incl=false so your example will be: /autoSuggest? terms = true &indent = true &terms .fl = title &terms .rows = 5 &terms .lower=simp&terms.lower.incl=false&terms.prefix=simp&omitHeader=true To get results for more multiple words such as "barack obama", you need to set the terms.fl parameter to an untokenized, un-processed field just as you would with a facet. So in your schema.xml, add a new string field, then use a copyfield to copy the value of title into the new field and set terms.fl to the new field you just created after reindexing. Thanks, Matt Weber On May 4, 2009, at 6:46 AM, sunnyfr wrote: Hi, I would like to know how work /autoSuggest. I do have result when I hit : /autoSuggest? terms = true &indent =true&terms.fl=title&terms.rows=5&terms.lower=simp&omitHeader=true I've: 74 129 2 2 1 How can I ask it to suggest first expression which are more frequent in the database ? How can I look for even for two words, ie: I look for "bara" ... make it suggesting "barack obama" ??? thanks a lot, -- View this message in context: http://www.nabble.com/autoSuggest-tp23367848p23367848.html Sent from the Solr - User mailing list archive at Nabble.com. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com
Re: Highlight MoreLikeThis results?
There was a thread about this last week and verdict is currently you can't highlight MoreLikeThis results. Thanks, Matt Weber On May 4, 2009, at 1:22 AM, jli...@gmail.com wrote: My query returns a number of MoreLikeThis results for a given document. I wonder if there is a way to highlight the terms in the MoreLikeThis results? Thanks.
Re: Term highlighting with MoreLikeThisHandler?
Yes, I understand you can't highlight a documented within a document. However, with MLT you a using the interesting terms from the source document(s) to find similar results. An obvious solution would be highlighting the interesting terms that matched and thus made the result similar. Thanks, Matt Weber On Apr 29, 2009, at 9:27 PM, Walter Underwood wrote: Think about this for a moment. When you use MoreLikeThis, the query is a document. How do you highlight a document in another document? wunder On 4/29/09 9:21 PM, "Matt Weber" wrote: Any luck on this? I am experiencing the same issue. Highlighting works fine on all other request handlers, but breaks when I use the MoreLikeThisHandler. Thanks, Matt Weber On Apr 28, 2009, at 5:29 AM, Eric Sabourin wrote: Yes... at least I think so. the highlighting works correctly for me on another request handler... see below the request handler for my morelikethishandler query. Thanks for your help... Eric score,id,timestamp,type,textualId,subject,url,server explicit true list subject,requirements,productName,justification,operation_exact str> 2 1 2 true 1 0 0 regex regex On Mon, Apr 27, 2009 at 11:30 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: Eric, Have you tried using MLT with parameters described on http://wiki.apache.org/solr/HighlightingParameters ? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Eric Sabourin To: solr-user@lucene.apache.org Sent: Monday, April 27, 2009 10:31:38 AM Subject: Term highlighting with MoreLikeThisHandler? I submit a query to the MoreLikeThisHandler to find documents similar to a specified document. This works and I've configured my request handler to also return the interesting terms. Is it possible to have MLT return to me highlight snippets in the similar documents it returns? I mean generate hl snippets of the interesting terms? If so how? Thanks... Eric -- Eric Sent from Halifax, NS, Canada
Re: Term highlighting with MoreLikeThisHandler?
Any luck on this? I am experiencing the same issue. Highlighting works fine on all other request handlers, but breaks when I use the MoreLikeThisHandler. Thanks, Matt Weber On Apr 28, 2009, at 5:29 AM, Eric Sabourin wrote: Yes... at least I think so. the highlighting works correctly for me on another request handler... see below the request handler for my morelikethishandler query. Thanks for your help... Eric score,id,timestamp,type,textualId,subject,url,server explicit true list name = "mlt .fl">subject,requirements,productName,justification,operation_exactstr> 2 1 2 true 1 0 0 regex regex On Mon, Apr 27, 2009 at 11:30 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: Eric, Have you tried using MLT with parameters described on http://wiki.apache.org/solr/HighlightingParameters ? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Eric Sabourin To: solr-user@lucene.apache.org Sent: Monday, April 27, 2009 10:31:38 AM Subject: Term highlighting with MoreLikeThisHandler? I submit a query to the MoreLikeThisHandler to find documents similar to a specified document. This works and I've configured my request handler to also return the interesting terms. Is it possible to have MLT return to me highlight snippets in the similar documents it returns? I mean generate hl snippets of the interesting terms? If so how? Thanks... Eric -- Eric Sent from Halifax, NS, Canada