Re: how to do a Parent/Child Mapping using entities
Thanks Sascha for your post, but i find it interresting, but in my case i don't want to use an additionnal field, i want to be able with the same schema to do a simple query like : q=res_url:some url, and a query like the other one; in other word; is there any solution to make two or more multivalued fields in the same document linked with each other, e.g: in this result: - result name=response numFound=1 start=0 - doc str name=id1/str str name=keywordKey1/str - arr name=res_url strurl1/str strurl2/str strurl3/str strurl4/str /arr - arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc /result i would like to make solr understand that for this document, value:url1 of res_url field is linked to value:1 of res_rank field, and all of them are linked to the commen field keyword. I think that i should use a custom field analyser or some thing like that; but i don't know what to do. but thanks for all; and any supplied help will be lovable. Sascha Szott wrote: Hi, you could create an additional index field res_ranked_url that contains the concatenated value of an url and its corresponding rank, e.g., res_rank + + res_url Then, q=res_ranked_url:1 url1 retrieves all documents with url1 as the first url. A drawback of this workaround is that you have to use a phrase query thus preventing wildcard searches for urls. -Sascha Hello everybody, i would like to know how to create index supporting a parent/child mapping and then querying the child to get the results. in other words; imagine that we have a database containing 2 tables:Keyword[id(int), value(string)] and Result[id(int), res_url(text), res_text(tex), res_date(date), res_rank(int)] For indexing, i used the DataImportHandler to import data and it works well, and my query response seems good:(q=*:*) (imagine that we have only this to keywords and their results) ?xml version=1.0 encoding=UTF-8 ? -response -lst name=responseHeader int name=status0/int int name=QTime0/int -lst name=params str name=q*:*/str /lst /lst -result name=response numFound=2 start=0 -doc str name=id1/str str name=keywordKey1/str -arr name=res_url strurl1/str strurl2/str strurl3/str strurl4/str /arr -arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc -doc str name=id2/str str name=keywordKey2/str -arr name=res_url strurl1/str strurl5/str strurl8/str strurl7/str /arr -arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc /result /response but the problem is when i tape a query kind of this:q=res_url:url2 AND res_rank:1 and this to say that i want to search for the keywords in which the url (url2) is ranked at the first position, i have a result like this: ?xml version=1.0 encoding=UTF-8 ? -response -lst name=responseHeader int name=status0/int int name=QTime0/int -lst name=params str name=qres_url:url2 AND res_rank:1/str /lst /lst -result name=response numFound=1 start=0 -doc str name=id1/str str name=keywordKey1/str -arr name=res_url strurl1/str strurl2/str strurl3/str strurl4/str /arr -arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc /result /response But this is not true; because the url present in the 1st position in the results of the keyword key1 is url1 and not url2. So what i want to say is : is there any solution to make the values of the multivalued fields linked; so in our case we can see that the previous result say that: - url1 is present in 1st position of key1 results - url2 is present in 2nd position of key1 results - url3 is present in 3rd position of key1 results - url4 is present in 4th position of key1 results and i would like that solr consider this when executing queries. Any helps please; and thanks for all :) -- View this message in context: http://old.nabble.com/how-to-do-a-Parent-Child-Mapping-using-entities-tp26956426p26965478.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Using the new tokenizer API from a jar file
Thanks all for your interest, especially Uwe. I asked this question on solr-user at the beginning but I got no reply. That's why I re-asked the question at java-user. Thanks for your efforts. I will try it now. On Mon, Dec 28, 2009 at 12:02 PM, Uwe Schindler u...@thetaphi.de wrote: I opened https://issues.apache.org/jira/browse/LUCENE-2182 about this problem and already have a fix. This is really a bug. The solution is simple because you have to load the IMPL class using the same classloader as the passed in interface. The default for Class.forName is the classloader of AttributeSource.class, which is the wrong one. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Monday, December 28, 2009 9:20 AM To: java-u...@lucene.apache.org Cc: solr-user@lucene.apache.org Subject: RE: Using the new tokenizer API from a jar file The question on this list was ok,as it shows a minor problem of using the new TokenStream API with Solr. His plugin was loaded correctly, because if Lucene says, that it cannot find the *Impl class, it was able to load the interface class before - the JAR file is visible to the JVM. The problem is the following and has to do with classloaders: 1. We have different class loaders for different places in Solr. Solr uses for plugins a SolrResourceLoader that searches for JAR files in the local lib folder before handling over to the webapp's classloader. 2. Initially, the lucene JAR is loaded by the Webapp's class loader 3. If a AttributeImpl is placed into a jar file e.g. in the plugin folder of solr (the lib folder where solr loads all resources, stop words,...), the loading mechanism inside AttributeSource.DEFAULT_ATTRIBUTE_FACTORY is unable to locate the class file, because Class.forName() always uses the class' classloader and not the global/thread one's. So AttributeSource will only find the class file if it is in the *same* directory as the lucene- core.jar file (WEB-INF/lib) and so accessible by the webapp's class loader. A good introduction about the problem is this one: http://www.theserverside.com/tt/articles/content/dm_classForname/DynLoad.p df The problem is here described for the JVM extensions folder but also applies to solr, because it has another classloader for plugins. A solution to fix this would be in lucene to use the thread's context class loader in AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, but I strongly discourage this, as it would break the whole AttributeSource functionality if you add two different attributes with same class names from different class loaders to the AttributeSource. The only solution to the problem is placing the JAR file inside the WEB-INF/lib folder where lucene-core.jar is. Plugins in Solr cannot define own attribute implementations. Alternatively he could try to force preload the class by calling Class.forName in his plugin initialization code on the Impl class. But I am not sure if this works (as Java handles classes from different classloaders different). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Monday, December 28, 2009 4:27 AM To: java-u...@lucene.apache.org Subject: Re: Using the new tokenizer API from a jar file : I tried to use it with solr and the problems began. It's always telling me : that it cannot find the class GlossAttributeImpl. I think the problem is : that my jar file is added to the class path at run time not from the command : line. Do you have a good solution or workaround? You're likely to get mmore helpful answers from other people in the Solr User community (solr-u...@lucene.a.o) As long as you put your jar in the lib directory under your solr home (or refrence it using a lib/ directive in your solrconfig.xml) Solr's plugin loader will take care of hte classloading for you. if you are confident you have your jar in the correct place, please email solr-user with the ClassNotFound stack trace from your solr logs, as well as hierarchy of files from your solr home (ie: the output of find .) -Hoss - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe,
Re: Checkin mistake - example does not work in trunk
ant example is how the solr.war gets generated for the example. It's not checked in. On Dec 29, 2009, at 10:22 PM, Lance Norskog wrote: The distributed binaries do not include the new spatial types, so the .../trunk/example/ store app does not start. Please either always check in the latest binaries (a pain), or edit the README.txt to include now first do an 'ant clean dist'. (And maybe not include the binaries?) http://svn.apache.org/repos/asf/lucene/solr/trunk/example/README.txt -- Lance Norskog goks...@gmail.com -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search
Re: Checkin mistake - example does not work in trunk
On Tue, Dec 29, 2009 at 10:22 PM, Lance Norskog goks...@gmail.com wrote: The distributed binaries do not include the new spatial types, so the .../trunk/example/ store app does not start. ? What distributed binaries are you referring to? The nightly builds? Are they missing a jar? -Yonik http://www.lucidimagination.com
Re: performance question
On Dec 29, 2009, at 2:19 PM, A. Steven Anderson wrote: Greetings! Is there any significant negative performance impact of using a dynamicField? There can be an impact if you are searching against a lot of fields or if you are indexing a lot of fields on every document, but for the most part in most applications it is negligible. Likewise for multivalued fields? No. Multivalued fields are just concatenated together with a large position gap underneath the hood. The reason why I ask is that our system basically aggregates data from many disparate data sources (structured, unstructured, and semi-structured), and the management of the schema.xml has become unwieldy; i.e. we currently have dozens of fields which grows every time we add a new data source. I was considering redefining the domain model outside of Solr which would be used to generate the fields for the indexing process and the metadata (e.g. display names) for the search process. Thoughts? It probably can't hurt to be more streamlined, but without knowing more about your model, it's hard to say. I've built apps that were totally dynamic field based and they worked just fine, but these were more for discovery than just pure search. In other words, the user was interacting with the system in a reflective model that selected which fields to search on. -Grant -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search
Re: performance question
There can be an impact if you are searching against a lot of fields or if you are indexing a lot of fields on every document, but for the most part in most applications it is negligible. We index a lot of fields at one time, but we can tolerate the performance impact at index time. It probably can't hurt to be more streamlined, but without knowing more about your model, it's hard to say. I've built apps that were totally dynamic field based and they worked just fine, but these were more for discovery than just pure search. In other words, the user was interacting with the system in a reflective model that selected which fields to search on. Our application is as much about discovery as search, so this is good to know. Thanks for the feedback. It was very helpful. -- A. Steven Anderson Independent Consultant st...@asanderson.com
StreamingUpdateSolrServer
Hi All, I'm testing StreamingUpdateSolrServer for indexing but I don't see the last : finished: org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner@ in my logs. Do I have to use a special function to wait until update is effective ? Another question (maybe easy for you) I'm running solr on a tomcat 5.0.28 and sometimes, not at a time of rsync or big traffic or commit, it doesn't respond anymore and uptime is very high. Thank you for your help. Patrick.
RE:Delete, commit, optimize doesn't reduce index file size
Is there another way to make this happen without making further changes to the index? Maybe a bounce of the servlet server? On Tue, Dec 29, 2009 at 1:23 PM, markwaddle m...@markwaddle.com wrote: I have an index that used to have ~38M docs at 17.2GB. I deleted all but 13K docs using a delete by query, commit and then optimize. A *:* query now returns 13K docs. The problem is that the files on disk are still 17.1GB in size. I expected the optimize to shrink the files. Is there a way I can shrink them now that the index only has 13K docs? Are you on Windows? The IndexWriter can't delete files in use by the current IndexReader (like it can in UNIX) when the commit is done. If you make further changes to the index and do a commit, you should see the space go down. -Yonik http://www.lucidimagination.com
Automating implementation of SolrInfoMBean
Hi all, Is there a standard way to automatically update the values returned by the methods in SolrInfoMBean? Particularly those concerning revision control etc. I'm assuming folks don't just update that by hand every commit... Thanks! Mat
Build index by consuming web service
I am in a need of a handler which consumes web serivce and builds index from return results of the service. Until now I was building index by reading data directly from database query using DataImportHandler. There are new functional requirements to index calculated fields in the index and allow search on them. I have exposed an application API as a web service, which returns all attributes for indexing. How can ask Solr to consume this service and index attributes returned by the service? Any pointers would be appreciated. Thanks, -- View this message in context: http://old.nabble.com/Build-index-by-consuming-web-service-tp26970642p26970642.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to do a Parent/Child Mapping using entities
Ya, structured data gets a little funny. For starters, the order of multi-valued fields should be maintained, so if you have: doc field name=urlhttp://aaa/field field name=url_rank5/field field name=urlhttp://bbb/field field name=url_rank4/field /doc the response will return result in order, so you can map them with array indicies. I have played some tricks with a JSON field analyzer that give you some more control. For example, if you index: doc field name=url{ url:http://host/;, rank:5 }/field /doc Then I use an analyzer that indexes the terms: url:http://host/ rank:5 I just posted SOLR-1690, if you want to take a look at that approach ryan On Dec 30, 2009, at 4:25 AM, magui wrote: Thanks Sascha for your post, but i find it interresting, but in my case i don't want to use an additionnal field, i want to be able with the same schema to do a simple query like : q=res_url:some url, and a query like the other one; in other word; is there any solution to make two or more multivalued fields in the same document linked with each other, e.g: in this result: - result name=response numFound=1 start=0 - doc str name=id1/str str name=keywordKey1/str - arr name=res_url strurl1/str strurl2/str strurl3/str strurl4/str /arr - arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc /result i would like to make solr understand that for this document, value:url1 of res_url field is linked to value:1 of res_rank field, and all of them are linked to the commen field keyword. I think that i should use a custom field analyser or some thing like that; but i don't know what to do. but thanks for all; and any supplied help will be lovable. Sascha Szott wrote: Hi, you could create an additional index field res_ranked_url that contains the concatenated value of an url and its corresponding rank, e.g., res_rank + + res_url Then, q=res_ranked_url:1 url1 retrieves all documents with url1 as the first url. A drawback of this workaround is that you have to use a phrase query thus preventing wildcard searches for urls. -Sascha Hello everybody, i would like to know how to create index supporting a parent/child mapping and then querying the child to get the results. in other words; imagine that we have a database containing 2 tables:Keyword[id(int), value(string)] and Result[id(int), res_url(text), res_text(tex), res_date(date), res_rank(int)] For indexing, i used the DataImportHandler to import data and it works well, and my query response seems good:(q=*:*) (imagine that we have only this to keywords and their results) ?xml version=1.0 encoding=UTF-8 ? -response -lst name=responseHeader int name=status0/int int name=QTime0/int -lst name=params str name=q*:*/str /lst /lst -result name=response numFound=2 start=0 -doc str name=id1/str str name=keywordKey1/str -arr name=res_url strurl1/str strurl2/str strurl3/str strurl4/str /arr -arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc -doc str name=id2/str str name=keywordKey2/str -arr name=res_url strurl1/str strurl5/str strurl8/str strurl7/str /arr -arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc /result /response but the problem is when i tape a query kind of this:q=res_url:url2 AND res_rank:1 and this to say that i want to search for the keywords in which the url (url2) is ranked at the first position, i have a result like this: ?xml version=1.0 encoding=UTF-8 ? -response -lst name=responseHeader int name=status0/int int name=QTime0/int -lst name=params str name=qres_url:url2 AND res_rank:1/str /lst /lst -result name=response numFound=1 start=0 -doc str name=id1/str str name=keywordKey1/str -arr name=res_url strurl1/str strurl2/str strurl3/str strurl4/str /arr -arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc /result /response But this is not true; because the url present in the 1st position in the results of the keyword key1 is url1 and not url2. So what i want to say is : is there any solution to make the values of the multivalued fields linked; so in our case we can see that the previous result say that: - url1 is present in 1st position of key1 results - url2 is present in 2nd position of key1 results - url3 is present in 3rd position of key1 results - url4 is present in 4th position of key1 results and i would like that solr consider this when executing queries. Any helps please; and thanks for all :) -- View this message in context: http://old.nabble.com/how-to-do-a-Parent-Child-Mapping-using-entities-tp26956426p26965478.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: weird sorting behavior
Hi, so this is only available in 1.5? I tried in 1.4 and got : org.apache.solr.common.SolrException: Error loading class 'solr.CollationKeyFilterFactory' Is there a way to do this in 1.4? The link Shalin sent is a 1.5 link I think. thanks Joel On Dec 25, 2009, at 10:52 PM, Robert Muir wrote: Hello, as Shalin said, you might want to try CollationKeyFilterFactory. Below is an example (using the multilingual root locale), where the spaces will sort after the letters and numbers as you mentioned, but it will still not be case-sensitive. This is because strength is 'secondary'. But are you really sure you want the spaces sorted after the letters and numbers? Or instead do you just want them ignored for sorting? If this is the case, then try 'primary', so that spaces, punctuation, accents and things like that in addition to case are ignored in the sort: for example Test-1234 andtest1234 sort the same with primary, but not with secondary (the one with leading spaces will sort last) If all else fails, you can write custom rules for it too, as Shalin mentioned. fieldType name=collatedROOT class=solr.TextField analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.CollationKeyFilterFactory language= strength=secondary / /analyzer /fieldType On Fri, Dec 25, 2009 at 5:37 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Thu, Dec 24, 2009 at 11:51 PM, Joel Nylund jnyl...@yahoo.com wrote: update, I tried changing to datatype string, and it sorts the numerics better, but the other sorts are not as good. Is there a way to control sorting for special chars, for example, I want blanks to sort after letters and numbers. In the general case, CollationKeyFilterFactory will do the trick. You could create a custom rule set which sorts spaces after letters and numbers. See http://wiki.apache.org/solr/UnicodeCollation using alphaOnlySort - sorts nicely for alpha, but numbers dont work string - sorts nicely for numbers and letters, but special chars like blanks show up first in the list alphaOnlySort has a PatternReplaceFilterFactory which removes all characters except a-z. This is the reason behind those wierd results. You could try removing that filter and see if thats what you need. -- Regards, Shalin Shekhar Mangar. -- Robert Muir rcm...@gmail.com
Result ordering for Wildcard/Prefix queries or ConstantScoreQueries
All documents matched for Wildcard and Prefix queries get the same score as they are scored as a ConstantScoreQuery. Example query - title:abc* In such cases, what determines the ordering of the results? Is it simply the same order in which those document terms appeared when enumerating through the terms of the field matched in the index? Also, would it be possible to specify criteria determining the ordering of such matches? I am assuming that should be possible but have little idea how that could be done. Kindly provide guidance/help. Regards, Prasanna.
score = result of function query
how can i make the score be solely the output of a function query? the function query wiki page details something like q=boxname:findbox+_val_:product(product(x,y),z)fl=*,score but that doesnt seems to work --joe
Requesting feedback on solr-spatial plugin
Hi all, I've been working on a small Solr plugin to expose the basic functionality of lucene-spatial as unobtrusively as possible. I've got a basic implementation up and passing tests, and I was hoping to get some feedback on it. Though I've coded against Lucene for a production app in the past, this is my first time writing code for Solr's plugin API, so I could easily be entirely on the wrong track. Honest (even brutal!) feedback would be very much appreciated: http://github.com/outoftime/solr-spatial Thanks much, Mat P.S. I definitely don't want to step on anyone's toes with the name - if solr-spatial is already in use, or reserved for a future official contrib for Solr, let me know and I'll come up with something else!
Re: how to do a Parent/Child Mapping using entities
Hi, Thanks Sascha for your post, but i find it interresting, but in my case i don't want to use an additionnal field, i want to be able with the same schema to do a simple query like : q=res_url:some url, and a query like the other one; You could easily write your own query parser (QParserPlugin, in Solr's terminology) that internally translates queries like q = res_url:url AND res_rank:rank into q = res_ranked_url:rank url thus hiding the res_ranked_url field from the user/client. I'm not sure, but maybe it's possible to utilize the order of values within the multi-valued field res_url directly in the newly created parser. This seems like the cleanest solution to me. -Sascha in other word; is there any solution to make two or more multivalued fields in the same document linked with each other, e.g: in this result: -result name=response numFound=1 start=0 -doc str name=id1/str str name=keywordKey1/str -arr name=res_url strurl1/str strurl2/str strurl3/str strurl4/str /arr -arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc /result i would like to make solr understand that for this document, value:url1 of res_url field is linked to value:1 of res_rank field, and all of them are linked to the commen field keyword. I think that i should use a custom field analyser or some thing like that; but i don't know what to do. but thanks for all; and any supplied help will be lovable. Sascha Szott wrote: Hi, you could create an additional index field res_ranked_url that contains the concatenated value of an url and its corresponding rank, e.g., res_rank + + res_url Then, q=res_ranked_url:1 url1 retrieves all documents with url1 as the first url. A drawback of this workaround is that you have to use a phrase query thus preventing wildcard searches for urls. -Sascha Hello everybody, i would like to know how to create index supporting a parent/child mapping and then querying the child to get the results. in other words; imagine that we have a database containing 2 tables:Keyword[id(int), value(string)] and Result[id(int), res_url(text), res_text(tex), res_date(date), res_rank(int)] For indexing, i used the DataImportHandler to import data and it works well, and my query response seems good:(q=*:*) (imagine that we have only this to keywords and their results) ?xml version=1.0 encoding=UTF-8 ? -response -lst name=responseHeader int name=status0/int int name=QTime0/int -lst name=params str name=q*:*/str /lst /lst -result name=response numFound=2 start=0 -doc str name=id1/str str name=keywordKey1/str -arr name=res_url strurl1/str strurl2/str strurl3/str strurl4/str /arr -arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc -doc str name=id2/str str name=keywordKey2/str -arr name=res_url strurl1/str strurl5/str strurl8/str strurl7/str /arr -arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc /result /response but the problem is when i tape a query kind of this:q=res_url:url2 AND res_rank:1 and this to say that i want to search for the keywords in which the url (url2) is ranked at the first position, i have a result like this: ?xml version=1.0 encoding=UTF-8 ? -response -lst name=responseHeader int name=status0/int int name=QTime0/int -lst name=params str name=qres_url:url2 AND res_rank:1/str /lst /lst -result name=response numFound=1 start=0 -doc str name=id1/str str name=keywordKey1/str -arr name=res_url strurl1/str strurl2/str strurl3/str strurl4/str /arr -arr name=res_rank str1/str str2/str str3/str str4/str /arr /doc /result /response But this is not true; because the url present in the 1st position in the results of the keyword key1 is url1 and not url2. So what i want to say is : is there any solution to make the values of the multivalued fields linked; so in our case we can see that the previous result say that: - url1 is present in 1st position of key1 results - url2 is present in 2nd position of key1 results - url3 is present in 3rd position of key1 results - url4 is present in 4th position of key1 results and i would like that solr consider this when executing queries. Any helps please; and thanks for all :)
Re: score = result of function query
On Dec 30, 2009, at 5:27 PM, Joe Calderon wrote: how can i make the score be solely the output of a function query? the function query wiki page details something like q=boxname:findbox+_val_:product(product(x,y),z)fl=*,score Wrap the non-function query part in parenthesis and boost it by 0. In Solr 1.5, you will be able to sort by function query. -Grant
Re: Result ordering for Wildcard/Prefix queries or ConstantScoreQueries
On Dec 30, 2009, at 3:21 PM, Prasanna R wrote: All documents matched for Wildcard and Prefix queries get the same score as they are scored as a ConstantScoreQuery. Example query - title:abc* In such cases, what determines the ordering of the results? Is it simply the same order in which those document terms appeared when enumerating through the terms of the field matched in the index? I'm assuming they are just in order of internal Lucene doc id, but I'd have to look for sure. There was also some changes to Lucene that allowed the collectors to take docs out of order, but again, I'd have to check to see if that is the case. Also, would it be possible to specify criteria determining the ordering of such matches? I am assuming that should be possible but have little idea how that could be done. Kindly provide guidance/help. Sort? What problem are you trying to solve? -Grant
Re: Requesting feedback on solr-spatial plugin
Hi Mat, Taking a quick look at your code via the gitHub browser (and not having downloaded or run it, that's for later! :) ), it looks _very_ clean, and well commented. Bravo! If you get a chance and are interested in participating in the SOLR spatial effort, there are a few issues you could take a look at, in particular, based on what you have so far, I would take a look at SOLR-1568, having to do with creating a QParserPlugin for spatial: http://issues.apache.org/jira/browse/SOLR-1568 SOLR-773 tracks the general progress of all of the spatial work, here: http://issues.apache.org/jira/browse/SOLR-773 There is also a wiki page for the community efforts: http://wiki.apache.org/solr/SpatialSearch If you're not familiar with it yet, there has been a ton of work on Local SOLR and LocalLucene as well. You may want to check out those pages too, located here: http://www.gissearch.com/localsolr Again, bravo on such a clean, easy to understand plugin! I'll try and test out your code and provide some feedback if I get a chance soon. Also I welcome and encourage your contribution/discussion on the SOLR mailing lists and wiki area. Cheers, Chris On 12/30/09 3:51 PM, Mat Brown m...@patch.com wrote: Hi all, I've been working on a small Solr plugin to expose the basic functionality of lucene-spatial as unobtrusively as possible. I've got a basic implementation up and passing tests, and I was hoping to get some feedback on it. Though I've coded against Lucene for a production app in the past, this is my first time writing code for Solr's plugin API, so I could easily be entirely on the wrong track. Honest (even brutal!) feedback would be very much appreciated: http://github.com/outoftime/solr-spatial Thanks much, Mat P.S. I definitely don't want to step on anyone's toes with the name - if solr-spatial is already in use, or reserved for a future official contrib for Solr, let me know and I'll come up with something else! ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Checkin mistake - example does not work in trunk
Rats! I did not rebuild after updating, so the new schema.xml tripped over my old example solr.war. Never mind. On 12/30/09, Yonik Seeley yo...@lucidimagination.com wrote: On Tue, Dec 29, 2009 at 10:22 PM, Lance Norskog goks...@gmail.com wrote: The distributed binaries do not include the new spatial types, so the .../trunk/example/ store app does not start. ? What distributed binaries are you referring to? The nightly builds? Are they missing a jar? -Yonik http://www.lucidimagination.com -- Lance Norskog goks...@gmail.com
serialize SolrInputDocument to java.io.File and back again?
I want to store a SolrInputDocument to the filesystem until it can be sent to the solr server via the solrj client. I will be using a quartz job to periodically query a table that contains a listing of SolrInputDocuments stored as java.io.File that need to be processed. Thanks for your time.
Re: Requesting feedback on solr-spatial plugin
Hi Grant, Thanks for the info and your point is well taken. I should have been clearer that I have no intention of this project being a long-term solution for spatial search in Solr - rather I was looking to build a rough and ready solution that gives some basic spatial search capabilities to tide us over until the real deal is available in Solr 1.5. That being said, I'd love to be of use in the official spatial efforts, so I'll be sure to take a look at the related tickets and see if there is anywhere I can help out. Mat On Wed, Dec 30, 2009 at 19:36, Grant Ingersoll gsi...@gmail.com wrote: Hi Mat, This is an area of active work in Solr right now (see SOLR-773 in JIRA for the top level tracking issue). Obviously you can do as you wish, but it would be really great if you chipped in on making the capabilities in Solr better (we've already added in the Lucene spatial jar, a bunch of distance functions, sort by functions and a few spatial field types) instead of doing something separate. In other words, spatial support is going to be baked into Solr 1.5, riding on the tail of a a whole slew of features that make Solr even more capable. See http://wiki.apache.org/solr/SpatialSearch for more details. Cheers, Grant On Dec 30, 2009, at 6:51 PM, Mat Brown wrote: Hi all, I've been working on a small Solr plugin to expose the basic functionality of lucene-spatial as unobtrusively as possible. I've got a basic implementation up and passing tests, and I was hoping to get some feedback on it. Though I've coded against Lucene for a production app in the past, this is my first time writing code for Solr's plugin API, so I could easily be entirely on the wrong track. Honest (even brutal!) feedback would be very much appreciated: http://github.com/outoftime/solr-spatial Thanks much, Mat P.S. I definitely don't want to step on anyone's toes with the name - if solr-spatial is already in use, or reserved for a future official contrib for Solr, let me know and I'll come up with something else!
Correct syntax for solrJ filter queries
I'm using solrJ to construct a query and it works just fine until I add the following. query.setFilterQueries(price:[*+TO+500], price:[500+TO+*]); That generates this error Caused by: org.apache.solr.common.SolrException: Bad Request Bad Request request: http://balboa:8085/apache-solr-1.4.0/core0/select?q=redfacet=truefl=*,scorerows=20fq=price:[*+TO+500]fq=price:[500+TO+*]wt=javabinversion=1 What is the proper syntax for specifying a set of facet.queries?
RE: Search both diacritics and non-diacritics
I have done follow it, but if I query with diacritic it respose only non-diacritic. But I want to query without diacritic anh then solr will be response both of diacritic and without diacritic :( Steven A Rowe wrote: Hi Olala, You can get something similar to what you want by copying the original field to another one where, as Hoss suggests, you apply ASCIIFoldingFilterFactory, and the rewrite queries to match against both fields, with higher boost given to the original field. @Hoss: Olala would benefit from a feature that AFAICT Solr doesn't currently have: the ability to add synonyms based on arbritrary transforms. Steve On 12/28/2009 at 5:33 AM, Olala wrote: I tried but it still not correct :( hossman wrote: I am developing a seach engine with Solr, and now I want to search both with and without diacritics, for example: if I query kho, it will response kho, khó, khò,... But if I query khó, it will response only khó. Who anyone have solution? I have used filter class=solr.ISOLatin1AccentFilterFactory/ but it is not correct ( try ASCIIFoldingFilterFactory instead. -Hoss -- View this message in context: http://old.nabble.com/Search-both-diacritics-and-non-diacritics-tp26897627p26975115.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Correct syntax for solrJ filter queries
Use query.addFacetQuery(str) instead. Erik On Dec 30, 2009, at 10:16 PM, Jay Fisher wrote: I'm using solrJ to construct a query and it works just fine until I add the following. query.setFilterQueries(price:[*+TO+500], price:[500+TO+*]); That generates this error Caused by: org.apache.solr.common.SolrException: Bad Request Bad Request request: http://balboa:8085/apache-solr-1.4.0/core0/select?q=redfacet=truefl=*,scorerows=20fq=price :[*+TO+500]fq=price:[500+TO+*]wt=javabinversion=1 What is the proper syntax for specifying a set of facet.queries?
Re: absolute search
Can anyone help me??? plz! Olala wrote: uhm,I am sorry, this is the debug :) lst name=debug str name=rawquerystringbook/str str name=querystringbook/str str name=parsedquery+DisjunctionMaxQuery((name:book)~0.01) ()/str str name=parsedquery_toString+(name:book)~0.01 ()/str − lst name=explain − str name=19534 7.903358 = (MATCH) sum of: 7.903358 = (MATCH) fieldWeight(name:book in 19533), product of: 1.0 = tf(termFreq(name:book)=1) 7.903358 = idf(docFreq=79, maxDocs=79649) 1.0 = fieldNorm(field=name, doc=19533) /str − str name=5925 3.951679 = (MATCH) sum of: 3.951679 = (MATCH) fieldWeight(name:book in 5924), product of: 1.0 = tf(termFreq(name:book)=1) 7.903358 = idf(docFreq=79, maxDocs=79649) 0.5 = fieldNorm(field=name, doc=5924) /str − str name=5933 3.951679 = (MATCH) sum of: 3.951679 = (MATCH) fieldWeight(name:book in 5932), product of: 1.0 = tf(termFreq(name:book)=1) 7.903358 = idf(docFreq=79, maxDocs=79649) 0.5 = fieldNorm(field=name, doc=5932) /str − str name=8049 3.951679 = (MATCH) sum of: 3.951679 = (MATCH) fieldWeight(name:book in 8048), product of: 1.0 = tf(termFreq(name:book)=1) 7.903358 = idf(docFreq=79, maxDocs=79649) 0.5 = fieldNorm(field=name, doc=8048) /str − str name=9358 3.951679 = (MATCH) sum of: 3.951679 = (MATCH) fieldWeight(name:book in 9357), product of: 1.0 = tf(termFreq(name:book)=1) 7.903358 = idf(docFreq=79, maxDocs=79649) 0.5 = fieldNorm(field=name, doc=9357) /str /lst str name=QParserDisMaxQParser/str null name=altquerystring/ null name=boostfuncs/ − arr name=filter_queries str/ /arr arr name=parsed_filter_queries/ − lst name=timing double name=time0.0/double − lst name=prepare double name=time0.0/double − lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst − lst name=process double name=time0.0/double − lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst /lst /lst Erick Erickson wrote: Hmmm, nothing jumps out at me. What does Luke show you is actually in your index in the field in question? And what does adding debugQuery=on to the query show? On Thu, Dec 24, 2009 at 8:44 PM, Olala hthie...@gmail.com wrote: Oh,yes, that is my schema config: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType field name=name type=text indexed=true stored=true multiValued=true/ And, my solrconfig.xml for seach in dismax: requestHandler name=dismax class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str float
Re: Result ordering for Wildcard/Prefix queries or ConstantScoreQueries
On Wed, Dec 30, 2009 at 5:04 PM, Grant Ingersoll gsi...@gmail.com wrote: On Dec 30, 2009, at 3:21 PM, Prasanna R wrote: All documents matched for Wildcard and Prefix queries get the same score as they are scored as a ConstantScoreQuery. Example query - title:abc* In such cases, what determines the ordering of the results? Is it simply the same order in which those document terms appeared when enumerating through the terms of the field matched in the index? I'm assuming they are just in order of internal Lucene doc id, but I'd have to look for sure. There was also some changes to Lucene that allowed the collectors to take docs out of order, but again, I'd have to check to see if that is the case. Also, would it be possible to specify criteria determining the ordering of such matches? I am assuming that should be possible but have little idea how that could be done. Kindly provide guidance/help. Sort? What problem are you trying to solve? I am using a prefix query to match a bunch of documents and would like to specify an ordering for the documents matched for that prefix query This is part of the work I am doing in implementing an autocomplete feature and I am using the dismax query parser with some custom modifications. I assume you mean that I can apply a sort ordering to the prefix query matches as part of the results handler. I was not aware of the same. Will look into that. Thanks a lot for the help. Regards, Prasanna.
Re: Correct syntax for solrJ filter queries
Thanks! That did it. ~ Jay On Wed, Dec 30, 2009 at 9:58 PM, Erik Hatcher erik.hatc...@gmail.comwrote: Use query.addFacetQuery(str) instead. Erik On Dec 30, 2009, at 10:16 PM, Jay Fisher wrote: I'm using solrJ to construct a query and it works just fine until I add the following. query.setFilterQueries(price:[*+TO+500], price:[500+TO+*]); That generates this error Caused by: org.apache.solr.common.SolrException: Bad Request Bad Request request: http://balboa:8085/apache-solr-1.4.0/core0/select?q=redfacet=truefl=*,scorerows=20fq=price :[*+TO+500]fq=price:[500+TO+*]wt=javabinversion=1 What is the proper syntax for specifying a set of facet.queries?
numFound is changing when query across distributed-seach with the same query.
Hi,all. I found a problem on distributed-seach. when i use ?q=keywordstart=0rows=20 to query across distributed-seach,it will return numFound=181 ,then I change the start param from 0 to 100,it will return numFound=131. why return different numFound with same query ? -- View this message in context: http://old.nabble.com/numFound-is-changing-when-query-across-distributed-seach-with-the-same-query.-tp26976128p26976128.html Sent from the Solr - User mailing list archive at Nabble.com.