Re: mysolr python client
Done! Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/12/1 Marc SCHNEIDER marc.schneide...@gmail.com Hi Marco, Great! Maybe you can add it on the Solr wiki? ( http://wiki.apache.org/solr/IntegratingSolr). Regards, Marc. On Thu, Dec 1, 2011 at 10:42 AM, Jens Grivolla j+...@grivolla.net wrote: On 11/30/2011 05:40 PM, Marco Martinez wrote: For anyone interested, recently I've been using a new Solr client for Python. It's easy and pretty well documented. If you're interested its site is: http://mysolr.redtuna.org/ Do you know what advantages it has over pysolr or solrpy? On the page it only says mysolr was born to be a fast and easy-to-use client for Apache Solr’s API and because existing Python clients didn’t fulfill these conditions. Thanks, Jens
mysolr python client
Hi all, For anyone interested, recently I've been using a new Solr client for Python. It's easy and pretty well documented. If you're interested its site is: *http://mysolr.redtuna.org/* * * bye! Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: Error Instantiating QParserPlugin
its seem that the problem is QParserPlugin2 class Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/10/20 karan.jindal1...@rediffmail.com hi, while to create customized query parser plugin for solr 3.2. I got the Instantiating error.As mentioned at various places I created two classesnbsp;1) MyQParserPlugin extends QParserPlugin2) MyQParser extends QParser org.apache.solr.common.SolrException: Error Instantiating QParserPlugin, MyQParserPlugin is not a org.apache.solr.search.QParserPlugin at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:428) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:448) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1548) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1542) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1575) at org.apache.solr.core.SolrCore.initQParsers(SolrCore.java:1492) at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:558) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:224) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.mortbay.start.Main.invokeMain(Main.java:194) at org.mortbay.start.Main.start(Main.java:534) at org.mortbay.start.Main.start(Main.java:441) at org.mortbay.start.Main.main(Main.java:119) Any idea about whats going on?? Thanks Karan
Re: Solr scraping: Nutch and other alternatives.
Hi Luis, Have you tried the copyField function with custom analyzers and tokenizers? bye, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/10/18 Luis Cappa Banda luisca...@gmail.com Hello everyone. I've been thinking about a way to retrieve information from a domain (for example, http://www.ign.com) to process and index. My idea is to use Solr as a searcher. I'm familiarized with Apache Nutch and I know that the latest version has a gateway to Solr to retrieve and index information with it. I tried it and it worked fine, but it's a little bit complex to develop plugins to process info and index it in a new field desired. Perhaps one of you have tried another (and better) alternative to data mine web information. Which is your recommendation? Can you give me any scraping suggestion? Thank you very much. Luis Cappa.
Re: Controlling the order of partial matches based on the position
Hi, I would use a custom function query that uses termPositions to calculate the order of the values in the field to accomplished your requirements. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/10/18 aronitin aro_ni...@yahoo.com Guys, It's been almost a week but there are no replies to the question that I posted. If its a small problem and already answered somewhere, please point me to that post. Otherwise please suggest any pointer to handle the requirement mentioned in the question, Nitin -- View this message in context: http://lucene.472066.n3.nabble.com/Controlling-the-order-of-partial-matches-based-on-the-position-tp3413867p3429823.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: PositionIncrement gap and multi-valued fields.
Hi Luis, As far as i know, the position increment gap only affects in some queries, like phrase queries if you use the slop. The position incremente gap does not affect the similarity scoring formula of lucene : score(q,d) = coord(q,d)http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_coord · queryNorm(q)http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_queryNorm · ∑( tf(t in d)http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_tf · idf(t)http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_idf 2 · t.getBoost()http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_termBoost · norm(t,d)http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_norm )t in q*Lucene Practical Scoring Function* * * * * The two first arguments are related to normalizes the queries. In the summation, the two first arguments are related to the frequency of the term, in the document and in the index, the third one is the boost of the term in the query, and the final one, encapsulates a few (indexing time) boost and length factors, but the lengths factor are calculated with the number of terms so the position increment gap doesnt make more tokens, so this factor neither affect the score. But if you use, for example a multivalue field, with a position incremente gap of 100, if you do a query with a slop less than 100, you prevent to have matches between two separated values of this field, ex: q=test:A B~99 doc1 field test position increment gap=100 strA/str strB/str You dont get any matches for this doc, but if you do this query q=test:A B~101 you will get the doc1 as a match. Bye! Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/8/8 Luis Cappa Banda luisca...@gmail.com Hello! I have a doubt about the behaviour of searching over field types that have positionIncrementGap defined. For example, supose that: 1. We have a field called test defined as multi-valued and white space tokenized. 2. The index has an single document with a test value: str TEST1 /str str AAA BBB /str str CCC DDD /str str EEE FFF /str str TEST2 /str I read that positionIncrementGap defines the virtual space between the last token of one field instance and the first token of the next instance (source: http://lucene.472066.n3.nabble.com/positionIncrementGap-in-schema-xml-td488338.html ). When it says last token of one field instance means that is the last token of the first entry from the multi-valued content? In our example before it will be TEST1. Anyway, I've been doing some tests modifying the positionIncrementGap value with high values and low values. Can anybody explain me with detail which implications has in Solr scoring algorythm an upper and a lower value? I would like to understand how this value affects matching results in fields and also calculating the final score (maybe more gap implies more spaces and a worst score when the value matches, etc.). Thank you for reading so far!
term positions performance
Hi, I am developing a new query term proximity and i am using the term positions to get the positions of each term. I want to know if there is any clues to increase the perfomance of using term positions, in index time o in query time, all my fields that i am applying the term positions are indexed. Thanks in advance, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: term positions performance
Also, i develop this query via function query, i wonder if i do it via a normal query will increase the perfomance.. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/7/20 Marco Martinez mmarti...@paradigmatecnologico.com Hi, I am developing a new query term proximity and i am using the term positions to get the positions of each term. I want to know if there is any clues to increase the perfomance of using term positions, in index time o in query time, all my fields that i am applying the term positions are indexed. Thanks in advance, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: embeded solrj doesn't refresh index
You should send a commit to you embedded solr Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/7/20 Jianbin Dai j...@huawei.com Hi, I am using embedded solrj. After I add new doc to the index, I can see the changes through solr web, but not from embedded solrj. But after I restart the embedded solrj, I do see the changes. It works as if there was a cache. Anyone knows the problem? Thanks. Jianbin
function queries scope
Hi, I need to use the function queries operations with the score of a given query, but only in the docset that i get from the query and i dont know if this is possible. Example: q=shops in madridreturns 1 docs with a specific score for each doc but now i need to do some stuff like q=sum(product(2,query(shops in madrid),productValueField) but this will be return all the docs in my index. I know that i can do it via filter queries, ex, q=sum(product(2,query(shops in madrid),productValueField)fq=shops in madrid but this will do the query two times and i dont want this because the performance is important to our application. Is there other approach to accomplished that= Thanks in advance, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: function queries scope
Thanks, but its not what i'm looking for, because the BoostQParserPlugin multiplies the score of the query with the function queries defined in the b param of the BoostQParserPlugin. and i can't use the edismax because we have our own qparser. Its seems that i have to code another qparser. Thanks Yonik anyway, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/6/7 Yonik Seeley yo...@lucidimagination.com One way is to use the boost qparser: http://search-lucene.com/jd/solr/org/apache/solr/search/BoostQParserPlugin.html q={!boost b=productValueField}shops in madrid Or you can use the edismax parser which as a boost parameter that does the same thing: defType=edismaxq=shops in madridboost=productValueField -Yonik http://www.lucidimagination.com On Tue, Jun 7, 2011 at 6:53 AM, Marco Martinez mmarti...@paradigmatecnologico.com wrote: Hi, I need to use the function queries operations with the score of a given query, but only in the docset that i get from the query and i dont know if this is possible. Example: q=shops in madridreturns 1 docs with a specific score for each doc but now i need to do some stuff like q=sum(product(2,query(shops in madrid),productValueField) but this will be return all the docs in my index. I know that i can do it via filter queries, ex, q=sum(product(2,query(shops in madrid),productValueField)fq=shops in madrid but this will do the query two times and i dont want this because the performance is important to our application. Is there other approach to accomplished that= Thanks in advance, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: function query apply only in the subset of the query
No, this query returns a few more documents than if a do it by lucene query parser. I'm going to generate another query parser that send a simple term query and see what is the output, when i have it, i will inform in the mail. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/4/12 Yonik Seeley yo...@lucidimagination.com On Tue, Apr 12, 2011 at 10:25 AM, Marco Martinez mmarti...@paradigmatecnologico.com wrote: Thanks but I tried this and I saw that this work in a standard scenario, but in my query i use a my own query parser and it seems that they dont doing the AND and returns all the docs in the index: My query: _query_:{!bm25}car AND _val_:marketValue - 67000 docs returned This would seem to point to your generated query {!bm25}car matching all docs for some reason? -Yonik http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-26, San Francisco
Re: function query apply only in the subset of the query
Its seems that is a problem of my own query, now i need to investigate if there is something different between a normal query and my implementation of the query, because if you use it alone, its works properly. Thanks, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/4/13 Marco Martinez mmarti...@paradigmatecnologico.com No, this query returns a few more documents than if a do it by lucene query parser. I'm going to generate another query parser that send a simple term query and see what is the output, when i have it, i will inform in the mail. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/4/12 Yonik Seeley yo...@lucidimagination.com On Tue, Apr 12, 2011 at 10:25 AM, Marco Martinez mmarti...@paradigmatecnologico.com wrote: Thanks but I tried this and I saw that this work in a standard scenario, but in my query i use a my own query parser and it seems that they dont doing the AND and returns all the docs in the index: My query: _query_:{!bm25}car AND _val_:marketValue - 67000 docs returned This would seem to point to your generated query {!bm25}car matching all docs for some reason? -Yonik http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-26, San Francisco
function query apply only in the subset of the query
Hi everyone, My situation is the next, I need to sum the value of a field to the score to the docs returned in the query, but not to all the docs, example: q=car returns 3 docs 1- name=car ford marketValue=1 score=1.3 2- name=car citroen marketValue=2 score=1.3 3- name=car mercedes marketValue=0.5 score=1.3 but if want to sum the marketValue to the score, my returned list is the next: q=car+_val_:marketValue 1- name=bus marketValue=5 score=5 2- name=car citroen marketValue=2 score=3.3 3- name=car ford marketValue=1 score=2.3 4- name=car mercedes marketValue=0.5 score=1.8 Its possible to apply the function query only to the documents returned in the first query? Thanks in advance, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: function query apply only in the subset of the query
Thanks but I tried this and I saw that this work in a standard scenario, but in my query i use a my own query parser and it seems that they dont doing the AND and returns all the docs in the index: My query: _query_:{!bm25}car AND _val_:marketValue - 67000 docs returned Solr query parser car AND _val_:marketValue - 300 docs returned Thanks, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/4/12 Erik Hatcher erik.hatc...@gmail.com Try using AND (or set q.op): q=car+AND+_val_:marketValue On Apr 12, 2011, at 07:11 , Marco Martinez wrote: Hi everyone, My situation is the next, I need to sum the value of a field to the score to the docs returned in the query, but not to all the docs, example: q=car returns 3 docs 1- name=car ford marketValue=1 score=1.3 2- name=car citroen marketValue=2 score=1.3 3- name=car mercedes marketValue=0.5 score=1.3 but if want to sum the marketValue to the score, my returned list is the next: q=car+_val_:marketValue 1- name=bus marketValue=5 score=5 2- name=car citroen marketValue=2 score=3.3 3- name=car ford marketValue=1 score=2.3 4- name=car mercedes marketValue=0.5 score=1.8 Its possible to apply the function query only to the documents returned in the first query? Thanks in advance, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: Different Results..
We need more information about the the analyzers and tokenizers of the default field of your search Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/12/22 satya swaroop satya.yada...@gmail.com Hi All, i am getting different results when i used with some escape keys.. for example::: 1) when i use this request http://localhost:8080/solr/select?q=erlang!ericson the result obtained is result name=response numFound=1934 start=0 2) when the request is http://localhost:8080/solr/select?q=erlang/ericson the result is result name=response numFound=1 start=0 My query here is, do solr consider both the queries differently and what do it consider for !,/ and all other escape characters. Regards, satya
Re: White space in facet values
try to copy the values (with copyfield) to a string field Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/12/22 Peter Karich peat...@yahoo.de you should try fq=Product:Electric Guitar How do I handle facet values that contain whitespace? Say I have a field Product that I want to facet on. A value for Product could be Electric Guitar. How should I handle the white space in Electric Guitar during indexing? What about when I apply the constraint fq=Product:Electric Guitar? -- http://jetwick.com open twitter search
Re: Solr search speed very low
You should use the tokenizer solr.WhitespaceTokenizerFactory in your field type to get your terms indexed, once you have indexed the data, you dont need to use the * in your queries that is a heavy query to solr. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/8/25 Andrey Sapegin andrey.sape...@unister-gmbh.de Dear ladies and gentlemen. I'm newbie with Solr, I didn't find an aswer in wiki, so I'm writing here. I'm analysing Solr performance and have 1 problem. *Search time is about 7-10 seconds per query.* I have a *.csv 5Gb-database with about 15 fields and 1 key field (record number). I uploaded it to Solr without any problem using curl. This database contains information about books and I'm intrested in keyword search using one of the fields (not a key field). I mean that if I search, for example, for word Hello, I expect response with sentences containing Hello: Hello all Hello World I say Hello to all etc. I tested it from console using time command and curl: /usr/bin/time -o test_results/time_solr -a curl http://localhost:8983/solr/select/?q=itemname:*$query*version=2.2start=0rows=10indent=on; -6 21 test_results/response_solr So, my query is *itemname:*$query**. 'Itemname' - is the name of field. $query - is a bash variable containing only 1 word. All works fine. *But unfortunately, search time is about 7-10 seconds per query.* For example, Sphinx spent only about 0.3 second per query. If I use only $query, without stars (*), I receive answer pretty fast, but only exact matches. And I want to see any sentence containing my $query in the response. Thats why I'm using stars. NOW THE QUESTION. Is my query syntax correct (*field:*word**) for keyword search)? Why response time is so big? Can I reduce search time? Thank You in advance, Kind Regards, Andrey Sapegin, Software Developer, Unister GmbH Barfußgässchen 11 | 04109 Leipzig andrey.sape...@unister-gmbh.de mailto:%20andreas.b...@unister-gmbh.de www.unister.de http://www.unister.de
Re: Search Results optimization
You can use a boost higher for stapler to accomplished your requirement. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/8/13 Hasnain hasn...@hotmail.com Hi All, My question is related to search results, I want to customize my query so that for query stapler hammer, I should get results for all items containing word stapler first and then results containing hammer, right now results are mixing up, I want them sorted, i.e. all results of stapler on top and hammer on bottom not mixed, I havent changed any configuration files... -- View this message in context: http://lucene.472066.n3.nabble.com/Search-Results-optimization-tp1129374p1129374.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: index pdf files
To help you we need the description of your fields in your schema.xml and the query that you do when you search only a single word. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/8/12 Ma, Xiaohui (NIH/NLM/LHC) [C] xiao...@mail.nlm.nih.gov I wrote a simple java program to import a pdf file. I can get a result when I do search *:* from admin page. I get nothing if I search a word. I wonder if I did something wrong or miss set something. Here is part of result I get when do *:* search: * - doc - arr name=attr_Author strHristovski D/str /arr - arr name=attr_Content-Type strapplication/pdf/str /arr - arr name=attr_Keywords strmicroarray analysis, literature-based discovery, semantic predications, natural language processing/str /arr - arr name=attr_Last-Modified strThu Aug 12 10:58:37 EDT 2010/str /arr - arr name=attr_content strCombining Semantic Relations and DNA Microarray Data for Novel Hypotheses Generation Combining Semantic Relations and DNA Microarray Data for Novel Hypotheses Generation Dimitar Hristovski, PhD,1 Andrej Kastrin,2... * Please help me out if anyone has experience with pdf files. I really appreciate it! Thanks so much,
custom scoring phrase queries
Hi, I want to know if its posiible to get a higher score in a phrase query when the matching is on the left side of the field. For example: doc1=name:stores peter john doc2=name:peter john stores doc3=name:peter john something if you do a search with name=peter john the resultset i want to get is: doc2 doc3 doc1 because the terms peter john are on the left side of the field and they get a higher score. Thanks in advance, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: custom scoring phrase queries
Hi Otis, Finally i construct my own function query that gives more score if the value is at the start of the field. But, its possible to tell solr to use spanFirstQuery without coding. I think i have read that its no possible. Thanks, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/6/18 Otis Gospodnetic otis_gospodne...@yahoo.com Marco, I don't think there is anything in Solr to do that (is there?), but you could do it with some coding if you combined the regular query with SpanFirstQuery with bigger boost: http://search-lucene.com/jd/lucene/org/apache/lucene/search/spans/SpanFirstQuery.html Oh, here are some examples and at the bottom you will see exactly what I suggested above: http://search-lucene.com/c/Lucene:/src/java/org/apache/lucene/search/spans/package.html||SpanFirstQueryhttp://search-lucene.com/c/Lucene:/src/java/org/apache/lucene/search/spans/package.html%7C%7CSpanFirstQuery Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Marco Martinez mmarti...@paradigmatecnologico.com To: solr-user@lucene.apache.org Sent: Fri, June 18, 2010 4:34:45 AM Subject: custom scoring phrase queries Hi, I want to know if its posiible to get a higher score in a phrase query when the matching is on the left side of the field. For example: doc1=name:stores peter john doc2=name:peter john stores doc3=name:peter john something if you do a search with name=peter john the resultset i want to get is: doc2 doc3 doc1 because the terms peter john are on the left side of the field and they get a higher score. Thanks in advance, Marco Martínez Bautista href=http://www.paradigmatecnologico.com; target=_blank http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: Distributed Search doesn't response the result set
Hi Scott, We need more information about your request, can you put the query that you are doing to the servers. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/6/7 Scott Zhang macromars...@gmail.com Hi. All. I am trying to use solr to search over 2 lucene indexes. I am following the solr tutorial and test the distributed search example. It works. Then I am using my own lucene indexes. Search in each solr instance works and return the expected result. But when I do distributed search using shards. It only return the numFound=14. But the result contain nothing. Don't know why. Can Any one help? Thanks.
Re: Distributed Search doesn't response the result set
Try to put the rows parameter in your request, i guess that in your solrconfig you have configured the default rows to 0 in your default request handler. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/6/7 Scott Zhang macromars...@gmail.com Thanks for replying. Here is the part of my schema.xml: I only have 4 fields in my document. fields field name=id type=string indexed=true stored=true required=true / field name=type type=string indexed=true stored=true required=true/ field name=keyword_level1 type=text indexed=true stored=false/ field name=keyword_level2 type=text indexed=true stored=false/ dynamicField name=*_i type=intindexed=true stored=true/ dynamicField name=*_s type=string indexed=true stored=true/ dynamicField name=*_l type=long indexed=true stored=true/ dynamicField name=*_t type=textindexed=true stored=true/ dynamicField name=*_b type=boolean indexed=true stored=true/ dynamicField name=*_f type=float indexed=true stored=true/ dynamicField name=*_d type=double indexed=true stored=true/ dynamicField name=*_dt type=dateindexed=true stored=true/ !-- some trie-coded dynamic fields for faster range queries -- dynamicField name=*_ti type=tintindexed=true stored=true/ dynamicField name=*_tl type=tlong indexed=true stored=true/ dynamicField name=*_tf type=tfloat indexed=true stored=true/ dynamicField name=*_td type=tdouble indexed=true stored=true/ dynamicField name=*_tdt type=tdate indexed=true stored=true/ dynamicField name=*_pi type=pintindexed=true stored=true/ dynamicField name=ignored_* type=ignored multiValued=true/ dynamicField name=attr_* type=textgen indexed=true stored=true multiValued=true/ dynamicField name=random_* type=random / /fields uniqueKeyid/uniqueKey I am running 2 instances as tutorial shows: one on 8983. Another one is on 7574. When I search on 8983: URL: http://localhost:8983/solr/select/?q=marshipversion=2.2start=0rows=10indent=on I got: result name=response numFound=17 start=0 - doc str name=id89/str str name=typeproduct/str /doc - doc str name=id90/str str name=typeproduct/str /doc .. when I search on 7574: URL: http://localhost:7574/solr/select/?q=marshipversion=2.2start=0rows=10indent=on I got: result name=response numFound=17 start=0 - doc str name=id89/str str name=typeproduct/str /doc - doc str name=id90/str str name=typeproduct/str /doc - doc str name=id91/str str name=typeproduct/str /doc As they are using 2 copies of same lucene indexes. the result is same. Then I use URL: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=marship I got: response - lst name=responseHeader int name=status0/int int name=QTime31/int - lst name=params str name=indenttrue/str str name=qmarship/str str name=shardslocalhost:8983/solr,localhost:7574/solr/str /lst /lst result name=response numFound=14 start=0/ /response Note the numFound is 14. When I try URL: http://localhost:8983/solr/select?shards=localhost:8983/solr/indent=trueq=marship The numFound=7 but still nothing returned. URL: http://localhost:8983/solr/select?shards=localhost:7574/solr/indent=trueq=marship return numFound=7 too. And the result has nothing. Please help. Thanks. Regards. Scott On Mon, Jun 7, 2010 at 3:47 PM, Marco Martinez mmarti...@paradigmatecnologico.com wrote: Hi Scott, We need more information about your request, can you put the query that you are doing to the servers. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/6/7 Scott Zhang macromars...@gmail.com Hi. All. I am trying to use solr to search over 2 lucene indexes. I am following the solr tutorial and test the distributed search example. It works. Then I am using my own lucene indexes. Search in each solr instance works and return the expected result. But when I do distributed search using shards. It only return the numFound=14. But the result contain nothing. Don't know why. Can Any one help? Thanks.
Re: solr.solr.home
Hi, When you start the tomcat, you can specify the properties, it will be something like this -Dsolr.solr.home=path/to/your/solr/home. For example, in linux ./startup.sh -Dsolr.solr.home=path/to/your/solr/home Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/27 Antonello Mangone antonello.mang...@gmail.com But where I have to write this command ??? System.setProperty(solr.solr.home, whateverpathyou'dliketosetonyourfilesystem); Claudio
Re: Any realtime indexing plugin available for SOLR
Maybe this will help you http://snaprojects.jira.com/wiki/display/ZOIE/Zoie+Solr+Plugin Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/26 bbarani bbar...@gmail.com Hi, Sorry if I am asking this question again in this forum.. Is there any plugin which I can use to do a realtime indexing? I have a requirement where we have an application which sits on top of SQL server DB and updates happen on day to day basis. Users would like to see the changes made to the DB immediately in the search results. I am thinking of using JMS queue for achieving this, but before that I just want to check if anyone has implemented similar kind of requirement before? Any help / suggestions would be greatly appreciated. Thanks, bb -- View this message in context: http://lucene.472066.n3.nabble.com/Any-realtime-indexing-plugin-available-for-SOLR-tp845026p845026.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: disable caches in real time
Hi Chris, Thank you for your answer. I've always undestand that if you do a commit (replication does it), a new searcher is open, and you lose performance (queries per second) while the caches are regenerated. I think i don't explain correctly my situation before, with my schema i want to avoid this loss of performance in an enviroment with frequent updates. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/18 Chris Hostetter hossman_luc...@fucit.org : I want to know if there is any approach to disable caches in a specific core : from a multicore server. only via hte config. : I have a multicore server where the core0 will be listen to the queries and : other core (core1) that will be replicated from a master server. Once the : replication has been done, i will swap the cores. My point is that i want to : disable the caches in the core that is in charge of the replication to save : memory in the machine. that seems bizarely complicated -- replication can work against a live core, no need to do the swap yourself, the replicationHandler takes care of this for your transparently (ie: you have one core, replicating from a master -- the old index will be searched by users, and have caches, and when the new version of the index is ready, the replication handler will swap the *index* in that core (but the core itself never changes) ... it can even autowarm the caches on the new index for you before the swap if you configure it that way. -Hoss
Re: Storing RandomSortField
Hi Alexandre, I am not totally sure about this, but the random sort field its only used to do a random sort on your searchs, and you will to pass differents values to have differents sorts, so this only applies in the searchs, so no value is indexed. You will find more information here: http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/18 Alexandre Rocco alel...@gmail.com Hi guys, Is there any way to mak a RandomSortField be stored? I'm trying to do it for debugging purposes, My intention is to take a look at the values that are stored there to determine the sorting that is being applied to the results. I tried to make it a stored field as: field name=randomorder type=random stored=true / And also tried to create another text field, copying the result from the random field like this: field name=randomorderdebug type=text indexed=true stored=true/ copyField source=randomorder dest=randomorderdebug/ Neither of the approaches worked. Is there any restriction on this kind of field that prevents it from being displayed in the results? Thanks, Alexandre
Re: Multifaceting on multivalued field
Hi, This exception is fired when you don't have this field on your index, but this comes because you have an error in your query syntax !{ex=cars}cars, should be {*!*ex=cars}cars , whith the exclamation inside the brackets. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/18 Peter Karich peat...@yahoo.de Hi all, I read about multifaceting [1] and tried it for myself. With multifaceting I would like to conserve the number of documents for the 'un-facetted case'. This works nice with normal fields, but I get an exception [2] if I apply this on a multivalued field. Is this a bug or logical :-) ? If the latter one is the case, would anybody help me to understand this? Regards, Peter. [1] http://www.craftyfella.com/2010/01/faceting-and-multifaceting-syntax-in.html [2] org.apache.solr.common.SolrException: undefined field !{ex=cars}cars at org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:1077) at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:226) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:283) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:166) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:239)
Re: disable caches in real time
Any suggestions? I have thought in have two configurations per server and reload each one with the appropiated config file but i would prefer another solution if its possible. Thanks, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/14 Marco Martinez mmarti...@paradigmatecnologico.com Hi, I want to know if there is any approach to disable caches in a specific core from a multicore server. My situation is the next: I have a multicore server where the core0 will be listen to the queries and other core (core1) that will be replicated from a master server. Once the replication has been done, i will swap the cores. My point is that i want to disable the caches in the core that is in charge of the replication to save memory in the machine. Any suggestions will be appreciated. Thanks in advance, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: Targeting two fields with the same query or one field gathering contents from both ?
No, the equivalent for this will be: - A: (the lazy fox) *OR* B: (the lazy fox) - C: (the lazy fox) Imagine the situation that you dont have in B 'the lazy fox', with the AND you get 0 results although you have 'the lazy fox' in A and C Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/17 Xavier Schepler xavier.schep...@sciences-po.fr Hey, let's say I have : - a field named A with specific contents - a field named B with specific contents - a field named C witch contents only from A and B added with copyField. Are those queries equivalents in terms of performance : - A: (the lazy fox) AND B: (the lazy fox) - C: (the lazy fox) ?? Thanks, Xavier
disable caches in real time
Hi, I want to know if there is any approach to disable caches in a specific core from a multicore server. My situation is the next: I have a multicore server where the core0 will be listen to the queries and other core (core1) that will be replicated from a master server. Once the replication has been done, i will swap the cores. My point is that i want to disable the caches in the core that is in charge of the replication to save memory in the machine. Any suggestions will be appreciated. Thanks in advance, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: Question on pf (Phrase Fields)
I don't know if this solution accomplished your requirements but you can use fq to do the query with only foo and q when you search by more terms. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/13 Blargy zman...@hotmail.com Is there any way to configure this so it only takes after if you match more than one word? For example if I search for: foo it should have no effect on scoring, but if I search for foo bar then it should. Is this possible? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Question-on-pf-Phrase-Fields-tp815095p815095.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: JTeam Spatial Plugin
Hi, You can use localsolr (http://www.gissearch.com/localsolr) that supports sharding if you need this feature. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/11 Jean-Sebastien Vachon js.vac...@videotron.ca Hi, Thanks for your suggestion but I received more information about this issue from one of the JTeam's developer and he told me that my problem was caused by the plugin not supporting sharding at this time. In my case, I noticed that individual shards were computing the distance through the geo_distance field. However, the master Solr instance controlling the shards was kind of loosing this information from the lack of support for shards. For now there is no quick work around that I know of. Later, On 2010-05-11, at 2:54 PM, Michael wrote: Try using geo_distance in the return fields. On Thu, Apr 29, 2010 at 9:26 AM, Jean-Sebastien Vachon js.vac...@videotron.ca wrote: Hi All, I am using JTeam's Spatial Plugin RC3 to perform spatial searches on my index and it works great. However, I can't seem to get it to return the computed distances. My query component is run before the geoDistanceComponent and the distanceField is set to distance Fields for lat/long are defined as well and the different tiers field are in the results. Increasing the radius cause the number of matches to increase so I guess that my setup is working... Here is sample query and its output (I removed some of the fields to keep it short): /select?passkey=sampleq={!spatial%20lat=40.27%20long=-76.29%20radius=22%20calc=arc}title:engineerwt=jsonindent=onfl=*,distance { responseHeader:{ status:0, QTime:69, params:{ fl:*,distance, indent:on, q:{!spatial lat=40.27 long=-76.29 radius=22 calc=arc}title:engineer, wt:json}}, response:{numFound:223,start:0,docs:[ { title:Electrical Engineer, long:-76.3054962158203, lat:40.037899017334, _tier_9:-3.004, _tier_10:-6.0008, _tier_11:-12.0016, _tier_12:-24.0031, _tier_13:-47.0061, _tier_14:-93.00122, _tier_15:-186.00243, _tier_16:-372.00485}, }} This output suggests to me that everything is in place. Anyone knows how to fetch the computed distance? I tried adding the field 'distance' to my list of fields but it didn't work Thanks
Re: multivalue fields logic required
Hi, 2º solution: Not use multiValue fields, instead use two single fields, in your example will be: doc1: dept: student1 city: city1 principalFlag:T doc2: dept: student2 city: city2 principalFlag:F So, if you search without specify any city or dept, you should put princiaplFlag:T for no get duplicate on your response. And if you specify a city or a dept, there is no need to specify the principalFlag because you will only get the result that match with your fields (you dont get duplicates). 3º solution: Do a postprocessing to eleminate the fields in your response that you dont need, i mean, get only the city and the dept that should be in the query response. Hope this will help Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/12 Jonty Rhods jonty.rh...@gmail.com Hi Marco, I am trying to patch for collapse component support (till now no luck).. In mean time I would like to know the 2nd and 3rd option you mentioned (logic in solrj).. with regards On Thu, May 6, 2010 at 2:36 PM, Marco Martinez mmarti...@paradigmatecnologico.com wrote: Hi Jonty, I think you have three possible solutions: 1. Use the collapse component with your name field for not have any duplicates documents. 2. Create a simple logic in your index with flags, like one flag to determine the first element of the same document (in your example you will have three differents documents and the fist one wiill have this flag=true). If the search only have name, you will have to set this flag to true, if not, the dept or the student will be defined and you will have one document returned. 3. Do a post-processing of your data. Maybe you will have more solutions but these are what i have thought right now. Regards, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/6 Jonty Rhods jonty.rh...@gmail.com thanks :General solution is to index 3 different SolrDocument in your example. id and name fields will repeat themselves. All fields will be single-valued. if I am indexing 3 different field then if user is searching by name + dept then it will return duplicate value.. is there any other best possible way..? thanks On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan iori...@yahoo.com wrote: recently I start to work on solr, So I am still very new to use solr. Sorry if I am logically wrong. I have two table, parent and referenced (child). for that I set multivalue field following is my schema details field name=id type=string indexed=true stored=true required=true / field name=name type=text indexed=true stored=true/ field name=dept type=text indexed=true stored=true multiValued=true/ field name=city type=text indexed=true stored=true multiValued=true/ indexed data details: doc arr name=dept strstudent1/str strstudent2/str strstudent3/str /arr arr name=city strcity1/str strcity2/str strcity3/str /arr str name=id1/str arr name=name strname of emp/str /arr /doc now my question is : When user is searching by city2 then I want to return employee2 and their id (for multi value field). something like: doc arr name=dept strstudent2/str /arr arr name=city strcity2/str /arr str name=id1/str arr name=name strname of emp/str /arr /doc I had a similar need before. AFAIK you cannot do it with multivalued fields. The indexing order is preserved in multivalued field. May be you can post-process returned fields and capture correct position of matched city field, and use this index to display correct dept value. But this is easy if you are using string or integer type for city and dept. General solution is to index 3 different SolrDocument in your example. id and name fields will repeat themselves. All fields will be single-valued.
Re: multivalue fields logic required
You should do a preprocessing(multiply your document as many documents as values you have in your multivalue field, with the principalFlag:T in your first document) before you indexing the data with that logic Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/12 Jonty Rhods jonty.rh...@gmail.com hi Marco, Thanks for quick reply.. I have another doubt: In 2nd solution: How to set flag for duplicate value. because I am not sure about the no fo duplicate rows (it could be random no..) so how can I set the flag.. thank On Wed, May 12, 2010 at 12:59 PM, Marco Martinez mmarti...@paradigmatecnologico.com wrote: Hi, 2º solution: Not use multiValue fields, instead use two single fields, in your example will be: doc1: dept: student1 city: city1 principalFlag:T doc2: dept: student2 city: city2 principalFlag:F So, if you search without specify any city or dept, you should put princiaplFlag:T for no get duplicate on your response. And if you specify a city or a dept, there is no need to specify the principalFlag because you will only get the result that match with your fields (you dont get duplicates). 3º solution: Do a postprocessing to eleminate the fields in your response that you dont need, i mean, get only the city and the dept that should be in the query response. Hope this will help Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/12 Jonty Rhods jonty.rh...@gmail.com Hi Marco, I am trying to patch for collapse component support (till now no luck).. In mean time I would like to know the 2nd and 3rd option you mentioned (logic in solrj).. with regards On Thu, May 6, 2010 at 2:36 PM, Marco Martinez mmarti...@paradigmatecnologico.com wrote: Hi Jonty, I think you have three possible solutions: 1. Use the collapse component with your name field for not have any duplicates documents. 2. Create a simple logic in your index with flags, like one flag to determine the first element of the same document (in your example you will have three differents documents and the fist one wiill have this flag=true). If the search only have name, you will have to set this flag to true, if not, the dept or the student will be defined and you will have one document returned. 3. Do a post-processing of your data. Maybe you will have more solutions but these are what i have thought right now. Regards, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/6 Jonty Rhods jonty.rh...@gmail.com thanks :General solution is to index 3 different SolrDocument in your example. id and name fields will repeat themselves. All fields will be single-valued. if I am indexing 3 different field then if user is searching by name + dept then it will return duplicate value.. is there any other best possible way..? thanks On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan iori...@yahoo.com wrote: recently I start to work on solr, So I am still very new to use solr. Sorry if I am logically wrong. I have two table, parent and referenced (child). for that I set multivalue field following is my schema details field name=id type=string indexed=true stored=true required=true / field name=name type=text indexed=true stored=true/ field name=dept type=text indexed=true stored=true multiValued=true/ field name=city type=text indexed=true stored=true multiValued=true/ indexed data details: doc arr name=dept strstudent1/str strstudent2/str strstudent3/str /arr arr name=city strcity1/str strcity2/str strcity3/str /arr str name=id1/str arr name=name strname of emp/str /arr /doc now my question is : When user is searching by city2 then I want to return employee2 and their id (for multi value field). something like: doc arr name=dept strstudent2/str /arr arr name=city strcity2/str /arr str name=id1/str arr name=name strname of emp/str /arr /doc
Re: multivalue fields logic required
Hi Jonty, I think you have three possible solutions: 1. Use the collapse component with your name field for not have any duplicates documents. 2. Create a simple logic in your index with flags, like one flag to determine the first element of the same document (in your example you will have three differents documents and the fist one wiill have this flag=true). If the search only have name, you will have to set this flag to true, if not, the dept or the student will be defined and you will have one document returned. 3. Do a post-processing of your data. Maybe you will have more solutions but these are what i have thought right now. Regards, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/6 Jonty Rhods jonty.rh...@gmail.com thanks :General solution is to index 3 different SolrDocument in your example. id and name fields will repeat themselves. All fields will be single-valued. if I am indexing 3 different field then if user is searching by name + dept then it will return duplicate value.. is there any other best possible way..? thanks On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan iori...@yahoo.com wrote: recently I start to work on solr, So I am still very new to use solr. Sorry if I am logically wrong. I have two table, parent and referenced (child). for that I set multivalue field following is my schema details field name=id type=string indexed=true stored=true required=true / field name=name type=text indexed=true stored=true/ field name=dept type=text indexed=true stored=true multiValued=true/ field name=city type=text indexed=true stored=true multiValued=true/ indexed data details: doc arr name=dept strstudent1/str strstudent2/str strstudent3/str /arr arr name=city strcity1/str strcity2/str strcity3/str /arr str name=id1/str arr name=name strname of emp/str /arr /doc now my question is : When user is searching by city2 then I want to return employee2 and their id (for multi value field). something like: doc arr name=dept strstudent2/str /arr arr name=city strcity2/str /arr str name=id1/str arr name=name strname of emp/str /arr /doc I had a similar need before. AFAIK you cannot do it with multivalued fields. The indexing order is preserved in multivalued field. May be you can post-process returned fields and capture correct position of matched city field, and use this index to display correct dept value. But this is easy if you are using string or integer type for city and dept. General solution is to index 3 different SolrDocument in your example. id and name fields will repeat themselves. All fields will be single-valued.
Re: hi to everyone
You should specify the core in your request, like http://localhost:8080/solr/*core0*/update?... where /solr/ is your webapp and 'core0' is the name of the core. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/6 Antonello Mangone antonello.mang...@gmail.com Hi to everyone, my name is Antonello Mangone and I'm a new user of Solr (this is the 4th day :D). I'm just a novice and i would like to make a question ... I'm using solr in multicore way but i don't understad how to add xml documents to a particular core ... Can someone help me ??? Antonello
Re: hi to everyone
See this page http://wiki.apache.org/solr/UpdateXmlMessages#Updating_a_Data_Record_via_curland the solr tutorial http://lucene.apache.org/solr/tutorial.html (maybe you can use the post.jar). Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/6 Antonello Mangone antonello.mang...@gmail.com Ok, you're right :D I exaplain my situation ... I have solr locally on my machine */home/antonello/solrtest* inside the folder solrtest I have: |_ build |_ build.xml |_ CHANGES.txt |_ client |_ common-build.xml |_ contrib |_ dist |_ docs |_ etc |_ lib |_ LICENSE.txt |_ logs |_ multicore |_ bandb |_ conf |_ schema.xml |_ solrconfig.xml |_ data |_ index |_ segments_1 |_ segments.gen |_ solr.xml |_ NOTICE.txt |_ README.txt |_ src |_ start.jar |_ start_multicore.sh |_ webapps I have also xml files in anoter place and I would like to add these xml files to the bandb core. Is there a command to add an xml file to a particular core, imagining we can have an indefinite number of cores ? 2010/5/6 Marco Martinez mmarti...@paradigmatecnologico.com You should specify the core in your request, like http://localhost:8080/solr/*core0*/update?... where /solr/ is your webapp and 'core0' is the name of the core. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/6 Antonello Mangone antonello.mang...@gmail.com Hi to everyone, my name is Antonello Mangone and I'm a new user of Solr (this is the 4th day :D). I'm just a novice and i would like to make a question ... I'm using solr in multicore way but i don't understad how to add xml documents to a particular core ... Can someone help me ??? Antonello
Re: synonym filter problem for string or phrase
Hi Ranveer, I don't see any stemming analyzer in your configuration of the field 'text_sync', also you have filter class=solr.TrimFilterFactory / at query time and not at index time, maybe that is your problem. Regards, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/4/30 Jonty Rhods jonty.rh...@gmail.com On 4/29/10 8:50 PM, Marco Martinez wrote: Hi Ranveer, If you don't specify a field type in the q parameter, the search will be done searching in your default search field defined in the solrconfig.xml, its your default field a text_sync field? Regards, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/4/29 Ranveerranveer.s...@gmail.com ranveer.s...@gmail.com Hi, I am trying to configure synonym filter. my requirement is: when user searching by phrase like what is solr user? then it should be replace with solr user. something like : what is solr user? = solr user My schema for particular field is: fieldType name=text_sync class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true tokenizerFactory=KeywordTokenizerFactory/ /analyzer /fieldType it seems working fine while trying by analysis.jsp but not by url http://localhost:8080/solr/core0/select?q=what is solr user? or http://localhost:8080/solr/core0/select?q=what is solr user? Please guide me for achieve desire result. Hi Marco, thanks. yes my default search field is text_sync. I am getting result now but not as I expect. following is my synonym.txt what is bone cancer=bone cancer what is bone cancer?=bone cancer what is of bone cancer=bone cancer what is symptom of bone cancer=bone cancer what is symptoms of bone cancer=bone cancer in above I am getting result of all synonym but not the last one what is symptoms of bone cancer=bone cancer. I think due to stemming I am not getting expected result. However when I am checking result from the analysis.jsp, its giving expected result. I am confused.. Also I want to know best approach to configure synonym for my requirement. thanks with regards Hi, I am also facing same type of problem.. I am Newbie please help. thanks Jonty
Re: synonym filter problem for string or phrase
Hi Ranveer, If you don't specify a field type in the q parameter, the search will be done searching in your default search field defined in the solrconfig.xml, its your default field a text_sync field? Regards, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/4/29 Ranveer ranveer.s...@gmail.com Hi, I am trying to configure synonym filter. my requirement is: when user searching by phrase like what is solr user? then it should be replace with solr user. something like : what is solr user? = solr user My schema for particular field is: fieldType name=text_sync class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true tokenizerFactory=KeywordTokenizerFactory/ /analyzer /fieldType it seems working fine while trying by analysis.jsp but not by url http://localhost:8080/solr/core0/select?q=what is solr user? or http://localhost:8080/solr/core0/select?q=what is solr user? Please guide me for achieve desire result.
Re: Facet count problem
Hi Ranveer, The error in the count of the facets its caused by the tokenized field that you are using, if you want to do facets for the whole string, use a fieldType that doesn't strip the the field in tokens like the string field. Regards, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/4/19 Ranveer Kumar ranveer.s...@gmail.com Hi Erick, My schema configuration is following. fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory/ !--tokenizer class=solr.WhitespaceTokenizerFactory/-- !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- !-- Case insensitive stop word removal. add enablePositionIncrements=true in both the index and query analyzers to leave a 'gap' for more accurate phrase queries. -- filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer analyzer type=query charFilter class=solr.HTMLStripCharFilterFactory/!-- escapedTags=lt;,gt;/ -- tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory/ !--tokenizer class=solr.WhitespaceTokenizerFactory/-- !--tokenizer class=solr.HTMLStripStandardTokenizerFactory/-- !-- filter class=solr.LengthFilterFactory min=2 max=50 / -- filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType field name=type type=text indexed=true stored=true/ !-- copy field for default search-- copyField source=type dest=text/ On Mon, Apr 19, 2010 at 6:22 AM, Erick Erickson erickerick...@gmail.com wrote: Can we see the actual field definitions from your schema file. Ahmet's question is vital and is best answered if you'll copy/paste the relevant configuration entries But based on what you *have* posted, I'd guess you're trying to facet on tokenized fields, which is not recommended. You might take a look at: http://wiki.apache.org/solr/UsingMailingLists, it'll help you frame your questions in a manner that gets you your answers as fast as possibld. Best Erick On Sun, Apr 18, 2010 at 12:59 PM, Ranveer Kumar ranveer.s...@gmail.com wrote: I am.using text for type, which is static. For example: type is a field and I am using type for categorization. For news type I am using news and for blog using blog.. type is a text field. On Apr 17, 2010 8:38 PM, Ahmet Arslan iori...@yahoo.com wrote: I am facing problem to get facet result count. I must be wrong somewhere. I am getting proper ... Are you faceting on a tokenized field? What is the fieldType of your field?
Re: Replication process on Master/Slave slowing down slave read/search performance
Hi Marcin, This is because when you do the replication, all the caches are rebuild cause the index has changed, so the searchs performance decrease. You can change your architecture to a multicore one to reduce the impact of the replication. Using two cores, one to do the replication, and other to search, when the replication is done, do a swap of the cores so the caches are updated all the time. Regards Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/4/9 Marcin mar...@feedsmanagement.com Hi guys, I have noticed that Master/Slave replication process is slowing down slave read/search performance during replication being done. please help cheers
Re: Solr query parser doesn't invoke analyzer for simple term query?
Hello, You can see what happen (which analyzer are used for this field and which is the output of the analyzers) with this search using the analysis page of the solr default web page. I assume you are using the same analyzers and tokenizers in indexing and searching for this field in your schema. Regards, Marco Martínez Bautista 2010/3/17 Teruhiko Kurosaka k...@basistech.com It seems that Solr's query parser doesn't pass a single term query to the Analyzer for the field. For example, if I give it 2001年 (year 2001 in Japanese), the searcher returns 0 hits but if I quote them with double-quotes, it returns hits. In this experiment, I configured schema.xml so that the field in question will use the morphological Analyzer my company makes that is capable of splitting 2001年 into two tokens 2001 and 年. I am guessing that this Analyzer is called ONLY IF the term is a phrase. Is my observation correct? If so, is there any configuration parameter that I can tweak to force any query for the text fields be processed by the Analyzer? One might ask why users won't put space between 2001 and 年. Well if they are clearly two separate words, people do that. But 年 works more like a suffix in this case, and in many Japanese speaker's mind, 2001年 seems like one token, so many people won't. (Remember Japanese don't use spaces in normal writing.) Forcing to use Analyzer would also be useful for compound word handling often desirable for languages like German. Teruhiko Kuro Kurosaka RLP + Lucene Solr = powerful search for global contents