Re: Combining several fields for facets.
I have many values in each field, I cant use facet query... (I dont know all the values) -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-several-fields-for-facets-tp4160679p4161539.html Sent from the Solr - User mailing list archive at Nabble.com.
multiple terms order in query - eDismax
Hi, We have an index with 3 documents, each document contains a single field let's call it 'text' (except the id) as below: * Doc1 o text:home garden sky sea wolf * Doc2 o text:home wolf sea garden sky * Doc3 o text:wolf sea home garden sky When executing the query: home garden apple, Using eDismax params: * pf=text * ps=1 * mm=2 We would like to get Doc1 and Doc3, in other words all the documents having at least 2 terms in close proximity (only 1 term off). The problem is that we get all 3 documents, it looks like the 'ps' parameter doesn't count. Why Doc2 included in the results? We expected that Solr will emit it since the 'ps' is larger than 1 = we have home wolf sea garden (ps=2?) Tomer Levi Software Engineer Big Data Group Product Technology Unit (T) +972 (9) 775-2693 tomer.l...@nice.commailto:tomer.l...@nice.com www.nice.comhttp://www.nice.com/ [cid:image001.png@01CFDB18.EF9E9800]http://twitter.com/NICE_Systems/[cid:image002.png@01CFDB18.EF9E9800]http://www.facebook.com/pages/NICE-Systems/149072782602/[cid:image003.png@01CFDB18.EF9E9800]http://www.linkedin.com/company/nice-systems[cid:image004.png@01CFDB18.EF9E9800]http://www.nice.com/blog [cid:image005.jpg@01CFDB18.EF9E9800]http://www.nice.com/big-data-solutions
Re: demo app explaining solr features
And you can also check out the tutorials in any of the Solr books, including my Solr Deep Dive e-book: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html -- Jack Krupansky -Original Message- From: Mikhail Khludnev Sent: Sunday, September 28, 2014 1:35 AM To: solr-user Subject: Re: demo app explaining solr features On Sat, Sep 27, 2014 at 12:26 PM, Anurag Sharma anura...@gmail.com wrote: I am wondering if there is any demo app that can demonstrate all the features/capabilities of solr. My intention is to understand, use and play around all the features supported by solr. https://lucene.apache.org/solr/4_10_0/tutorial.html -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: multiple terms order in query - eDismax
pf and ps merely control boosting of documents, not selection of documents. mm controls selection of documents. So, hopefully at least doc3 is returned before doc2. -- Jack Krupansky From: Tomer Levi Sent: Sunday, September 28, 2014 5:39 AM To: solr-user@lucene.apache.org Subject: multiple terms order in query - eDismax Hi, We have an index with 3 documents, each document contains a single field let’s call it ‘text’ (except the id) as below: · Doc1 o text:home garden sky sea wolf · Doc2 o text:home wolf sea garden sky · Doc3 o text:wolf sea home garden sky When executing the query: home garden apple, Using eDismax params: · pf=text · ps=1 · mm=2 We would like to get Doc1 and Doc3, in other words all the documents having at least 2 terms in close proximity (only 1 term off). The problem is that we get all 3 documents, it looks like the ‘ps’ parameter doesn’t count. Why Doc2 included in the results? We expected that Solr will emit it since the ‘ps’ is larger than 1 = we have home wolf sea garden (ps=2?) Tomer Levi Software Engineer Big Data Group Product Technology Unit (T) +972 (9) 775-2693 tomer.l...@nice.com www.nice.com
Re: demo app explaining solr features
There is NOTHING that will explain all features of Solr. Solr is too deep. It starts from Hello World and gets into PhD level computational linguistics as well as specialist-level distributed systems. However, as mentioned in the other emails, there are resources, both online and paid that can get you from that Hello World to the point where you can use reference resources and Solr source code to chart your own path further. That was the goal with my book certainly and it would still do it, though it does not cover more recently added basics such as dynamic schemas. On the other hand, I just shadowed somebody going through it in intensive 3 days and then being ready to troubleshoot some hairy perl client issues. :-) Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 28 September 2014 06:17, Jack Krupansky j...@basetechnology.com wrote: And you can also check out the tutorials in any of the Solr books, including my Solr Deep Dive e-book: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html -- Jack Krupansky -Original Message- From: Mikhail Khludnev Sent: Sunday, September 28, 2014 1:35 AM To: solr-user Subject: Re: demo app explaining solr features On Sat, Sep 27, 2014 at 12:26 PM, Anurag Sharma anura...@gmail.com wrote: I am wondering if there is any demo app that can demonstrate all the features/capabilities of solr. My intention is to understand, use and play around all the features supported by solr. https://lucene.apache.org/solr/4_10_0/tutorial.html -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: demo app explaining solr features
Hi Anurag, For the demo you can post xml to solr in example-docs folder I guess and then you can use the browse request handler http://localhost:8983/solr/browse I am not too sure about URL but this can help you to gave an idea about searching, faceting, geo spatial search,etc I wish this could help you. On Sep 28, 2014 4:39 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: There is NOTHING that will explain all features of Solr. Solr is too deep. It starts from Hello World and gets into PhD level computational linguistics as well as specialist-level distributed systems. However, as mentioned in the other emails, there are resources, both online and paid that can get you from that Hello World to the point where you can use reference resources and Solr source code to chart your own path further. That was the goal with my book certainly and it would still do it, though it does not cover more recently added basics such as dynamic schemas. On the other hand, I just shadowed somebody going through it in intensive 3 days and then being ready to troubleshoot some hairy perl client issues. :-) Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 28 September 2014 06:17, Jack Krupansky j...@basetechnology.com wrote: And you can also check out the tutorials in any of the Solr books, including my Solr Deep Dive e-book: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html -- Jack Krupansky -Original Message- From: Mikhail Khludnev Sent: Sunday, September 28, 2014 1:35 AM To: solr-user Subject: Re: demo app explaining solr features On Sat, Sep 27, 2014 at 12:26 PM, Anurag Sharma anura...@gmail.com wrote: I am wondering if there is any demo app that can demonstrate all the features/capabilities of solr. My intention is to understand, use and play around all the features supported by solr. https://lucene.apache.org/solr/4_10_0/tutorial.html -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
boosting words from specific list
Dear all, Hi, I was wondering how can I implement solr boosting words from specific list of important words? I mean I want to have a list of important words and tell solr to score documents based on the weighted sum of these words. For example let word school has weight of 2 and word president has the weight of 5. In this case a doc with 2 school words and 3 president words will has the total score of 19! I want to sort documents based on this score. How such procedure is possible in solr? Thank you very much. Best regards. -- A.Nazemian
Re: How does KeywordRepeatFilterFactory help giving a higher score to an original term vs a stemmed term
Hi, How about coord factor? Does it kick in when stemmed and original tokens both match? On Friday, September 26, 2014 1:32 AM, Diego Fernandez difer...@redhat.com wrote: The difference comes in the fact that when you query the same form it matches 2 tokens including the less common one. When you query a different form you only match on the more common form. So really you're getting the boost from both the tiny difference in TF*IDF and the extra token that you match on. However, I agree that adding a payload might be a better solution. - Original Message - Hi - but this makes no sense, they are scored as equals, except for tiny differences in TF and IDF. What you would need is something like a stemmer that preserves the original token and gives a 1 payload to the stemmed token. The same goes for filters like decompounders and accent folders that change meaning of words. -Original message- From:Diego Fernandez difer...@redhat.com Sent: Wednesday 17th September 2014 23:37 To: solr-user@lucene.apache.org Subject: Re: How does KeywordRepeatFilterFactory help giving a higher score to an original term vs a stemmed term I'm not 100% on this, but I imagine this is what happens: (using - to mean tokenized to) Suppose that you index: I am running home - am run running home If you then query running home - run running home and thus give a higher score than if you query runs home - run runs home - Original Message - The Solr wiki says A repeated question is how can I have the original term contribute more to the score than the stemmed version? In Solr 4.3, the KeywordRepeatFilterFactory has been added to assist this functionality. https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Stemming (Full section reproduced below.) I can see how in the example from the wiki reproduced below that both the stemmed and original term get indexed, but I don't see how the original term gets more weight than the stemmed term. Wouldn't this require a filter that gives terms with the keyword attribute more weight? What am I missing? Tom - A repeated question is how can I have the original term contribute more to the score than the stemmed version? In Solr 4.3, the KeywordRepeatFilterFactory has been added to assist this functionality. This filter emits two tokens for each input token, one of them is marked with the Keyword attribute. Stemmers that respect keyword attributes will pass through the token so marked without change. So the effect of this filter would be to index both the original word and the stemmed version. The 4 stemmers listed above all respect the keyword attribute. For terms that are not changed by stemming, this will result in duplicate, identical tokens in the document. This can be alleviated by adding the RemoveDuplicatesTokenFilterFactory. fieldType name=text_keyword class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.KeywordRepeatFilterFactory/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType -- Diego Fernandez - 爱国 Software Engineer GSS - Diagnostics -- Diego Fernandez - 爱国 Software Engineer GSS - Diagnostics IRC: aiguofer on #gss and #customer-platform
Re: applicability of schema on document
Anurag, Just answering your three emails on two emailing lists. You seem to want to jump into the middle of Solr and then dig your way both ways. So, your question here is a mix of very basic and unnecessarily complicated. Specifically, you seem to be missing a link between schema.xml and solrconfig.xml. You need to look at both of the files if you are actually trying to understand what's going on, as opposed to just changing one thing to get from one point to another. Both files for multiple examples that come with Solr are heavily documented and give you MORE than you want to know. In fact, half of the solrconfig.xml is a bit scary to me right now and I wrote a book on Solr. But it has the answers and if you just read it, you will know better questions to ask. Regards, Alex. P.s. Feel free to ignore this, we'll help you anyway. It's just that the above might be MORE efficient than the strategy you selected so far. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 27 September 2014 04:05, Anurag Sharma anura...@gmail.com wrote: I am trying to understand how the schema and it's field types gets applied to a document. Is it based on a document Id e.g. how to specify solr to store fields like age as integer, dateofbirth as date but not as a string or vice versa. Also, is it possible to change the field type at runtime? Direct update using below command adds the values even when the id doesn't exist: curl http://localhost:8983/solr/update\?commit\=true -H 'Content-type:application/json' -d '[{dob: {add: [2014-02-12T12:00:00Z, 2014-07-16T12:00:00Z]}, id: }]' My doubts in the above scenario are: - Is it taking some default types based on parsed values? - Is it also possible to store multiple types in a single field? - What are the rules for schema less doc Above scenario is tried using solr.war in example/webapps Following are multiple schema files in example directory ./example/example-DIH/solr/db/conf/schema.xml ./example/example-DIH/solr/mail/conf/schema.xml ./example/example-DIH/solr/rss/conf/schema.xml ./example/example-DIH/solr/solr/conf/schema.xml ./example/example-DIH/solr/tika/conf/schema.xml ./example/example-schemaless/solr/collection1/conf/schema.xml ./example/multicore/core0/conf/schema.xml ./example/multicore/core1/conf/schema.xml ./example/solr/collection1/conf/schema.xml Any documentation or unit tests describing the flow, creating and using the schema will be helpful. Thanks Anurag
Re: [ANN] Lucidworks Fusion 1.0.0
Hi, How we can see the demo for NLP? On Sep 24, 2014 4:43 PM, Grant Ingersoll gsing...@apache.org wrote: Hi Thomas, Thanks for the question, yes, I give a brief demo of it in action during my talk and we will have demos at our booth. I will also give a demo during the Webinar, which will be recorded. As others have said as well, you can simply download it and try yourself. Cheers, Grant On Sep 23, 2014, at 2:00 AM, Thomas Egense thomas.ege...@gmail.com wrote: Hi Grant. Will there be a Fusion demostration/presentation at Lucene/Solr Revolution DC? (Not listed in the program yet). Thomas Egense On Mon, Sep 22, 2014 at 3:45 PM, Grant Ingersoll gsing...@apache.org wrote: Hi All, We at Lucidworks are pleased to announce the release of Lucidworks Fusion 1.0. Fusion is built to overlay on top of Solr (in fact, you can manage multiple Solr clusters -- think QA, staging and production -- all from our Admin).In other words, if you already have Solr, simply point Fusion at your instance and get all kinds of goodies like Banana ( https://github.com/LucidWorks/Banana -- our port of Kibana to Solr + a number of extensions that Kibana doesn't have), collaborative filtering style recommendations (without the need for Hadoop or Mahout!), a modern signal capture framework, analytics, NLP integration, Boosting/Blocking and other relevance tools, flexible index and query time pipelines as well as a myriad of connectors ranging from Twitter to web crawling to Sharepoint. The best part of all this? It all leverages the infrastructure that you know and love: Solr. Want recommendations? Deploy more Solr. Want log analytics? Deploy more Solr. Want to track important system metrics? Deploy more Solr. Fusion represents our commitment as a company to continue to contribute a large quantity of enhancements to the core of Solr while complementing and extending those capabilities with value adds that integrate a number of 3rd party (e.g connectors) and home grown capabilities like an all new, responsive UI built in AngularJS. Fusion is not a fork of Solr. We do not hide Solr in any way. In fact, our goal is that your existing applications will work out of the box with Fusion, allowing you to take advantage of new capabilities w/o overhauling your existing application. If you want to learn more, please feel free to join our technical webinar on October 2: http://lucidworks.com/blog/say-hello-to-lucidworks-fusion/. If you'd like to download: http://lucidworks.com/product/fusion/. Cheers, Grant Ingersoll Grant Ingersoll | CTO gr...@lucidworks.com | @gsingers http://www.lucidworks.com Grant Ingersoll | @gsingers http://www.lucidworks.com
JSP support not configured in cygwin with Apache Solr
I'm starting Cygwin with Apache solr-4.3.1 like so: @echo off C: chdir C:\cygwin\bin rem bash --login -i bash -c cd /cygdrive/c/solr-4.3.1/example/;java -Dsolr.solr.home=./example-DIH/solr/ -jar -Xms200m -Xmx1200m start.jar -OPTIONS=jsp But when I go to http://localhost:8983/solr/db/admin/analysis.jsp I get a 500 error `Problem accessing /solr/db/admin/analysis.jsp. Reason: JSP support not configured` Why? -- View this message in context: http://lucene.472066.n3.nabble.com/JSP-support-not-configured-in-cygwin-with-Apache-Solr-tp4161613.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: JSP support not configured in cygwin with Apache Solr
On 28 September 2014 21:06, PeterKerk petervdk...@hotmail.com wrote: But when I go to http://localhost:8983/solr/db/admin/analysis.jsp I get a 500 error `Problem accessing /solr/db/admin/analysis.jsp. Reason: JSP support not configured` Where did you get that URL? Try just going to http://localhost:8983/solr/ and navigate the admin screens for there. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
Re: JSP support not configured in cygwin with Apache Solr
Was an old bookmark..I did not notice the extra pages under core selectionfound it now , thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/JSP-support-not-configured-in-cygwin-with-Apache-Solr-tp4161613p4161619.html Sent from the Solr - User mailing list archive at Nabble.com.
Flexible search field analyser/tokenizer configuration
I have a site which lists companies. I'm looking to improve my search, but I want to know which available analysers and tokenizers I should use for which scenario, and if it's at all possible. I want users to be able to search on the company title on for example a company called The Royal Garden The logic for this search should be as follows, The Royal Garden, should be found on queries: the royal garden royal garden the roy The royal RoYAl garden So case insensitive, matching on parts of words. However, a query the royal should not return companies like: the wall the room the restaurant So words like the, but also a should be ignored if these are the only match in the searchquery. I now have this: fieldType name=searchtext class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=title_search type=searchtext indexed=true stored=true/ I'm testing on http://localhost:8983/solr/#/bm/analysis but I'm stuck. Also, I would think my scenario is pretty common and lots of users have already configured their Solr search to be flexible and powerful...any good search configurations would be welcome! -- View this message in context: http://lucene.472066.n3.nabble.com/Flexible-search-field-analyser-tokenizer-configuration-tp4161624.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Flexible search field analyser/tokenizer configuration
WordDelimiterFilterFactory is rather a specialized and capricious beast. Possibly not the most suitable for your needs (it's for things like iPhone 6 == iphone6). Things you may want to look at: http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/ngram/EdgeNGramFilterFactory.html http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/core/StopFilterFactory.html http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/commongrams/CommonGramsFilterFactory.html http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/commongrams/CommonGramsQueryFilter.html Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 28 September 2014 22:50, PeterKerk petervdk...@hotmail.com wrote: I have a site which lists companies. I'm looking to improve my search, but I want to know which available analysers and tokenizers I should use for which scenario, and if it's at all possible. I want users to be able to search on the company title on for example a company called The Royal Garden The logic for this search should be as follows, The Royal Garden, should be found on queries: the royal garden royal garden the roy The royal RoYAl garden So case insensitive, matching on parts of words. However, a query the royal should not return companies like: the wall the room the restaurant So words like the, but also a should be ignored if these are the only match in the searchquery. I now have this: fieldType name=searchtext class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=title_search type=searchtext indexed=true stored=true/ I'm testing on http://localhost:8983/solr/#/bm/analysis but I'm stuck. Also, I would think my scenario is pretty common and lots of users have already configured their Solr search to be flexible and powerful...any good search configurations would be welcome! -- View this message in context: http://lucene.472066.n3.nabble.com/Flexible-search-field-analyser-tokenizer-configuration-tp4161624.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Boosting Unique Values
ok. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Boosting-Unique-Values-tp4160507p4161627.html Sent from the Solr - User mailing list archive at Nabble.com.