Re: Solr hanging
Hi Chris Hostetter Does that mean, that the last two questions I have posted hasn't reached the mailing list? Best regards Trym Den 25-04-2012 19:58, Chris Hostetter skrev: : Subject: Solr hanging : References:31fdac6b-c4d9-4383-865d-2faca0f09...@geekychris.com :can4yxvff-mqoawbyow2rsf_v4tc8vpgb+z8auv-z3zp94vv...@mail.gmail.com : In-Reply-To: :can4yxvff-mqoawbyow2rsf_v4tc8vpgb+z8auv-z3zp94vv...@mail.gmail.com https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is hidden in that thread and gets less attention. It makes following discussions in the mailing list archives particularly difficult. -Hoss
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
Hi, Any more thoughts?? Thanks, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3940383.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
In general -- i would not suggest mixing EmbeddedSolrServer with a different style (unless the other instances are read only). If you have multiple instances writing to the same files on disk you are asking for problems. Have you tried just using StreamingUpdateSolrServer for daily update? I would suspect that it would be faster then EmbeddedSolrServer anyway. ryan On Wed, Apr 25, 2012 at 11:32 PM, pcrao purn...@gmail.com wrote: Hi, Any more thoughts?? Thanks, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3940383.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Boosting fields in SOLR using Solrj
I would suggest debugging with browser requests -- then switching to Solrj after you are at 1st base. In particular, try adding the debugQuery=true parameter to the request and see what solr thinks is happening. The value that will work for the 'qt' parameter depends on what is configured in solrconfig.xml -- I suspect you want to point to a requestHandler that is configured to use edismax query parser. This can be configured by default with: lst name=defaults str name=defTypeedismax/str /lst ryan On Wed, Apr 25, 2012 at 3:57 PM, Joe joe.pol...@gmail.com wrote: Hi, I'm using the solrj API to query my SOLR 3.6 index. I have multiple text fields, which I would like to weight differently. From what I've read, I should be able to do this using the dismax or edismax query types. I've tried the following: SolrQuery query = new SolrQuery(); query.setQuery( title:apples oranges content:apples oranges); query.setQueryType(edismax); query.set(qf, title^10.0 content^1.0); QueryResponse rsp = m_Server.query( query ); But this doesn't work. I've tried the following variations to set the query type, but it doesn't seem to make a difference. query.setQueryType(dismax); query.set(qt,dismax); query.set(type,edismax); query.set(qt,edismax); query.set(type,dismax); I'd like to retain the full Lucene query syntax, so I prefer ExtendedDisMax to DisMax. Boosting individual terms in the query (as shown below) does work, but is not a valid solution, since the queries are automatically generated and can get arbitrarily complex is syntax. query.setQuery( title:apples^10.0 oranges^10.0 content:apples oranges); Any help would be much appreciated. -- View this message in context: http://lucene.472066.n3.nabble.com/Boosting-fields-in-SOLR-using-Solrj-tp3939789p3939789.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Boosting fields in SOLR using Solrj
Am 26.04.2012 00:57, schrieb Joe: Hi, I'm using the solrj API to query my SOLR 3.6 index. I have multiple text fields, which I would like to weight differently. From what I've read, I should be able to do this using the dismax or edismax query types. I've tried the following: SolrQuery query = new SolrQuery(); query.setQuery( title:apples oranges content:apples oranges); query.setQueryType(edismax); query.set(qf, title^10.0 content^1.0); QueryResponse rsp = m_Server.query( query ); Why do you try to construct your own query, when you're using an edismax query with a defined qf parameter? What you're searching is the text title:apples oranges content:apples oranges. Depending on your analyzer chain, it might be that title:appes and content:apples are kept as one token, so nothing is found because there's no such token in the index. Why don't you simply query for apples oranges? That's how (e)dismax is made for. Have a deeper look at http://wiki.apache.org/solr/DisMax. BTW, if you used the above query in a Lucene parser, it would look for apples in title and content field, but look for oranges in your default search field. This is because you didn't quote apples oranges. Since you want to use Edismax, you can ignore this, it's just that you current query won't work as expected in both cases. -Kuli
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
Hi Ryan, I see. Yes, for incremental indexing(Hourly) we use StreamingUpdateSolrServer and it is faster than EmbeddedSolrServer. We are also using, Embedded server for full indexing on a daily basis and it is efficient for full indexing as it can handle large number of documents in a better way. THanks, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3940818.html Sent from the Solr - User mailing list archive at Nabble.com.
solr error after relacing schema.xml
Trouble getting solr and Haystack working together. I have solar working (can get admin screen and query test data). I then create my search_indexes.py (per getting started example, I also have run syndb and added data to the Notes table). I run manage.py build_solar_schema, it generates XML for the schema.xml file. I replace the contents of the schema.xml with the new content. I restart the solr server, then try to start admin but get following error HTTP ERROR 500 Problem accessing /solr/admin/. Reason: Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: abortOnConfigurationErrorfalse/abortOnConfigurationError in solr.xml - org.apache.solr.common.SolrException: No cores were created, please check the logs for errors what am I doing wrong. - BillB1951 -- View this message in context: http://lucene.472066.n3.nabble.com/solr-error-after-relacing-schema-xml-tp3940133p3940133.html Sent from the Solr - User mailing list archive at Nabble.com.
Using Customized sorting in Solr
Hi, We are planning to move the search of one of our listing based portal to solr/lucene search server from sphinx search server. But we are facing a challenge is porting customized sorting being used in our portal. We only have last 60 days of data live.The algorithm is as follows:- 1. Put all listings into 54 buckets – (Date bucket for 60 days) i.e. buckets of 7day, 1 day, 1 day…… 2. For each date bucket we make 2 buckets –(Paid / free bucket) 3. For each paid / free bucket cycle the advertisers on uniqueness basis i.e. inside a bucket the ordering should be 1st listing of each advertiser, 2nd listing of each advertiser and so on in other words within a *sub-bucket* second listing of an advertiser will be displayed only after first listing of all advertiser has been displayed. For taking care of point 1 and 2 we have created a field named bucket_index at the time of indexing the data and get the results sorted by this index, but we are not able to find a way to create a sort field at index time or think of a sort function for the point no 3. Please suggest if there is a way to do so in solr. Tia, BC Rathore
Re: Dynamic creation of cores for this use case.
Take a look here: http://wiki.apache.org/solr/CoreAdmin?highlight=%28create%29%7C%28core%29#CREATE I'm not sure about your partners view. As an alternative to creating individual cores, you could simply use a single index and use filter queries (fq) to restrict the selection to the relevant customers. If you also included a field with a value for which partner group the customer belonged to, you could use a similar technique there. Best Erick On Wed, Apr 25, 2012 at 5:13 AM, pprabhcisco123 ppr...@gmail.com wrote: Hi everyone , I am new to solr. I have a use case which describes as below, I have a use case to create cores based on customers and partners. There are about 2500 customers each customers having on an average of 1 devices. The partner is a group of customers say 30 . So , we have customer column and partner column. So, the use case is to 1. Create a VM running solr, with one core per customer. 2. Index all of each customer's data (device info , config text, metadata, etc) into a single core. 3. Index all 30 customer's data into the partners view after creating parnter per 30 customer. Can any one please help me on this case ? Thanks Prabhakaran. P -- View this message in context: http://lucene.472066.n3.nabble.com/Dynamic-creation-of-cores-for-this-use-case-tp3937696p3937696.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stats.facet on date returns error
Works on my macine (tm). I tried both trunk and 3.6, so I guess that means we need more details. What version are you running on? What is your exact URL? Did you do anything like change the definition without blowing away your index and re-indexing? Have you tried using Luke or the schema browser (or terms component) to examine your data and see if it's odd? Best Erick On Wed, Apr 25, 2012 at 5:53 PM, Peter Markey sudoma...@gmail.com wrote: Hello, I have been trying stats.facet option on a tdate field and I end up getting an error even though solr has only proper values for all the docs for the date field. It happens for any type of trie-field. Any help would be appreciated. My date field is defined as: field name=doc_time type=tdate indexed=true stored=false/ error: SEVERE: org.apache.solr.common.SolrException: Invalid Date String:'M\-r/' at org.apache.solr.schema.DateField.parseMath(DateField.java:168) at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:321) at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:300) at org.apache.solr.schema.TrieField.toInternal(TrieField.java:330) at org.apache.solr.schema.TrieDateField.toInternal(TrieDateField.java:102) at org.apache.solr.request.UnInvertedField.getStats(UnInvertedField.java:609) at org.apache.solr.handler.component.SimpleStats.getStatsFields(StatsComponent.java:235) at org.apache.solr.handler.component.SimpleStats.getStatsCounts(StatsComponent.java:211) at org.apache.solr.handler.component.StatsComponent.process(StatsComponent.java:70) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:204) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:579) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:309) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
auto warm up cache and new data
Please help me understand that. What wil happen if if have cached data and thay change after comit and i have autowarm set up. Old cached data will be still accesible in cache so i will get old data? That means if autowarm copy all needed data to new cache probably i will never see new data? Cache in solr expire only with searcher going down right? -- View this message in context: http://lucene.472066.n3.nabble.com/auto-warm-up-cache-and-new-data-tp3940963p3940963.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: auto warm up cache and new data
The warmup process reloads the data from the new index. Cache in Solr expires with a new searcher, correct. You could have evictions too if it gets filled. On Thu, Apr 26, 2012 at 8:33 AM, mizayah miza...@gmail.com wrote: Please help me understand that. What wil happen if if have cached data and thay change after comit and i have autowarm set up. Old cached data will be still accesible in cache so i will get old data? That means if autowarm copy all needed data to new cache probably i will never see new data? Cache in solr expire only with searcher going down right? -- View this message in context: http://lucene.472066.n3.nabble.com/auto-warm-up-cache-and-new-data-tp3940963p3940963.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr for routing a webapp
Hello, I'm thinking about using a Solr index for routing a webapp. I have pregenerated base urls in my index. E.g. /foo/bar1 /foo/bar2 /foo/bar3 /foo/bar4 /bar/foo1 /bar/foo2 /bar/foo3 I try to find a way to match /foo/bar3/parameter1/value1/parameter2/value2 without knowing that parameter and value are not part of the base url. In fact I need the best hit from the beginng. Is that possible and are there any performance issues? I hope my problem is understandable! Thanks in advance and best regards, Bjoern
Format content field
Greetings all! I have created a enterprise search architecture that includes both nutch for crawling as well as solr for indexing. I was so focused on the nutch part that I didn't realized that my user interface (Jquery based) was lacking in appeal. One of my issues is the format of the text in the content field. Is there any way to force it to include spaces, etc for the text. for instance, this is an example of a value: thereisno way to know.Next sentence goes here.BUT I am all squished This is sample content from a html page. -- View this message in context: http://lucene.472066.n3.nabble.com/Format-content-field-tp3941336p3941336.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Dynamic creation of cores for this use case.
Hi, Thanks Erick for your response . Actually , the total no of customers is 4500 and every group of customers say 30 is a considered to be a partner or agent. The use case is to create a core for each customer as well as partner . Since its very difficult to create cores statically in solr.xml file for all 4500 customers , is there any way to create the cores dynamically or on the fly. Actually we are implementing solr in our application that is live now , in the application customers or partners around the world will login and search there devices. So , our manager wants us to do a small poc on how solr will fit into the application. I am very much depressed on this use case as it took more than the deadline . So, please help me on this regards. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Dynamic-creation-of-cores-for-this-use-case-tp3937696p3941433.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr replication failing with error: Master at: is not available. Index fetch failed
hello, sorry - i overlooked this message - thanks for checking back and thanks for the info. yes - replication seems to be working now: tailed from logs just now: 2012-04-26 09:21:33,284 INFO [org.apache.solr.handler.SnapPuller] (pool-12-thread-1) Slave in sync with master. 2012-04-26 09:21:53,279 INFO [org.apache.solr.handler.SnapPuller] (pool-12-thread-1) Slave in sync with master. 2012-04-26 09:22:13,279 INFO [org.apache.solr.handler.SnapPuller] (pool-12-thread-1) Slave in sync with master. 2012-04-26 09:22:33,279 INFO [org.apache.solr.handler.SnapPuller] (pool-12-thread-1) Slave in sync with master. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-replication-failing-with-error-Master-at-is-not-available-Index-fetch-failed-tp3932921p3941447.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Dynamic creation of cores for this use case.
Am 26.04.2012 16:17, schrieb pprabhcisco123: The use case is to create a core for each customer as well as partner . Since its very difficult to create cores statically in solr.xml file for all 4500 customers , is there any way to create the cores dynamically or on the fly. Yes there is. Have a look at: http://wiki.apache.org/solr/CoreAdmin#CREATE I suggest to set the persistent flag in solr.xml to true. I think all your cores will share the same configuration, so you can point all configuration directories to the same one, and install unique data dirs. This should be relative simple in theory. In practise, you might detect performance issues with such a configuration. It should be no big problem if at most few hundred users work in parallel, but as soon as most cores are used all together, I predict you'll have bad performance. Solr has no hard-coded limitation in the number of cores, but each core has its own caches and readers. Depending on your machine configuration, this may be too much. My suggestion is to try it out. It should work first, and if you're hitting performance limits, then you can modify yourn configuration. -Kuli
Re: Solr for routing a webapp
Have you tried using mod_rewrite for this? paul Le 26 avr. 2012 à 15:16, Björn Zapadlo a écrit : Hello, I'm thinking about using a Solr index for routing a webapp. I have pregenerated base urls in my index. E.g. /foo/bar1 /foo/bar2 /foo/bar3 /foo/bar4 /bar/foo1 /bar/foo2 /bar/foo3 I try to find a way to match /foo/bar3/parameter1/value1/parameter2/value2 without knowing that parameter and value are not part of the base url. In fact I need the best hit from the beginng. Is that possible and are there any performance issues? I hope my problem is understandable! Thanks in advance and best regards, Bjoern
impact of EdgeNGramFilterFactory on indexing process?
Hello all, i am experimenting with EdgeNGramFilterFactory - on two of the fieldTypes in my schema. filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=15 side=front/ i believe i understand this - but want to verify: 1) will this increase my index time? 2) will increase the number of documents in my index? thank you -- View this message in context: http://lucene.472066.n3.nabble.com/impact-of-EdgeNGramFilterFactory-on-indexing-process-tp3941743p3941743.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr error after relacing schema.xml
Try looking at the logs. On Apr 25, 2012, at 10:53 PM, BillB1951 wrote: Trouble getting solr and Haystack working together. I have solar working (can get admin screen and query test data). I then create my search_indexes.py (per getting started example, I also have run syndb and added data to the Notes table). I run manage.py build_solar_schema, it generates XML for the schema.xml file. I replace the contents of the schema.xml with the new content. I restart the solr server, then try to start admin but get following error HTTP ERROR 500 Problem accessing /solr/admin/. Reason: Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: abortOnConfigurationErrorfalse/abortOnConfigurationError in solr.xml - org.apache.solr.common.SolrException: No cores were created, please check the logs for errors what am I doing wrong. - BillB1951 -- View this message in context: http://lucene.472066.n3.nabble.com/solr-error-after-relacing-schema-xml-tp3940133p3940133.html Sent from the Solr - User mailing list archive at Nabble.com. - Mark Miller lucidimagination.com
Re: Recovery - too many updates received since start
On Apr 24, 2012, at 9:31 AM, Trym R. Møller wrote: Hi I experience that a Solr looses its connection with Zookeeper and re-establish it. After Solr is reconnection to Zookeeper it begins to recover. It has been missing the connection approximately 10 seconds and meanwhile the leader slice has received some documents (maybe about 1000 documents). Solr fails to update peer sync with the log message: Apr 21, 2012 10:13:40 AM org.apache.solr.update.PeerSync sync WARNING: PeerSync: core=mycollection_slice21_shard1 url=zk-1:2181,zk-2:2181,zk-3:2181 too many updates received since start - startingUpdates no longer overlaps with our currentUpdates You can configure the timeout here - I may have chosen a default that is too low based on some reports. Looking into PeerSync and UpdateLog I can see that 100 updates is the maximum allowed updates that a shard can be behind. Is it correct that this is not configurable and what is the reasons for choosing 100? Yonik chose this - I'll let him expand on it if he see this. I think it's not configurable currently, but perhaps with the right caveats as doc, it should be. I suspect that one must compare the work needed to replicate the full index with the performance loss/resource usage when enhancing the size of the UpdateLog? Yeah, I think that is the gist of it. There may be another gotchya or two, I just don't remember at the moment. Yonik? Any comments regarding this is greatly appreciated. Best regards Trym - Mark Miller lucidimagination.com
Re: Question on Facet counts by grouped results
Never mind, I did not notice that this is coming in Solr 4.0. Any ideas on when Solr 4.0 will be out? Sohail
Re: solr error after relacing schema.xml
It does not appear that any logfiles were created. - BillB1951 -- View this message in context: http://lucene.472066.n3.nabble.com/solr-error-after-relacing-schema-xml-tp3940133p3941997.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr error after relacing schema.xml
By default logging goes to std out. You probably want to configure real logging though: http://wiki.apache.org/solr/SolrJetty#Logging On Apr 26, 2012, at 1:33 PM, BillB1951 wrote: It does not appear that any logfiles were created. - BillB1951 -- View this message in context: http://lucene.472066.n3.nabble.com/solr-error-after-relacing-schema-xml-tp3940133p3941997.html Sent from the Solr - User mailing list archive at Nabble.com. - Mark Miller lucidimagination.com
Re: solr replication failing with error: Master at: is not available. Index fetch failed
On Apr 23, 2012, at 12:10 PM, geeky2 wrote: http://someip:someport/somepath/somecore/admin/replication/ is not available. Index fetch failed. Exception: Invalid version (expected 2, but 10) or the data in not in 'javabin' format This is kind of a bug. When Solr tries to talk in javabin and gets an http response instead (like a 404 response - what this likely is) it does this. Really it should detect this case and give you the proper error. I almost think someone made this change already in trunk based on what I was seeing yesterday, but I'm not sure. - Mark Miller lucidimagination.com
Re: Question on Facet counts by grouped results
On Apr 26, 2012, at 1:24 PM, Sohail Aboobaker wrote: Any ideas on when Solr 4.0 will be out? We are hoping this year. There will be a series of alphas and betas that should start within a month or few. - Mark Miller lucidimagination.com
Setting FuzzyConfig's prefixLength ?
I'd like to change Lucene's FuzzyConfig prefixLength from it's default value of 0. Is there a way to configure that via Solr somehow? I've noticed references on the list to people recompiling lucene from source in order to change this value, and I'm hoping not to need to resort to the same. Thanks in advance, Phill
searchable solr user mail archive
Hi, Is there a searchable archive for solr user emails available somewhere to avoid questions already asked on list? Sohail
Re: solr error after relacing schema.xml
Which version of Solr does Haystack expect? The schema builder might be targeting an older version of Solr. On Thu, Apr 26, 2012 at 10:47 AM, Mark Miller markrmil...@gmail.com wrote: By default logging goes to std out. You probably want to configure real logging though: http://wiki.apache.org/solr/SolrJetty#Logging On Apr 26, 2012, at 1:33 PM, BillB1951 wrote: It does not appear that any logfiles were created. - BillB1951 -- View this message in context: http://lucene.472066.n3.nabble.com/solr-error-after-relacing-schema-xml-tp3940133p3941997.html Sent from the Solr - User mailing list archive at Nabble.com. - Mark Miller lucidimagination.com -- Lance Norskog goks...@gmail.com
Re: searchable solr user mail archive
: Is there a searchable archive for solr user emails available somewhere : to avoid questions already asked on list? https://wiki.apache.org/solr/SolrResources#Mailing_List_Archives Or just use the search box in the top right corner of the main solr website... http://lucene.apache.org/solr/ -Hoss
Re: QueryElevationComponent and distributed search
Can anyone help me out in understand the fix to QueryElevationComponent (in Solr 4.0) to make it work for distributed search. -- View this message in context: http://lucene.472066.n3.nabble.com/QueryElevationComponent-and-distributed-search-tp3936998p3942221.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr for routing a webapp
Or write your own query component mapping /solr/* in the web.xml, exposing the request by a thread-local through a filter, and reading this setting the appropriate query parameters... Performance-wise, this seems quite reasonable I think. paul Le 26 avr. 2012 à 16:58, Paul Libbrecht a écrit : Have you tried using mod_rewrite for this? paul Le 26 avr. 2012 à 15:16, Björn Zapadlo a écrit : Hello, I'm thinking about using a Solr index for routing a webapp. I have pregenerated base urls in my index. E.g. /foo/bar1 /foo/bar2 /foo/bar3 /foo/bar4 /bar/foo1 /bar/foo2 /bar/foo3 I try to find a way to match /foo/bar3/parameter1/value1/parameter2/value2 without knowing that parameter and value are not part of the base url. In fact I need the best hit from the beginng. Is that possible and are there any performance issues? I hope my problem is understandable! Thanks in advance and best regards, Bjoern
Re: solr error after relacing schema.xml
I'm using haystack 2.0.0Beta, and Apache-Solr-3.6.0. I'm not sure how to determine the schema.xml version, but I do notice that the solr example's schema.xml is schema name=example version=1.5- and the schema.xml generated by haystack is schema name=default version=1.4. Can I specify another schema.xml generator for haystack? If so , where? - BillB1951 -- View this message in context: http://lucene.472066.n3.nabble.com/solr-error-after-relacing-schema-xml-tp3940133p3942303.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Question on Facet counts by grouped results
Hi, I am trying nightly build for solr 4.0. I downloaded the build and am able to start it. In 3.x, I copied the example directory and updated the schema.xml. It worked fine but in 4.0, I did the same thing (make a copy of example) but when I change the schema, I get following: Apr 26, 2012 5:04:12 PM org.apache.solr.common.SolrException log SEVERE: null:java.lang.RuntimeException: Can't find resource 'stopwords_en.txt' in classpath or 'solr/./conf/', cwd=/apps/servers/apache-solr-4.0-2012-04-26_08-10-58/apache-solr-4.0-2012-04-26_08-10-58/trusted Do i need to copy some other files in my copied directory as well? Sohail
Re: Solr4 CoreContainer failed to load with older version of Slf4j 1.5.2
I also ran into a problem using 1.6.1 - thats the breaks of progress I guess ;) On Thu, Apr 26, 2012 at 4:07 PM, Gopal Patwa gopalpa...@gmail.com wrote: I am using Solr4 nightly build apache-solr-4.0-2012-04-26_08-10-58 and I saw Slf4j version was upgraded to 1.6.4 and it is failing now to start Solr, if I want to use previous version of Slf4j version like 1.5.2 12:43:48,913 ERROR [SolrDispatchFilter] Could not start Solr. Check solr/home property and the logs 12:43:48,944 ERROR [SolrCore] null:java.lang.NoSuchMethodError: org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/StaticLoggerBinder; at org.apache.solr.core.CoreContainer.load(CoreContainer.java:395) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:355) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:304) When I investigated the code, I found it is using new method org.slf4j.impl.StaticLoggerBinder.getSingleton() , which was added in CoreContainer class during initialize logging but this method is not available in Slf4j 1.5.2 version Thanks Gopal -- - Mark http://www.lucidimagination.com
HTTP Auth and Distributed Search?
Hi, I'm wondering if there's any way to use container-based HTTP auth and Distributed Search configured in the SearchHandler that I haven't discovered aside from writing my own shard handler implementation. Thanks, Michael
Re: HTTP Auth and Distributed Search?
On Apr 26, 2012, at 5:25 PM, Michael Della Bitta wrote: Hi, I'm wondering if there's any way to use container-based HTTP auth and Distributed Search configured in the SearchHandler that I haven't discovered aside from writing my own shard handler implementation. Thanks, Michael I think there is an ugly global way to support this by setting some global properties for HttpClient. I can't remember clearly offhand though. We should add explicit support for this I think - just like we have for replication. - Mark Miller lucidimagination.com
Re: HTTP Auth and Distributed Search?
Really? Is that in a .properties file somewhere, or would I have to do it in code? I was sort of hoping I'd be able to add the credentials to the URL in the shards field, but looking at the source, that won't fly. While we're on the topic, it might be nice to be able to specify the connection scheme, too (e.g. for HTTPS). I'd be willing to make a patch if there's a decision on the way this should work. Thanks, Michael On Thu, 2012-04-26 at 17:55 -0400, Mark Miller wrote: On Apr 26, 2012, at 5:25 PM, Michael Della Bitta wrote: Hi, I'm wondering if there's any way to use container-based HTTP auth and Distributed Search configured in the SearchHandler that I haven't discovered aside from writing my own shard handler implementation. Thanks, Michael I think there is an ugly global way to support this by setting some global properties for HttpClient. I can't remember clearly offhand though. We should add explicit support for this I think - just like we have for replication. - Mark Miller lucidimagination.com
Re: Using Customized sorting in Solr
Hi, How about trying grouping with paging? First you do group=truegroup.field=advertiserIdgroup.limit=1group.offset=0group.main=truesort=somethinggroup.sort=how-much-paid desc That gives you one listing per advertiser, sorted the way you like. Then to grab the next batch of ads, you go group.offset=1 etc etc. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 26. apr. 2012, at 08:10, solr user wrote: Hi, We are planning to move the search of one of our listing based portal to solr/lucene search server from sphinx search server. But we are facing a challenge is porting customized sorting being used in our portal. We only have last 60 days of data live.The algorithm is as follows:- 1. Put all listings into 54 buckets – (Date bucket for 60 days) i.e. buckets of 7day, 1 day, 1 day…… 2. For each date bucket we make 2 buckets –(Paid / free bucket) 3. For each paid / free bucket cycle the advertisers on uniqueness basis i.e. inside a bucket the ordering should be 1st listing of each advertiser, 2nd listing of each advertiser and so on in other words within a *sub-bucket* second listing of an advertiser will be displayed only after first listing of all advertiser has been displayed. For taking care of point 1 and 2 we have created a field named bucket_index at the time of indexing the data and get the results sorted by this index, but we are not able to find a way to create a sort field at index time or think of a sort function for the point no 3. Please suggest if there is a way to do so in solr. Tia, BC Rathore
Per-User Sorting on an ExternalFileField
I'm trying pretty hard to come up with a solution that lets me sort by per-user scores that I calculate based on my data. Today, I'm trying to use a combination of ExternalFileField and dynamic fields, where the presumption is that each user might have their own file full of scores. I think the fields are hooked up okay, but I can't sort on them because it appears ExternalFileField explicitly doesn't support this operation. SEVERE: java.lang.UnsupportedOperationException at org.apache.solr.schema.ExternalFileField.getSortField(ExternalFileField.java:91) I'm using Solr 3.5. Does anyone have a suggestion as to how to end up adding this extra dimension so that I can do per-user relevance? It seems like an oft-asked, rarely-answered question. Thanks in advance, Phill
Re: Per-User Sorting on an ExternalFileField
On Fri, Apr 27, 2012 at 12:07 AM, Phill Tornroth famousactr...@gmail.comwrote: I'm using Solr 3.5. Does anyone have a suggestion as to how to end up adding this extra dimension so that I can do per-user relevance? It seems like an oft-asked, rarely-answered question. Use a function that make use of your externalfilefield and alter the score so that you can sort on the score ?
Re: Per-User Sorting on an ExternalFileField
So, I did just issue: sort=sub(my_user_score_field,0)+desc It got me past the error, but still doesn't appear to be actually using the values to sort. Any ideas as to why? Phill On Thu, Apr 26, 2012 at 4:35 PM, Stephane Bailliez sbaill...@gmail.comwrote: On Fri, Apr 27, 2012 at 12:07 AM, Phill Tornroth famousactr...@gmail.com wrote: I'm using Solr 3.5. Does anyone have a suggestion as to how to end up adding this extra dimension so that I can do per-user relevance? It seems like an oft-asked, rarely-answered question. Use a function that make use of your externalfilefield and alter the score so that you can sort on the score ?
Re: impact of EdgeNGramFilterFactory on indexing process?
1 yes. EdgeNGram will inevitably increase the number of tokens in your index, lengthening your index time. How much? some, but that means you'll have to try it to see if it's unacceptable. Some people can't take an increase of 10%. Some can take a 100% increase. 2 No. It will increase the number of _tokens_ in each document in your index, and the index size. But the number of documents is unchanged. Best Erick On Thu, Apr 26, 2012 at 12:09 PM, geeky2 gee...@hotmail.com wrote: Hello all, i am experimenting with EdgeNGramFilterFactory - on two of the fieldTypes in my schema. filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=15 side=front/ i believe i understand this - but want to verify: 1) will this increase my index time? 2) will increase the number of documents in my index? thank you -- View this message in context: http://lucene.472066.n3.nabble.com/impact-of-EdgeNGramFilterFactory-on-indexing-process-tp3941743p3941743.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Question on Facet counts by grouped results
Yes, stopwords_en.txt. Or go into your schema file and find the usages of stopwords_en.txt and change it to a stopwords file in your setup. Best Erick On Thu, Apr 26, 2012 at 5:15 PM, Sohail Aboobaker sabooba...@gmail.com wrote: Hi, I am trying nightly build for solr 4.0. I downloaded the build and am able to start it. In 3.x, I copied the example directory and updated the schema.xml. It worked fine but in 4.0, I did the same thing (make a copy of example) but when I change the schema, I get following: Apr 26, 2012 5:04:12 PM org.apache.solr.common.SolrException log SEVERE: null:java.lang.RuntimeException: Can't find resource 'stopwords_en.txt' in classpath or 'solr/./conf/', cwd=/apps/servers/apache-solr-4.0-2012-04-26_08-10-58/apache-solr-4.0-2012-04-26_08-10-58/trusted Do i need to copy some other files in my copied directory as well? Sohail
Benchmark Solr vs Elastic Search vs Sensei
Hi Solr users I've implemented the project to compare the performance between Solr, Elastic Search and SenseiDB https://github.com/vzhabiuk/search-perf the Solr version 3.5.0 was used. I've used the default configuration, just enabled json updates and used the following schema https://github.com/vzhabiuk/search-perf/blob/master/configs/solr/schema.xml. 2.5 mln documents were put into the index, after that I've launched the indexing process to add anotherr 500k docs. I was issuing commits after each 500 doc batch . At the same time I've launched the concurrent client, that sent the following type of queries ((tags:moon-roof%20or%20tags:electric%20or%20tags:highend%20or%20tags:hybrid)%20AND%20(!tags:family%20AND%20!tags:chick%20magnet%20AND%20!tags:soccer%20mom))%20 OR%20((color:red%20or%20color:green%20or%20color:white%20or%20color:yellow)%20AND%20(!color:gold%20AND%20!color:silver%20AND%20!color:black))%20 OR%20mileage:[15001%20TO%2017500]%20OR%20mileage:[17501%20TO%20*]%20 OR%20city:u.s.a.* facet=truefacet.field=tagsfacet.field=color The query contains the high level OR query, consisting of 2 terms, 2 ranges and 1 prefix. It is designed to hit ~60-70% of all the docs Here is the performance result: #Threads min median mean75% qps 1 208.95ms 332.66ms350.48ms 422.92ms 2.8 2 188.68ms 338.09ms339.22ms 402.15ms 5.9 3 151.06ms 326.64ms336.20ms 418.61ms 8.8 4 125.13ms 332.90ms332.18ms 396.14ms 12.0 If there is no indexing process on background The result is as follows for 2,6 mln docs: #Threads min median mean 75% qps 1 106.70ms 199.66ms199.40ms 234.89ms 5.1 2 128.61ms 199.12ms201.81ms 229.89ms 9.9 3 110.99ms 197.43ms203.13ms 232.25ms 14.7 4 90.24ms201.46ms 200.46ms 227.75ms 19.9 5 106.14ms 208.75ms207.69ms 242.88ms 24.0 6 103.75ms 208.91ms211.23ms 238.60ms 28.3 7 113.54ms 207.07ms209.69ms 239.99ms 33.3 8 117.32ms 216.38ms224.74ms 258.74ms 35.5 I've got three questions so far: 1. In case of background indexing the latency is almost 2 times higher, is there any way to overcome this? 2. How can we tune the Solr to get better results ? 3. What's in your opinion is the preferred type of queries that I can use for the benchmark? With many thanks, Volodymyr BTW here is the spec of my machine RedHat 6.1 64bit Intel XEON e5620 @2.40 GHz, 8 cores 63 GB RAM
Re: HTTP Auth and Distributed Search?
I believe you can set up certificates. You then store the certificates in a Java keyring file, and tell Java about the keyring at startup. Now, when you make an HTTP connection, the HTTP library automatically uses the certificates. You don't need any custom code in the http client. (I think this is how it works, anyway.) On Thu, Apr 26, 2012 at 3:01 PM, Michael Della Bitta michael.della.bi...@appinions.com wrote: Really? Is that in a .properties file somewhere, or would I have to do it in code? I was sort of hoping I'd be able to add the credentials to the URL in the shards field, but looking at the source, that won't fly. While we're on the topic, it might be nice to be able to specify the connection scheme, too (e.g. for HTTPS). I'd be willing to make a patch if there's a decision on the way this should work. Thanks, Michael On Thu, 2012-04-26 at 17:55 -0400, Mark Miller wrote: On Apr 26, 2012, at 5:25 PM, Michael Della Bitta wrote: Hi, I'm wondering if there's any way to use container-based HTTP auth and Distributed Search configured in the SearchHandler that I haven't discovered aside from writing my own shard handler implementation. Thanks, Michael I think there is an ugly global way to support this by setting some global properties for HttpClient. I can't remember clearly offhand though. We should add explicit support for this I think - just like we have for replication. - Mark Miller lucidimagination.com -- Lance Norskog goks...@gmail.com