Re: how to selectively sort records keeping some at the bottom always.. ?
Thanks Yonik. It was very useful. On Sat, Aug 29, 2009 at 3:11 AM, Yonik Seeley yo...@lucidimagination.comwrote: On Thu, Aug 27, 2009 at 10:29 AM, Preetam Raoblogathan@gmail.com wrote: Hi, If I have documents of type a, b and c but when I sort by some criteria, lets say date, can I make documents of kind c always appear at the bottom ? One way is to simply use sorting. You could have a string field called type_c with sortMissingFirst=true (see the example schema) Index yes for all documents that are of type_c Then to sort by date, use sort=type_c desc, date desc If one needed to put type c docs at the top or bottom, then index 1 for type c and 2 for other types, and then sort asc or desc as needed. -Yonik http://www.lucidimagination.com So effectively I want one kind of records always appear at the bottom since they don't have valid data, whether sort is ascending or descending; Would a function query help here ? Or is it even possible ? Thanks Preetam
how to selectively sort records keeping some at the bottom always.. ?
Hi, If I have documents of type a, b and c but when I sort by some criteria, lets say date, can I make documents of kind c always appear at the bottom ? So effectively I want one kind of records always appear at the bottom since they don't have valid data, whether sort is ascending or descending; Would a function query help here ? Or is it even possible ? Thanks Preetam
Re: performance implications on using lots of values in fq
I don't have much idea on performance of these many fqs, since I have usually used very small number of fqs. But passing my thoughts hoping it might help. (since I did not see any response :-) a) the filter cache size needs to be more, so that fqs can be cached. If a fq is not in cache, AFAIK, each fq produces one lucene query. b) If fqs are in cache, the operations involving fq reduces to intersecting the N bit sets where N is number of fqs. In the worst case, N fqs boil down to N lucene queries and N bitset intersections. Just a wild guess - if you are doing something with radius search or similar search involving lat/longs, you can try using LocalSolr, which takes care of all the details for you. -- Preetam On Wed, Jul 23, 2008 at 11:58 PM, briand [EMAIL PROTECTED] wrote: I have documents in SOLR such that each document contains one to many points (latitude and longitudes). Currently we store the multiple points for a given document in the db and query the db to find all of the document ids around a given point first. Once we have the list of ids, we populate the fq with those ids and the q value and send that off to SOLR to do a search. In the longest query to SOLR we're populating about 450 ids into the fq parameter at this time. I was wondering if anyone knows the performance implications of passing so many ids into the fq and when it would potentially be a problem for SOLR? Currently the query passing in 450 ids is not a problem at all and returns in less than a second. Thanks. -- View this message in context: http://www.nabble.com/performance-implications-on-using-lots-of-values-in-fq-tp18617397p18617397.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr/Lucene search term stats
hi, try using faceted search, http://wiki.apache.org/solr/SimpleFacetParameters something like facet=truefacet.query=title:(web2.0 OR ajax) facet.query - gives the number of matching documents for a query. You can run the examples in the above link and see how it works.. You can also try using facet.field, which enumerates all the terms found in a given field and also tells how many documnetss contained each term. For both the above, the set of documents it acts on are the results of q. So if you want get the facets for all documents, try q=*:* On Tue, Jul 22, 2008 at 1:43 PM, Sunil [EMAIL PROTECTED] wrote: Hi All, I am working on a module using Solr, where I want to get the stats of each keyword found in each field. If my search term is: (title:(web2.0 OR ajax) OR description:(web2.0 OR ajax)) Then I want to know how many times web2.0/ajax were found in title or description. Any suggestion on how to get this information (apart from hl=true variable). Thanks, Sunil
Re: Filter by Type increases search results.
I have used fq the way it is with dismax and it works fine. fq is standard parameter and not specific to dismax. So type:idea should work correctly. - Preetam On Fri, Jul 18, 2008 at 11:30 AM, chris sleeman [EMAIL PROTECTED] wrote: btw, this *seems* to only work for me with standard search handler. dismax and fq: dont' seem to get along nicely... Wouldnt the dismax parser consider the filter query parameter as type idea and not value idea for solr field - type? I guess thats the reason this query doesnt work with dismax, the way it works with the standard search handler. You can add a debugQuery=true parameter to check the actual parsed query. -Chris On Tue, Jul 15, 2008 at 10:47 PM, Yonik Seeley [EMAIL PROTECTED] wrote: On Tue, Jul 15, 2008 at 11:10 AM, Norberto Meijome [EMAIL PROTECTED] wrote: On Tue, 15 Jul 2008 18:07:43 +0530 Preetam Rao [EMAIL PROTECTED] wrote: When I say filter, I meant q=fishfq=type:idea btw, this *seems* to only work for me with standard search handler. dismax and fq: dont' seem to get along nicely... but maybe, it is just late and i'm not testing it properly.. It should work the same... the only thing dismax does differently now is change the type of the base query to dismax. -Yonik -- Bill Cosby - Advertising is the most fun you can have with your clothes on.
Re: Multiple query fields in DisMax handler
If I understand the question correctly, you can provide init params, default params and invariant params in the appropriate request handler section in solrconfig.xml. So you can create a standard request handler with name dismaxL, whose defType is dismax and set all parameters in defaults section. Preetam On Thu, Jul 17, 2008 at 4:35 PM, chris sleeman [EMAIL PROTECTED] wrote: Thanks a lot..this is, more or less, what i was looking for. However, is there a way to pre-configure the dismax query parser, with parameters like qf, pf, boost etc., in solr-config.xml, rather than doing so at query time. So my actual query would look like - http://localhost:8983/solr/select?q= http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true queryfq={!dismaxL}CAdebugQuery=true http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true , where dismaxL refers to a query parser defined in solrconfig, with all the necessary parameters. The q parameter would then use the default dismax parser defined for the handler and fq would use dismaxL. Regards, Chris On Thu, Jul 17, 2008 at 5:15 AM, Erik Hatcher [EMAIL PROTECTED] wrote: On Jul 16, 2008, at 7:38 PM, Ryan McKinley wrote: (assuming you are using 1.3-dev), you could use the dismax query parser syntax for the fq param. I think it is something like: fq=!dismaxyour query The latest committed syntax is: {!dismax qf=}your query For example, with the sample data: http://localhost:8983/solr/select?q=*:*fq={!dismax%20qf=%22name%22}ipoddebugQuery=truehttp://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true I can't find the syntax now (Yonik?) but I don't know how you could pull out the qf,pf,etc fields for the fq portion vs the q portion. You can add parameters like the qf above, within the {!dismax ... } area. Erik -- Bill Cosby - Advertising is the most fun you can have with your clothes on.
Re: Multiple query fields in DisMax handler
I see that a QParser takes local params (those given via {!...} )as well as request params. It sets the lookup chain as local followed be request params. AFAIK, the request param lookup chain is set up as - those given in the url explicitly, then invariants, then defaults gievn in solrconfig for the corresponding request handler. Since you are not using dismax params for the main query and just want them to be available for the Dismax parser, there are no conflicts and I think you can set the qf, bf etc in the named standard request handler that you are configuring in solrconfig and dismax query parser will automatically pick it up. -- Preetam On Thu, Jul 17, 2008 at 5:48 PM, Erik Hatcher [EMAIL PROTECTED] wrote: A custom QParserPlugin could be created and implement an #init(NamedList) which you could parameterize via it's solrconfig.xml configuration. That would be one way. Another trick, I think, would be to use request parameter substitution. The javadocs here might lead you to what you're after: http://lucene.apache.org/solr/api/org/apache/solr/search/NestedQParserPlugin.html I've not tinkered with this stuff myself other than a bit of trying to grok the capabilities Yonik built into this stuff, so having folks post back their experiences would be helpful to us all :) Erik On Jul 17, 2008, at 8:11 AM, chris sleeman wrote: What I actually meant was whether or not I could create a configuration for a dismax query parser and then refer to it in my filter query. I already have a standard request handler with a dismax deftype for my query field. I wanted to use another dismax parser for the fq param, on the lines of what Ryan and Erik had suggested. Just dont want to specify all the params for this dismax at query time. My actual query would then simply look like - http://localhost:8983/solr/select?q=*:*fq={!dismaxL}CAhttp://localhost:8983/solr/select?q=*:*fq=%7B%21dismaxL%7DCA, instead of specifying all the qf, pf, etc fields as part of the dismax syntax within the query. Regards, Chris On Thu, Jul 17, 2008 at 5:18 PM, Preetam Rao [EMAIL PROTECTED] wrote: If I understand the question correctly, you can provide init params, default params and invariant params in the appropriate request handler section in solrconfig.xml. So you can create a standard request handler with name dismaxL, whose defType is dismax and set all parameters in defaults section. Preetam On Thu, Jul 17, 2008 at 4:35 PM, chris sleeman [EMAIL PROTECTED] wrote: Thanks a lot..this is, more or less, what i was looking for. However, is there a way to pre-configure the dismax query parser, with parameters like qf, pf, boost etc., in solr-config.xml, rather than doing so at query time. So my actual query would look like - http://localhost:8983/solr/select?q= http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true queryfq={!dismaxL}CAdebugQuery=true http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true , where dismaxL refers to a query parser defined in solrconfig, with all the necessary parameters. The q parameter would then use the default dismax parser defined for the handler and fq would use dismaxL. Regards, Chris On Thu, Jul 17, 2008 at 5:15 AM, Erik Hatcher [EMAIL PROTECTED] wrote: On Jul 16, 2008, at 7:38 PM, Ryan McKinley wrote: (assuming you are using 1.3-dev), you could use the dismax query parser syntax for the fq param. I think it is something like: fq=!dismaxyour query The latest committed syntax is: {!dismax qf=}your query For example, with the sample data: http://localhost:8983/solr/select?q=*:*fq={!dismax%20qf=%22namehttp://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name %22}ipoddebugQuery=true http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true I can't find the syntax now (Yonik?) but I don't know how you could pull out the qf,pf,etc fields for the fq portion vs the q portion. You can add parameters like the qf above, within the {!dismax ... } area. Erik -- Bill Cosby - Advertising is the most fun you can have with your clothes on. -- Yogi Berra - A nickel ain't worth a dime anymore.
Re: Multiple query fields in DisMax handler
Oops.. this will only help you configure only the defaults common to the main dismax query as well as the fq dismax query. For creating two sets of dismax parsers which are named and want to read params from solrconfig, I think one can extend the dismaxQParser's currently empty init() method to setup the param hierarchy during the query time such that the lookup chain becomes - local params, then qparser init params and then request params. It is along similar lines to Eric's suggestion. But I am not sure if we want to create a custom parser for this or make it a general behavior.. --- preetam On Thu, Jul 17, 2008 at 6:21 PM, Preetam Rao [EMAIL PROTECTED] wrote: I see that a QParser takes local params (those given via {!...} )as well as request params. It sets the lookup chain as local followed be request params. AFAIK, the request param lookup chain is set up as - those given in the url explicitly, then invariants, then defaults gievn in solrconfig for the corresponding request handler. Since you are not using dismax params for the main query and just want them to be available for the Dismax parser, there are no conflicts and I think you can set the qf, bf etc in the named standard request handler that you are configuring in solrconfig and dismax query parser will automatically pick it up. -- Preetam On Thu, Jul 17, 2008 at 5:48 PM, Erik Hatcher [EMAIL PROTECTED] wrote: A custom QParserPlugin could be created and implement an #init(NamedList) which you could parameterize via it's solrconfig.xml configuration. That would be one way. Another trick, I think, would be to use request parameter substitution. The javadocs here might lead you to what you're after: http://lucene.apache.org/solr/api/org/apache/solr/search/NestedQParserPlugin.html I've not tinkered with this stuff myself other than a bit of trying to grok the capabilities Yonik built into this stuff, so having folks post back their experiences would be helpful to us all :) Erik On Jul 17, 2008, at 8:11 AM, chris sleeman wrote: What I actually meant was whether or not I could create a configuration for a dismax query parser and then refer to it in my filter query. I already have a standard request handler with a dismax deftype for my query field. I wanted to use another dismax parser for the fq param, on the lines of what Ryan and Erik had suggested. Just dont want to specify all the params for this dismax at query time. My actual query would then simply look like - http://localhost:8983/solr/select?q=*:*fq={!dismaxL}CAhttp://localhost:8983/solr/select?q=*:*fq=%7B%21dismaxL%7DCA, instead of specifying all the qf, pf, etc fields as part of the dismax syntax within the query. Regards, Chris On Thu, Jul 17, 2008 at 5:18 PM, Preetam Rao [EMAIL PROTECTED] wrote: If I understand the question correctly, you can provide init params, default params and invariant params in the appropriate request handler section in solrconfig.xml. So you can create a standard request handler with name dismaxL, whose defType is dismax and set all parameters in defaults section. Preetam On Thu, Jul 17, 2008 at 4:35 PM, chris sleeman [EMAIL PROTECTED] wrote: Thanks a lot..this is, more or less, what i was looking for. However, is there a way to pre-configure the dismax query parser, with parameters like qf, pf, boost etc., in solr-config.xml, rather than doing so at query time. So my actual query would look like - http://localhost:8983/solr/select?q= http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true queryfq={!dismaxL}CAdebugQuery=true http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true , where dismaxL refers to a query parser defined in solrconfig, with all the necessary parameters. The q parameter would then use the default dismax parser defined for the handler and fq would use dismaxL. Regards, Chris On Thu, Jul 17, 2008 at 5:15 AM, Erik Hatcher [EMAIL PROTECTED] wrote: On Jul 16, 2008, at 7:38 PM, Ryan McKinley wrote: (assuming you are using 1.3-dev), you could use the dismax query parser syntax for the fq param. I think it is something like: fq=!dismaxyour query The latest committed syntax is: {!dismax qf=}your query For example, with the sample data: http://localhost:8983/solr/select?q=*:*fq={!dismax%20qf=%22namehttp://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name %22}ipoddebugQuery=true http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true http://localhost:8983/solr/select?q=*:*fq=%7B%21dismax%20qf=%22name%22%7DipoddebugQuery=true I can't find the syntax now (Yonik?) but I don't know how you could pull out the qf,pf,etc
Dismax request handler and sub phrase matches... suggestion for another handler..
Hi, Apologies if you are receiving it second time...having tough time with mail server.. I take a user entered query as it is and run it with dismax query handler. The documents fields have been filled from structured data, where different fields have different attributes like number of beds, number of baths, city name etc. A sample user query would look like 3 bed homes in new york. I would like this to match against city:new york and beds:3 beds. When I use dismax handler with boosts and tie parameter, I do not always get the most relevant top 10 results because there seem to be many factors in play one of which is not being able to recognize the presence of sub phrases and secondly not being able to ignore unwanted matches in unwanted fields. What are your thoughts on having one more request handler like dismax, but which uses a sub-phrase query instead of dismax query ? It would also provide the below parameters, on per field basis, to help customize the behavior of the request handler, and give more flexibility in different scenarios. . phraseBoost - how better is a 3 word sub phrase match than 2 word sub phrase match useOnlyMaxMatch - If many sub phrases match in the field, only the best score is used. ignoreDuplicates - If a field has duplicate matches, pick only one match for scoring. matchOnlyOneField - if match is found in the first field, remove the matched terms while querying the other fields. For example, for me city match is more important than in other fields. So,, I do not want thenew in new york to match all other fields and skew the results, which is what i am seeing with dismax, irrespective of the high boosts. ignoreSomeLuceneScorefactors - Ignore the lucene tf, idf, query norm or any such criteria which is not needed for this field., since if I want exact matches only, they are really not important. They also seem to play a big role in me not being to get most relevant top 10 results. I see this handler might be useful in the below use cases - a) data is mostly exact in that, I am not trying to search on free text like, mails, reviews, articles, web pages etc b) numbers and their binding are important c) exact phrase or sub phrase matches are more important than rankings derived from tf, idf, query norm etc. d) need to make sure that in some cases some fields affect the scoring and in some they don't. I found this was the most difficult task, to trace the noise matches from the required ones for my use case. Your thoughts and suggestions on alternatives are welcome. Have also posted a question on sub phrase matching in lucene-user which is not related to having a solr handler with additional features like sub-phrase matching, for user entered queries. Thanks Preetam
Re: Dismax request handler and sub phrase matches... suggestion for another handler..
I agree. If we do decide to implement another kind of request handler, it should be through StandardRequesthandler def type attribute, which selects the registered QParser which generates appropriate queries for lucene. Preetam On Tue, Jul 15, 2008 at 3:59 PM, Erik Hatcher [EMAIL PROTECTED] wrote: On Jul 15, 2008, at 4:45 AM, Preetam Rao wrote: What are your thoughts on having one more request handler like dismax, but which uses a sub-phrase query instead of dismax query ? It'd be better to just implement a QParser(Plugin) such that the StandardRequestHandler can use it (defType=dismax, for example). No need to have additional actual request handlers just to swap out query parsing logic anymore. Erik
Re: Filter by Type increases search results.
Hi Matt, Other than applying one more fq, is everything else remains same between the two queries, like q and all other parameters ? My understanding is that, fq is an intersection on the set of results returned from q. So it should always be a subset of results returned from q. So if one uses just q, and other uses q and fq, for the same q, the second will have equal or less number of documents. Preetam On Tue, Jul 15, 2008 at 4:10 PM, matt connolly [EMAIL PROTECTED] wrote: I'm using Solr with a Drupal site, and one of the fields in the schema is type. In my example development site, searching for the word fish returns 2 documents, one type='story', and the other type='idea'. If I filter by type:idea then I get 9 results, the correct first result, followed by 8 results that are of type='idea' but do not use the word fish at all. I have completely disabled synonyms (and rebuilt indexes) and this makes no difference. Any ideas why filtering the type results in more search documents matched? -- View this message in context: http://www.nabble.com/Filter-by-Type-increases-search-results.-tp18462188p18462188.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Filter by Type increases search results.
Hi Matt, When I say filter, I meant q=fishfq=type:idea What you are trying is a boolean OR of defaultsearchfield.:fish OR type:idea. Its not a filter, its an OR. Obviously you will get a union of results... -- Preetam On Tue, Jul 15, 2008 at 5:37 PM, matt connolly [EMAIL PROTECTED] wrote: Yes, the same, except for the filter. For example: http://localhost:8983/solr/select?q=fish returns: result name=response numFound=2 start=0etc (followed by 2 docs) http://localhost:8983/solr/select?q=fish+type:idea returns: result name=response numFound=9 start=0. (followed by 9 docs) -Matt Preetam Rao wrote: Hi Matt, Other than applying one more fq, is everything else remains same between the two queries, like q and all other parameters ? My understanding is that, fq is an intersection on the set of results returned from q. So it should always be a subset of results returned from q. So if one uses just q, and other uses q and fq, for the same q, the second will have equal or less number of documents. Preetam -- View this message in context: http://www.nabble.com/Filter-by-Type-increases-search-results.-tp18462188p18463448.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: estimating memory needed for solr instances...
Oops. Sorry for the typo. I will be careful next time. Thanks a lot for digging out the old thread :-) It was helpful. Should we remove the option useFilterForSortedQuery altogether if its not being used anymore ? --- Preetam On Fri, Jul 11, 2008 at 2:10 AM, Chris Harris [EMAIL PROTECTED] wrote: I didn't know what option was being referred to here, but I eventually figured it out. If anyone else was confused, the option is called useFilterForSortedQuery, you can set it via solrconfig.xml, and, at least according to Yonik in late 2006, you probably won't want to enable it even if you *do* sort by something other than score: http://www.nabble.com/try-setting-useFilterForSortedQuery-to-false-td7822871.html#a7822871 Cheers, Chris On Wed, Jul 9, 2008 at 12:00 AM, Preetam Rao [EMAIL PROTECTED] wrote: Since we do not sort the results, the sort will be by score which eliminates the option userFiterFprSortedQuerries.
estimating memory needed for solr instances...
Hi, Since we plan to share the same box among multiple solr instances on a 16gb RAM multi core box, Need to estimate how much memory we need for our application. The index size is on disk 2.4G with close to 3 million documents. The plan is to use dismax query with some fqs. Since we do not sort the results, the sort will be by score which eliminates the option userFiterFprSortedQuerries. Thus assuming all q's will use query result cache and all fqs will use filter caches the below is what i am thinking. I would like to know how to relate the index size on disk to its memory size ? Would it be safe to assume gven the disk size of 2.4g, that we can have ram size for whole index plus 1g for any other overhead plus the cache size which comes to 150MB (calculation below). Thus making it around 4g. cache size calculation - query result cache - size = 50K; since we paginate the results and each page has 10 items and assuming each user will at the max see 3 pages, per query we will set queryResultWindowSize to 30. Assuming this, for 50k querries we will use up 5* 30 bits = 187K asuming results are stored in bitset. we use few common fqs, lets say 200. Assuming each returns around 30k documents, it adds to 200 * 3 bits = 750K. If we use document cache of size 20K, assuming each document size is around 5k at the max, it will take up 2 * 5= 100MB. Thus we can increase the cache more drastically and still it will use up only 150MB or less. Is this reasoning on cache's correct ? Thanks Preetam
Re: estimating memory needed for solr instances...
Thanks for the responses, Ian, Jacob. While I could not locate the previous thread, this is what I understand.. While we can fine tune the cache parameters and other stuff which we can directly control, with respect to index files the key is to give enough RAM and let the the OS do its best with respect to keeping the index file in memory, -- Preetam On Thu, Jul 10, 2008 at 7:12 AM, Ian Connor [EMAIL PROTECTED] wrote: I would guess so also to a point. After you run out of RAM, indexing also takes a hit. I have noticed on a 2Gb machine when the index gets over 2Gb, my indexing rate when down from 100/s to 40/s. After reaching 4Gb it was down to 10/s. I am trying now with a 8Gb machine to see how far I get through my data before slowing down. On Wed, Jul 9, 2008 at 7:56 PM, Jacob Singh [EMAIL PROTECTED] wrote: My total guess is that indexing is CPU bound, and searching is RAM bound. Best, Jacob Ian Connor wrote: There was a thread a while ago, that suggested just need to factor in the index's total size (Mike Klaas I think was the author). It was suggested having the RAM is enough and the OS will cache the files as needed to give you the performance boost needed. If I misread the thread, please chime in - but it seems having enough RAM is the key to performance. On Wed, Jul 9, 2008 at 3:00 AM, Preetam Rao [EMAIL PROTECTED] wrote: Hi, Since we plan to share the same box among multiple solr instances on a 16gb RAM multi core box, Need to estimate how much memory we need for our application. The index size is on disk 2.4G with close to 3 million documents. The plan is to use dismax query with some fqs. Since we do not sort the results, the sort will be by score which eliminates the option userFiterFprSortedQuerries. Thus assuming all q's will use query result cache and all fqs will use filter caches the below is what i am thinking. I would like to know how to relate the index size on disk to its memory size ? Would it be safe to assume gven the disk size of 2.4g, that we can have ram size for whole index plus 1g for any other overhead plus the cache size which comes to 150MB (calculation below). Thus making it around 4g. cache size calculation - query result cache - size = 50K; since we paginate the results and each page has 10 items and assuming each user will at the max see 3 pages, per query we will set queryResultWindowSize to 30. Assuming this, for 50k querries we will use up 5* 30 bits = 187K asuming results are stored in bitset. we use few common fqs, lets say 200. Assuming each returns around 30k documents, it adds to 200 * 3 bits = 750K. If we use document cache of size 20K, assuming each document size is around 5k at the max, it will take up 2 * 5= 100MB. Thus we can increase the cache more drastically and still it will use up only 150MB or less. Is this reasoning on cache's correct ? Thanks Preetam
Re: Integrate Solr with Tomcat in Linux
set the solr home folder such that- If you are using jndi name for solr.home or command line argument for solr.home, then it will look for conf and lib folders under that folder. If you are not using jndi name, then it looks for solr/conf and solr/lib folders under current directory which is the directory you started tomcat from. You can get the conf and lib folders from the distributions example folder also Hope this helps Thanks Preetam On Wed, Jul 9, 2008 at 9:28 AM, Noble Paul നോബിള് नोब्ळ् [EMAIL PROTECTED] wrote: The context 'solr' is not initialized. The most likely reson is that you have not set the solr.home correctly. --Noble On Wed, Jul 9, 2008 at 3:24 AM, sandeep kaur [EMAIL PROTECTED] wrote: Hi, As i am running tomcat after copying the solr files to appropriate tomcat directories, i am getting the followin error in the catalina log: Jul 8, 2008 10:30:02 PM org.apache.catalina.core.AprLifecycleListener init INFO: The Apache Tomcat Native library which allows optimal performance in production environments was not found on the java.library.path: /usr/java/jdk1.6.0_06 /jre/lib/i386/client:/usr/java/jdk1.6.0_06/jre/lib/i386:/usr/java/jdk1.6.0_06/jr e/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib Jul 8, 2008 10:30:02 PM org.apache.coyote.http11.Http11Protocol init INFO: Initializing Coyote HTTP/1.1 on http-8080 Jul 8, 2008 10:30:02 PM org.apache.catalina.startup.Catalina load INFO: Initialization processed in 285 ms Jul 8, 2008 10:30:02 PM org.apache.catalina.core.StandardService start INFO: Starting service Catalina Jul 8, 2008 10:30:02 PM org.apache.catalina.core.StandardEngine start INFO: Starting Servlet Engine: Apache Tomcat/6.0.9 Jul 8, 2008 10:30:02 PM org.apache.solr.servlet.SolrDispatchFilter init INFO: SolrDispatchFilter.init() Jul 8, 2008 10:30:02 PM org.apache.solr.core.Config getInstanceDir INFO: Using JNDI solr.home: /home/user_name/softwares Jul 8, 2008 10:30:02 PM org.apache.solr.core.Config setInstanceDir INFO: Solr home set to '/home/user_name/softwares/' Jul 8, 2008 10:30:02 PM org.apache.catalina.core.StandardContext start SEVERE: Error filterStart Jul 8, 2008 10:30:02 PM org.apache.catalina.core.StandardContext start SEVERE: Context [/solr] startup failed due to previous errors Jul 8, 2008 10:30:03 PM org.apache.coyote.http11.Http11Protocol start INFO: Starting Coyote HTTP/1.1 on http-8080 Jul 8, 2008 10:30:03 PM org.apache.jk.common.ChannelSocket init INFO: JK: ajp13 listening on /0.0.0.0:8009 Jul 8, 2008 10:30:03 PM org.apache.jk.server.JkMain start INFO: Jk running ID=0 time=0/30 config=null Jul 8, 2008 10:30:03 PM org.apache.catalina.startup.Catalina start INFO: Server startup in 589 ms In the browser while typing http://localhost:8080/solr/admin i am getting the following error HTTP Status 404 - /solr/admin type Status report message /solr/admin description The requested resource (/solr/admin) is not available. Apache Tomcat/6.0.9 Could anyone please suggest how to resolve this error. Thanks, Sandip --- On Tue, 8/7/08, Shalin Shekhar Mangar [EMAIL PROTECTED] wrote: From: Shalin Shekhar Mangar [EMAIL PROTECTED] Subject: Re: Integrate Solr with Tomcat in Linux To: solr-user@lucene.apache.org, [EMAIL PROTECTED] Date: Tuesday, 8 July, 2008, 4:40 PM Take a look at http://wiki.apache.org/solr/SolrTomcat Please avoid replying to an older message when you're starting a new topic. On Tue, Jul 8, 2008 at 4:36 PM, sandeep kaur [EMAIL PROTECTED] wrote: Hi, I have solr with jetty as server application running on Linux. Could anyone please tell me the changes i need to make to integrate Tomcat with solr on Linux. Thanks, Sandip --- On Mon, 7/7/08, Benson Margulies [EMAIL PROTECTED] wrote: From: Benson Margulies [EMAIL PROTECTED] Subject: Re: js client To: [EMAIL PROTECTED], solr-user solr-user@lucene.apache.org Date: Monday, 7 July, 2008, 11:43 PM The Javascript should have the right URL automatically if you get it from the ?js URL. Anyway, I think I was the first person to say 'stupid' about that WSDL in the sample. I'm not at all clear on what you are doing at this point. Please send along the URL that works for you in soapUI and the URL that works for you in the script.../script element. On Mon, Jul 7, 2008 at 5:54 AM, Christine Karman [EMAIL PROTECTED] wrote: On Sun, 2008-07-06 at 10:25 -0400, Benson Margulies wrote: In the sample, it is a relative URL to the web service endpoint. The sample starts from a stupid WSDL with silly names for the service and the port. I'm sorry about using the word stupid. Take your endpoint deployment URL, the very URL that is logged when your