Re: Documents in facet results
Dear community, I'm wondering if there is a clean solution to my rather interesting problem. The following facet query results in a list of all facets and the number of all documents matching the corresponding facet as seen below: Probably the quickest way would be to write another XSLT transform to reformat the results to match your requirements. Then add extra query params to call the transform. str name=wtxslt/str str name=trreformat.xsl/str Fergus. Query: str name=q*:*/str str name=facet.limit5/str str name=facet.fielden_atmosphere/str str name=rows0/str Results: lst name=facet_counts lst name=facet_queries/ lst name=facet_fields lst name=en_atmosphere int name=Snug and pleasant675/int int name=Authentic385/int int name=Modern and functional378/int int name=Romantic374/int int name=Modest339/int /lst /lst Now I would like to have the documents as child node of the various facet fields, so that the result will be something similar as: lst name=facet_counts lst name=facet_queries/ lst name=facet_fields lst name=en_atmosphere docs facet=Snug and pleasant doc... doc... /docs docs facet=Authentic doc... doc... /docs ... /lst /lst Of course it would be possible to send a couple of queries for each facet to get the corresponding docs or I can parse the response xml, but it would be more efficient if SOLR can return the result as above. Thanks! -- Jeffrey Gelens Buyways B.V. Tel. 050 853 6600 Webengineer Friesestraatweg 215c Fax. 050 853 6601 http://www.buyways.nl 9743 AD Groningen KvK 01074105 -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: Solr vs Sphinx
Something that would be interesting is to share solr configs for various types of indexing tasks. From a solr configuration aimed at indexing web pages to one doing large amounts of text to one that indexes specific structured data. I could see those being posted on the wiki and helping folks who say I want to do X, is there an example?. I think most folks start with the example Solr install and tweak from there, which probably isn't the best path... Eric Yep a solr cookbook with lots of different example recipes. However these would need to be very actively maintained to ensure they always represented best practice. While using cocoon I made extensive use of the examples section of the cocoon website. However most of the, massive number of, examples represent obsolete cocoon practise. Or there were four or five examples doing the same thing in different ways with no text explaining the pros/cons of the different approaches. This held me, as a newcomer, back and gave a bad impression of cocoon. I was wondering about a performance hints page. I was caught by an issue indexing CSV content where the use of overwrite=false made an almost 3x difference to my indexing speed. Still do not really know why! On May 15, 2009, at 8:09 AM, Mark Miller wrote: In the spirit of good defaults: I think we should change the Solr highlighter to highlight phrase queries by default, as well as prefix,range,wildcard constantscore queries. Its awkward to have to tell people you have to turn those on. I'd certainly prefer to have to turn them off if I have some limitation rather than on. Yep I agree, all whizzy new features should ideally be on by default unless there is a significant performance penalty. It is not enough that to issue a default solrconfig.xml with the feature on, it has to be on by default inside the code. - Mark - Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com Free/Busy: http://tinyurl.com/eric-cal Fergus
Re: query regarding Indexing xml files -db-data-config.xml
hi , u may not need that enclosing entity , if you only wish to index one file. baseDir is not required if you give absolute path in the fileName. no need to mention forEach or fields if you set useSolrAddSchema=true On Sat, May 16, 2009 at 1:23 AM, jayakeerthi s mail2keer...@gmail.com wrote: Hi All, I am trying to index the fileds from the xml files, here is the configuration that I am using. db-data-config.xml dataConfig dataSource type=FileDataSource name =xmlindex/ document name=products entity name=xmlfile processor=FileListEntityProcessor fileName=c:\test\ipod_other.xml recursive=true rootEntity=false dataSource=null baseDir=${dataimporter.request.xmlDataDir} entity name=data processor=XPathEntityProcessor forEach=/record | /the/record/xpath url=${xmlfile.fileAbsolutePath} field column=manu name=manu/ /entity /entity /document /dataConfig Schema.xml has the field manu The input xml file used to import the field is doc field name=idF8V7067-APL-KIT/field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc doing the full-import this is the response I am getting - lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched0/str str name=Total Documents Skipped0/str str name=Full Dump Started2009-05-15 11:58:00/str str name=Indexing completed. Added/Updated: 0 documents. Deleted 0 documents./str str name=Committed2009-05-15 11:58:00/str str name=Optimized2009-05-15 11:58:00/str str name=Time taken0:0:0.172/str /lst str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response Do I missing anything here or is there any format on the input xml,?? please help resolving this. Thanks and regards, Jay -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Solr memory requirements?
I think that if you have in your index any documents with norms, you will still use norms for those fields even if the schema is changed later. Did you wipe and re-index after all your schema changes? -Peter On Fri, May 15, 2009 at 9:14 PM, vivek sar vivex...@gmail.com wrote: Some more info, Profiling the heap dump shows org.apache.lucene.index.ReadOnlySegmentReader as the biggest object - taking up almost 80% of total memory (6G) - see the attached screen shot for a smaller dump. There is some norms object - not sure where are they coming from as I've omitnorms=true for all indexed records. I also noticed that if I run a query - let's say generic query that hits 100million records and then follow up with a specific query - which hits only 1 record, the second query causes the increase in heap. Looks like there are few bytes being loaded into memory for each document - I've checked the schema all indexes have omitNorms=true, all caches are commented out - still looking to see what else might put things in memory which don't get collected by GC. I also saw, https://issues.apache.org/jira/browse/SOLR- for Solr 1.4 (which I'm using). Not sure if that can cause any problem. I do use range queries for dates - would that have any effect? Any other ideas? Thanks, -vivek On Thu, May 14, 2009 at 8:38 PM, vivek sar vivex...@gmail.com wrote: Thanks Mark. I checked all the items you mentioned, 1) I've omitnorms=true for all my indexed fields (stored only fields I guess doesn't matter) 2) I've tried commenting out all caches in the solrconfig.xml, but that doesn't help much 3) I've tried commenting out the first and new searcher listeners settings in the solrconfig.xml - the only way that helps is that at startup time the memory usage doesn't spike up - that's only because there is no auto-warmer query to run. But, I noticed commenting out searchers slows down any other queries to Solr. 4) I don't have any sort or facet in my queries 5) I'm not sure how to change the Lucene term interval from Solr - is there a way to do that? I've been playing around with this memory thing the whole day and have found that it's the search that's hogging the memory. Any time there is a search on all the records (800 million) the heap consumption jumps by 5G. This makes me think there has to be some configuration in Solr that's causing some terms per document to be loaded in memory. I've posted my settings several times on this forum, but no one has been able to pin point what configuration might be causing this. If someone is interested I can attach the solrconfig and schema files as well. Here are the settings again under Query tag, query maxBooleanClauses1024/maxBooleanClauses enableLazyFieldLoadingtrue/enableLazyFieldLoading queryResultWindowSize50/queryResultWindowSize queryResultMaxDocsCached200/queryResultMaxDocsCached HashDocSet maxSize=3000 loadFactor=0.75/ useColdSearcherfalse/useColdSearcher maxWarmingSearchers2/maxWarmingSearchers /query and schema, field name=id type=long indexed=true stored=true required=true omitNorms=true compressed=false/ field name=atmps type=integer indexed=false stored=true compressed=false/ field name=bcid type=string indexed=true stored=true omitNorms=true compressed=false/ field name=cmpcd type=string indexed=true stored=true omitNorms=true compressed=false/ field name=ctry type=string indexed=true stored=true omitNorms=true compressed=false/ field name=dlt type=date indexed=false stored=true default=NOW/HOUR compressed=false/ field name=dmn type=string indexed=true stored=true omitNorms=true compressed=false/ field name=eaddr type=string indexed=true stored=true omitNorms=true compressed=false/ field name=emsg type=string indexed=false stored=true compressed=false/ field name=erc type=string indexed=false stored=true compressed=false/ field name=evt type=string indexed=true stored=true omitNorms=true compressed=false/ field name=from type=string indexed=true stored=true omitNorms=true compressed=false/ field name=lfid type=string indexed=true stored=true omitNorms=true compressed=false/ field name=lsid type=string indexed=true stored=true omitNorms=true compressed=false/ field name=prsid type=string indexed=true stored=true omitNorms=true compressed=false/ field name=rc type=string indexed=false stored=true compressed=false/ field name=rmcd type=string indexed=false stored=true compressed=false/ field name=rmscd type=string indexed=false stored=true compressed=false/ field name=scd type=string indexed=true stored=true omitNorms=true compressed=false/ field name=sip type=string indexed=false stored=true compressed=false/ field name=ts type=date indexed=true stored=false default=NOW/HOUR omitNorms=true/ !-- catchall field, containing all other searchable text fields (implemented via copyField further on in this schema -- field name=all
Re: Solr memory requirements?
I've never paid attention to post/commit ration. I usually do a commit after maybe 100 posts. Is there a guideline about this? Thanks. On Wed, May 13, 2009 at 1:10 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: 2) ramBufferSizeMB dictates, more or less, how much Lucene/Solr will consume during indexing. There is no need to commit every 50K docs unless you want to trigger snapshot creation.
Re: Autocommit blocking adds? AutoCommit Speedup?
Hi Jayson, It is on my list of things to do. I've been having a very busy week and and am also working all weekend. I hope to get to it next week sometime, if no-one else has taken it. cheers, -mike On 8-May-09, at 10:15 PM, jayson.minard wrote: First cut of updated handler now in: https://issues.apache.org/jira/browse/SOLR-1155 Needs review from those that know Lucene better, and double check for errors in locking or other areas of the code. Thanks. --j jayson.minard wrote: Can we move this to patch files within the JIRA issue please. Will make it easier to review and help out a as a patch to current trunk. --j Jim Murphy wrote: Yonik Seeley-2 wrote: ...your code snippit elided and edited below ... Don't take this code as correct (or even compiling) but is this the essence? I moved shared access to the writer inside the read lock and kept the other non-commit bits to the write lock. I'd need to rethink the locking in a more fundamental way but is this close to idea? public void commit(CommitUpdateCommand cmd) throws IOException { if (cmd.optimize) { optimizeCommands.incrementAndGet(); } else { commitCommands.incrementAndGet(); } Future[] waitSearcher = null; if (cmd.waitSearcher) { waitSearcher = new Future[1]; } boolean error=true; iwCommit.lock(); try { log.info(start +cmd); if (cmd.optimize) { closeSearcher(); openWriter(); writer.optimize(cmd.maxOptimizeSegments); } finally { iwCommit.unlock(); } iwAccess.lock(); try { writer.commit(); } finally { iwAccess.unlock(); } iwCommit.lock(); try { callPostCommitCallbacks(); if (cmd.optimize) { callPostOptimizeCallbacks(); } // open a new searcher in the sync block to avoid opening it // after a deleteByQuery changed the index, or in between deletes // and adds of another commit being done. core.getSearcher(true,false,waitSearcher); // reset commit tracking tracker.didCommit(); log.info(end_commit_flush); error=false; } finally { iwCommit.unlock(); addCommands.set(0); deleteByIdCommands.set(0); deleteByQueryCommands.set(0); numErrors.set(error ? 1 : 0); } // if we are supposed to wait for the searcher to be registered, then we should do it // outside of the synchronized block so that other update operations can proceed. if (waitSearcher!=null waitSearcher[0] != null) { try { waitSearcher[0].get(); } catch (InterruptedException e) { SolrException.log(log,e); } catch (ExecutionException e) { SolrException.log(log,e); } } } -- View this message in context: http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23457422.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Autocommit blocking adds? AutoCommit Speedup?
Thanks Mike, I'm running it in a few environments that do not have post-commit hooks and so far have not seen any issues. A white-box review will be helpful in seeing things that may rarely occur, or if I had any misuse if internal data structures that I do not know well enough to measure. --j Mike Klaas wrote: Hi Jayson, It is on my list of things to do. I've been having a very busy week and and am also working all weekend. I hope to get to it next week sometime, if no-one else has taken it. cheers, -mike On 8-May-09, at 10:15 PM, jayson.minard wrote: First cut of updated handler now in: https://issues.apache.org/jira/browse/SOLR-1155 Needs review from those that know Lucene better, and double check for errors in locking or other areas of the code. Thanks. --j -- View this message in context: http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23587440.html Sent from the Solr - User mailing list archive at Nabble.com.
multicore for 20k users?
Trying to create a search solution for about 20k users at a company. Each person's documents are private and different (some overlap... it would be nice to not have to store/index copies). Is multicore something that would work or should we auto-insert a facet into each query generated by the person? Thanks for any advice, I am very new to solr. Any tiny push in the right direction would be appreciated. Thanks, Chris
Re: multicore for 20k users?
how much overlap is there with the 20k user documents? if you create a separate index for each of them will you be indexing 90% of the documents 20K times? How many total documents could an individual user typically see? How many total distinct documents are you talking about? Is the indexing strategy the same for all users? (the same analysis etc) Is it actually possible to limit visibility by role rather then user? I would start with trying to put everything in one index -- if that is not possible, then look at a multi-core option. On May 17, 2009, at 5:53 PM, Chris Cornell wrote: Trying to create a search solution for about 20k users at a company. Each person's documents are private and different (some overlap... it would be nice to not have to store/index copies). Is multicore something that would work or should we auto-insert a facet into each query generated by the person? Thanks for any advice, I am very new to solr. Any tiny push in the right direction would be appreciated. Thanks, Chris
Re: multicore for 20k users?
Thanks for helping Ryan, On Sun, May 17, 2009 at 7:17 PM, Ryan McKinley ryan...@gmail.com wrote: how much overlap is there with the 20k user documents? There are around 20k users but each one has anywhere from zero to thousands of documents. The final overlap is unknown because there is a current set of documents but each user will add documents on the fly (it's like their own personal search engine in a way). if you create a separate index for each of them will you be indexing 90% of the documents 20K times? Probably more like 5-10% How many total documents could an individual user typically see? Average is around 100 now but we want them to be able to add more. How many total distinct documents are you talking about? Is the indexing strategy the same for all users? (the same analysis etc) The indexing strategy is the same for each user. Is it actually possible to limit visibility by role rather then user? No, it has to be by user since it is a private document set. We just want to save on diskspace when there are big documents that are the same across users (based on document checksum). I would start with trying to put everything in one index -- if that is not possible, then look at a multi-core option. OK. Another thing is that we want to allow the user to restrict searches based on when the document was added... if we do share an indexed item and insert some attribute into each query (like user:ralph) then it couldn't have date-added based search. Unless a field was added like date-added-by-ralph, date-added-by-sally (ugh!). Or maybe diskspace is cheap and we just should strive for simplicity? Thanks, Chris On May 17, 2009, at 5:53 PM, Chris Cornell wrote: Trying to create a search solution for about 20k users at a company. Each person's documents are private and different (some overlap... it would be nice to not have to store/index copies). Is multicore something that would work or should we auto-insert a facet into each query generated by the person? Thanks for any advice, I am very new to solr. Any tiny push in the right direction would be appreciated. Thanks, Chris
Re: multicore for 20k users?
Chris, Yes, disk space is cheap, and with so little overlap you won't gain much by putting everything in a single index. Plus, when each user has a separate index, it's easy to to split users and distribute over multiple machines if you ever need to do that, it's easy and fast to completely reindex one user's data without affecting other users, etc. Several years ago I built Simpy at http://www.simpy.com/ that way (but pre-Solr, so it uses Lucene directly) and never regretted it. There are way more than 20K users there with many searches per second and with constant indexing. Each user has an index for bookmarks and an index for notes. Each group has its own index, shared by all group members. The main bookmark search is another index. People search is yet another index. And so on. Single server. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Chris Cornell srchn...@gmail.com To: solr-user@lucene.apache.org Sent: Sunday, May 17, 2009 8:37:44 PM Subject: Re: multicore for 20k users? Thanks for helping Ryan, On Sun, May 17, 2009 at 7:17 PM, Ryan McKinley wrote: how much overlap is there with the 20k user documents? There are around 20k users but each one has anywhere from zero to thousands of documents. The final overlap is unknown because there is a current set of documents but each user will add documents on the fly (it's like their own personal search engine in a way). if you create a separate index for each of them will you be indexing 90% of the documents 20K times? Probably more like 5-10% How many total documents could an individual user typically see? Average is around 100 now but we want them to be able to add more. How many total distinct documents are you talking about? Is the indexing strategy the same for all users? (the same analysis etc) The indexing strategy is the same for each user. Is it actually possible to limit visibility by role rather then user? No, it has to be by user since it is a private document set. We just want to save on diskspace when there are big documents that are the same across users (based on document checksum). I would start with trying to put everything in one index -- if that is not possible, then look at a multi-core option. OK. Another thing is that we want to allow the user to restrict searches based on when the document was added... if we do share an indexed item and insert some attribute into each query (like user:ralph) then it couldn't have date-added based search. Unless a field was added like date-added-by-ralph, date-added-by-sally (ugh!). Or maybe diskspace is cheap and we just should strive for simplicity? Thanks, Chris On May 17, 2009, at 5:53 PM, Chris Cornell wrote: Trying to create a search solution for about 20k users at a company. Each person's documents are private and different (some overlap... it would be nice to not have to store/index copies). Is multicore something that would work or should we auto-insert a facet into each query generated by the person? Thanks for any advice, I am very new to solr. Any tiny push in the right direction would be appreciated. Thanks, Chris
Re: multicore for 20k users?
A few questions, 1) what is the frequency of inserts? 2) how many cores need to be up and running at any given point On Mon, May 18, 2009 at 3:23 AM, Chris Cornell srchn...@gmail.com wrote: Trying to create a search solution for about 20k users at a company. Each person's documents are private and different (some overlap... it would be nice to not have to store/index copies). Is multicore something that would work or should we auto-insert a facet into each query generated by the person? Thanks for any advice, I am very new to solr. Any tiny push in the right direction would be appreciated. Thanks, Chris -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: multicore for 20k users?
On Mon, May 18, 2009 at 8:18 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Chris, As far as I know, AOL is using Solr with lots of cores. What I don't know is how they are handling shutting down of idle cores, which is something you'll need to do if your machine can't handle all cores being open and their data structures being populated at all times. I know I had to do that same for Simpy. :) we have a custom build of Solr. we do just in time automatic loading of cores and an LRU based unloading of cores when the upper water mark is crossed Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Chris Cornell srchn...@gmail.com To: solr-user@lucene.apache.org Sent: Sunday, May 17, 2009 10:11:10 PM Subject: Re: multicore for 20k users? On Sun, May 17, 2009 at 8:38 PM, Otis Gospodnetic wrote: Chris, Yes, disk space is cheap, and with so little overlap you won't gain much by putting everything in a single index. Plus, when each user has a separate index, it's easy to to split users and distribute over multiple machines if you ever need to do that, it's easy and fast to completely reindex one user's data without affecting other users, etc. Several years ago I built Simpy at http://www.simpy.com/ that way (but pre-Solr, so it uses Lucene directly) and never regretted it. There are way more than 20K users there with many searches per second and with constant indexing. Each user has an index for bookmarks and an index for notes. Each group has its own index, shared by all group members. The main bookmark search is another index. People search is yet another index. And so on. Single server. Thankyou very much for your insight and experience, sounds like we shouldn't be thinking about prematurely optimizing this. Has someone actually used multicore this way, though? With thousands of them? Independently of advice in that regard, I guess our next step is to explore and create some dummy scenarios/tests to try and stress multicore (search latency is not as much of a factor as memory usage is). I'll report back on any conclusion we come to. Thanks! Chris -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: multicore for 20k users?
On Sun, May 17, 2009 at 8:38 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Chris, Yes, disk space is cheap, and with so little overlap you won't gain much by putting everything in a single index. Plus, when each user has a separate index, it's easy to to split users and distribute over multiple machines if you ever need to do that, it's easy and fast to completely reindex one user's data without affecting other users, etc. Several years ago I built Simpy at http://www.simpy.com/ that way (but pre-Solr, so it uses Lucene directly) and never regretted it. There are way more than 20K users there with many searches per second and with constant indexing. Each user has an index for bookmarks and an index for notes. Each group has its own index, shared by all group members. The main bookmark search is another index. People search is yet another index. And so on. Single server. Thankyou very much for your insight and experience, sounds like we shouldn't be thinking about prematurely optimizing this. Has someone actually used multicore this way, though? With thousands of them? Independently of advice in that regard, I guess our next step is to explore and create some dummy scenarios/tests to try and stress multicore (search latency is not as much of a factor as memory usage is). I'll report back on any conclusion we come to. Thanks! Chris
Re: multicore for 20k users?
Chris, As far as I know, AOL is using Solr with lots of cores. What I don't know is how they are handling shutting down of idle cores, which is something you'll need to do if your machine can't handle all cores being open and their data structures being populated at all times. I know I had to do that same for Simpy. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Chris Cornell srchn...@gmail.com To: solr-user@lucene.apache.org Sent: Sunday, May 17, 2009 10:11:10 PM Subject: Re: multicore for 20k users? On Sun, May 17, 2009 at 8:38 PM, Otis Gospodnetic wrote: Chris, Yes, disk space is cheap, and with so little overlap you won't gain much by putting everything in a single index. Plus, when each user has a separate index, it's easy to to split users and distribute over multiple machines if you ever need to do that, it's easy and fast to completely reindex one user's data without affecting other users, etc. Several years ago I built Simpy at http://www.simpy.com/ that way (but pre-Solr, so it uses Lucene directly) and never regretted it. There are way more than 20K users there with many searches per second and with constant indexing. Each user has an index for bookmarks and an index for notes. Each group has its own index, shared by all group members. The main bookmark search is another index. People search is yet another index. And so on. Single server. Thankyou very much for your insight and experience, sounds like we shouldn't be thinking about prematurely optimizing this. Has someone actually used multicore this way, though? With thousands of them? Independently of advice in that regard, I guess our next step is to explore and create some dummy scenarios/tests to try and stress multicore (search latency is not as much of a factor as memory usage is). I'll report back on any conclusion we come to. Thanks! Chris
Re: Order document result by face count
Patric - See the documents in facets results for a creative method for handling this need with xslt transformations. Cheers, --bemansell On May 16, 2009 2:11 AM, patric.wi...@rtl.de wrote: Hello, I've got a little problem. My index contains a formatid wich i counts in my querys with the facet.field select?q=text%3A(TEST)start=0rows=100facet=truefacet.field=formatidfacet.mincount=1facet.sort=true The facet fields are sorted by count but my result is still sorted by the score! Can I change that so all documents are grouped by the faced count? lst name=formatid int name=126/int int name=2421/int int name=220/int int name=2012/int int name=274/int int name=122/int int name=262/int int name=32/int int name=382/int int name=412/int int name=351/int /lst Kind regards, Patric Die Information in dieser E-Mail ist vertraulich und exklusiv fuer den Adressatenkreis bestimmt. Unbefugte Empfaenger haben kein Recht, vom Inhalt Kenntnis zu nehmen, fehlgeleitete E-mails sind sofort zu loeschen. Weiterleiten oder Kopieren darf, auch auszugsweise nur mit ausdruecklicher, schriftlicher Einwilligung des Absenders erfolgen. In jedem Fall ist sicherzustellen, dass keinerlei inhaltliche Veraenderungen erfolgen. Der Absender ist von der Richtigkeit des Inhalts und der Uebertragung dieser E-Mail ueberzeugt. Eine Haftung dafuer ist jedoch ausgeschlossen. This is a confidential communication intended only for the named adresses. If you received this communication in error, please notify us and return and delete it without reading it. This e-mail may not be disclosed, copied or distributed in any form without the obtained permission in writing of the sender. In any case it may not be altered or otherwise changed. Whilst the sender believes that the information is correct at the date of the e-mail, no warranty and representation is given to this effect and no responsibility can be accepted by the sender.
Re: Sole core naming convention for multicores
Thank you Otis. One silly question, how would I know that a particular character is forbidden, I think Solr will give me exceptions saying that some characters not allowed, right? Thank, KK. On Sun, May 17, 2009 at 3:12 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: KK, That should work just fine. Should any of the characters in email addresses turn out to be forbidden, just replace them consistently. For example, if @ turns out to be the problem, you could simple replace it with _. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: KK dioxide.softw...@gmail.com To: solr-user@lucene.apache.org Sent: Saturday, May 16, 2009 3:45:01 AM Subject: Sole core naming convention for multicores Hi All, I'm trying to put multicores for Solr[lol, finding the multicore config a bit difficult, any good/simple steps to do the same?any pointers]. Let me come to the point, essentially what I want is that whenever a person registersfor our service, I'll use his mail-id[this is unique] as the corename. I dont know if its viable or not. As per the wiki example the creation/registration of new core is done like this, http://localhost:8983/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instance_directoryconfig=config_file_name.xmlschema=schem_file_name.xmldataDir=data this says the name as something like coreX where X replaces a num. Is it possible to have a name like say alex...@abc.com? If not may be I've map the mail-id to some unique number that I'll use as a core name. I don't want to do all this [don't know either], hence my question. Do let me know some smart ways of doing the same. Note: I've to use mail-id as the unique identifier. Thanks in appreciation. Thanks, KK
Re: start param for MoreLikeThis?
That's correct - You can paginate/offset mlt results only through the MoreLikeThisHandler rather than the method you're using (standardrequesthandler with mlt enabled). Cheers, --bemansell On May 9, 2009 10:42 AM, jli...@gmail.com wrote: Hi. I'm using the StandardRequestHandler for MoreLikeThis queries. I find that although I can specify how many results I want returned with mlt.count, it seesm like I can not specify a start location so that I can paginate the results. Is this the case? Thanks