Re: optimize is taking too much time
I've always thought that these two events were effectively equivalent. -- the results of an optimize vs the results of Lucene _naturally_ merging all segments together into one. If they don't have the safe effect then what is the difference? ~ David Smiley Otis Gospodnetic wrote: Hello, Solr will never optimize the whole index without somebody explicitly asking for it. Lucene will merge index segments on the master as documents are indexed. How often it does that depends on mergeFactor. See: http://search-lucene.com/?q=mergeFactor+segment+mergefc_project=Lucenefc_project=Solrfc_type=mail+_hash_+user Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message From: mklprasad mklpra...@gmail.com To: solr-user@lucene.apache.org Sent: Fri, February 19, 2010 1:02:11 AM Subject: Re: optimize is taking too much time Jagdish Vasani-2 wrote: Hi, you should not optimize index after each insert of document.insted you should optimize it after inserting some good no of documents. because in optimize it will merge all segments to one according to setting of lucene index. thanks, Jagdish On Fri, Feb 12, 2010 at 4:01 PM, mklprasad wrote: hi in my solr u have 1,42,45,223 records having some 50GB . Now when iam loading a new record and when its trying optimize the docs its taking 2 much memory and time can any body please tell do we have any property in solr to get rid of this. Thanks in advance -- View this message in context: http://old.nabble.com/optimize-is-taking-too-much-time-tp27561570p27561570.html Sent from the Solr - User mailing list archive at Nabble.com. Yes, Thanks for reply i have removed the optmize() from code. but i have a doubt .. 1.Will mergefactor internally do any optmization (or) we have to specify 2. Even if solr initaiates optmize if i have a large data like 52GB will that takes huge time? Thanks, Prasad -- View this message in context: http://old.nabble.com/optimize-is-taking-too-much-time-tp27561570p27650028.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/optimize-is-taking-too-much-time-tp27561570p27676881.html Sent from the Solr - User mailing list archive at Nabble.com.
Why ASCIIFoldingFilter is not a CharFilter
Hello, Looking over the CharFilter franchise, it seems to me that the ASCIIFoldingFilter is a perfect candidate for being a CharFilter as it performs character level substitutions like MappingCharFilter. However it is not a CharFilter. Is there a reason why? -- Regards, Shalin Shekhar Mangar.
Re: Why ASCIIFoldingFilter is not a CharFilter
won't some stemmers leave diacritics in the terms that ought to be removed before indexing? On Feb 21, 2010, at 4:57 PM, Shalin Shekhar Mangar wrote: Hello, Looking over the CharFilter franchise, it seems to me that the ASCIIFoldingFilter is a perfect candidate for being a CharFilter as it performs character level substitutions like MappingCharFilter. However it is not a CharFilter. Is there a reason why? -- Regards, Shalin Shekhar Mangar.
Re: Why ASCIIFoldingFilter is not a CharFilter
right, most stemmers expect the diacritics to be in their input to work correctly, too. On Sun, Feb 21, 2010 at 5:19 PM, Erik Hatcher erik.hatc...@gmail.comwrote: won't some stemmers leave diacritics in the terms that ought to be removed before indexing? On Feb 21, 2010, at 4:57 PM, Shalin Shekhar Mangar wrote: Hello, Looking over the CharFilter franchise, it seems to me that the ASCIIFoldingFilter is a perfect candidate for being a CharFilter as it performs character level substitutions like MappingCharFilter. However it is not a CharFilter. Is there a reason why? -- Regards, Shalin Shekhar Mangar. -- Robert Muir rcm...@gmail.com
Sorting by a function that depends on the current result set
When sorting by (an integer) price field I need prices under 1 standard deviation from the mean of the current result set to be pushed to the end of the list. For example with these values: 0, 20, 40, 100, 2000, 2000, 2000, 2000, 2000, 3000, 3000, 3000, 3000, 3000, 4000, 5000, 5000, 9000 Mean ~ 2675, StdDev ~ 2211, Cutoff = 2675-2211 = 464 So all prices under 464 will be pushed to the end of the list and the resulting order would be: ascending: 2000, 2000, 2000, 2000, 2000, 3000, 3000, 3000, 3000, 3000, 4000, 5000, 5000, 9000, 0, 20, 40, 100 descending: 9000, 5000, 5000, 4000, 3000, 3000, 3000, 3000, 3000, 2000, 2000, 2000, 2000, 2000, 0, 20, 40, 100 Is this possible with in a single solr query?
Using XSLT with DIH for a URLDataSource
Hi, I have to load data for Solr from a UrlDataSource supplying me with a XML feed. In the simple case where I just do simple XSLT select this works just fine. Just as shown on the wiki (http://wiki.apache.org/solr/DataImportHandler) But I need to do some manipulation of the XML feed first, So I am trying to a transform first using: Something like this: dataSource type=URLDataSource/ document entity name=products pk=id url=file:///D:/LucidWorks/lucidworks/solr/core-se/feedtest.xml processor=XPathEntityProcessor forEach=/products/product xsl=file:///D:/LucidWorks/lucidworks/solr/core-se/test.xslt transformer=script:updateRow field column=id xpath=/products/product/@id / field column=c-n xpath=/products/product/categories/category/name/ field column=c-id xpath=/products/product/categories/category/@id/ --- But no matter what I do in my test.xslt - I get the same error: ... org.apache.solr.handler.dataimport.DataImportHandlerException: Error initializing XSL Processing Document # 1 ... Caused by: javax.xml.transform.TransformerConfigurationException: Could not compile stylesheet Anyone that can help me out here? Or has a running example using XSLT with DIH? med venlig hilsen/best regards Roland Villemoes Tel: (+45) 22 69 59 62 E-Mail: mailto:r...@alpha-solutions.dk Alpha Solutions A/S Borgergade 2, 3.sal, 1300 København K Tel: (+45) 70 20 65 38 Web: http://www.alpha-solutions.dkhttp://www.alpha-solutions.dk/ ** This message including any attachments may contain confidential and/or privileged information intended only for the person or entity to which it is addressed. If you are not the intended recipient you should delete this message. Any printing, copying, distribution or other use of this message is strictly prohibited. If you have received this message in error, please notify the sender immediately by telephone, or e-mail and delete all copies of this message and any attachments from your system. Thank you.