Re: Correcting text at index time

2015-07-01 Thread Alessandro Benedetti
Honestly, if I had to write a custom UpdateRequestProcessor I would go for a SynonymUpdateProcessor, taking in input the same Synonim file style SynonimTokenFilter is using. Would be much easier to configure and use it! Cheers 2015-07-01 2:55 GMT+01:00 Jack Krupansky jack.krupan...@gmail.com:

Re: Correcting text at index time

2015-07-01 Thread Jack Krupansky
Absolutely - I'm always in favor of coming up with additional work for other people to do. -- Jack Krupansky On Wed, Jul 1, 2015 at 6:04 AM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Honestly, if I had to write a custom UpdateRequestProcessor I would go for a

Re: Correcting text at index time

2015-06-30 Thread Jack Krupansky
You would have to have a separate instance of the update processor, each with one of the words. Or, you could code a JavaScript script with the stateless script update processor that has the long list or words and replacements as two arrays or an array of objects, and then iterate through the

Re: Correcting text at index time

2015-06-30 Thread hossmaa
Hi all Thanks for the replies. So there's no getting away from doing it on my own then... @Jack: I need to replace a whole list of shortened words... It would make a crazy regex (which I incidentally wouldn't even know how to formulate). Cheers A. -- View this message in context:

RE: Correcting text at index time

2015-06-29 Thread hossmaa
Hi Markus Thanks for the reply. I'm already using the Synonyms filter and it is working fine (i.e., when I search for customer, it also returns documents containing cst.). What the synonyms filter does not do is to actually replace the word cst. with customer in the document. Just to be clearer:

RE: Correcting text at index time

2015-06-29 Thread Markus Jelsma
Hello - why not just use synonyms or StemmerOverrideFilter? Markus -Original message- From:hossmaa andreea.hossm...@gmail.com Sent: Monday 29th June 2015 14:08 To: solr-user@lucene.apache.org Subject: Correcting text at index time Hi everyone I'm wondering if it's possible

Re: Correcting text at index time

2015-06-29 Thread Walter Underwood
Yes, do this in an update request processor before it gets to the analyzer chain. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On Jun 29, 2015, at 3:19 PM, Erick Erickson erickerick...@gmail.com wrote: Hmmm, very hard to do currently. The _point_ of

Re: Correcting text at index time

2015-06-29 Thread Erick Erickson
Hmmm, very hard to do currently. The _point_ of stored fields is that an exact, verbatim copy of the input is returned in fl lists and this is violating that promise. I suppose some kind of custom update processor could work, but it's really roll your own funcitonality I think. Best, Erick On

Re: Correcting text at index time

2015-06-29 Thread Jack Krupansky
The regex replace processor can be used to do this: https://lucene.apache.org/solr/5_2_0/solr-core/org/apache/solr/update/processor/RegexReplaceProcessorFactory.html -- Jack Krupansky On Mon, Jun 29, 2015 at 6:20 PM, Walter Underwood wun...@wunderwood.org wrote: Yes, do this in an update