It is showing up in the search results. Just to confirm, does this
UpdateProcessor method remove the characters during indexing or only after
indexing has been done?

Regards,
Edwin

On 26 May 2015 at 21:30, Upayavira <u...@odoko.co.uk> wrote:

>
>
> On Tue, May 26, 2015, at 02:20 PM, Zheng Lin Edwin Yeo wrote:
> > Hi,
> >
> > Is there a way to remove the special characters like \n during indexing
> > of
> > the rich text documents.
> >
> > I have quite alot of leading \n \n in front of my indexed content of rich
> > text documents due to the space and empty lines with the original
> > documents, and it's causing the content to be flooded with '\n \n' at the
> > start before the actual content comes in. This causes the content to look
> > ugly, and also takes up unnecessary bandwidth in the system.
>
> Where is this showing up?
>
> If it is in search results, you must use an UpdateProcessor, as these
> happen before fields are stored (E.g. RegexpReplaceProcessorFactory).
>
> If you are concerned about facet results, then you can do it in an
> analysis chain, for example with a RegexpFilterFactory.
>
> Upayavira
>

Reply via email to