Koji Sekiguchi <k...@r.email.ne.jp> writes:

> This is correct when you have the mapping definition:
>
> "&lt;" => "<"
> "&gt;" => ">"
>    :              :
>
> But I thought you could not have them, but have only:
>
> "&uuml;" => "ü"
> "&auml;" => "ä"
>    :             :
>
> Didn't it solve your problem?

Hi Koji,

oh, seems like I missed a bit of your suggestion. So you propose to
have mappings for all entities except the troublesome lt, gt, amp?

That should work, as long as it is okay that whitespace follows those
characters. I guess that it will indeed be okay for most situations.

Still, while that is a clever workaround, it doesn't change that the
advertised functionality in the HTML stripper is broken.


I now signed up for JIRA, and created SOLR-1394 for this issue.


Thanks,
Anders.

Reply via email to