gt; Steve
>
> > -Original Message-
> > From: Mike Hugo [mailto:m...@piragua.com]
> > Sent: Tuesday, January 24, 2012 3:56 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: HTMLStripCharFilterFactory not working in Solr4?
> >
> > Thanks for the
lters have been working there
all along.)
Steve
> -Original Message-
> From: Mike Hugo [mailto:m...@piragua.com]
> Sent: Tuesday, January 24, 2012 3:56 PM
> To: solr-user@lucene.apache.org
> Subject: Re: HTMLStripCharFilterFactory not working in Solr4?
>
> Thanks
Thanks for the responses everyone.
Steve, the test method you provided also works for me. However, when I try
a more end to end test with the HTMLStripCharFilterFactory configured for a
field I am still having the same problem. I attached a failing unit test
and configuration to the following is
Oops, I didn't read carefully enough to see that you wanted those constructs
entirely stripped out.
Given that you're seeing numbers indexed, this strongly indicates an
escaping bug in the SolrJ client that must have been introduced at
some point.
I'll see if I can reproduce it in a unit test.
-
Try putting the HTMLStripCharFilterFactory before the StandardTokenizerFactory
instead of after it. I vaguely recall being burned by something like this
before.
-Michael
Hi Mike,
When I add the following test to TestHTMLStripCharFilterFactory.java on Solr
trunk, it passes:
public void testNumericCharacterEntities() throws Exception {
final String text = "Bose® ™"; // |Bose® ™|
HTMLStripCharFilterFactory htmlStripFactory = new
HTMLStripCharFilterFactory()
Thanks for the response Yonik,
Interestingly enough, changing to to the LegacyHTMLStripCharFilterFactory
does NOT solve the problem - in fact I get the same result
I can see that the LegacyHTMLStripCharFilterFactory is being applied at
startup:
Jan 24, 2012 1:25:29 PM org.apache.solr.util.plugin.
You can use LegacyHTMLStripCharFilterFactory to get the previous behavior.
See https://issues.apache.org/jira/browse/LUCENE-3690 for more details.
-Yonik
http://www.lucidimagination.com
On Tue, Jan 24, 2012 at 1:34 PM, Mike Hugo wrote:
> We recently updated to the latest build of Solr4 and eve