Ok. Thanks for the clarification. We will do the stripping before the
indexing.
On 11/06/07, Chris Hostetter [EMAIL PROTECTED] wrote:
: Ok. Is it possible to get back the content without the html tags?
Solr never does anything to modify the stored value of a field, so you'd
really need to
Ok. Is it possible to get back the content without the html tags?
On 08/06/07, Yonik Seeley [EMAIL PROTECTED] wrote:
On 6/8/07, Thierry Collogne [EMAIL PROTECTED] wrote:
I am trying to use the solr.HTMLStripWhitespaceTokenizerFactory analyzer
with no luck.
[...]
Is this normal? Shouldn't
On 11-Jun-07, at 3:54 AM, Thierry Collogne wrote:
Ok. Is it possible to get back the content without the html tags?
Well, it isn't stored anywhere in Solr. It's best to think of lucene/
solr as two systems: the indexer applies a tokenization
transformation to the data and creates an
: Ok. Is it possible to get back the content without the html tags?
Solr never does anything to modify the stored value of a field, so you'd
really need to send Solr the value after strpping the HTML to get this to
work.
Internally, the HTMLStripWhitespaceTokenizerFactory does the HTML
Hello,
I am trying to use the solr.HTMLStripWhitespaceTokenizerFactory analyzer
with no luck.
I have a field content that contains the following field
name=content![CDATA[test a href=testlink/a
post]]/field
When I do a search I get the following
result
On 6/8/07, Thierry Collogne [EMAIL PROTECTED] wrote:
I am trying to use the solr.HTMLStripWhitespaceTokenizerFactory analyzer
with no luck.
[...]
Is this normal? Shouldn't the html code and the white spaces be removed from
the field?
For indexing purposes, yes. The stored field you get back