Can you enter the text on the Solr Admin UI Analysis page? Then you could
tell which stage the issue occurs.
StandardTokenizer has a default token length limit of 255. You can override
with the "maxTokenLength" attribute:
<tokenizer class="solr.StandardTokenizerFactory"
maxTokenLength="1024" />
See:
https://lucene.apache.org/core/4_2_0/analyzers-common/org/apache/lucene/analysis/standard/StandardTokenizerFactory.html
But the "#" sounds like a bug.
-- Jack Krupansky
-----Original Message-----
From: Danny Watari
Sent: Tuesday, April 02, 2013 5:45 PM
To: solr-user@lucene.apache.org
Subject: Lengthy description is converted to hash symbols
Hi, I have a field that is defined to be of type "text_en". Occasionally, I
notice that lengthy strings are converted to hash symbols. Here is a
snippet of my field type:
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<field name="description" type="text_en" indexed="true" stored="true"
required="false" />
Here is an example of the field's value:
<str
name="description">###############################################################################################################################################################################################################################################################</str>
Any ideas why this might be happening?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Lengthy-description-is-converted-to-hash-symbols-tp4053338.html
Sent from the Solr - User mailing list archive at Nabble.com.