Then you'll have to scrub the data on the way in.

Or change the type to something like KeywordTokenizer and use
PatternReplaceCharFilter(Factory) to get  rid of unwanted stuff.

Best,
Erick

On Wed, Jul 12, 2017 at 7:07 PM, Zheng Lin Edwin Yeo
<edwinye...@gmail.com> wrote:
> The field which I am bucketing is indexed using String field, and does not
> pass through any tokenizers.
>
> Regards,
> Edwin
>
> On 12 July 2017 at 21:52, Susheel Kumar <susheel2...@gmail.com> wrote:
>
>> I checked on 6.6 and don't see any such issues. I assume the field you are
>> bucketing on is string/keywordtokenizer not text/analyzed field.
>>
>>
>> ===
>>
>> "facets":{
>>
>>     "count":5,
>>
>>     "myfacet":{
>>
>>       "buckets":[{
>>
>>           "val":"A\t\t\t",
>>
>>           "count":2},
>>
>>         {
>>
>>           "val":"L\t\t\t",
>>
>>           "count":1},
>>
>>         {
>>
>>           "val":"P\t\t\t",
>>
>>           "count":1},
>>
>>         {
>>
>>           "val":"Z\t\t\t",
>>
>>           "count":1}]}}}
>>
>> On Wed, Jul 12, 2017 at 2:31 AM, Zheng Lin Edwin Yeo <edwinye...@gmail.com
>> >
>> wrote:
>>
>> > Hi,
>> >
>> > Would like to check, does JSON facet output remove characters like \t
>> from
>> > its output?
>> >
>> > Currently, we found that if the result is not in the last result set, the
>> > characters like \t will be removed from the output. However, if it is the
>> > last result set, the \t will not be removed.
>> >
>> > As there is discrepancy in the results being returned, is this
>> considered a
>> > bug in the output of the JSON facet?
>> >
>> > I'm using Solr 6.5.1.
>> >
>> > Snapshot of output when \t is not removed:
>> >
>> >   "description":{
>> >         "buckets":[{
>> >            "val":"detaildescription\t\t\t\t",
>> >             "count":1}]},
>> >
>> > Snapshot of output when \t is removed:
>> >
>> >   "description":{
>> >         "buckets":[{
>> >            "val":"detaildescription        ",
>> >             "count":1}]},
>> >
>> > Regards,
>> > Edwin
>> >
>>

Reply via email to