[
https://issues.apache.org/jira/browse/SOLR-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yeo Zheng Lin updated SOLR-8334:
--------------------------------
Environment: Windows 8.1, Solr 5.3, ZooKeeper 3.4.6, jieba-analysis-1.0.0
(was: Windows 8.1, Solr 5.3, ZooKeeper 3.4.6)
> Highlighting content field problem when using JiebaTokenizerFactory
> -------------------------------------------------------------------
>
> Key: SOLR-8334
> URL: https://issues.apache.org/jira/browse/SOLR-8334
> Project: Solr
> Issue Type: Bug
> Components: highlighter, search
> Affects Versions: 5.3
> Environment: Windows 8.1, Solr 5.3, ZooKeeper 3.4.6,
> jieba-analysis-1.0.0
> Reporter: Yeo Zheng Lin
> Labels: patch
> Attachments: JiebaSegmenter.java
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> When I tried to use the JiebaTokenizerFactory to index Chinese characters in
> Solr, it works fine with the segmentation when I'm using the Analysis
> function on the Solr Admin UI.
> However, when I tried to do the highlighting in Solr, it is not highlighting
> in the correct place. For example, when I search of 自然环境与企业本身, it highlight
> 认<em>为自然环</em><em>境</em><em>与企</em><em>业本</em>身的
> Even when I search for English character like responsibility, it highlight
> <em> responsibilit<em>y.
> Basically, the highlighting goes off by 1 character/space consistently.
> This problem only happens in content field, and not in any other fields.
> I've made some minor modification in the code under JiebaSegmenter.java, and
> the highlighting seems to be fine now.
> Basically, I created another int called offset2 under process() method.
> int offset2 = 0;
> After which, I modified the offset to offset2 for this part of the code under
> process() method.
> The changes are in the attachment below.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]