Hi Dmitry,

It’s weird that start and end offsets are the same - what do you see for the 
start/end of ‘$’, i.e. if you take out MCFF?  (I think it should be start:5, 
end:6.)

As far as offsets “respecting the remapped token”, are you asking for offsets 
to be set as if ‘dollarsign' were part of the original text?  If so, there is 
no setting that would do that - the intent is for offsets to map to the 
*original* text.  You can work around this by performing the substitution prior 
to Solr analysis, e.g. in an update processor like RegexReplaceProcessorFactory.

Steve
www.lucidworks.com

> On Jun 18, 2015, at 3:07 AM, Dmitry Kan <solrexp...@gmail.com> wrote:
> 
> Hi,
> 
> It looks like MappingCharFilter sets start and end offset to the same
> value. Can this be affected on by some setting?
> 
> For a string: test $ test2 and mapping "$" => " dollarsign " (we insert
> extra space to separate $ into its own token)
> 
> we get: http://snag.gy/eJT1H.jpg
> 
> Ideally, we would like to have start and end offset respecting the remapped
> token. Can this be achieved with settings?
> 
> -- 
> Dmitry Kan
> Luke Toolbox: http://github.com/DmitryKey/luke
> Blog: http://dmitrykan.blogspot.com
> Twitter: http://twitter.com/dmitrykan
> SemanticAnalyzer: www.semanticanalyzer.info

Reply via email to