Re: Offset-Based Analysis

2023-02-22 Thread Mikhail Khludnev
gt; > Thanks again, > Luke > > From: java-user@lucene.apache.org At: 02/22/23 02:38:30 UTC-5:00To: > java-user@lucene.apache.org > Subject: Re: Offset-Based Analysis > > Hello Luke. > > Using offsets seems really doubtful to me. What comes to my mind is > pre-analyzed field >

Re: Offset-Based Analysis

2023-02-22 Thread Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD A)
@lucene.apache.org Subject: Re: Offset-Based Analysis Hello Luke. Using offsets seems really doubtful to me. What comes to my mind is pre-analyzed field https://solr.apache.org/guide/solr/latest/indexing-guide/external-files-processe s.html#the-preanalyzedfield-type. Thus, external NLP service can provide ready

Re: Offset-Based Analysis

2023-02-21 Thread Mikhail Khludnev
Hello Luke. Using offsets seems really doubtful to me. What comes to my mind is pre-analyzed field https://solr.apache.org/guide/solr/latest/indexing-guide/external-files-processes.html#the-preanalyzedfield-type. Thus, external NLP service can provide ready-made tokens for straightforward indexing

Offset-Based Analysis

2023-02-21 Thread Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD A)
Hi All, I am trying to enrich a lucene-powered search index with data from various different NLP systems that are distributed throughout my company. Ideally this internally-derived data could be tied back to specific positions of the original text. I’ve searched around and this is the closest t