Github user NightOwl888 commented on the issue:

    https://github.com/apache/lucenenet/pull/191
  
    @conniey 
    
    I have nearly finished Highlighter on this branch
    
    https://github.com/NightOwl888/lucenenet/tree/netcoremigration-highlighter
    
    There is still 1 failing test (that doesn't appear to be related to 
BreakIterator). I was able to make [hacky solutions for the first 3 
issues](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Highlighter/IcuBreakIterator.cs#L289-L306)
 I mentioned 
[above](https://github.com/apache/lucenenet/pull/191#issuecomment-266510336), 
but after the link you provided I am convinced that none of them will suffice 
for production, and we will need a `RuleBasedBreakIterator` to be sure we have 
all of the breaking rules setup correctly for international support.
    
    You can pull this now if you wish - let me know if you need me to work on 
that failing test.
    
    icu-dotnet
    ----
    
    Some classes that I have already ported that you may wish to migrate into 
icu-dotnet:
    
    - 
[BreakIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Core/Support/BreakIterator.cs)
 - Note that is one change from the original  - the Text property returns 
string instead of CharacterIterator. In hindsight, this change may not have 
been necessary.
    - 
[CharacterIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Core/Support/CharacterIterator.cs)
 - a GetTextAsString() method was added primarily so we can get the text to 
pass on to icu-dotnet, and the documentation hasn't yet been migrated from Java.
    - 
[StringCharacterIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Core/Support/StringCharacterIterator.cs)
 - Exactly like Java (but documentation not yet migrated).
    - 
[IcuBreakIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Highlighter/IcuBreakIterator.cs)
 - If you remove the hacks I put in to emulate the RuleBasedBreakIterator, it 
would probably be a useful addition to icu-dotnet.
    - 
[TestBreakIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Tests.Highlighter/TestBreakIterator.cs)
 - A partial set of tests that verify the sentence and word breaking 
RuleBasedBreakIterator rules.
    
    Note that CharacterIterator is a dependency of 
[CharArrayIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Analysis.Common/Analysis/Util/CharArrayIterator.cs),
 but since that is in Analysis.Common (which already depends on icu-dotnet), we 
can completely remove all of these classes from Lucene.Net once the 
functionality is available in icu-dotnet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to