rclabo commented on issue #793:
URL: https://github.com/apache/lucenenet/issues/793#issuecomment-1781023204

   Shad, thank you for that.   I feel like it just pulled me back into reality. 
 
   
   So I guess what you are saying is we can't have a "stable" Lucene.NET 
release unless its dependencies are stable and currently 
[Lucene.NET.ICU](https://lucenenet.apache.org/docs/4.8.0-beta00016/api/icu/overview.html)
 is a work in progress with a changing API surface.  
   
   I'm reading into that, [ICU4N](https://github.com/NightOwl888/ICU4N), which 
Lucene.NET.ICU depends on, is also probably a work in progress.  And it's 
certainly worth noting that ICU support is something the Java Lucene team got 
for free in the JDK that unfortunately isn't included in the .NET Framework 
(full or core).  Hence the need to create ICU4N to provide that support.  A 
nontrivial endeavor in its own right.
   
   In using Lucene.NET to create a search index for an e-commerce marketplace, 
I've never hit any ICU-related functionality that was missing that I felt I 
needed.  Unfortunately, I have no prior history with ICU so my only learnings 
about it have been here on the Lucene.NET project.  So I guess for me, it's 
often an out-of-sight, out of mind, portion of Lucene.  
   
   But when I review the docs for 
[Lucene.Net.ICU](https://lucenenet.apache.org/docs/4.8.0-beta00016/api/icu/overview.html)
 and see what's included, it feels very central to a search library and 
encompasses such basic functionality as finding word boundaries and line break 
boundaries.  While this seems trivial in languages like English it's anything 
but trivial in languages like Chinese  要弄清楚如何分解中文單字是很困難的。or Japanese 
中国語で単語を区切る方法を理解するのは難しいです.  
   
   Given that a great many of the developers using Lucene.NET only use it for 
English text, or other languages that use the Latin alphabet, it's easy to see 
how we can sometimes lose sight of what ICU is and why it's so important.  
Based on your post, I now better understand why Lucene.NET hasn't had a public 
release yet.  Still, it seems very unfortunate that such a stable product (at 
least for indexing Latin languages) has a current version (beta) that doesn't 
indicate it's production-ready for Latin languages.  
   
   I'm with you a 100% that doing a Frankenstein version of Lucene that 
incorporates features from different versions. is a non-starter.  Being able to 
compare execution paths with a corresponding Java version is too valuable to 
give up.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to