Re: [PR] feat: Optimize SmartCn Dictionaries and Add Dictionary Loading Tests [lucenenet]

via GitHub Tue, 08 Apr 2025 02:24:39 -0700


NehanPathan commented on PR #1154:
URL: https://github.com/apache/lucenenet/pull/1154#issuecomment-2785801346


   
   ---
   
   ✅ **All suggested changes implemented.**  
   - Updated code comments to align with upstream Lucene and Lucene.NET 
standards.  
   - Restored previously removed comments where appropriate, and added `// 
LUCENENET` tags for custom logic or deviations.  
   - Lowercased `bigramdict.dct` filename to match upstream expectations and 
ensure compatibility on case-sensitive file systems.  
   - Added `[LuceneNetSpecific]` attribute to custom test classes.  
   
   🛠️ **About the `try-catch (EndOfStreamException)` block:**  
   We retained a minimal `try-catch` only around the critical `ReadInt32()` 
line inside the character loop — not the whole loop — to handle 
`EndOfStreamException`. This exception arises when the file ends naturally (not 
due to corruption), which is expected behavior for our test data.  
   Without this handling, the `DictionaryTests` fail because the test file 
contains fewer entries than the full GB2312 character set (6768). This pattern 
matches how earlier Lucene.NET code handled missing characters gracefully by 
stopping once the stream ends.  
   
   All tests are passing now ✅  
   Let me know if any further clarification or changes are needed!
   
   ---
   
   ---
   
   🔁 **Force-pushed twice** for fine-tuning:
   
   1. First force-push: Removed the inline comment `// Reached end of file — 
assume remaining chars are missing` above the `break;` line to reduce verbosity.
   2. Second force-push: Re-added a shorter and more focused comment: `// 
Reached end of file` — kept it concise while still indicating the purpose of 
the break.
   
   This keeps the code clean but still clear to future maintainers reviewing 
why the loop may exit early.
   
   ---
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] feat: Optimize SmartCn Dictionaries and Add Dictionary Loading Tests [lucenenet]

Reply via email to