JapaneseReadingFormFilter cannot convert some hiragana to romaji

KURAMITSU-TAKUMA Thu, 07 Dec 2023 05:25:17 -0800

Hello,

I found a bug where some hiragana characters are not being converted to romaji 
when using JapaneseReadingFormFilter.
(For example, “ぐ” is not being converted to “gu”. I noticed this when searching 
for “ますきんぐ” and “マスキング” did not appear in the search results.) 
I believe this is due to the fact that there are hiragana characters in the 
kuromoji dictionary that do not have an explicitly defined reading.


# Proposed Solution
How about adding a process to convert hiragana to katakana when detected in the 
getRomanization function?
https://github.com/apache/lucene/pull/12885

I apologize in advance if I have made any mistakes in the reporting process or 
procedures as this is my first time posting in this community.

--

Takuma Kuramitsu

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

JapaneseReadingFormFilter cannot convert some hiragana to romaji

Reply via email to