The alif and ayn can also be used as diacritic-like characters in Korean;  this 
is a known practice.   But thanks anyway.

On May 24, 2012, at 9:30 AM, Charles Riley wrote:

> Hi Naomi,
> 
> I don't have a conclusive answer for you on this yet, but let me pick up on a 
> few points.
> 
> First, the apostrophe is probably being handled through ignoring punctuation 
> in the ICUCollationKeyFilterFactory.  
> 
> Alif isn't a diacritic but a letter, and its character properties would be 
> handled as such, apparently also outside the scope of what the folding filter 
> factory does unless it's tailored.
> 
> From the solrwiki, this looks like a helpful rule of thumb:
> 
> "When To use a CharFilter vs a TokenFilter
> There are several pairs of CharFilters and TokenFilters that have related 
> (ie: MappingCharFilter and ASCIIFoldingFilter) or nearly identical 
> functionality (ie: PatternReplaceCharFilterFactory and 
> PatternReplaceFilterFactory) and it may not always be obvious which is the 
> best choice.
> 
> The ultimate decision depends largely on what Tokenizer you are using, and 
> whether you need to "out smart" it by preprocessing the stream of characters.
> 
> For example, maybe you have a tokenizer such as StandardTokenizer and you are 
> pretty happy with how it works overall, but you want to customize how some 
> specific characters behave.
> 
> In such a situation you could modify the rules and re-build your own 
> tokenizer with javacc, but perhaps its easier to simply map some of the 
> characters before tokenization with a CharFilter."
> 
> 
> Charles    
> 
> On Tue, May 15, 2012 at 2:47 PM, Naomi Dushay <ndus...@stanford.edu> wrote:
> We are using the ICUFoldingFilterFactory with great success to fold 
> diacritics so searches with and without the diacritics get the same results.
> 
> We recently discovered we have some Korean records that use an alif diacritic 
> instead of an apostrophe, and this diacritic is NOT getting folded.   Has 
> anyone experienced this for alif or ayn characters?   Do you have a solution?
> 
> 
> - Naomi
> 
> --
> You received this message because you are subscribed to the Google Groups 
> "solrmarc-tech" group.
> To post to this group, send email to solrmarc-t...@googlegroups.com.
> To unsubscribe from this group, send email to 
> solrmarc-tech+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/solrmarc-tech?hl=en.
> 
> 
> 
> 
> -- 
> Charles L. Riley
> Catalog Librarian for Africana
> Sterling Memorial Library, Yale University
> <zenodo...@gmail.com>
> 203-432-7566
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "solrmarc-tech" group.
> To post to this group, send email to solrmarc-t...@googlegroups.com.
> To unsubscribe from this group, send email to 
> solrmarc-tech+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/solrmarc-tech?hl=en.

Reply via email to