cfeck added a comment.
I have zero knowledge about Baloo, but I can add some comments regarding Unicode. - the four ranges you used are all adjacent, so you could contract to {0x4E00, 0x9FFF} - there are more ranges for CJK characters in the BMP, at least {0x3400, 0x4DBF} would be useful (I don't know if CJK users ever use the compatibility characters) - to be able to fully support the remaining CJK blocks in higher planes, the could would need to handle surrogate pairs - if Baloo doesn't handle CJK, it maybe also doesn't handle other non-Latin scripts, so I suggest to use QChar::category() REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D11552 To: michaelh, #baloo, #frameworks, lbeltrame, bruns Cc: cfeck, ashaposhnikov, michaelh, astippich, spoorun, nicolasfella, ngraham, alexeymin