On Fri, Jul 27, 2012 at 2:50 PM, Scott Lovenberg
<[email protected]> wrote:
> On Fri, Jul 27, 2012 at 3:42 PM, Scott Lovenberg
> <[email protected]> wrote:
>>
>>
>> On Fri, Jul 27, 2012 at 3:13 PM, Frediano Ziglio
>> <[email protected]> wrote:
>>>
>>> Hi,
>>>   I'm currently trying to support utf-16 with characters not in plane 0.
>>>
>>> I'm currently end up with this patch. Currently is not against latest
>>> kernel but the problem still reside in last git kernel.
>>>
>>> wchar_t is currently 16bit so converting a utf8 encoded characters not
>>> in plane 0 (>= 0x10000) to wchar_t (that is calling char2uni) lead to a
>>> -EINVAL return. This patch detect utf8 in cifs_strtoUCS and add special
>>> code calling directly utf8_to_utf32.
>>>
>>> Does it sound a good patch or just a bad hack. Perhaps would be better
>>> to change char2uni converting to unicode_t (32bit) instead of wchar_t
>>> but probably many code have to be checked in order to make sure it does
>>> not lead to wrong conversions, overflows or other bad stuff.
>>>
>>> Is it worth working in this hacking way? I'd like to upstream this
>>> patch.


Terminology is confusing.   Refreshing my memory by looking at

http://en.wikipedia.org/wiki/Universal_Character_Set

are we talking about UTF-16 vs. UCS-2 (ie cases where a pair of
16 bit unicode characters are interpreted as one)?

IIRC there are a few languages where this helps, at least
since Windows XP when apparently it became more common.
and we have to support this in kernel.


> Just my $0.02, but there are a lot of magic numbers in this patch

Agreed.  The check vs. 3F and against the maximum unicode
value should be against #defined values which are easier to read.



-- 
Thanks,

Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to