Hi All,
        I haven't heard any further feedback about the character set conversion 
issue, so I'm going to put an issue in our tracker to disable the conversion 
that's currently allowed, and revisit it later.

        Quincey

On May 3, 2011, at 7:16 AM, Quincey Koziol wrote:

> Hi Andrew,
> 
> On Apr 26, 2011, at 8:07 PM, Andrew Collette wrote:
> 
>> Hi,
>> 
>> I'm curious as to how HDF5 treats character-set information during
>> type conversion.  There are functions H5Tset_cset and H5Tget_cset in
>> the API.  What happens if I try to read data defined as H5T_CSET_ASCII
>> into a buffer defined as H5T_CSET_UTF8, and vice-versa?  What if the
>> buffer defined as ASCII contains characters > 127, but isn't UTF-8
>> compliant?  I ask because in practice I've noticed that H5T_CSET_ASCII
>> seems to be used to indicate data of an unknown encoding.
> 
>       Sorry for the delay in reply, I wanted to verify the library's behavior 
> and it took a little while to find a gap in my schedule.
> 
>       Currently, the library will neither convert the data, nor fail to 
> perform a read/write operation with two different character sets - it treats 
> the UTF-8 and ASCII string datatypes as identical (see attached little test 
> program).
> 
>       However, I'm inclined to change that behavior and have the conversion 
> fail, so that application developers and users aren't surprised.  Then, once 
> we find out the correct behavior and can implement a bridge between the two 
> character sets (at least from ASCII to UTF-8), we can enable the proper 
> behavior.
> 
>       How's that sound to people?
> 
>       Quincey
> 
> <test_utf8.c>_______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to