On Monday 08 May 2017 16:24:24 Pali Rohár wrote: > On Monday 08 May 2017 15:13:28 Vladimir 'phcoder' Serbinenko wrote: > > On Mon, Apr 10, 2017, 23:17 Pali Rohár <pali.ro...@gmail.com> wrote: > > > char *outbuf, int normalize_utf8) > > > > Normalize isn't the right word. And it's not utf-8 but latin1 (called > > compressed utf-16 by udf docs). > > Without this patch part read_string() expects that input string is > either utf-8 or utf-16 in that compressed osta form. If input string is > marked with leading 0x8 but contains invalid UTF-8 sequence (like chars > 80-FF) then it is treated as Latin1 and converted to UTF-8 in output. So > input "\x80" is returned as "\xC2\x80". > > What I need is to do not do this Latin1 --> UTF-8 conversion if input is > marked with leading 0x8 and stay it in binary/raw/octets form. This is > due to older versions of mkudffs which put into volset string not > conforming to osta spec. libblkid do not do that "\x80" --> "\xC2\x80" > conversion too so it is better to have same algorithm for providing UUID > on running system (blkid on Linux) and in bootloader (Grub2). > > > Are you sure you handle utf-16 case correctly? What is the expected > > behavior in those cases? Ideally you may want to just parse raw > > string in caller > > If volsetid is stored according to osta spec, then it is handed > correctly (both UTF-8 and UTF-16). > > > > + binpos = 16; > > > + for (i = 0; i < len; ++i) > > > + { > > > + if (!grub_isalnum (buf[i])) > > > > That looks real weird. What if first byte of UUID is 'a'? What if > > alnum part contains non-English chars.
Hm... check rather should be that buf[i] contains hexadecimal digit, not arbitrary letter. But in most cases it is hexadecimal digit... > > I have to admit I don't get what expected behaviour is. Can you > > elaborate on this and enable UUID test in udf_test to check that > > UUID matches blkid? > > According to osta spec, first 16 characters of volsetid are unique and > remaining anything. First 8 characters are hexadecimal representation of > 32bit timestamp and remaining 8 implementation free (but still are > unique). Therefore those first 16 characters we use for generating UUID. > Again some generators of UDF disks do not put there hexadecimal number, > but some garbage (sometimes not valid UTF-8...) so this code generates > alphanumeric UUID from input with fact that in most cases is input > hexadecimal (so used as is). > > If you have other idea how to deal with this, let me know... > -- Pali Rohár pali.ro...@gmail.com _______________________________________________ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel