Nicolas Williams wrote:
> On Wed, Apr 22, 2009 at 11:16:09AM -0700, Don Cragun wrote:
> > The standard does not currently specify a way to count (multi-byte)
> > characters even though this means tail output may start or end in the
> > middle of a multi-byte character when using the -c option.
> 
> How... painful.  Of course, for fixed width encodings
> and UTF-8/16 it

AFAIK Solaris doesn't have an UTF-16 based locale and AFAIK UTF-16 can't
be supported by the POSIX multibyte API (at least I never saw and and
can't imagine how it should work) ...

> should be possible to automatically adjust the -c argument value so it
> starts at the start of a character, but that would require another
> argument.

Following the precedent of "wc" we could use "-C" (uppercase 'C') for
this purpose...
... but I am not sure whether it is possible for all encodings (e.g.
Shift-JIS, GBK, EUC etc.) to properly detect the start of a multibyte
character (anyone remeber ISO-2022 ? =:-) ).

> Fortunately I bet tail -c ... is fairly uncommon.

There are several consumes in Solaris which use it.

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.mainz at nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)

Reply via email to