On Jul 29, 2014, at 7:41 PM, Ken Hornstein wrote:
> So here's the thing ... right now we (mostly) don't have to think about
> processing UTF-8. We get bytes in from decoding and squirt them out.
> There's no processing; we leave that up to the terminal to handle it.
> We're essentially UTF-8 ig
>And UTF-8 internally would give a third option. For a header, are three
>things wanted: the raw header, embedded linefeeds and all; a logical
>single-line version but with the =?ISO-8859-1? still present; and a
>decoded UTF-8 version of the previous.
>
>`subject' could go from the last of thos
>Bio's formatted I/O routines are fully utf-8 aware. E.g. when looking
>to consume or output a character they know how many octets are required
>to form any given unicode character. The upside is you never have to
>think at all about processing utf-8 -- it just happens.
So here's the thing ... r
On Jul 29, 2014, at 12:47 PM, Ken Hornstein wrote:
> Alright. So ... I'm trying to understand the gain of using Bio(3). I'm
> not seeing it yet, but I'm willing to be educated. Would you want to
> import tcs(1) as well?
Bio's formatted I/O routines are fully utf-8 aware. E.g. when looking t
On Sat, Jul 26, 2014 at 6:53 PM, Ken Hornstein wrote:
>>I would really like to see this, too. But, since there is no definition
>>of what a "canonicalized" header might look like, we would need to
>>proceed very carefully in defining the semantics around this. Header
>>parsing is riddled with co
>> You had mentioned this earlier, and I took a look at it then. I did not
>> see anything in Bio(3) that handled character transliteration.
>
>Bio doesn't. upasfs invoked tcs(1) to do the translations.
Alright. So ... I'm trying to understand the gain of using Bio(3). I'm
not seeing it yet, b
On Jul 29, 2014, at 4:26 AM, Ken Hornstein wrote:
> You had mentioned this earlier, and I took a look at it then. I did not
> see anything in Bio(3) that handled character transliteration.
Bio doesn't. upasfs invoked tcs(1) to do the translations.
signature.asc
Description: Message signed
Hi Ken,
> Well, I guess we could make it work both ways. Right now it's not
> really decoded before it hits the format engine. We could keep with
> that logic. Or if you wanted to convert it to ASCII ... well, I don't
> see a better option than converting it and substituting an appropriate
> ch
>The reason why I like the idea of using utf-8 internally is that it
>encompasses anything we can (reasonably) expect to see. It makes us
>"lossless" internally. If the display side of the environment can't
>handle things, well, too bad. We can't reprogram people's terminals
>or character sets f
On Mon, 28 Jul 2014 18:11:59 -0700, Lyndon Nerenberg writes:
>Do any of you have gear with the one true (:-)) byte order we could add to the
>build cluster? E.g. a pair of 32/64-bit Sparc boxes would be great. Or PPC.
>Or even MIPS-BE if anyone keeps that sort of thing running these days.
i c
10 matches
Mail list logo