On 2012-07-01, at 4:39 PM, Alex Shinn wrote:
> On Sun, Jul 1, 2012 at 10:19 PM, Marc Feeley <[email protected]> wrote:
>> The R5RS has the following sequence to sequence conversion procedures:
>>
>> list->string, and string->list
>> list->vector, and vector->list
>>
>> The R7RS is adding bytevector sequences, but it does not add the conversion
>> procedures:
>>
>> list->bytevector, and bytevector->list
>>
>> What is the rationale for this inconsistency?
>>
>> Moreover, the R7RS is adding only the first set of these conversion
>> procedures:
>>
>> vector->string, and string->vector
>> bytevector->string, and string->bytevector (not in R7RS)
>> vector->bytevector, and bytevector->vector (not in R7RS)
>
> Actually, we have the second, it's just named
> utf8->string and string->utf8 to emphasize the
> encoding used to convert to and from a bytevector.
Not really. I expected bytevector->string to be equal to
(lambda (bv) (list->string (map integer->char (bytevector->list bv))))
which would correspond I guess to a latin1->string functionality with your
naming Scheme.
Concerning utf8->string and string->utf8, I dislike these procedures for many
reasons:
1) Very minor point: the official name for this encoding is UTF-8, so it should
be UTF-8->string and string->UTF-8.
2) The procedures specify in their names the character encoding to use. But
there are oodles of character encodings, so for easy extensibility to other
encodings, it would be better to use a parameter as in (decode-string
bytevector 'UTF-8) and (encode-string string 'UTF-8) instead of oodles of
different procedures.
3) The main reason for character encodings is to perform I/O on byte-oriented
streams. Yet the only procedures having to do with character encodings in R7RS
are utf8->string and string->utf8. This seems wrong. If textual output could
be performed on binary ports and the character encoding could be specified when
the port is opened (as was proposed in SRFI-91,
http://srfi.schemers.org/srfi-91/srfi-91.html, and implemented in Gambit), then
the procedures utf8->string and string->utf8 would be superfluous since they
could be defined easily like this:
(define (string->utf8 s)
(let ((port (open-output-bytevector 'UTF-8)))
(display s port)
(get-output-bytevector port)))
Marc
_______________________________________________
Scheme-reports mailing list
[email protected]
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports