On Mon, 21 Oct 2013 12:24:50 +0200 "Pascal J. Bourguignon" <p...@informatimago.com> wrote:
> When reading utf-8 or other unicode streams, invalid byte sequences can > signal errors, be substituted by a given character, or be encoded into > application reseved code points to be able to transparently transmit the > invalid byte sequence. Cf. clisp :INPUT-ERROR-ACTION parameter of > ext:make-encoding (clisp encodings are external-format values). > http://clisp.org/impnotes/encoding.html#make-encoding I agree with the above, and it's currently possible in ECL to handle UTF-8 decoding errors (ext:stream-decoding-error) with access to the octets of the invalid sequence (ext:character-decoding-error-octets), with an available restart (invoke-restart 'use-value ...). Thus an application is free to also recode the invalid octets to LATIN or to implement "UTF8-B" at its discretion, if it implements its own input and output. The advantage of native modes such as UTF8-B or UTF8-LATIN-1 etc would be performance and simplicity in cases where this is wanted, but the default UTF-8 streams would continue to explicitely signal decoding errors, definitely. If you also mean that CLisp can also optionally do such conversions transparently on request (or that its interface allows user code to do this more efficiently), that's a good thing to know and I should look at its implementation for ideas on the way it presents that interface. I've added to my notes the link above, thanks a lot for your answer. -- Matt ------------------------------------------------------------------------------ October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk _______________________________________________ Ecls-list mailing list Ecls-list@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ecls-list