On 12/04/2013 06:01 AM, Simon Sapin wrote:
Hi,
In response to:
https://github.com/mozilla/rust/wiki/Meeting-weekly-2013-12-03#strfrom_utf8
Yes, error handling other than strict/fail requires allocation. I
suggest taking the pull request for the special case of non-allocating
strict UTF-8, and keeping error handling for a future, larger API that
also handles other encodings (and incremental processing):
OK.
https://github.com/mozilla/rust/pull/10701
https://github.com/mozilla/rust/wiki/Proposal-for-character-encoding-API
[On invalid UTF-8 bytes]
brson: One has a condition that lets you replace a bad character
I believe this is not implemented. The current not_utf8 condition lets
you do the entire decoding yourself.
You're right! I guess I was forshadowing.
acrichto: We could truncate by default.
I am very much opposed to this. Truncating silently loses data
(potentially lots of it!) It should not be implemented, let alone be
the default.
I agree.
jack: In python, you have to specify how you want it transformed.
Truncate vs. replace with '?', etc. Maybe there should be an
alternate version that takes the transform.
pnkfelix: But doesn't work with slices...
jack: There's truncate, replace, and fail.
Python does not have truncate. It has ignore (skip invalid byte
sequences but continue with the rest of the input), strict (fail), and
replace (with � U+FFFD REPLACEMENT CHARACTER). You don’t have to
specify an error handling, strict is the default.
Ignore is bad IMO as it silently loses data (although it’s not as bad
as truncate) though it could have uses I’m not thinking of right now.
Side note:
Regarding failing vs. returning an Option or Result: I’d be in favor
of only having the latter. Having two versions of the same API (foo()
and foo_opt()) is ugly, and it’s easy to get "value or fail" from an
Option with .unwrap()
Thanks for your always valuable feedback, Simon!
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev