Re: [rust-dev] Proposed API for character encodings
Le 21/09/2013 16:38, Olivier Renaud a écrit : I'd expect this offset to be absolute. After all, the only thing that the programmer can do with this information at this point is to report it to the user ; if the programmer wanted to handle the error, he could have done it by using a trap. A relative offset has no meaning outside of the processing loop, whereas an absolute offset can still be useful even outside of the program (if the source of the stream is a file, then an absolute offset will give the exact location of the error in the file). A counter is super cheap, I would'nt worry about its cost. Actually, it just has to be incremented once for each call to 'feed'. Well to get the position inside a given chunk of input you still have to count individual bytes. (Maybe with Iterator::enumerate?) Unless maybe we do dirty pointer arithmetic… If possible, I’d rather find a way to not have to pay that cost in the common case where the error handling is *not* abort and DecodeError is never used. This is also a bit annoying as each implementation will have to repeat the counting logic, but maybe it’s still worth it. Note : for the encoder, you will have to specify wether the offset is a 'code point' count or a 'code unit' count. Yes. I don’t know yet. If we do [1] and make the input generic it will probably have to be code points. [1] https://mail.mozilla.org/pipermail/rust-dev/2013-September/005662.html Otherwise, it may be preferable to match Str::slice and count UTF-8 bytes. (Which I suppose is what you call code units?) -- Simon Sapin ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Syntax for raw string literals
Hi everyone, Have we considered syntax similar to Ruby style heredocs? I particularly like the light looking syntax. - The indentation of the block is determined by the indentation of the eos marker. Keeping code flow natural. eos Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud eos - Brackets in the eos marker are flipped to allow [[[raw]]] - eoseos causes a literal eos to be inserted. For example a raw string My main concern is that might be a common operator. Perhaps would be ok? Thoughts? On 21/09/2013 4:28 AM, Alex Crichton a...@crichton.co wrote: Of the 3, Lua's is probably the best, although it's a bit esoteric (with using [[ and nary a quote in sight). I think an important thing to keep in mind is that the main reason behind creating a new form of literal is for things like: * Escapes in format! strings * Possible regular expression syntax (this also may be a syntax extension) * Type literal windows paths (escaping \ is hard) * Otherwise long literals which may contain quotes (like html text) With those in mind, although Lua's syntax is sufficient, is it nice to use? If the first thing I saw as an introduction to Rust was: fn main() { println!([[Hello, {}!]], world); } I would be a little confused. Now the [[/]] aren't really necessary in this case, but I'm personally unsure of how usable [[/]] would be throughout the language. Raw literals in languages like C++ and Lua I think aren't intended to be used that often. Instead they should be used only when necessary, and you frequently don't see them in code. For rust, the use cases which are the cause of this discussion are actually fairly common, and I'm not sure that we'd want to see [[/]] all over the place, although of course that's just my opinion :) Skimming back, I haven't seen a suggestion of the backtick character as a delimiter. Go takes this approach, and I don't believe that in Go you can have a backtick anywhere in a backtick literal, and otherwise what you see is what you get. It's at least something to consider, though. ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] RFC: Syntax for raw string literals
Oh right, that's fair enough. I think the indentation/escaping issues can be fixed however the new line issues you mentioned will still exist for strings split over multiple lines using this syntax. Good luck! Steven On Monday, September 23, 2013, Kevin Ballard wrote: Heredocs are primarily intended for multiline strings. Raw strings are intended for strings that have no escapes. Raw strings typically allow newlines, but that is not their primary purpose (and in Rust, regular strings allow newlines anyway). Trying to use a heredoc syntax for raw strings is just a headache (because of indentation, and dealing with the first and/or trailing newline in the heredoc). -Kevin On Sep 22, 2013, at 11:52 AM, Artem Egorkine art...@gmail.com wrote: I must be missing something about ruby heredocs, but the indentation had always been a painful question about them ( http://stackoverflow.com/questions/3772864/how-do-i-remove-leading-whitespace-chars-from-ruby-heredoc). Another thing, of course, it's that they are by no means raw (which of course doesn't stop rust from adopting their syntax for raw strings. I would just say that it would be nice to pick such syntax for raw strings that allows for both single line raw strings and multi-line raw strings to be represented easily. On Sep 22, 2013 1:00 PM, Steven Ashley ste...@ashley.net.nz wrote: Hi everyone, Have we considered syntax similar to Ruby style heredocs? I particularly like the light looking syntax. - The indentation of the block is determined by the indentation of the eos marker. Keeping code flow natural. eos Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud eos - Brackets in the eos marker are flipped to allow [[[raw]]] - eoseos causes a literal eos to be inserted. For example a raw string My main concern is that might be a common operator. Perhaps would be ok? Thoughts? On 21/09/2013 4:28 AM, Alex Crichton a...@crichton.co wrote: Of the 3, Lua's is probably the best, although it's a bit esoteric (with using [[ and nary a quote in sight). I think an important thing to keep in mind is that the main reason behind creating a new form of literal is for things like: * Escapes in format! strings * Possible regular expression syntax (this also may be a syntax extension) * Type literal windows paths (escaping \ is hard) * Otherwise long literals which may contain quotes (like html text) With those in mind, although Lua's syntax is sufficient, is it nice to use? If the first thing I saw as an introduction to Rust was: fn main() { println!([[Hello, {}!]], world); } I would be a little confused. Now the [[/]] aren't really necessary in this case, but I'm personally unsure of how usable [[/]] would be throughout the language. Raw literals in languages like C++ and Lua I think aren't intended to be used that often. Instead they should be used only when necessary, and you frequently don't see them in code. For rust, the use cases which are the cause of this discussion are actually fairly common, and I'm not sure that we'd want to see [[/]] all over the place, although of course that's just my opinion :) Skimming back, I haven't seen a suggestion of the backtick character as a delimiter. Go takes this approach, and I don't believe that in Go you can have a backtick anywhere in a backtick literal, and otherwise what you see is what you get. It's at least something to consider, though. ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] RFC: Syntax for raw string literals
I'm in favour of C++11 syntax. On Monday, September 23, 2013, Steven Ashley wrote: Oh right, that's fair enough. I think the indentation/escaping issues can be fixed however the new line issues you mentioned will still exist for strings split over multiple lines using this syntax. Good luck! Steven On Monday, September 23, 2013, Kevin Ballard wrote: Heredocs are primarily intended for multiline strings. Raw strings are intended for strings that have no escapes. Raw strings typically allow newlines, but that is not their primary purpose (and in Rust, regular strings allow newlines anyway). Trying to use a heredoc syntax for raw strings is just a headache (because of indentation, and dealing with the first and/or trailing newline in the heredoc). -Kevin On Sep 22, 2013, at 11:52 AM, Artem Egorkine art...@gmail.com wrote: I must be missing something about ruby heredocs, but the indentation had always been a painful question about them ( http://stackoverflow.com/questions/3772864/how-do-i-remove-leading-whitespace-chars-from-ruby-heredoc). Another thing, of course, it's that they are by no means raw (which of course doesn't stop rust from adopting their syntax for raw strings. I would just say that it would be nice to pick such syntax for raw strings that allows for both single line raw strings and multi-line raw strings to be represented easily. On Sep 22, 2013 1:00 PM, Steven Ashley ste...@ashley.net.nz wrote: Hi everyone, Have we considered syntax similar to Ruby style heredocs? I particularly like the light looking syntax. - The indentation of the block is determined by the indentation of the eos marker. Keeping code flow natural. eos Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud eos - Brackets in the eos marker are flipped to allow [[[raw]]] - eoseos causes a literal eos to be inserted. For example a raw string My main concern is that might be a common operator. Perhaps would be ok? Thoughts? On 21/09/2013 4:28 AM, Alex Crichton a...@crichton.co wrote: Of the 3, Lua's is probably the best, although it's a bit esoteric (with using [[ and nary a quote in sight). I think an important thing to keep in mind is that the main reason behind creating a new form of literal is for things like: * Escapes in format! strings * Possible regular expression syntax (this also may be a syntax extension) * Type literal windows paths (escaping \ is hard) * Otherwise long literals which may contain quotes (like html text) With those in mind, although Lua's syntax is sufficient, is it nice to use? If the first thing I saw as an introduction to Rust was: fn main() { println!([[Hello, {}!]], world); } I would be a little confused. Now the [[/]] aren't really necessary in this case, but I'm personally unsure of how usable [[/]] would be throughout the language. Raw literals in languages like C++ and Lua I think aren't intended to be used that often. Instead they should be used only when necessary, and you frequently don't see them in code. For rust, the use cases which are the cause of this discussion are actually fairly common, and I'm not sure that we'd want to see [[/]] all over the place, although of course that's just my opinion :) Skimming back, I haven't seen a suggestion of the backtick character as a delimiter. Go takes this approach, and I don't believe that in Go you can have a backtick anywhere in a backtick literal, and otherwise what you see is what you get. It's at least something to consider, though. ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Syntax for raw string literals
On Thu, Sep 19, 2013 at 1:36 PM, Kevin Ballard ke...@sb.org wrote: One feature common to many programming languages that Rust lacks is raw string literals. This is one of those things where I feel almost all languages get wrong, and probably mostly for historical reasons. IMO there should *only* be raw string literals on the syntax level. It seems extremely weird to me that languages have this second-level language that gets interpreted within a literal. That kind of higher level processing should be part of a formatting library (e.g. a macro like fmt), rather than an embedded language inside the literal syntax. So, I think string literals should contain exactly what they contain in their source form, without any additional processing. If you want to express characters that are inconvenient to type, you can use control sequences and a (standard) formatting library to produce them. -- Sebastian Sylvan ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Syntax for raw string literals
On 09/22/2013 05:40 PM, Kevin Ballard wrote: I've filed a summary of this conversation as an RFC issue on the GitHub issue tracker. https://github.com/mozilla/rust/issues/9411 I've used a variation of the option 10 for my own configuration format's raw strings: delimraw textdelim Where delim was an equivalent of an identifier. If ` is a problem, then maybe using ' works too? 'delimraw textdelim' 'raw text' -SL ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Syntax for raw string literals
' doesn't work because 'delim is parsed as a lifetime. -Kevin On Sep 22, 2013, at 3:41 PM, SiegeLord slab...@aim.com wrote: On 09/22/2013 05:40 PM, Kevin Ballard wrote: I've filed a summary of this conversation as an RFC issue on the GitHub issue tracker. https://github.com/mozilla/rust/issues/9411 I've used a variation of the option 10 for my own configuration format's raw strings: delimraw textdelim Where delim was an equivalent of an identifier. If ` is a problem, then maybe using ' works too? 'delimraw textdelim' 'raw text' -SL ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Syntax for raw string literals
On 09/22/2013 07:10 PM, Kevin Ballard wrote: ' doesn't work because 'delim is parsed as a lifetime. The parser will have to be modified to support raw strings in any of their manifestations. Is it a fact that there is no possible parser than can differentiate between 'delim and 'delim ? I guess it'll give trouble to this current syntax 'fooblah, but it wouldn't be the first place in the grammar where a space was necessary to disambiguate between constructs ( comes to mind). -SL ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Syntax for raw string literals
It would require changing the rules for lifetimes, with no benefit (and no clear new rule to use anyway). 'foodelim is perfectly legal today, and I see no reason to change that. -Kevin On Sep 22, 2013, at 4:26 PM, SiegeLord slab...@aim.com wrote: On 09/22/2013 07:10 PM, Kevin Ballard wrote: ' doesn't work because 'delim is parsed as a lifetime. The parser will have to be modified to support raw strings in any of their manifestations. Is it a fact that there is no possible parser than can differentiate between 'delim and 'delim ? I guess it'll give trouble to this current syntax 'fooblah, but it wouldn't be the first place in the grammar where a space was necessary to disambiguate between constructs ( comes to mind). -SL ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Syntax for raw string literals
On 09/22/2013 07:45 PM, Kevin Ballard wrote: It would require changing the rules for lifetimes, with no benefit (and no clear new rule to use anyway). 'foodelim is perfectly legal today, and I see no reason to change that. It's not as big a change as you make it out to be, but fair enough. Looking at the parser right now, it seems to me that implementing the leading 'R' in C++'s syntax will be just as difficult/easy as doing my delimstuffdelim proposal so I'm sticking to that idea as my 'vote'. If C++ way is chosen, I'd suggest the following permutation of the delimeters, as I think it looks lighter (by virtue of using smaller characters): r'delimraw stringdelim' r'raw string' -SL ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Syntax for raw string literals
On Sep 22, 2013, at 5:27 PM, SiegeLord slab...@aim.com wrote: On 09/22/2013 07:45 PM, Kevin Ballard wrote: It would require changing the rules for lifetimes, with no benefit (and no clear new rule to use anyway). 'foodelim is perfectly legal today, and I see no reason to change that. It's not as big a change as you make it out to be, but fair enough. Looking at the parser right now, it seems to me that implementing the leading 'R' in C++'s syntax will be just as difficult/easy as doing my delimstuffdelim proposal so I'm sticking to that idea as my 'vote'. With C++11 syntax, `Rfoo` is very obviously the start of a raw string. With your syntax, what about `addfoo`? Is that obviously the start of a raw string, or did the user just forget to type the ( in their function call? They may look the same to a lexer, but I think that being very clear about what starts the raw string is beneficial for reading. If C++ way is chosen, I'd suggest the following permutation of the delimeters, as I think it looks lighter (by virtue of using smaller characters): r'delimraw stringdelim' r'raw string' I'd really rather not overload the meaning of the ' character, if at all possible. Right now it's used for lifetimes, and character literals. Expanding it to also be used in string literals just feels like unnecessary overloading. We already have a perfectly good that means string literal. I suppose you could flip that to rdelim'raw string'delim or r'raw string'. I just don't see why that's any better than Rdelim(raw string)delim or R(raw string). Especially in the r'raw string' case, having lots of little tick marks in a row takes more effort to visually distinguish. I suppose r(raw string) is an option, but if we're that close to C++11 we may as well just go whole hog and be consistent with their syntax. -Kevin ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev