Hello, While this is indeed possible, we would not be able to leverage the fact that 7-bit encoded strings could be copied without conversions when going out on a P/Invoke with "Ansi" settings (which in Mono, we have overloaded to mean "utf-8").
And Unix is predominantly a utf-8 friendly world. Hence, the encoding is better for our purposes. Miguel On Thu, Jul 28, 2016 at 3:33 AM, Jonathan Gilbert <[email protected]> wrote: > Another thought: It would make more sense for the single-byte encoding to > be ISO-8559-1 (Latin-1) than ASCII, because ASCII is either constrained to > 128 code points, or, most typically extended by code page 437 in North > American computers (and, of course, it cannot be assumed to be code page > 437 in the local encoding) requires a look-up table to convert to/from > Unicode, whereas Latin-1 simply is the first 256 code points of Unicode, > making the conversion a simple cast between System.Char/wchar_t and byte. > > Thanks, > > Jonathan Gilbert > > On Thu, Jul 28, 2016 at 2:15 AM, Jonathan Gilbert <[email protected]> > wrote: > >> Phew :-) I must have gotten the wrong idea from this: >> http://www.mono-project.com/docs/advanced/runtime/docs/ascii-strings/#disabling-fixed-on-strings >> >> Thanks, >> >> Jonathan Gilbert >> >> On Thu, Jul 28, 2016 at 12:06 AM, Miguel de Icaza < >> [email protected]> wrote: >> >>> Hello Jonathan, >>> >>> I personally think it is a terrible idea to make Mono completely unable >>> to run code that compiles and runs just fine on Microsoft's .NET framework. >>> Could get_OffsetToStringData be made to convert the ASCII >>> representation back to UCS-2 on-the-fly for that edge case where the code >>> actually uses the fixed (char *ptr = str) pattern? It's not a very >>> common pattern, so the overhead of the conversion, while defeating the >>> purpose of using that pattern in the first place, would affect only the >>> tiniest minority of code. >>> >>> >>> If this were to become a standard part of Mono, that would have to be >>> done. >>> >>> The reason it is not done in the current patch is that we needed to >>> identify all the spots with issues so they could adjusted to deal with the >>> two encodings, purely a bootstrapping side effect. >>> >>> And we need the spots adjusted, so we do not needlessly create duplicate >>> strings on demand, otherwise one of the benefits of this work (reduce >>> memory pressure) would go out the window. >>> >>> If this were the direction taken, it might be nice also to provide a way >>> to force an ASCII-capable string to be UCS-2 anyway, in case there are >>> people who want the fixed (char *ptr = str) pattern to remain >>> performant -- perhaps an environment variable?? Obviously we wouldn't want >>> the Mono runtime to scan the environment block every time it allocates a >>> string, so perhaps it could do the check & cache the result once on >>> startup, and then allow some innocuous method that's already doing a lot of >>> work, such as string.IsInterned, to re-check it. This avoids adding >>> Mono-specific API, so that code written to be aware of Mono's peculiarity >>> still runs just fine on other frameworks. >>> >>> >>> Something like that. >>> >>> Miguel. >>> >>> >> >
_______________________________________________ Mono-devel-list mailing list [email protected] http://lists.dot.net/mailman/listinfo/mono-devel-list
