Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/04/12 02:03, Timon Gehr wrote: On 02/03/2012 11:08 AM, Artur Skawina wrote: On 02/03/12 00:20, Jonathan M Davis wrote: in is pointless on value types. All it does is make the function parameter const, which really doesn't do much for you, and in some instances, is really annoying. Personally, I see no point in using in unless the parameter is a reference type, and even then, it's often a bad idea with reference types, because in is really const scope, and the scope is problematic if you want to return anything from that variable. It's particularly problematic with arrays, since it's frequently desirable to return slices of them, and scope (and therefore in) would prevent that. It's useful in some instances (particularly with delegates), but I'd use in _very_ sparingly. It's almost always more trouble than it's worth IMHO. BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. I totally agree. Most function arguments are not escaped. However, it is nice that the shortest storage class, 'in', implies scope. There are currently two problems with using in: a) the one mentioned, where using in/scope means you can't (or shouldn't be able to) pass the thing to another function w/o scope marked args, and b) in implies const, which is a problem because you may want to reassign the argument - a perfectly safe thing to do. With const itself you can use parentheses to limit its scope to not include the reference itself; the problematic case is the builtin string alias, ie int f(in string s); would have to allow reassigning 's' inside the function. Semi-related quiz: immutable(char)[] a = a; const(char)[] b = b; auto aa = a ~ a; auto bb = b ~ b; auto ab = a ~ b; writeln(aa: , typeid(aa), bb: , typeid(bb), ab: , typeid(ab)); And the question is: How many people, who have not already been bitten by this, will give the correct answer to: What will this program print?? There should have been another class, in addition to immutable/const, say uniq. For cases where an expression results in new unique objects. This class implicitly converts to any of const/immutable and mutates to the new type. IOW string a = a; char[] b = b; auto c = a ~ b; // typeid(c) == (uniq(char)[]) string d = c; // Fine, c is unique and can be safely treated as a string. // But, from now on, c is (immutable(char)[]) so: char[] e = c; // Fails. // And the other way: auto f = a ~ b; char[] g = f; // OK string h = f // Fails, as f is now a (char[]) No need for unsafe-looking casts, just so that the compiler accepts perfectly safe code, like: string c = ab;, which would currently fail if used in the above quiz, and has to be written as string c = cast(string)ab;. [1] artur [1] Using a helper template is not different from adding a comment, it only serves to document /why/ the programmer had to something, which is only a workaround for a language/compiler limitation. Compiler because at least the simple cases could be silently fixed in a backward compatible way (by not disallowing safe conversions). Language because uniq would also be useful when the programmer knows it applies, but the compiler can't figure it out by itself.
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/04/2012 06:55 PM, Artur Skawina wrote: On 02/04/12 02:03, Timon Gehr wrote: On 02/03/2012 11:08 AM, Artur Skawina wrote: On 02/03/12 00:20, Jonathan M Davis wrote: in is pointless on value types. All it does is make the function parameter const, which really doesn't do much for you, and in some instances, is really annoying. Personally, I see no point in using in unless the parameter is a reference type, and even then, it's often a bad idea with reference types, because in is really const scope, and the scope is problematic if you want to return anything from that variable. It's particularly problematic with arrays, since it's frequently desirable to return slices of them, and scope (and therefore in) would prevent that. It's useful in some instances (particularly with delegates), but I'd use in _very_ sparingly. It's almost always more trouble than it's worth IMHO. BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. I totally agree. Most function arguments are not escaped. However, it is nice that the shortest storage class, 'in', implies scope. There are currently two problems with using in: a) the one mentioned, where using in/scope means you can't (or shouldn't be able to) pass the thing to another function w/o scope marked args, and b) in implies const, which is a problem because you may want to reassign the argument - a perfectly safe thing to do. With const itself you can use parentheses to limit its scope to not include the reference itself; the problematic case is the builtin string alias, ie int f(in string s); would have to allow reassigning 's' inside the function. Semi-related quiz: immutable(char)[] a = a; const(char)[] b = b; auto aa = a ~ a; auto bb = b ~ b; auto ab = a ~ b; writeln(aa: , typeid(aa), bb: , typeid(bb), ab: , typeid(ab)); And the question is: How many people, who have not already been bitten by this, will give the correct answer to: What will this program print?? I think this is covered in this issue: http://d.puremagic.com/issues/show_bug.cgi?id=7311 But feel free to open a more specific enhancement/bug report. There should have been another class, in addition to immutable/const, say uniq. For cases where an expression results in new unique objects. This class implicitly converts to any of const/immutable and mutates to the new type. IOW string a = a; char[] b = b; auto c = a ~ b; // typeid(c) == (uniq(char)[]) string d = c; // Fine, c is unique and can be safely treated as a string. // But, from now on, c is (immutable(char)[]) so: char[] e = c; // Fails. // And the other way: auto f = a ~ b; char[] g = f; // OK string h = f // Fails, as f is now a (char[]) No need for unsafe-looking casts, just so that the compiler accepts perfectly safe code, like: string c = ab;, which would currently fail if used in the above quiz, and has to be written as string c = cast(string)ab;. [1] artur [1] Using a helper template is not different from adding a comment, it only serves to document /why/ the programmer had to something, which is only a workaround for a language/compiler limitation. Compiler because at least the simple cases could be silently fixed in a backward compatible way (by not disallowing safe conversions). Language because uniq would also be useful when the programmer knows it applies, but the compiler can't figure it out by itself. I am certain we'll get something like this eventually, once the compiler bug count has shrunk sufficiently. It is a natural thing to add.
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/04/12 22:20, Timon Gehr wrote: On 02/04/2012 06:55 PM, Artur Skawina wrote: On 02/04/12 02:03, Timon Gehr wrote: On 02/03/2012 11:08 AM, Artur Skawina wrote: On 02/03/12 00:20, Jonathan M Davis wrote: in is pointless on value types. All it does is make the function parameter const, which really doesn't do much for you, and in some instances, is really annoying. Personally, I see no point in using in unless the parameter is a reference type, and even then, it's often a bad idea with reference types, because in is really const scope, and the scope is problematic if you want to return anything from that variable. It's particularly problematic with arrays, since it's frequently desirable to return slices of them, and scope (and therefore in) would prevent that. It's useful in some instances (particularly with delegates), but I'd use in _very_ sparingly. It's almost always more trouble than it's worth IMHO. BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. I totally agree. Most function arguments are not escaped. However, it is nice that the shortest storage class, 'in', implies scope. There are currently two problems with using in: a) the one mentioned, where using in/scope means you can't (or shouldn't be able to) pass the thing to another function w/o scope marked args, and b) in implies const, which is a problem because you may want to reassign the argument - a perfectly safe thing to do. With const itself you can use parentheses to limit its scope to not include the reference itself; the problematic case is the builtin string alias, ie int f(in string s); would have to allow reassigning 's' inside the function. Semi-related quiz: immutable(char)[] a = a; const(char)[] b = b; auto aa = a ~ a; auto bb = b ~ b; auto ab = a ~ b; writeln(aa: , typeid(aa), bb: , typeid(bb), ab: , typeid(ab)); And the question is: How many people, who have not already been bitten by this, will give the correct answer to: What will this program print?? I think this is covered in this issue: http://d.puremagic.com/issues/show_bug.cgi?id=7311 But feel free to open a more specific enhancement/bug report. Apparently, there's already a bug open for everything. I'm not sure if it's a good or bad thing. :) I don't think there's one correct answer here - you're right that unique const does not really make sense. But is mutable (non-const) really better than immutable? It depends, sometimes you will want one, sometimes the other. I first ran into this while doing a custom string class - there it was the cause of the one and only cast - (string ~ const(char)[]) can obviously still be a string, but the compiler won't accept it without a cast. I'm not sure how often you'll want the result of concatenation to be mutable, compared to immutable. Anyway, the result really is unique, not mutable, const or immutable, at least until it is converted to one of those, hence the solution described below. There should have been another class, in addition to immutable/const, say uniq. For cases where an expression results in new unique objects. This class implicitly converts to any of const/immutable and mutates to the new type. IOW string a = a; char[] b = b; auto c = a ~ b; // typeid(c) == (uniq(char)[]) string d = c; // Fine, c is unique and can be safely treated as a string. // But, from now on, c is (immutable(char)[]) so: char[] e = c; // Fails. // And the other way: auto f = a ~ b; char[] g = f; // OK string h = f // Fails, as f is now a (char[]) No need for unsafe-looking casts, just so that the compiler accepts perfectly safe code, like: string c = ab;, which would currently fail if used in the above quiz, and has to be written as string c = cast(string)ab;. [1] artur [1] Using a helper template is not different from adding a comment, it only serves to document /why/ the programmer had to something, which is only a workaround for a language/compiler limitation. Compiler because at least the simple cases could be silently fixed in a backward compatible way (by not disallowing safe conversions). Language because uniq would also be useful when the programmer knows it applies, but the compiler can't figure it out by itself. I am certain we'll get something like this eventually, once the compiler bug count has shrunk sufficiently. It is a natural thing to add. It would be great if as many of the /language/ issues were fixed and documented as soon as possible. They don't even have to be implemented. What i'm afraid of is what will happen once dmd no longer is the
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/04/2012 11:23 PM, Artur Skawina wrote: On 02/04/12 22:20, Timon Gehr wrote: On 02/04/2012 06:55 PM, Artur Skawina wrote: On 02/04/12 02:03, Timon Gehr wrote: On 02/03/2012 11:08 AM, Artur Skawina wrote: On 02/03/12 00:20, Jonathan M Davis wrote: in is pointless on value types. All it does is make the function parameter const, which really doesn't do much for you, and in some instances, is really annoying. Personally, I see no point in using in unless the parameter is a reference type, and even then, it's often a bad idea with reference types, because in is really const scope, and the scope is problematic if you want to return anything from that variable. It's particularly problematic with arrays, since it's frequently desirable to return slices of them, and scope (and therefore in) would prevent that. It's useful in some instances (particularly with delegates), but I'd use in _very_ sparingly. It's almost always more trouble than it's worth IMHO. BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. I totally agree. Most function arguments are not escaped. However, it is nice that the shortest storage class, 'in', implies scope. There are currently two problems with using in: a) the one mentioned, where using in/scope means you can't (or shouldn't be able to) pass the thing to another function w/o scope marked args, and b) in implies const, which is a problem because you may want to reassign the argument - a perfectly safe thing to do. With const itself you can use parentheses to limit its scope to not include the reference itself; the problematic case is the builtin string alias, ie int f(in string s); would have to allow reassigning 's' inside the function. Semi-related quiz: immutable(char)[] a = a; const(char)[] b = b; auto aa = a ~ a; auto bb = b ~ b; auto ab = a ~ b; writeln(aa: , typeid(aa), bb: , typeid(bb), ab: , typeid(ab)); And the question is: How many people, who have not already been bitten by this, will give the correct answer to: What will this program print?? I think this is covered in this issue: http://d.puremagic.com/issues/show_bug.cgi?id=7311 But feel free to open a more specific enhancement/bug report. Apparently, there's already a bug open for everything. I'm not sure if it's a good or bad thing. :) I don't think there's one correct answer here - you're right that unique const does not really make sense. But is mutable (non-const) really better than immutable? It depends, sometimes you will want one, sometimes the other. I first ran into this while doing a custom string class - there it was the cause of the one and only cast - (string ~ const(char)[]) can obviously still be a string, but the compiler won't accept it without a cast. I'm not sure how often you'll want the result of concatenation to be mutable, compared to immutable. Anyway, the result really is unique, not mutable, const or immutable, at least until it is converted to one of those, hence the solution described below. Well, string = string ~ const(char)[] and char[] = string ~ const(char)[] should work, regardless of the type of immutable[] ~ const[]. The compiler can track the uniqueness of the data at the expression level without actually introducing a type modifier. There is precedent: Array literals are covariant, because it is safe. Do you want to open the enhancement or should I do it? (I really thought I already had an issue open for this...) There should have been another class, in addition to immutable/const, say uniq. For cases where an expression results in new unique objects. This class implicitly converts to any of const/immutable and mutates to the new type. IOW string a = a; char[] b = b; auto c = a ~ b; // typeid(c) == (uniq(char)[]) string d = c; // Fine, c is unique and can be safely treated as a string. // But, from now on, c is (immutable(char)[]) so: char[] e = c; // Fails. // And the other way: auto f = a ~ b; char[] g = f; // OK string h = f // Fails, as f is now a (char[]) No need for unsafe-looking casts, just so that the compiler accepts perfectly safe code, like: string c = ab;, which would currently fail if used in the above quiz, and has to be written as string c = cast(string)ab;. [1] artur [1] Using a helper template is not different from adding a comment, it only serves to document /why/ the programmer had to something, which is only a workaround for a language/compiler limitation. Compiler because at least the simple cases could be silently fixed in a backward compatible way (by not disallowing safe conversions). Language because uniq would also be useful when the
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/04/12 23:44, Timon Gehr wrote: On 02/04/2012 11:23 PM, Artur Skawina wrote: On 02/04/12 22:20, Timon Gehr wrote: On 02/04/2012 06:55 PM, Artur Skawina wrote: Semi-related quiz: immutable(char)[] a = a; const(char)[] b = b; auto aa = a ~ a; auto bb = b ~ b; auto ab = a ~ b; writeln(aa: , typeid(aa), bb: , typeid(bb), ab: , typeid(ab)); And the question is: How many people, who have not already been bitten by this, will give the correct answer to: What will this program print?? I think this is covered in this issue: http://d.puremagic.com/issues/show_bug.cgi?id=7311 But feel free to open a more specific enhancement/bug report. Apparently, there's already a bug open for everything. I'm not sure if it's a good or bad thing. :) I don't think there's one correct answer here - you're right that unique const does not really make sense. But is mutable (non-const) really better than immutable? It depends, sometimes you will want one, sometimes the other. I first ran into this while doing a custom string class - there it was the cause of the one and only cast - (string ~ const(char)[]) can obviously still be a string, but the compiler won't accept it without a cast. I'm not sure how often you'll want the result of concatenation to be mutable, compared to immutable. Anyway, the result really is unique, not mutable, const or immutable, at least until it is converted to one of those, hence the solution described below. Well, string = string ~ const(char)[] and char[] = string ~ const(char)[] should work, regardless of the type of immutable[] ~ const[]. The compiler can track the uniqueness of the data at the expression level without actually introducing a type modifier. There is precedent: Array literals are covariant, because it is safe. Do you want to open the enhancement or should I do it? (I really thought I already had an issue open for this...) Please do; i've been avoiding filing dmd bugs, because i'm only using gdc, not dmd. (so i can't even easily check if the issue still exists in the current tree) artur
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/03/12 00:20, Jonathan M Davis wrote: in is pointless on value types. All it does is make the function parameter const, which really doesn't do much for you, and in some instances, is really annoying. Personally, I see no point in using in unless the parameter is a reference type, and even then, it's often a bad idea with reference types, because in is really const scope, and the scope is problematic if you want to return anything from that variable. It's particularly problematic with arrays, since it's frequently desirable to return slices of them, and scope (and therefore in) would prevent that. It's useful in some instances (particularly with delegates), but I'd use in _very_ sparingly. It's almost always more trouble than it's worth IMHO. BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. If it isn't obvious why - GC. The compiler can optimize the cases where it knows a newly allocated object can't escape and reduce or omit the GC overhead. And yes, it can also do this automatically - but that requires analyzing the whole call chain, which is a) not always possible and b) much more expensive. artur
Re: Segment violation (was Re: Why I could not cast string to int?)
On 03-02-2012 11:08, Artur Skawina wrote: On 02/03/12 00:20, Jonathan M Davis wrote: in is pointless on value types. All it does is make the function parameter const, which really doesn't do much for you, and in some instances, is really annoying. Personally, I see no point in using in unless the parameter is a reference type, and even then, it's often a bad idea with reference types, because in is really const scope, and the scope is problematic if you want to return anything from that variable. It's particularly problematic with arrays, since it's frequently desirable to return slices of them, and scope (and therefore in) would prevent that. It's useful in some instances (particularly with delegates), but I'd use in _very_ sparingly. It's almost always more trouble than it's worth IMHO. BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. If it isn't obvious why - GC. The compiler can optimize the cases where it knows a newly allocated object can't escape and reduce or omit the GC overhead. And yes, it can also do this automatically - but that requires analyzing the whole call chain, which is a) not always possible and b) much more expensive. artur It is not that simple. If the class's constructor passes 'this' off to some arbitrary code, this optimization breaks completely. You would need whole-program analysis to have the slightest hope of doing this optimization correctly. -- - Alex
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/03/12 11:41, Artur Skawina wrote: On 02/03/12 11:21, Alex Rønne Petersen wrote: On 03-02-2012 11:08, Artur Skawina wrote: On 02/03/12 00:20, Jonathan M Davis wrote: in is pointless on value types. All it does is make the function parameter const, which really doesn't do much for you, and in some instances, is really annoying. Personally, I see no point in using in unless the parameter is a reference type, and even then, it's often a bad idea with reference types, because in is really const scope, and the scope is problematic if you want to return anything from that variable. It's particularly problematic with arrays, since it's frequently desirable to return slices of them, and scope (and therefore in) would prevent that. It's useful in some instances (particularly with delegates), but I'd use in _very_ sparingly. It's almost always more trouble than it's worth IMHO. BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. If it isn't obvious why - GC. The compiler can optimize the cases where it knows a newly allocated object can't escape and reduce or omit the GC overhead. And yes, it can also do this automatically - but that requires analyzing the whole call chain, which is a) not always possible and b) much more expensive. artur It is not that simple. If the class's constructor passes 'this' off to some arbitrary code, this optimization breaks completely. You would need whole-program analysis to have the slightest hope of doing this optimization correctly. It's about enabling the optimization for as much code as possible. And probably the most interesting cases are strings/arrays - the GC overhead can be huge if you do a lot of concatenation etc. Would marking the ctor as scope (similarly to const or pure) work for your case? (it is reasonable to expect that the compiler checks this by itself; it's per-type, so not nearly as expensive as analyzing the flow) Actually, passing 'this' to some some arbitrary code isn't a problem, unless the code in question has the esc annotation, in which case you need to mark the ctor (or any other method) as esq too; that will turn off the optimization, for this struct/class, obviously. That's why scope needs to be the default - mixing it with code that does not guarantee that the object does not escape does not really work - you cannot call anything not marked with scope with an already scoped object. Which means you need to mark practically every function argument as scope - this doesn't scale well. artur
Re: Segment violation (was Re: Why I could not cast string to int?)
On 03-02-2012 11:41, Artur Skawina wrote: On 02/03/12 11:21, Alex Rønne Petersen wrote: On 03-02-2012 11:08, Artur Skawina wrote: On 02/03/12 00:20, Jonathan M Davis wrote: in is pointless on value types. All it does is make the function parameter const, which really doesn't do much for you, and in some instances, is really annoying. Personally, I see no point in using in unless the parameter is a reference type, and even then, it's often a bad idea with reference types, because in is really const scope, and the scope is problematic if you want to return anything from that variable. It's particularly problematic with arrays, since it's frequently desirable to return slices of them, and scope (and therefore in) would prevent that. It's useful in some instances (particularly with delegates), but I'd use in _very_ sparingly. It's almost always more trouble than it's worth IMHO. BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. If it isn't obvious why - GC. The compiler can optimize the cases where it knows a newly allocated object can't escape and reduce or omit the GC overhead. And yes, it can also do this automatically - but that requires analyzing the whole call chain, which is a) not always possible and b) much more expensive. artur It is not that simple. If the class's constructor passes 'this' off to some arbitrary code, this optimization breaks completely. You would need whole-program analysis to have the slightest hope of doing this optimization correctly. It's about enabling the optimization for as much code as possible. And probably the most interesting cases are strings/arrays - the GC overhead can be huge if you do a lot of concatenation etc. Would marking the ctor as scope (similarly to const or pure) work for your case? (it is reasonable to expect that the compiler checks this by itself; it's per-type, so not nearly as expensive as analyzing the flow) artur Well, you would have to mark methods as scope too, as they could be passing off 'this' as well. It's probably doable that way, but explicit annotations kind of suck. :( -- - Alex
Re: Segment violation (was Re: Why I could not cast string to int?)
On Friday, February 03, 2012 11:08:54 Artur Skawina wrote: BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. That would destroy slicing. I'm firmly of the opinion that scope should be used sparingly. - Jonathan M Davis
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/03/12 13:06, Jonathan M Davis wrote: On Friday, February 03, 2012 11:08:54 Artur Skawina wrote: BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. That would destroy slicing. I'm firmly of the opinion that scope should be used sparingly. Well, not doing it destroys performance. [1] It's a trade-off. Also, i don't know if destroy slicing is accurate. Things like 'string f(string s) { return s[1..$]; }' needs to continue to work; the object does not really escape from the POV of f(), but the caller has to assume it's not dead after returning from the function. Doing this by default for any functions returning refs that could potentially hold on to the passed object would make things work. For the cases that where the called function knows that it will always return unique objects the signature could look like 'new string f(string s);', but that's only an optimization. Any other problematic slicing use, that i'm not thinking of right now? artur [1] I had a case, where turning on logging in some code made the program unusable, because instead of IIRC ~40s it took 40+ minutes, at which point i gave up and killed it... The profile looked like this: 37.62% uint gc.gcx.Gcx.fullcollect(void*) 20.47% uint gc.gcbits.GCBits.test(uint) 13.80% uint gc.gcbits.GCBits.testSet(uint) 10.15% pure nothrow @safe bool std.uni.isGraphical(dchar) 3.33% _D3std5array17__T8AppenderTAyaZ8Appender10__T3putTwZ3putMF 2.78% 0x11025a 2.13% _D3std6format65__T13formatElementTS3std5array17__T8Appende 1.64% _D3std6format56__T10formatCharTS3std5array17__T8AppenderTA 1.37% pure @safe uint std.utf.encode(ref char[4], dchar) 0.88% void* gc.gcx.GC.malloc(uint, uint, uint*) 0.50% pure nothrow @safe bool std.uni.binarySearch2(dchar, immutable(dchar[2][])) 0.45% void gc.gcbits.GCBits.set(uint) 0.37% void gc.gcbits.GCBits.clear(uint) 0.34% __divdi3 That shows several problems, but even after fixing the obvious ones (inlining GCBits, making std.uni.isGraphical sane (this, btw, reduced its cost to ~1%)) GC still takes up most of time (not remembering the details, but certainly 50%, it could have been 80%). Some slowdown from the IO and formatting is expected, but spending most cycles on GC is not reasonable, when most objects never leave the scope (in this case it was just strings passed to writeln etc IIRC).
Re: Segment violation (was Re: Why I could not cast string to int?)
Jonathan M Davis: in is pointless on value types. All it does is make the function parameter const, which really doesn't do much for you, and in some instances, is really annoying. Having const value types is useful because you can't change them later inside the method. This helps you avoid bugs like: void foo(int n) { // uses n here // modifies n here by mistake // uses n here again, assuming it's the 'real' n argument } When you program you think of arguments as the inputs of your algorithm, so if you mutate them by mistake, this sometimes causes bugs if later you think they are the real inputs of your algorithm still. Generally in D code all variables that can be const/immutable should be const/immutable, unless this causes problems or is impossible or it causes signficant performance troubles. This avoids some bugs, helps DMD optimize better (I have seen this), and helps the person that reads the code to understand the code better (because he/she/shi is free to focus on just the mutable variables). It's better to have const function arguments, unless this is not possible, or for not common situations where a mutable input helps you optimize your algorithm better (especially if the profiler has told you so). Personally, I see no point in using in unless the parameter is a reference type, and even then, it's often a bad idea with reference types, because in is really const scope, and the scope is problematic if you want to return anything from that variable. Think of returning a part of a mutable input argument as an optimization, to be used when you know you need the extra speed. Otherwise where performance is not a problem it's often safer to return a const value or to return something new created inside the function/method. This programming style avoids many mistakes (it's useful in Java coding too). From what I've seen, in my D code only a small percentage of the program lines need to be optimized and use C-style coding. For most of the lines of code a more functional D style is enough, and safer. The idea is mutability where needed, and a bit more functional-style everywhere else :-) Bye, bearophile
Re: Segment violation (was Re: Why I could not cast string to int?)
Artur Skawina: Would marking the ctor as scope (similarly to const or pure) work for your case? (it is reasonable to expect that the compiler checks this by itself; it's per-type, so not nearly as expensive as analyzing the flow) Maybe this is a topic worth discussing in the main D newsgroup (and maybe later worth an enhancement request). Bye, bearophile
Re: Segment violation (was Re: Why I could not cast string to int?)
Al 03/02/12 00:14, En/na bearophile ha escrit: xancorreu: But you only put a in in recFactorial function argument. What this mean? **Why** this is more efficient than mine? It wasn't meant to improve performance. in turns a function argument to input only (and eventually scoped too). Generally when you program in D2 it's a good practice to use immutability where you can and where this doesn't cause other performance or typing problems. Immutability avoids bugs, allows a stronger purity (and I have seen DMD is often able to compiler a little more efficient program if you use immutability/constants everywhere they are a good fit). So 95% of the arguments of your program are better tagged with in. Mmm. Thanks. It remembers me val in scala ;-) I note it for optimizations. Bye, bearophile
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/03/2012 11:08 AM, Artur Skawina wrote: On 02/03/12 00:20, Jonathan M Davis wrote: in is pointless on value types. All it does is make the function parameter const, which really doesn't do much for you, and in some instances, is really annoying. Personally, I see no point in using in unless the parameter is a reference type, and even then, it's often a bad idea with reference types, because in is really const scope, and the scope is problematic if you want to return anything from that variable. It's particularly problematic with arrays, since it's frequently desirable to return slices of them, and scope (and therefore in) would prevent that. It's useful in some instances (particularly with delegates), but I'd use in _very_ sparingly. It's almost always more trouble than it's worth IMHO. BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. I totally agree. Most function arguments are not escaped. However, it is nice that the shortest storage class, 'in', implies scope. If it isn't obvious why - GC. The compiler can optimize the cases where it knows a newly allocated object can't escape and reduce or omit the GC overhead. And yes, it can also do this automatically - but that requires analyzing the whole call chain, which is a) not always possible and b) much more expensive. artur Any optimization that relies on alias analysis potentially benefits from 'scope'.
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/03/2012 01:06 PM, Jonathan M Davis wrote: On Friday, February 03, 2012 11:08:54 Artur Skawina wrote: BTW, scope should have been the default for *all* reference type function arguments, with an explicit modifier, say esc, required to let the thing escape. It's an all-or-nothing thing, just like immutable strings - not using it everywhere is painful, but once you switch everything over you get the benefits. That would destroy slicing. I'm firmly of the opinion that scope should be used sparingly. - Jonathan M Davis It could be enabled on by type basis.
Re: Segment violation (was Re: Why I could not cast string to int?)
Timon Gehr: However, it is nice that the shortest storage class, 'in', implies scope. I'd like to ask this to be valid, to shorten my code: alias immutable imm; Is this silly? Bye, bearophile
Re: Segment violation (was Re: Why I could not cast string to int?)
bearophile bearophileh...@lycos.com wrote in message news:jgi3jn$2o6p$1...@digitalmars.com... I'd like to ask this to be valid, to shorten my code: alias immutable imm; Is this silly? Yes =) immutable might be more characters than you want to type, but at this point it's extremely unlikely it will be changed or a synonym will be added. You can always define something like this: template imm(T) { alias immutable T imm; } imm!int cantchangethis = ...;
Re: Segment violation (was Re: Why I could not cast string to int?)
xancorreu: I get segment violation error with ./factorial 40 How can I resolve it? You are having a stack overflow. DMD currently doesn't print a good message because of this regression that is being worked on: http://d.puremagic.com/issues/show_bug.cgi?id=6088 On Windows with DMD you increase the stack like this: dmd -L/STACK:1 -run test2.d 40 result.txt If it goes in overflow still, increase the stack some more. But it will take a long time to compute the result even with the latest 2.058head with improved GC because the algorithm you have used to compute the factorial is very bad. I have rewritten your code like this: import std.stdio, std.bigint, std.conv, std.exception; BigInt recFactorial(in int n) { if (n == 0) return BigInt(1); else return BigInt(n) * recFactorial(n - 1); } void main(string[] args) { if (args.length != 2) { writeln(Factorial requires a number.); } else { try { writeln(recFactorial(to!int(args[1]))); } catch (ConvException e) { writeln(Error); } } } Note the usage of ConvException, it's a very good practice to never use a generic gotta catch them all expression, because it leads to hiding other bugs in your code, and this is a source for troubles. Bye, bearophile
Re: Segment violation (was Re: Why I could not cast string to int?)
Al 02/02/12 19:30, En/na bearophile ha escrit: xancorreu: I get segment violation error with ./factorial 40 How can I resolve it? You are having a stack overflow. DMD currently doesn't print a good message because of this regression that is being worked on: http://d.puremagic.com/issues/show_bug.cgi?id=6088 On Windows with DMD you increase the stack like this: dmd -L/STACK:1 -run test2.d 40 result.txt If it goes in overflow still, increase the stack some more. But it will take a long time to compute the result even with the latest 2.058head with improved GC because the algorithm you have used to compute the factorial is very bad. I have rewritten your code like this: import std.stdio, std.bigint, std.conv, std.exception; BigInt recFactorial(in int n) { if (n == 0) return BigInt(1); else return BigInt(n) * recFactorial(n - 1); } void main(string[] args) { if (args.length != 2) { writeln(Factorial requires a number.); } else { try { writeln(recFactorial(to!int(args[1]))); } catch (ConvException e) { writeln(Error); } } } Note the usage of ConvException, it's a very good practice to never use a generic gotta catch them all expression, because it leads to hiding other bugs in your code, and this is a source for troubles. Bye, bearophile Thank you very much for recode this. But you only put a in in recFactorial function argument. What this mean? **Why** this is more efficient than mine? For the other hand, how can increase the stack in linux? Thanks, Xan.
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/02/2012 08:04 PM, xancorreu wrote: Al 02/02/12 19:30, En/na bearophile ha escrit: xancorreu: I get segment violation error with ./factorial 40 How can I resolve it? You are having a stack overflow. DMD currently doesn't print a good message because of this regression that is being worked on: http://d.puremagic.com/issues/show_bug.cgi?id=6088 On Windows with DMD you increase the stack like this: dmd -L/STACK:1 -run test2.d 40 result.txt If it goes in overflow still, increase the stack some more. But it will take a long time to compute the result even with the latest 2.058head with improved GC because the algorithm you have used to compute the factorial is very bad. I have rewritten your code like this: import std.stdio, std.bigint, std.conv, std.exception; BigInt recFactorial(in int n) { if (n == 0) return BigInt(1); else return BigInt(n) * recFactorial(n - 1); } void main(string[] args) { if (args.length != 2) { writeln(Factorial requires a number.); } else { try { writeln(recFactorial(to!int(args[1]))); } catch (ConvException e) { writeln(Error); } } } Note the usage of ConvException, it's a very good practice to never use a generic gotta catch them all expression, because it leads to hiding other bugs in your code, and this is a source for troubles. Bye, bearophile Thank you very much for recode this. But you only put a in in recFactorial function argument. What this mean? **Why** this is more efficient than mine? It is not. He just added some stylistic changes that don't change the code's semantics in any way. For the other hand, how can increase the stack in linux? Thanks, Xan. I don't know, but it is best to just rewrite the code so that it does not use recursion. (This kind of problem is exactly the reason why any language standard should mandate tail call optimization.)
Re: Segment violation (was Re: Why I could not cast string to int?)
On Thu, Feb 02, 2012 at 10:55:06PM +0100, Timon Gehr wrote: On 02/02/2012 08:04 PM, xancorreu wrote: [...] For the other hand, how can increase the stack in linux? [...] I don't know, but it is best to just rewrite the code so that it does not use recursion. (This kind of problem is exactly the reason why any language standard should mandate tail call optimization.) Doesn't help badly-chosen implementations like: int fib(int n) { if (n = 2) return 1; else return fib(n-2) + fib(n+1); } There's not much the compiler can do to offset programmers choosing the wrong algorithm for the job. It can't replace educating programmers to not implement things a certain way unless they have to. T -- Philosophy: how to make a career out of daydreaming.
Re: Segment violation (was Re: Why I could not cast string to int?)
On Thu, Feb 02, 2012 at 02:47:22PM -0800, H. S. Teoh wrote: [...] int fib(int n) { if (n = 2) return 1; else return fib(n-2) + fib(n+1); [...] Ugh. That should be fib(n-1), not fib(n+1). But no matter, such a thing shouldn't ever be actually written and compiled, as Andrei says. :) T -- Change is inevitable, except from a vending machine.
Re: Segment violation (was Re: Why I could not cast string to int?)
xancorreu: But you only put a in in recFactorial function argument. What this mean? **Why** this is more efficient than mine? It wasn't meant to improve performance. in turns a function argument to input only (and eventually scoped too). Generally when you program in D2 it's a good practice to use immutability where you can and where this doesn't cause other performance or typing problems. Immutability avoids bugs, allows a stronger purity (and I have seen DMD is often able to compiler a little more efficient program if you use immutability/constants everywhere they are a good fit). So 95% of the arguments of your program are better tagged with in. Bye, bearophile
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/02/2012 11:47 PM, H. S. Teoh wrote: On Thu, Feb 02, 2012 at 10:55:06PM +0100, Timon Gehr wrote: On 02/02/2012 08:04 PM, xancorreu wrote: [...] For the other hand, how can increase the stack in linux? [...] I don't know, but it is best to just rewrite the code so that it does not use recursion. (This kind of problem is exactly the reason why any language standard should mandate tail call optimization.) Doesn't help badly-chosen implementations like: int fib(int n) { if (n= 2) return 1; else return fib(n-2) + fib(n+1); } This is not a tail-recursive function. And neither is recFactorial, my bad. Anyway, my point was that the compiler should not generate code that blows up on a (in principle) perfectly sane implementation. There's not much the compiler can do to offset programmers choosing the wrong algorithm for the job. Agreed. It can't replace educating programmers to not implement things a certain way unless they have to. T Or unless they feel like it. LList!ulong fib(){ LList!ulong r; r=cons(st(1UL),cons(st(1UL),lz(()=zipWith((Lazy!ulong a, Lazy!ulong b)=lz(()=a+b),r,r.tail)(; return r; }
Re: Segment violation (was Re: Why I could not cast string to int?)
Timon Gehr: This is not a tail-recursive function. And neither is recFactorial, my bad. Anyway, my point was that the compiler should not generate code that blows up on a (in principle) perfectly sane implementation. Is it possible to create a function attribute like @tail_recursive that produces a compile error if you apply it to a function that's not tail-recursive? Bye, bearophile
Re: Segment violation (was Re: Why I could not cast string to int?)
On Thursday, February 02, 2012 18:14:25 bearophile wrote: xancorreu: But you only put a in in recFactorial function argument. What this mean? **Why** this is more efficient than mine? It wasn't meant to improve performance. in turns a function argument to input only (and eventually scoped too). Generally when you program in D2 it's a good practice to use immutability where you can and where this doesn't cause other performance or typing problems. Immutability avoids bugs, allows a stronger purity (and I have seen DMD is often able to compiler a little more efficient program if you use immutability/constants everywhere they are a good fit). So 95% of the arguments of your program are better tagged with in. in is pointless on value types. All it does is make the function parameter const, which really doesn't do much for you, and in some instances, is really annoying. Personally, I see no point in using in unless the parameter is a reference type, and even then, it's often a bad idea with reference types, because in is really const scope, and the scope is problematic if you want to return anything from that variable. It's particularly problematic with arrays, since it's frequently desirable to return slices of them, and scope (and therefore in) would prevent that. It's useful in some instances (particularly with delegates), but I'd use in _very_ sparingly. It's almost always more trouble than it's worth IMHO. - Jonathan M Davis
Re: Segment violation (was Re: Why I could not cast string to int?)
On Thursday, February 02, 2012 18:17:36 bearophile wrote: Timon Gehr: This is not a tail-recursive function. And neither is recFactorial, my bad. Anyway, my point was that the compiler should not generate code that blows up on a (in principle) perfectly sane implementation. Is it possible to create a function attribute like @tail_recursive that produces a compile error if you apply it to a function that's not tail-recursive? I suspect that Walter would feel the same way about that that he feels about something like marking functions as inline - i.e. that sort of thing should be left up to the compiler to optimize or not as it sees appropriate. - Jonathan M Davis
Re: Segment violation (was Re: Why I could not cast string to int?)
On Fri, Feb 03, 2012 at 12:10:01AM +0100, Timon Gehr wrote: [...] LList!ulong fib(){ LList!ulong r; r=cons(st(1UL),cons(st(1UL),lz(()=zipWith((Lazy!ulong a, Lazy!ulong b)=lz(()=a+b),r,r.tail)(; return r; } Whoa. A caching recursive definition of fibonacci. Impressive! Now I wonder if we can do this with the Ackermann function... ;-) T -- 640K ought to be enough -- Bill G., 1984. The Internet is not a primary goal for PC usage -- Bill G., 1995. Linux has no impact on Microsoft's strategy -- Bill G., 1999.
Re: Segment violation (was Re: Why I could not cast string to int?)
On 02/02/2012 03:10 PM, Timon Gehr wrote: LList!ulong fib(){ LList!ulong r; r=cons(st(1UL),cons(st(1UL),lz(()=zipWith((Lazy!ulong a, Lazy!ulong b)=lz(()=a+b),r,r.tail)(; return r; } Sorry, wrong newsgroup. alt.comp.lang.perl is around the corner. :p Ali
Re: Segment violation (was Re: Why I could not cast string to int?)
On Thu, Feb 02, 2012 at 03:26:52PM -0800, Ali Çehreli wrote: On 02/02/2012 03:10 PM, Timon Gehr wrote: LList!ulong fib(){ LList!ulong r; r=cons(st(1UL),cons(st(1UL),lz(()=zipWith((Lazy!ulong a, Lazy!ulong b)=lz(()=a+b),r,r.tail)(; return r; } Sorry, wrong newsgroup. alt.comp.lang.perl is around the corner. :p [...] You mean alt.comp.lang.lisp. :-) T -- Why waste time learning, when ignorance is instantaneous? -- Hobbes, from Calvin Hobbes