Re: avoid toLower in std.algorithm.sort compare alias
Am Sun, 22 Apr 2012 09:23:45 +0200 schrieb "Jay Norwood" : > On Sunday, 22 April 2012 at 06:26:42 UTC, Jonathan M Davis wrote: > > > > You can look at the code. It checks each of the characters in > > place. Unlike > > toLower, it doesn't need to generate a new string. But as far > > as the > > comparison goes, they're the same - hence that line in the docs. > > > > - Jonathan M Davis > > ok, I did look at the code just now, and I'll sleep better > knowing that it doesn't do the whole string conversion. I > misunderstood your pseudo-code to mean that two lower case > strings were being created prior to the compare. > > However, icmp code does appear to call the toLower conversion on > both characters without first comparing the characters for > equality, which misses the chance to do a simple compare that > would avoid the two calls. /- check for equality :) v cmp!"a != b && std.uni.toLower(a) < std.uni.toLower(b)"(r1, r2) -- Marco
Re: avoid toLower in std.algorithm.sort compare alias
On Tuesday, April 24, 2012 12:24:44 Regan Heath wrote: > On Mon, 23 Apr 2012 16:43:20 +0100, Steven Schveighoffer > > wrote: > > While dealing with unicode in my std.stream rewrite, I've found that > > hand-decoding dchars is way faster than using library calls. > > After watching Andrei's talk on generic and generative programming I have > to ask, which routines are you avoiding .. it seems we need to make them > as good as the hand coded code you've written... In general, when operating on strings generically, you up having to treat them as ranges of dchar and decode everything, but there are a lot of cases where you can special-case algorithms for narrow strings and avoid decoding them. Phobos does this a lot (though it can probably do a better job of it in a number of places), so by using functions from there rather than rolling your own, the problem is reduced, but any time that you're doing a lot of generic string processing, there's a decent chance that you're going to have to special case some stuff for arrays of char, wchar, and dchar in order to fully optimize it. And I don't think that there's really a way out of that beyond having a lot of functions already available (and already optimized) to do a lot of the string processing for you. There's a definite tension between genericity and effciency in the case of string processing - due primarily to variable length encodings. - Jonathan M Davis
Re: avoid toLower in std.algorithm.sort compare alias
On Tuesday, 24 April 2012 at 14:54:48 UTC, Steven Schveighoffer wrote: On Tuesday, 24 April 2012 at 11:24:44 UTC, Regan Heath wrote: After watching Andrei's talk on generic and generative programming I have to ask, which routines are you avoiding .. it seems we need to make them as good as the hand coded code you've written... from memory (don't have the code in front of me right now), it was std.uni.decode, and using foreach(dchar d; str) (which cannot be inlined currently). IIRC, std.uni.decode was not being inlined. So I tried hand-inlining it (I also discovered some optimizations it was not using), and it made a huge difference. BTW, you can check out my github branch of phobos named new-io2, look at the textstream struct to see what I've inlined. -Steve
Re: avoid toLower in std.algorithm.sort compare alias
On Tuesday, 24 April 2012 at 11:24:44 UTC, Regan Heath wrote: On Mon, 23 Apr 2012 16:43:20 +0100, Steven Schveighoffer wrote: While dealing with unicode in my std.stream rewrite, I've found that hand-decoding dchars is way faster than using library calls. After watching Andrei's talk on generic and generative programming I have to ask, which routines are you avoiding .. it seems we need to make them as good as the hand coded code you've written... from memory (don't have the code in front of me right now), it was std.uni.decode, and using foreach(dchar d; str) (which cannot be inlined currently). IIRC, std.uni.decode was not being inlined. So I tried hand-inlining it (I also discovered some optimizations it was not using), and it made a huge difference. In regards to this discussion, I think icmp can also be improved when run on a char array, by doing a byte comparison (no dchar decoding) until it finds a difference. That might be a huge speedup. Right now, all dchars are being decoded, and translated to the toLower counterpart. It may have an opposite effect, however, if there are a lot of strings that are equivalent when ignoring case, but not exactly the same. -Steve
Re: avoid toLower in std.algorithm.sort compare alias
On Mon, 23 Apr 2012 16:43:20 +0100, Steven Schveighoffer wrote: While dealing with unicode in my std.stream rewrite, I've found that hand-decoding dchars is way faster than using library calls. After watching Andrei's talk on generic and generative programming I have to ask, which routines are you avoiding .. it seems we need to make them as good as the hand coded code you've written... R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Re: avoid toLower in std.algorithm.sort compare alias
On Mon, 23 Apr 2012 09:49:50 -0400, Jay Norwood wrote: On Monday, 23 April 2012 at 11:27:40 UTC, Steven Schveighoffer wrote: I think using std.string.icmp is the best solution. I would expect it to outperform even schwartz sort. -Steve icmp took longer... added about 1 sec vs 0.3 sec (for schwartzSort ) to the program execution time. bool myComp(string x, string y) { return std.string.icmp(x,y)<0; } std.algorithm.sort!(myComp)(dirs); finished! time:1396 ms Well, that's surprising :) Perhaps there's some room for improvement in icmp or std.uni.toLower. There may be some constructs that are preventing inlining (enforce is the worst offender). While dealing with unicode in my std.stream rewrite, I've found that hand-decoding dchars is way faster than using library calls. -Steve
Re: avoid toLower in std.algorithm.sort compare alias
On Monday, 23 April 2012 at 11:27:40 UTC, Steven Schveighoffer wrote: I think using std.string.icmp is the best solution. I would expect it to outperform even schwartz sort. -Steve icmp took longer... added about 1 sec vs 0.3 sec (for schwartzSort ) to the program execution time. bool myComp(string x, string y) { return std.string.icmp(x,y)<0; } std.algorithm.sort!(myComp)(dirs); finished! time:1396 ms
Re: avoid toLower in std.algorithm.sort compare alias
On Sat, 21 Apr 2012 19:24:56 -0400, Jay Norwood wrote: While playing with sorting the unzip archive entries I tried use of the last example in http://dlang.org/phobos/std_algorithm.html#sort std.algorithm.sort!("toLower(a.name) < toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries); It was terribly slow for sorting the 34k entries in my test case. I'd guess it is doing the toLower call on both strings for every compare. It was much, much faster to add an entries.lowerCaseName = std.string.toLower(entries.name) foreach entry prior to the sort execution and then use std.algorithm.sort!("a.lowerCaseName < b.lowerCaseName ",std.algorithm.SwapStrategy.stable)(entries); The difference was on the order of 10 secs vs no noticeable delay when executing the sort operation in the debugger. I'll point out what I haven't seen yet: the issue is not so much toLower being called on every comparison, but more that toLower allocates (and then throws away!). I think using std.string.icmp is the best solution. I would expect it to outperform even schwartz sort. Note, to answer your question elsewhere, the comment is accurate, std.uni.toLower(a) is a function that accepts a dchar, not a string. What the comment is saying is that for the "ranges" (i.e. strings) given, it runs the given comparison on the std.uni.toLower() result for each element (i.e. dchar). -Steve
Re: avoid toLower in std.algorithm.sort compare alias
On Sunday, 22 April 2012 at 00:36:19 UTC, bearophile wrote: Performing the toLower every time the cmp function is called doesn't change the O complexity. In Phobos there is an alternative sorting (Schwartzian sorting routime) that applies a function to each item before sorting them, usually is much slower than the normal sorting, but maybe this time it's convenient. Bye, bearophile The shwartzSort works fine. Thanks, Jay std.algorithm.schwartzSort!(std.string.toLower, "a < b")(dirs); G:\d\rmdir2\rmdir2\rmdir2\Debug>rmdir2 removing: g:\tz finished! time:699 ms
Re: avoid toLower in std.algorithm.sort compare alias
On Saturday, 21 April 2012 at 23:24:57 UTC, Jay Norwood wrote: While playing with sorting the unzip archive entries I tried use of the last example in http://dlang.org/phobos/std_algorithm.html#sort std.algorithm.sort!("toLower(a.name) < toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries); It was terribly slow for sorting the 34k entries in my test case. I'd guess it is doing the toLower call on both strings for every compare. It was much, much faster to add an entries.lowerCaseName = std.string.toLower(entries.name) foreach entry prior to the sort execution and then use std.algorithm.sort!("a.lowerCaseName < b.lowerCaseName ",std.algorithm.SwapStrategy.stable)(entries); The difference was on the order of 10 secs vs no noticeable delay when executing the sort operation in the debugger. Good point. Perhaps this should be added in the documentation of std.algorithm ? It's easy to get trapped.
Re: avoid toLower in std.algorithm.sort compare alias
On 2012-04-22 01:24, Jay Norwood wrote: While playing with sorting the unzip archive entries I tried use of the last example in http://dlang.org/phobos/std_algorithm.html#sort std.algorithm.sort!("toLower(a.name) < toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries); It was terribly slow for sorting the 34k entries in my test case. I'd guess it is doing the toLower call on both strings for every compare. It was much, much faster to add an entries.lowerCaseName = std.string.toLower(entries.name) foreach entry prior to the sort execution and then use std.algorithm.sort!("a.lowerCaseName < b.lowerCaseName ",std.algorithm.SwapStrategy.stable)(entries); The difference was on the order of 10 secs vs no noticeable delay when executing the sort operation in the debugger. Perhaps a function that does case folding would be better in this case. But as far as I know Phobos doesn't have a function for that. -- /Jacob Carlborg
Re: toLower() and Unicode are incomplete was: Re: avoid toLower in std.algorithm.sort compare alias
On 22.04.2012 5:43, Ali Çehreli wrote: On 04/21/2012 04:24 PM, Jay Norwood wrote: > While playing with sorting the unzip archive entries I tried use of the > last example in http://dlang.org/phobos/std_algorithm.html#sort > > std.algorithm.sort!("toLower(a.name) < > toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries); Stealing this thread to point out that converting a letter to upper or lower case cannot be done without knowing the writing system. Phobos's toLower() documentation currently says: "Returns a string which is identical to s except that all of its characters are lowercase (in unicode, not just ASCII)." Oh, come on. This function wasn't updated for ages. I bet this wording here is intact since unicode 4.0 ;) Unicode cannot define the conversions of at least the following letters without knowing the actual alphabet that the text is written in: - Lowercase of I is ı in some alphabets[*] and i in many others. - Uppercase of i is İ in some alphabets[*] and I in many others. Fair point. The list however is not that long and a system may choose to support this or not (changing behavior based on writing system is called tailoring I believe). Ali [*] Turkish, Azeri, Chrimean Tatar, Gagauz, Celtic, etc. -- Dmitry Olshansky
Re: avoid toLower in std.algorithm.sort compare alias
On Sunday, 22 April 2012 at 06:26:42 UTC, Jonathan M Davis wrote: You can look at the code. It checks each of the characters in place. Unlike toLower, it doesn't need to generate a new string. But as far as the comparison goes, they're the same - hence that line in the docs. - Jonathan M Davis ok, I did look at the code just now, and I'll sleep better knowing that it doesn't do the whole string conversion. I misunderstood your pseudo-code to mean that two lower case strings were being created prior to the compare. However, icmp code does appear to call the toLower conversion on both characters without first comparing the characters for equality, which misses the chance to do a simple compare that would avoid the two calls.
Re: avoid toLower in std.algorithm.sort compare alias
On Sunday, April 22, 2012 08:20:13 Jay Norwood wrote: > The comment below worries me a little bit about std.string.icmp. > What if they are two 1MB strings that differ in he first > character? Does it really convert both strings to lower case > before comparing the first character? > > http://dlang.org/phobos/std_string.html#icmp > > "Technically, icmp(r1, r2) is equivalent to > cmp!"std.uni.toLower(a) < std.uni.toLower(b)"(r1, r2). " You can look at the code. It checks each of the characters in place. Unlike toLower, it doesn't need to generate a new string. But as far as the comparison goes, they're the same - hence that line in the docs. - Jonathan M Davis
Re: avoid toLower in std.algorithm.sort compare alias
On Sunday, 22 April 2012 at 02:29:45 UTC, Jonathan M Davis wrote: Regardless of whether it's the Big(O) complexity or the constant factor that's the problem here, clearly there's enough additional overhead that it's causing problems for Jay's particular case. It's also the sort of thing that can be easy to miss and then end up wondering why your code is so slow (if it actually matters in your particular situation). - Jonathan M Davis I haven't looked at strncmpi code, but I suspect it is a lot more efficient. For example, in comparing AbbbCdddEfffXabcdEfgh AbbbCdddEfffYabcdEfgh it is not necessary to do case conversion on anything except X and Y, and if isUpper(X)==isUpper(Y) then X and Y can be compared without conversion, and since X and Y are not equal the remaining characters don't have to be converted. The comment below worries me a little bit about std.string.icmp. What if they are two 1MB strings that differ in he first character? Does it really convert both strings to lower case before comparing the first character? http://dlang.org/phobos/std_string.html#icmp "Technically, icmp(r1, r2) is equivalent to cmp!"std.uni.toLower(a) < std.uni.toLower(b)"(r1, r2). "
Re: avoid toLower in std.algorithm.sort compare alias
On Saturday, April 21, 2012 18:26:42 H. S. Teoh wrote: > Actually, I don't think the nested loops affect Big-O complexity at all. > The O(l) complexity (where l = string length) is already inherent in the > string comparison "str < otherStr". Adding more loops over the strings > doesn't change the Big-O complexity (you just make O(l) into 3*O(l) > which is the same as O(l)). If the loop is really nested, then it will increase the Big(O) complexity, whereas if it's parallel to a similar loop, it will just increase the constant factor. Which it really is in this case, I don't know without sitting down and sketching out the exact algorithm, but certainly upon a first glance, it was my conclusion that the loops were nested. If they're not, then they're not. Regardless of whether it's the Big(O) complexity or the constant factor that's the problem here, clearly there's enough additional overhead that it's causing problems for Jay's particular case. It's also the sort of thing that can be easy to miss and then end up wondering why your code is so slow (if it actually matters in your particular situation). - Jonathan M Davis
Re: toLower() and Unicode are incomplete was: Re: avoid toLower in std.algorithm.sort compare alias
On Saturday, April 21, 2012 18:43:23 Ali Çehreli wrote: > On 04/21/2012 04:24 PM, Jay Norwood wrote: > > While playing with sorting the unzip archive entries I tried use of the > > last example in http://dlang.org/phobos/std_algorithm.html#sort > > > > std.algorithm.sort!("toLower(a.name) < > > toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries); > > Stealing this thread to point out that converting a letter to upper or > lower case cannot be done without knowing the writing system. Phobos's > toLower() documentation currently says: "Returns a string which is > identical to s except that all of its characters are lowercase (in > unicode, not just ASCII)." > > Unicode cannot define the conversions of at least the following letters > without knowing the actual alphabet that the text is written in: > > - Lowercase of I is ı in some alphabets[*] and i in many others. > > - Uppercase of i is İ in some alphabets[*] and I in many others. > > Ali > > [*] Turkish, Azeri, Chrimean Tatar, Gagauz, Celtic, etc. toLower and toUpper get pretty screwing with unicode. I don't know enough about non-English alphabets to know what affects what, but at minimum, there are a number of cases where toLower does not reverse toUpper (and vice versa). Rather, it converts the character into yet another letter. So, toLower to toUpper with unicode and definitely a bit iffy. I suppose that they do the job if you call them enough on the string that it doesn't change anymore, but I don't know. I also don't know how they act with regards to the various alphabets and how their implementation was decided upon. IIRC, Walter wrote them, and I'm sure that they're based on the unicode standard, but what that amounts to, I don't know. - Jonathan M Davis
Re: avoid toLower in std.algorithm.sort compare alias
On Sunday, April 22, 2012 03:47:30 Jay Norwood wrote: > On Saturday, 21 April 2012 at 23:54:26 UTC, Jonathan M Davis > > wrote: > > Yeah. toLower would be called on both strings on _every_ > > compare. And since > > that involves a loop, that would make the overall call to sort > > an order of > > magnitude worse than if you didn't call toLower at all. I'm not > > sure if it's > > an order of magnitude worse than your solution though, since > > it's been a while > > since I studied Big(O) notation (doing the conversion only once > > is still more > > expensive than not converting, but I'm not sure how much more > > expensive - it > > might cost less than sort such that it actually doesn't matter > > as for as > > Big(O) goes though). > > > > - Jonathan M Davis > > use of toLower in the sort expression adds around 11.2 secs > overhead to a 0.3 sec operation which reads and sorts the 34k > directory entries in this 2GB layout on the ssd drive, so it > isn't an option for me. > > finished! time:312 ms > > finished! time:11598 ms > > std.algorithm.sort!("toLower(a) < > toLower(b)",std.algorithm.SwapStrategy.stable)(dirs); > //std.algorithm.sort!("a < b", > std.algorithm.SwapStrategy.stable)(dirs); It wasn't saying that it _was_ an option. I was just trying to point out the likely reason why it's so bad with toLower - algorithmically-speaking. This definitely appears to be a case where doing some extra computation ahead of time will save you a lot later. - Jonathan M Davis
Re: avoid toLower in std.algorithm.sort compare alias
On Saturday, 21 April 2012 at 23:54:26 UTC, Jonathan M Davis wrote: Yeah. toLower would be called on both strings on _every_ compare. And since that involves a loop, that would make the overall call to sort an order of magnitude worse than if you didn't call toLower at all. I'm not sure if it's an order of magnitude worse than your solution though, since it's been a while since I studied Big(O) notation (doing the conversion only once is still more expensive than not converting, but I'm not sure how much more expensive - it might cost less than sort such that it actually doesn't matter as for as Big(O) goes though). - Jonathan M Davis use of toLower in the sort expression adds around 11.2 secs overhead to a 0.3 sec operation which reads and sorts the 34k directory entries in this 2GB layout on the ssd drive, so it isn't an option for me. finished! time:312 ms finished! time:11598 ms std.algorithm.sort!("toLower(a) < toLower(b)",std.algorithm.SwapStrategy.stable)(dirs); //std.algorithm.sort!("a < b", std.algorithm.SwapStrategy.stable)(dirs);
toLower() and Unicode are incomplete was: Re: avoid toLower in std.algorithm.sort compare alias
On 04/21/2012 04:24 PM, Jay Norwood wrote: > While playing with sorting the unzip archive entries I tried use of the > last example in http://dlang.org/phobos/std_algorithm.html#sort > > std.algorithm.sort!("toLower(a.name) < > toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries); Stealing this thread to point out that converting a letter to upper or lower case cannot be done without knowing the writing system. Phobos's toLower() documentation currently says: "Returns a string which is identical to s except that all of its characters are lowercase (in unicode, not just ASCII)." Unicode cannot define the conversions of at least the following letters without knowing the actual alphabet that the text is written in: - Lowercase of I is ı in some alphabets[*] and i in many others. - Uppercase of i is İ in some alphabets[*] and I in many others. Ali [*] Turkish, Azeri, Chrimean Tatar, Gagauz, Celtic, etc.
Re: avoid toLower in std.algorithm.sort compare alias
On Sat, Apr 21, 2012 at 05:45:35PM -0700, Jonathan M Davis wrote: > On Saturday, April 21, 2012 20:36:18 bearophile wrote: > > Jonathan M Davis: > > > I'm not sure if it's an order of magnitude worse than your > > > solution though, since it's been a while since I studied Big(O) > > > notation (doing the conversion only once is still more expensive > > > than not converting, but I'm not sure how much more expensive - it > > > might cost less than sort such that it actually doesn't matter as > > > for as Big(O) goes though). > > > > Performing the toLower every time the cmp function is called doesn't change > > the O complexity. > > Yes it does. It adds a loop to each comparison (two loops actually, > but since they're not nested, Big(O) only cares about the one), since > toLower has to loop over all of the characters. As sort loops over > each of the strings to compare them for moving them into a sorted > position or not, it calls toLower, which adds a nested loop, so it > increases the Big(O) complexity. Something like this > > foreach(str; strings) > str < otherStr; > > becomes > > foreach(str; strings) > { > foreach(dchar c; str) > //lower characters > > foreach(dchar c; otherStr) > //lower characters > > str < otherStr; > } > > though that's obviously very pseudo-code-ish and not exactly what sort > does. Regardless, those extra loops when the comparison happens are > nested and therefore increase the Big(O) complexity of the overall > algorithm. [...] Actually, I don't think the nested loops affect Big-O complexity at all. The O(l) complexity (where l = string length) is already inherent in the string comparison "str < otherStr". Adding more loops over the strings doesn't change the Big-O complexity (you just make O(l) into 3*O(l) which is the same as O(l)). However, always remember that Big-O notation hides constant factors and terms. These factors are negligible given a sufficiently large problem size, but for small problem sizes, the hidden constants are very much significant. An O(n log n) algorithm may actually take n*log(10*n) steps or 1000*n*log(n) steps; for large enough n, they're approximately the same, but when n is small, the 1000 has a very significant hit on observed performance. That's why optimizing the constant factors is sometimes necessary (provided you've already minimized the big-O complexity to the full, since otherwise you're just pinching pennies yet freely spending $100 bills). Inner loop optimization, like strength reduction, loop invariant hoisting, etc., are examples of where constant factors get optimized. If you have a loop: real x; for (i=0; i < n; i++) { // bigComplexCalculation is independent of i real n = bigComplexCalculation(x); // make use of n in some way } Then moving the bigComplexCalculation() call outside the loop improves the observed performance significantly, even though you're not changing the big-O complexity: a small change from 10*n to 8*n means a 20% improvement in observed performance, even though the algorithm still degrades linearly with problem size just like before. T -- Microsoft is to operating systems & security ... what McDonalds is to gourmet cooking.
Re: avoid toLower in std.algorithm.sort compare alias
On Saturday, April 21, 2012 20:36:18 bearophile wrote: > Jonathan M Davis: > > I'm not sure if it's > > an order of magnitude worse than your solution though, since it's been a > > while since I studied Big(O) notation (doing the conversion only once is > > still more expensive than not converting, but I'm not sure how much more > > expensive - it might cost less than sort such that it actually doesn't > > matter as for as Big(O) goes though). > > Performing the toLower every time the cmp function is called doesn't change > the O complexity. Yes it does. It adds a loop to each comparison (two loops actually, but since they're not nested, Big(O) only cares about the one), since toLower has to loop over all of the characters. As sort loops over each of the strings to compare them for moving them into a sorted position or not, it calls toLower, which adds a nested loop, so it increases the Big(O) complexity. Something like this foreach(str; strings) str < otherStr; becomes foreach(str; strings) { foreach(dchar c; str) //lower characters foreach(dchar c; otherStr) //lower characters str < otherStr; } though that's obviously very pseudo-code-ish and not exactly what sort does. Regardless, those extra loops when the comparison happens are nested and therefore increase the Big(O) complexity of the overall algorithm. - Jonathan M Davis
Re: avoid toLower in std.algorithm.sort compare alias
Jonathan M Davis: > I'm not sure if it's > an order of magnitude worse than your solution though, since it's been a > while > since I studied Big(O) notation (doing the conversion only once is still more > expensive than not converting, but I'm not sure how much more expensive - it > might cost less than sort such that it actually doesn't matter as for as > Big(O) goes though). Performing the toLower every time the cmp function is called doesn't change the O complexity. In Phobos there is an alternative sorting (Schwartzian sorting routime) that applies a function to each item before sorting them, usually is much slower than the normal sorting, but maybe this time it's convenient. The performance improvement in the OP message is large, maybe it's a problem of memory allocations of the converted strings... More info on the original code is needed to give better answers. Bye, bearophile
Re: avoid toLower in std.algorithm.sort compare alias
On Sunday, April 22, 2012 01:24:56 Jay Norwood wrote: > While playing with sorting the unzip archive entries I tried use > of the last example in > http://dlang.org/phobos/std_algorithm.html#sort > > std.algorithm.sort!("toLower(a.name) < > toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries); > > It was terribly slow for sorting the 34k entries in my test > case. I'd guess it is doing the toLower call on both strings for > every compare. > > It was much, much faster to add an entries.lowerCaseName = > std.string.toLower(entries.name) foreach entry prior to the sort > execution and then use > > std.algorithm.sort!("a.lowerCaseName < b.lowerCaseName > ",std.algorithm.SwapStrategy.stable)(entries); > > The difference was on the order of 10 secs vs no noticeable delay > when executing the sort operation in the debugger. Yeah. toLower would be called on both strings on _every_ compare. And since that involves a loop, that would make the overall call to sort an order of magnitude worse than if you didn't call toLower at all. I'm not sure if it's an order of magnitude worse than your solution though, since it's been a while since I studied Big(O) notation (doing the conversion only once is still more expensive than not converting, but I'm not sure how much more expensive - it might cost less than sort such that it actually doesn't matter as for as Big(O) goes though). - Jonathan M Davis
avoid toLower in std.algorithm.sort compare alias
While playing with sorting the unzip archive entries I tried use of the last example in http://dlang.org/phobos/std_algorithm.html#sort std.algorithm.sort!("toLower(a.name) < toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries); It was terribly slow for sorting the 34k entries in my test case. I'd guess it is doing the toLower call on both strings for every compare. It was much, much faster to add an entries.lowerCaseName = std.string.toLower(entries.name) foreach entry prior to the sort execution and then use std.algorithm.sort!("a.lowerCaseName < b.lowerCaseName ",std.algorithm.SwapStrategy.stable)(entries); The difference was on the order of 10 secs vs no noticeable delay when executing the sort operation in the debugger.