Re: D for the Win
On Sunday, 24 August 2014 at 09:24:55 UTC, Jacob Carlborg wrote: On 2014-08-24 10:53, bearophile wrote: Dicebot: In reddit thread one of commenters complained about D performance and linked this benchmark : That benchmark found a small performance bug in ldc2, that I reported, but I think it's not yet fixed. The numbers in the benchmark has just been updated. DMD is behind C. GDC is the fastest of all and LDC is ahead of Clang but behind GCC. Seems pretty good to me. I did some analysis to find out which changes made the difference. Here's my result. 1. Disabling the GC - insignificant 2. Liberal use of `immutable` - insignificant 3. Decorating functions with @trusted, @safe, nothrow, pure - insignificant 4. Using C's random number generator for both D and C - insignificant 5. Using C's floor instead of D's floor. - very significant (why?) 6. This change (https://github.com/nsf/pnoise/commit/baadfe20c7ae6aa900cb0e4188aa9d20bea95918) - very significant. Mike
Re: D for the Win
On Sun, 24 Aug 2014 12:51:10 + Mike via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: ps. 6. This change (https://github.com/nsf/pnoise/commit/baadfe20c7ae6aa900cb0e4188aa9d20bea95918) with GDC has no effect at all. signature.asc Description: PGP signature
Re: D for the Win
On Sunday, 24 August 2014 at 13:13:58 UTC, ketmar via Digitalmars-d-announce wrote: On Sun, 24 Aug 2014 12:51:10 + Mike via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: ps. 6. This change (https://github.com/nsf/pnoise/commit/baadfe20c7ae6aa900cb0e4188aa9d20bea95918) with GDC has no effect at all. If I undo all of Edmund Smith's changes from today, use C's floor, and remove all the excessive function attributes, I get this http://dpaste.dzfl.pl/1b564efb423e === gcc -O3: 0.141484117 seconds time === D (dmd): 0.446634464 seconds time === D (ldc2): 0.191059330 seconds time === D (gdc): 0.226455762 seconds time Then I add change only #6 above, and remove the excessive function attributes, I get this: http://dpaste.dzfl.pl/f525adab909c === gcc -O3: 0.137815809 seconds time === D (dmd): 0.480525196 seconds time === D (ldc2): 0.139659135 seconds time === D (gdc): 0.131637220 seconds time Approaching twice as fast for GDC. That's significant to me. Also, all those optimization flags should already be on with -O3. Here are the flags I'm using: gcc -std=c99 -O3 -o bin_test_c_gcc test.c -lm dmd -ofbin_test_d_dmd -O -noboundscheck -inline -release test.d ldc2 -O3 -ofbin_test_d_ldc test.d -release gdc -O3 -o bin_test_d_gdc test.d -frelease Maybe I'll make a pull request for it. I don't think users should have to decorate their code like a Christmas tree and use a bunch of special compiler flags to get a well-behaved binary. Mike
Re: D for the Win
Mike: Then I add change only #6 above, and remove the excessive function attributes, Maybe I'll make a pull request for it. I don't think users should have to decorate their code like a Christmas tree I don't agree, function attributes are not excessive, they are idiomatic in D. Bye, bearophile
Re: D for the Win
On Sun, 24 Aug 2014 13:44:07 + Mike via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: hm. for my GDC 4.9.1. git HEAD #6 has no effect at all. p.s. it's unfair to specify -msse3 -mfpmath=sse for gcc and not for gdc. gdc can use this flags too! (yeah, the effect is great: sse3 variant is ~2.5 times faster). signature.asc Description: PGP signature
Re: D for the Win
On Sun, 24 Aug 2014 13:44:07 + Mike via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: p.s. what i did is this: auto tm = Timer(); tm.start; foreach (; 0..100) { auto n2d = Noise2DContext(0); foreach (i; 0..100) { foreach (y; 0..256) { foreach (x; 0..256) { auto v = n2d.get(x * 0.1f, y * 0.1f) * 0.5f + 0.5f; pixels[y*256+x] = v; } } } } tm.stop; writeln(tm.toString); Timer is my simple timer class which uses MonoTime to measure intervals. this shows ~22 seconds for both variants, with #6 and without #6. and 57 seconds for variants without sse3 flags. ;-) signature.asc Description: PGP signature
Re: D for the Win
On Sunday, 24 August 2014 at 14:04:22 UTC, ketmar via Digitalmars-d-announce wrote: On Sun, 24 Aug 2014 13:44:07 + Mike via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: p.s. what i did is this: auto tm = Timer(); tm.start; foreach (; 0..100) { auto n2d = Noise2DContext(0); foreach (i; 0..100) { foreach (y; 0..256) { foreach (x; 0..256) { auto v = n2d.get(x * 0.1f, y * 0.1f) * 0.5f + 0.5f; pixels[y*256+x] = v; } } } } tm.stop; writeln(tm.toString); Timer is my simple timer class which uses MonoTime to measure intervals. this shows ~22 seconds for both variants, with #6 and without #6. and 57 seconds for variants without sse3 flags. ;-) I'm guessing the dependency is probably due to our configure/build of GDC. I'm using Arch Linux 64's default GDC from their repository. Perhaps it's configured in a way that has these optimizations on by default. It probably should. Mike
Re: D for the Win
On Sunday, 24 August 2014 at 14:09:03 UTC, Mike wrote: I'm guessing the dependency is probably due to our configure/build of GDC. I'm using Arch Linux 64's default GDC from their repository. Perhaps it's configured in a way that has these optimizations on by default. It probably should. dependency -- discrepancy
Re: Fix #2529: explicit protection package #3651
On Sunday, 24 August 2014 at 02:22:41 UTC, Dicebot wrote: Well difference is that internal substring in the fully qualified name that is much more likely to tell user he is better to not touch it. However, original Kagamin proposal of embedding it into module names themselves does address that if you are ok with resulting uglyness. Both ways it's a convention, and I don't see, why such convention should exist in the first place, member protection has nothing to do with module name, which reflects module's functionality. On Sunday, 24 August 2014 at 02:34:01 UTC, Dicebot wrote: It can be a philosophical matter, but in my experience grouping by functionality is genuine, and access is an afterthought, so grouping by access doesn't work in the long term, as code naturally (spontaneously) groups by functionality. That does contradict your statement that any stuff used in parent packages must go to up the hierarchy which is exactly grouping by access :) Access there means protection, usage reflects functionality. I can probably give a more practical example I have just recently encountered when re-designing one of our internal library packages. It was a serialization package and sub-packages defined different (de)serialization models, some of those defining special wrapper types for resulting data that enforce certain semantics specific to that serialization model via the type system. It is not surprising that most internal details of such wrappers were kept package protected as exposing those to user could have easily violated all assumptions (but was needed to implement (de)serialization efficiently). It has become more complicated when meta-serializers have been added to parent package - templated stuff that took any of sub-package implementations and added some functionality (like versioning support) on top. Being generic thing it reside in higher level serialization package but to be implemented efficiently it needs access to those internal wrapper fields. Moving wrapper modules to parent package is not an option here because those are closely related to specific serialization model. Exposing stuff as public is not an option because anyone not deeply familiar with package implementation can easily break all type system guarantees that way. Moving meta-serializers to sub-packages is quite a code duplication. Right now I keep that stuff public and say in docs please don't use this which is hardly a good solution. 1. Making wrapper protected will preclude writing new serializer. 2. Using wrapper methods can be meaningless without serializer. 3. Serializer may just not expose wrapper, then user will have no way to access it. 4. .net has quite a lot of things like http://msdn.microsoft.com/en-us/library/system.runtime.compilerservices.configuredtaskawaitable%28v=vs.110%29.aspx and nothing explodes even though .net programmers are believed to be really stupid and evil. It's a virtue of Stackoverflow Driven Development. On Sunday, 24 August 2014 at 02:39:40 UTC, Dicebot wrote: On Saturday, 23 August 2014 at 09:00:30 UTC, Kagamin wrote: What is difficult to find? With flat structure you have all files right before your eyes. If you need std.datetime.systime module, you open std/datetime/systime.d file - that's the reason of needlessly restricting code structure with modules as if one size fits all. It is the same reasoning as with deep filesystem hierarchies and, well, any data hierarchies - once the element (module / file) count becomes bigger than ~dozen you only really notice things you know to look for. Contrary to that deeply nested categorized hierarchies are easy to casually to search through if you don't know exact module name - just iteratively pick whatever package fits the theme until you find what you want. I'm afraid, hierarchies make things harder to find. If you don't know what is where, flat organization will present you with everything readily available right before your eyes. With deep hierarchy you're left with abstract or poorly chosen categories at every hierarchy level, so effectively you have to walk the entire hierarchy, which is much more tedious than scroll a flat list of modules viewing ten modules per scroll. Badly named list of modules (like what we have now in phobos) scales well up to 100, well-named list is much more manageable: if you need xml, you already know it's near the end of the list - it's easy to find even among 1000 files - you don't ever need to scroll entire list. If it's not there, where do you go? There's no obvious answer. So even shallow hierarchy is more troublesome than a flat list of 1000 modules. I don't believe hierarchy will magically solve navigation problems just because it has folders. I remember coding a bit in C#/.NET platform ages ago - it was totally possible to find relevant modules without even looking in docs, just using auto-complete through suggested package
Re: D for the Win
On 24 Aug 2014 14:09, ketmar via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: On Sun, 24 Aug 2014 12:51:10 + Mike via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: 5. Using C's floor instead of D's floor. - very significant (why?) gcc/clang inlines floorf(). gdc generates calls to floor() in both cases, C floor() is just faster. i.e. gdc fails to see that floor() can be converted to intrinsic. the same thing with DMD i believe. That's because floor isn't an intrinsic. The crippling speed issue was the fact that floor computed and returned at real precision. On recent (sandybridge?) CPU's, it was found that x87 does more ill than good. So I changed it to a template in Phobos (and did some nice tidy ups in the process). This will be pulled down in the 2.066 merge. Speed improvements were discussed in the PR and in the original pnoise thread. Though it's very likely that a hand optimised SSE3 assembly implementation in C's mathlib might still be faster. Iain.
Re: D for the Win
On Sun, 24 Aug 2014 16:16:43 +0100 Iain Buclaw via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: That's because floor isn't an intrinsic. The crippling speed issue was the fact that floor computed and returned at real precision. i'm testing on x86, and the difference between 'call floorf' and inlining is significant. gcc inlines floorf() call, and gdc does not. i don't know anything about x86_64 though. signature.asc Description: PGP signature
Re: D for the Win
On 24 Aug 2014 16:26, ketmar via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: On Sun, 24 Aug 2014 16:16:43 +0100 Iain Buclaw via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: That's because floor isn't an intrinsic. The crippling speed issue was the fact that floor computed and returned at real precision. i'm testing on x86, and the difference between 'call floorf' and inlining is significant. gcc inlines floorf() call, and gdc does not. Inline is not quite correct. Floor is a function recognised by the compiler, so if the backend knows an instruction for it, it will favour that intrinsic over calling an external function. Iain
QTD upgrades to dmd 2.066.
I have made a fork of QTD, the Qt bindings for D, to get it to work with dmd 2.066: https://github.com/remy-j-a-moueza/qtd/tree/dmd-2.066 I have only made it to build on Linux (32 bit Ubuntu 14.04) with nearly no testing. I haven't updated the examples either, so lots of them are not compiling. That said, it seems that QTD was quite stable prior to dmd 2.061; I hope that these last upgrades can help it get back from obsolescence.
Re: QTD upgrades to dmd 2.066.
On Sunday, 24 August 2014 at 20:27:59 UTC, Rémy Mouëza wrote: I have made a fork of QTD, the Qt bindings for D, to get it to work with dmd 2.066: https://github.com/remy-j-a-moueza/qtd/tree/dmd-2.066 I have only made it to build on Linux (32 bit Ubuntu 14.04) with nearly no testing. I haven't updated the examples either, so lots of them are not compiling. That said, it seems that QTD was quite stable prior to dmd 2.061; I hope that these last upgrades can help it get back from obsolescence. Nice work, btw there maybe a few commits from repo https://github.com/qtd-developers/qtd that never made it back to bitbucket (which you appearantly forked from) which you might be interested in.