Re: DUB release candidate 0.9.24-rc.3 ready for testing
On Monday, 14 September 2015 at 11:45:13 UTC, Sönke Ludwig wrote: If no regressions show up in this RC, the final release will be made on the upcoming Sunday. The main additions are support for SDLang [1] package recipes [2] and a vastly improved "dub describe". Download: http://code.dlang.org/download Change log: https://github.com/D-Programming-Language/dub/blob/master/CHANGELOG.md [1]: http://sdl.ikayzo.org/display/SDL/Home [2]: http://code.dlang.org/package-format?lang=sdl It would be great if https://github.com/D-Programming-Language/dub/pull/638 would be merged, it contains multiple fixes for being able to use LDC. One of the commit here is controversial, but it wouldn't happen if DUB wouldn't pass multiple -march flags to compilers (DMD support redundant -m32/-m64, LDC doesn't).
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Example output might be useful for you to see as well: 10009.1.1:5.2e-02_13: 16 10014.1.1:2.9e-03_11: 44 10017.1.1:4.1e-02_13: 16 10026.1.1:5.8e-03_12: 27 10027.1.1:6.6e-04_13: 16 10060.1.1:2.7e-03_14: 2 10061.1.1:5.1e-07_13: 41 Worth noticing is that it is essentially impossible to predict how many "hits"/records there are for each query, it varies wildly from 0 to 1000+ in some cases.
DUB release candidate 0.9.24-rc.3 ready for testing
If no regressions show up in this RC, the final release will be made on the upcoming Sunday. The main additions are support for SDLang [1] package recipes [2] and a vastly improved "dub describe". Download: http://code.dlang.org/download Change log: https://github.com/D-Programming-Language/dub/blob/master/CHANGELOG.md [1]: http://sdl.ikayzo.org/display/SDL/Home [2]: http://code.dlang.org/package-format?lang=sdl
Re: DUB release candidate 0.9.24-rc.3 ready for testing
BTW, it's rc.4, not rc.3.
Re: DUB release candidate 0.9.24-rc.3 ready for testing
Am 14.09.2015 um 13:59 schrieb ponce: It would be great if https://github.com/D-Programming-Language/dub/pull/638 would be merged, it contains multiple fixes for being able to use LDC. One of the commit here is controversial, but it wouldn't happen if DUB wouldn't pass multiple -march flags to compilers (DMD support redundant -m32/-m64, LDC doesn't). It's probably a good idea to leave this for a quick follow-up release, some questions are still open and the fix will require some more through testing, which would further delay this release.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 12:44:22 UTC, Edwin van Leeuwen wrote: Sounds like this program is actually IO bound. In that case I would not expect a really expect an improvement by using D. What is the CPU usage like when you run this program? Also which dmd version are you using. I think there were some performance improvements for file reading in the latest version (2.068) Hi Edwin, thanks for your quick reply! I'm using v2.068.1; I actually got inspired to try this out after skimming the changelog :). Regarding if it is IO-bound. I actually expected it would be, but both the Python and the D-version consume 100% CPU while running, and just copying the file around only takes a few seconds (cf 15-20 sec in runtime for the two programs). There's bound to be some aggressive file caching going on, but I figure that would rather normalize program runtimes at lower times after running them a few times, but I see nothing indicating that.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea s/also/even
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea
Re: Operator overloading or alternatives to expression templates
On Sunday, 13 September 2015 at 17:16:40 UTC, Daniel N wrote: int opCmp(Foo rhs) { return (id > rhs.id) - (id < rhs.id); } IMO, subtracting boolean values is bad code style, it's better to be explicit about your intention: (id > rhs.id ? 1 : 0) - (id < rhs.id ? 1 : 0)
Speeding up text file parser (BLAST tabular format)
Hi, This is my first post on Dlang forums and I don't have a lot of experience with D (yet). I mainly code bioinformatics-stuff in Python on my day-to-day job, but I've been toying with D for a couple of years now. I had this idea that it'd be fun to write a parser for a text-based tabular data format I tend to read a lot of in my programs, but I was a bit stomped that the D implementation I created was slower than my Python-version. I tried running `dmd -profile` on it but didn't really understand what I can do to make it go faster. I guess there's some unnecessary dynamic array extensions being made but I can't figure out how to do without them, maybe someone can help me out? I tried making the examples as small as possible. Here's the code D code: http://dpaste.com/2HP0ZVA Here's my Python code for comparison: http://dpaste.com/0MPBK67 Using a small test file (~550 MB) on my machine (2x Xeon(R) CPU E5-2670 with RAID6 SAS disks and 192GB of RAM), the D version runs in about 20 seconds and the Python version less than 16 seconds. I've repeated runs at least thrice when testing. This holds true even if the D version is compiled with -O. The file being parsed is the output of a DNA/protein sequence mapping algorithm called BLAT, but the tabular output format is originally known from the famous BLAST algorithm. Here's a short example of what the input files looks like: http://dpaste.com/017N58F The format is TAB-delimited: query, target, percent_identity, alignment_length, mismatches, gaps, query_start, query_end, target_start, target_end, e-value, bitscore In the example the output is sorted by query, but this cannot be assumed to hold true for all cases. The input file varies in range from several hundred megabytes to several gigabytes (10+ GiB). A brief explanation on what the code does: Parse each line, Only accept records with percent_identity >= min_identity (90.0) and alignment_length >= min_matches (10), Store all such records as tuples (in the D code this is a struct) in an array in an associative array indexed by 'query', For each query, remove any records with percent_id less than 5 percentage points less than the highest value observed for that query, Write results to stdout (in my real code the data is subject to further downstream processing) This was all just for me learning to do some basic stuff in D, e.g. file handling, streaming data from disk, etc. I'm really curious what I can do to improve the D code. My original idea was that maybe I should compile the performance critical parts of my Python codebase to D and call them with PyD or something, but not I'm not so sure any more. Help and suggestions appreciated!
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 12:50:03 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 12:44:22 UTC, Edwin van Leeuwen wrote: Sounds like this program is actually IO bound. In that case I would not expect a really expect an improvement by using D. What is the CPU usage like when you run this program? Also which dmd version are you using. I think there were some performance improvements for file reading in the latest version (2.068) Hi Edwin, thanks for your quick reply! I'm using v2.068.1; I actually got inspired to try this out after skimming the changelog :). Regarding if it is IO-bound. I actually expected it would be, but both the Python and the D-version consume 100% CPU while running, and just copying the file around only takes a few seconds (cf 15-20 sec in runtime for the two programs). There's bound to be some aggressive file caching going on, but I figure that would rather normalize program runtimes at lower times after running them a few times, but I see nothing indicating that. Two things that you could try: First hitlists.byKey can be expensive (especially if hitlists is big). Instead use: foreach( key, value ; hitlists ) Also the filter.array.length is quite expensive. You could use count instead. import std.algorithm : count; value.count!(h => h.pid >= (max_pid - max_pid_diff));
Re: Why does reverse also flips my other dynamic array?
Thanks for the clarification.
[Issue 15044] [Reg 2.068.0] destroy might leak memory
https://issues.dlang.org/show_bug.cgi?id=15044 --- Comment #3 from Kenji Hara--- (In reply to Martin Nowak from comment #2) > Any idea how to solve this @Kenji? > We could try to do semantic3 for buildOpAssign later, but then we'd have to > add a special case to op_overload and gag assign usage or so. I think currently there's no way than treat the generated opAssign function specially. I'll post a PR soon. --
Re: Adjacent Pairs Range
On Monday, 14 September 2015 at 05:37:05 UTC, Sebastiaan Koppe wrote: What about using zip and a slice? Slicing requires a RandomAccessRange (Array). This is too restrictive. We want to change operations such as adjacentTuples with for example map and reduce without the need for temporary copies of the whole range. This is the thing about D's standard library. Read up on D's range concepts.
Re: Adjacent Pairs Range
On Monday, 14 September 2015 at 10:45:52 UTC, Per Nordlöw wrote: restrictive. We want to change operations such as Correction: We want to *chain* operations such as...
Re: Magicport - where it is ?
On 14/09/2015 7:24 PM, Temtaime wrote: Hi ! I wonder if there's a repo with magicport that was used to convert dmd. I have a big library written in C++ and wanna try convert it to D. Or is magicport closed and there's no chance to get it ? Thanks for a reply. The latest version of magicport is in the dmd repo history, right before it was deleted. https://github.com/D-Programming-Language/dmd/tree/last-cdmd Good luck with your conversion!
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: Hi, Using a small test file (~550 MB) on my machine (2x Xeon(R) CPU E5-2670 with RAID6 SAS disks and 192GB of RAM), the D version runs in about 20 seconds and the Python version less than 16 seconds. I've repeated runs at least thrice when testing. This holds true even if the D version is compiled with -O. Sounds like this program is actually IO bound. In that case I would not expect a really expect an improvement by using D. What is the CPU usage like when you run this program? Also which dmd version are you using. I think there were some performance improvements for file reading in the latest version (2.068)
Re: chaining chain Result and underlying object of chain
On Monday, 14 September 2015 at 14:17:51 UTC, Laeeth Isharc wrote: chain doesn't seem to compile if I try and chain a chain of two strings and another string. what should I use instead? Laeeth. Works for me: http://dpaste.dzfl.pl/a692281f7a80
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:14:18 UTC, John Colvin wrote: what system are you on? What are the error messages you are getting? I really appreciate your will to try to help me out. This is what ldd shows on the latest binary release of LDC on my machine. I'm on a Red Hat Enterprise Linux 6.6 system. [boulund@terra ~]$ ldd ~/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2 /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2) /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2: /lib64/libc.so.6: version `GLIBC_2.15' not found (required by /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2) linux-vdso.so.1 => (0x7fff623ff000) libconfig.so.8 => /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/libconfig.so.8 (0x7f7f716e1000) libpthread.so.0 => /lib64/libpthread.so.0 (0x7f7f714a3000) libdl.so.2 => /lib64/libdl.so.2 (0x7f7f7129f000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0032cde0) libm.so.6 => /lib64/libm.so.6 (0x7f7f7101a000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0032cca0) libc.so.6 => /lib64/libc.so.6 (0x7f7f70c86000) /lib64/ld-linux-x86-64.so.2 (0x7f7f718ec000) As you can see it lacks something related to GLIBC, but I'm not sure how to fix that.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:15:25 UTC, Laeeth Isharc wrote: I picked up D to start learning maybe a couple of years ago. I found Ali's book, Andrei's book, github source code (including for Phobos), and asking here to be the best resources. The docs make perfect sense when you have got to a certain level (or perhaps if you have a computer sciencey background), but can be tough before that (though they are getting better). You should definitely take a look at the dlangscience project organized by John Colvin and others. If you like ipython/jupyter also see his pydmagic - write D inline in a notebook. I saw the dlangscience project on GitHub the other day. I've yet to venture deeper. The inlining of D in jupyter notebooks sure is cool, but I'm not sure it's very useful for me, Python feels more succinct for notebook use. Still, I really appreciate the effort put into that, it's really cool! You may find this series of posts interesting too - another bioinformatics guy migrating from Python: http://forum.dlang.org/post/akzdstfiwwzfeoudh...@forum.dlang.org I'll have a look at that series of posts, thanks for the heads-up! Unfortunately I haven't time to read your code, and others will do better. But do you use .reserve() ? Also these are a nice fast container library based on Andrei Alexandrescu's allocator: https://github.com/economicmodeling/containers Not familiar with .reserve(), nor Andrei's allocator library. I'll put that in the stuff-to-read-about-queue for now. :) Thanks for your tips!
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:58:33 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 13:37:18 UTC, John Colvin wrote: On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? -inline in particular is likely to have a strong impact here Why would -inline be particularly likely to make a big difference in this case? I'm trying to learn, but I don't see what inlining could be done in this case. Range-based code like you are using leads to *huge* numbers of function calls to get anything done. The advantage of inlining is twofold: 1) you don't have to pay the cost of the function call itself and 2) often more optimisation can be done once a function is inlined. Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea +1 I would expect ldc or gdc to strongly outperform dmd on this code. Why is that? I would love to learn to understand why they could be expected to perform much better on this kind of code. Because there are much better at inlining. dmd is quick to compile your code and is most up-to-date, but ldc and gdc will produce somewhat faster code in almost all cases, sometimes very dramatically much faster.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:55:50 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 13:10:50 UTC, Edwin van Leeuwen wrote: Two things that you could try: First hitlists.byKey can be expensive (especially if hitlists is big). Instead use: foreach( key, value ; hitlists ) Also the filter.array.length is quite expensive. You could use count instead. import std.algorithm : count; value.count!(h => h.pid >= (max_pid - max_pid_diff)); I didn't know that hitlists.byKey was that expensive, that's just the kind of feedback I was hoping for. I'm just grasping for straws in the online documentation when I want to do things. With my Python background it feels as if I can still get things that work that way. I picked up D to start learning maybe a couple of years ago. I found Ali's book, Andrei's book, github source code (including for Phobos), and asking here to be the best resources. The docs make perfect sense when you have got to a certain level (or perhaps if you have a computer sciencey background), but can be tough before that (though they are getting better). You should definitely take a look at the dlangscience project organized by John Colvin and others. If you like ipython/jupyter also see his pydmagic - write D inline in a notebook. You may find this series of posts interesting too - another bioinformatics guy migrating from Python: http://forum.dlang.org/post/akzdstfiwwzfeoudh...@forum.dlang.org I realize the filter.array.length thing is indeed expensive. I find it especially horrendous that the code I've written needs to allocate a big dynamic array that will most likely be cut down quite drastically in this step. Unfortunately I haven't figured out a good way to do this without storing the intermediary results since I cannot know if there might be yet another hit for any encountered "query" since the input file might not be sorted. But the main reason I didn't just count the values like you suggest is actually that I need the filtered hits in later downstream analysis. The filtered hits for each query are used as input to a lowest common ancestor algorithm on the taxonomic tree (of life). Unfortunately I haven't time to read your code, and others will do better. But do you use .reserve() ? Also these are a nice fast container library based on Andrei Alexandrescu's allocator: https://github.com/economicmodeling/containers
Re: shared array?
On Monday, 14 September 2015 at 13:56:16 UTC, Laeeth Isharc wrote: Personally, when I make a strong claim about something and find that I am wrong (the claim that D needs to scan every pointer), I take a step back and consider my view rather than pressing harder. It's beautiful to be wrong because through recognition of error, growth. If recognition. The claim is correct: you need to follow every pointer that through some indirection may lead to a pointer that may point into the GC heap. Not doing so will lead to unverified memory unsafety. Given one was written by one (very smart) student for his PhD thesis, and that as I understand it that formed the basis of Sociomantic's concurrent garbage collector (correct me if I am wrong), and that this is being ported to D2, and whether or not it is released, success will spur others to follow - it strikes As it has been described, it is fork() based and unsuitable for the typical use case. provided one understands the situation. Poking holes at things without taking any positive steps to fix them is understandable for people that haven't a choice about their situation, but in my experience is rarely effective in making the world better. Glossing over issues that needs attention is not a good idea. It wastes other people's time. I am building my own libraries, also for memory management with move semantics etc.
chaining chain Result and underlying object of chain
chain doesn't seem to compile if I try and chain a chain of two strings and another string. what should I use instead? Laeeth.
Re: chaining chain Result and underlying object of chain
On Monday, 14 September 2015 at 14:17:51 UTC, Laeeth Isharc wrote: chain doesn't seem to compile if I try and chain a chain of two strings and another string. what should I use instead? Laeeth. std.algorithm.iteration.joiner?
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:28:41 UTC, John Colvin wrote: Yup, glibc is too old for those binaries. What does "ldd --version" say? It says "ldd (GNU libc) 2.12". Hmm... The most recent version in RHEL's repo is "2.12-1.166.el6_7.1", which is what is installed. Can this be side-loaded without too much hassle and manual effort?
Re: Speeding up text file parser (BLAST tabular format)
On Mon, Sep 14, 2015 at 02:34:41PM +, Fredrik Boulund via Digitalmars-d-learn wrote: > On Monday, 14 September 2015 at 14:18:58 UTC, John Colvin wrote: > >Range-based code like you are using leads to *huge* numbers of > >function calls to get anything done. The advantage of inlining is > >twofold: 1) you don't have to pay the cost of the function call > >itself and 2) often more optimisation can be done once a function is > >inlined. > > Thanks for that explanation! Now that you mention it it makes perfect > sense. I never considered it, but of course *huge* numbers of > function calls to e.g. next() and other range-methods will be made. > > >Because there are much better at inlining. dmd is quick to compile > >your code and is most up-to-date, but ldc and gdc will produce > >somewhat faster code in almost all cases, sometimes very dramatically > >much faster. > > Sure sounds like I could have more fun with LDC and GDC on my system > in addition to DMD :). If performance is a problem, the first thing I'd recommend is to use a profiler to find out where the hotspots are. (More often than not, I have found that the hotspots are not where I expected them to be; sometimes a 1-line change to an unanticipated hotspot can result in a huge performance boost.) The next thing I'd try is to use gdc instead of dmd. ;-) IME, code produced by `gdc -O3` is at least 20-30% faster than code produced by `dmd -O -inline`. Sometimes the difference can be up to 40-50%, depending on the kind of code you're compiling. T -- Lottery: tax on the stupid. -- Slashdotter
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea Thanks for the suggestions! I'm not too familiar with compiled languages like this, I've mainly written small programs in D and run them via `rdmd` in a scripting language fashion. I'll read up on what the different compile flags do (I knew about -O, but I'm not sure what the others do). Unfortunately I cannot get LDC working on my system. It seems to fail finding some shared library when I download the binary released, and I can't figure out how to make it compile. I haven't really given GDC a try yet. I'll see what I can do. Running the original D code I posted before with the flags you suggested reduced the runtime by about 2 seconds on average.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:37:18 UTC, John Colvin wrote: On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? -inline in particular is likely to have a strong impact here Why would -inline be particularly likely to make a big difference in this case? I'm trying to learn, but I don't see what inlining could be done in this case. Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea +1 I would expect ldc or gdc to strongly outperform dmd on this code. Why is that? I would love to learn to understand why they could be expected to perform much better on this kind of code.
Re: shared array?
On Monday, 14 September 2015 at 00:53:58 UTC, Jonathan M Davis wrote: Only the stack and the GC heap get scanned unless you tell the GC about memory that was allocated by malloc or some other mechanism. malloced memory won't be scanned by default. So, if you're using the GC minimally and coding in a way that doesn't involve needing to tell the GC about a bunch of malloced memory, then the GC won't have all that much to scan. And while the amount of memory that the GC has to scan does affect the speed of a collection, in general, the less memory that's been allocated by the GC, the faster a collection is. Idiomatic D code uses the stack heavily and allocates very little on the GC heap. ... So, while the fact that D's GC is less than stellar is certainly a problem, and we would definitely like to improve that, the idioms that D code typically uses seriously reduce the number of performance problems that we get. - Jonathan M Davis Thank you for your posts on this (and the others), Jonathan. I appreciate your taking the time to write so carefully and thoroughly, and I learn a lot from reading your work.
Re: how do I check if a member of a T has a member ?
On Sunday, 13 September 2015 at 17:34:11 UTC, BBasile wrote: On Sunday, 13 September 2015 at 17:24:20 UTC, Laeeth Isharc wrote: On Sunday, 13 September 2015 at 17:09:57 UTC, wobbles wrote: Use __traits(compiles, date.second)? Thanks. This works: static if (__traits(compiles, { T bar; bar.date.hour;})) pragma(msg,"hour"); else pragma(msg,"nohour"); can't you use 'hasMember' (either with __traits() or std.traits.hasMember)? It's more idiomatic than checking if it's compilable. I'll check again in a bit, but I seem to recall hasMember didn't work. I would like to get the type of a member of a type, and I think hasMember!(T.bar.date","hour") didn't work for that. Possibly it does work and I messed it up somehow, or it doesn't work and there is a more elegant way. Someone ought to write a tutorial showing how to use the good stuff we have to solve real problems. Eg an annotated babysteps version of Andrei's allocator talk. I can't do it as too much on my plate.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:50:22 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: [...] Thanks for the suggestions! I'm not too familiar with compiled languages like this, I've mainly written small programs in D and run them via `rdmd` in a scripting language fashion. I'll read up on what the different compile flags do (I knew about -O, but I'm not sure what the others do). Unfortunately I cannot get LDC working on my system. It seems to fail finding some shared library when I download the binary released, and I can't figure out how to make it compile. I haven't really given GDC a try yet. I'll see what I can do. Running the original D code I posted before with the flags you suggested reduced the runtime by about 2 seconds on average. what system are you on? What are the error messages you are getting?
Re: std.allocator ready for some abuse
On Saturday, 12 September 2015 at 06:45:16 UTC, bitwise wrote: [...] Alternatively, GC.addRange() could return a value indicating whether or not the range had actually been added(for the first time) and should be removed. Bit Maybe the solution is as simple as specifying the state of memory that should be received from an untyped allocator. It could be explicitly stated that an untyped allocator should give out raw memory, and should not initialize it's content or add any ranges to the GC. While it may seem obvious for a C++ allocator not to initialize it's contents, I think this makes sense in the presence of a GC. I would appreciate some feedback on this. Bit
Re: how do I check if a member of a T has a member ?
On Monday, 14 September 2015 at 14:05:01 UTC, Laeeth Isharc wrote: On Sunday, 13 September 2015 at 17:34:11 UTC, BBasile wrote: On Sunday, 13 September 2015 at 17:24:20 UTC, Laeeth Isharc wrote: [...] can't you use 'hasMember' (either with __traits() or std.traits.hasMember)? It's more idiomatic than checking if it's compilable. I'll check again in a bit, but I seem to recall hasMember didn't work. I would like to get the type of a member of a type, and I think hasMember!(T.bar.date","hour") didn't work for that. Possibly it does work and I messed it up somehow, or it doesn't work and there is a more elegant way. You mean hasMember!(typeof(T.bar.date), "hour"), right?
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:25:04 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 14:14:18 UTC, John Colvin wrote: what system are you on? What are the error messages you are getting? I really appreciate your will to try to help me out. This is what ldd shows on the latest binary release of LDC on my machine. I'm on a Red Hat Enterprise Linux 6.6 system. [boulund@terra ~]$ ldd ~/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2 /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2) /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2: /lib64/libc.so.6: version `GLIBC_2.15' not found (required by /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2) linux-vdso.so.1 => (0x7fff623ff000) libconfig.so.8 => /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/libconfig.so.8 (0x7f7f716e1000) libpthread.so.0 => /lib64/libpthread.so.0 (0x7f7f714a3000) libdl.so.2 => /lib64/libdl.so.2 (0x7f7f7129f000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0032cde0) libm.so.6 => /lib64/libm.so.6 (0x7f7f7101a000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0032cca0) libc.so.6 => /lib64/libc.so.6 (0x7f7f70c86000) /lib64/ld-linux-x86-64.so.2 (0x7f7f718ec000) As you can see it lacks something related to GLIBC, but I'm not sure how to fix that. Yup, glibc is too old for those binaries. What does "ldd --version" say?
Re: DUB release candidate 0.9.24-rc.3 ready for testing
On 09/14/2015 07:45 AM, Sönke Ludwig wrote: If no regressions show up in this RC, the final release will be made on the upcoming Sunday. The main additions are support for SDLang [1] package recipes [2] and a vastly improved "dub describe". This one really should be included so it doesn't end up as a breaking change later on: https://github.com/D-Programming-Language/dub/pull/644 And not having this one would be problematic for me, although you mentioned a possible quick follow-up release, which I guess would be fine if really, really necessary: https://github.com/D-Programming-Language/dub/pull/633
Re: chaining chain Result and underlying object of chain
On Monday 14 September 2015 16:17, Laeeth Isharc wrote: > chain doesn't seem to compile if I try and chain a chain of two > strings and another string. > > what should I use instead? Please show code, always. A simple test works for me: import std.algorithm: equal; import std.range: chain; void main() { auto chain1 = chain("foo", "bar"); auto chain2 = chain(chain1, "baz"); assert(equal(chain2, "foobarbaz")); }
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:40:29 UTC, H. S. Teoh wrote: If performance is a problem, the first thing I'd recommend is to use a profiler to find out where the hotspots are. (More often than not, I have found that the hotspots are not where I expected them to be; sometimes a 1-line change to an unanticipated hotspot can result in a huge performance boost.) I agree with you on that. I used Python's cProfile module to find the performance bottleneck in the Python version I posted, and shaved off 8-10 seconds of runtime on an extraneous str.split() I had missed. I tried using the built-in profiler in DMD on the D program but to no avail. I couldn't really make any sense of the output other than that were enormous amounts of calls to lots of functions I couldn't find a way to remove from the code. Here's a paste of the trace output from the version I posted in the original post: http://dpaste.com/1AXPK9P The next thing I'd try is to use gdc instead of dmd. ;-) IME, code produced by `gdc -O3` is at least 20-30% faster than code produced by `dmd -O -inline`. Sometimes the difference can be up to 40-50%, depending on the kind of code you're compiling. Yes, it really seems that gdc or ldc is the way to go.
Re: FancyPars
Nice one, thanks for the info. I just used Pegged to generate an API for a JSON REST service at compile time so I'm still geeking out about the power of D at compile time, but I'm always interested in parsers. On Mon, Sep 14, 2015 at 10:50 AM, Bastiaan Veelo via Digitalmars-d-announcewrote: > On Monday, 6 July 2015 at 09:22:51 UTC, Per Nordlöw wrote: > >> >> How does its design and use differ from Pegged? >> > > FWIW, this is what I learned from my first acquaintance with FancyPars > (the OP having signalled not to be available for questions). My conclusions > may be wrong though. > > Running dub produces a vibe.d web server demonstrating the capabilities of > FancyPars. This was a bit confusing at first because being a web-app seemed > central to the design of FancyPars, but I think it is not. Anyway, the > first page shows a large edit field containing an example grammar, and a > button "Generate AST". Clicking this button brings up the second page > containing D code for the lexer and parser for the given grammar, type > definitions for the nodes of the AST, as well as code for printing the AST. > > Understanding the source of FancyPars is challenging because the core > source, example vibe.d application source and supporting code, as well as > generated lexer/parser code are all contained in the same directory and > committed in the repository. > > The syntax for the grammar definition is different from Pegged, and seems > to be inspired by D. It supports a hierarchical structure. It looks > powerful, but is undocumented. The example grammar looks like this: > > ASTNode { > Identifier @internal { > [a-zA-Z_][] identifier > } > > Group @parent { > Identifier name, ? "@" : Identifier[] annotations : "@", "{", > PatternElement[] elements : "," / Group[] groups, > "}" > } > > PatternElement @internal { > > AlternativeElement @noFirst { > PatternElement[] alternatives : "/" > } > > LexerElement { > > StringElement { > "\"", char[] string_, "\"" > } > > NamedChar { > "char", ? "[]" : bool isArray, Identifier name > } > > CharRange @internal { > char rangeBegin, ? "-" : char RangeEnd > } > > RangeElement { > "[", CharRange[] ranges, "]" > } > > LookbehindElement { > "?lb", "(", StringElement str, ")" > } > > NotElement { > "!", LexerElement ce > } > > } > > NamedElement { > Identifier type, ? "[]" : bool isArray, Identifier name, > ? bool isArray : ? ":" : StringElement lst_sep > } > > ParenElement { > "(", PatternElement[] elements : ",", ")" > } > > FlagElement { > "bool", Identifier flag_name > } > > QueryElement { > "?", "bool", Identifier flag_name, ":", PatternElement elem > } > > OptionalElement { > "?", LexerElement[] ce : ",", ":", PatternElement elem > } > > } > } > > > Its announced support for left-recursion is interesting, and I may decide > to play a bit further with it. My objective would be to see if an Extended > Pascal to D translating compiler would be feasible. > > Cheers, > Bastiaan Veelo. >
Re: Operator overloading or alternatives to expression templates
On Friday, 11 September 2015 at 19:41:41 UTC, Martin Nowak wrote: Does anyone have a different idea how to make a nice query language? db.get!Person.where!(p => p.age > 21 && p.name == "Peter") In our last project we took the following approach: `auto q = query.builder!Person.age!">"(20).name("Peter");` In the query-builders there was a lot of ugly concatenating sql stuff. But there was just one guy writing that, everybody else was using the high-level functions. It allows for a lot of checking, not only on types but also on logic (e.g. age!">"(20).age!"<"(10) was caught at runtime). Plus, very unittestable. It worked so well that once the query-builders were written, all the constraints, sorting and ordering parameters could be passed straight from the http api down to the query-builders. Still, you would have to write the query-builders for each Object you want query, although that is something you could do in a DSL.
Re: shared array?
On Monday, 14 September 2015 at 08:57:07 UTC, Ola Fosheim Grøstad wrote: On Monday, 14 September 2015 at 00:53:58 UTC, Jonathan M Davis wrote: So, while the fact that D's GC is less than stellar is certainly a problem, and we would definitely like to improve that, the idioms that D code typically uses seriously reduce the number of performance problems that we get. What D needs is some way for a static analyzer to be certain that a pointer does not point to a specific GC heap. And that means language changes... one way or the other. Without language changes it becomes very difficult to reduce the amount of memory scanned without sacrificing memory safety. Personally, when I make a strong claim about something and find that I am wrong (the claim that D needs to scan every pointer), I take a step back and consider my view rather than pressing harder. It's beautiful to be wrong because through recognition of error, growth. If recognition. And I don't think a concurrent GC is realistic given the complexity and performance penalties. The same people who complain about GC would not accept performance hits on pointer-writes. That would essentially make D and Go too similar IMO. Given one was written by one (very smart) student for his PhD thesis, and that as I understand it that formed the basis of Sociomantic's concurrent garbage collector (correct me if I am wrong), and that this is being ported to D2, and whether or not it is released, success will spur others to follow - it strikes me as a problematic claim to make that developing one isn't realistic unless one is deeply embedded in the nitty gritty of the problem (because theory and practice are more different in practice than they are in theory!) There is etcimon's work too (at research stage). Don't underestimate too how future corporate support combined with an organically growing community may change what's possible. Andy Smith gave his talk based on his experience at one of the largest and well-run hedge funds. An associate who sold a decent sized marketing group got in contact to thank me for posting links on D as it helped him implement a machine-learning problem better. And if I look at what's in front of me, I really am not aware of a better solution to the needs I have, which I am pretty sure are needs that are more generally shared - corporate inertia may be a nuisance but it is also a source of opportunity for others. In response to your message earlier where you suggested that Sociomantic was an edge case of little relevance for the rest of us. I made that point in response to the claim that D had no place for such purposes. It's true that being able to do something doesn't mean it is a good idea, but really having seen them speak and looked at the people they hire, I really would be surprised if they do not know what they are doing. (I would say the same if they had never been bought). And they say that using D has significantly lowered their costs compared to their competitors. It's what I have been finding, too, dealing with data sets that are for now by no means 'big' but will be soon enough. It's also a human group phenomenon that it's very difficult to do something for the first time, and the more people that follow, the easier it is for others. So the edge case of yesteryear shall be the best practice of the future. One sees this also with allocators, where Andrei's library is already beginning to be integrated in different projects. I had never even heard of D two years ago and had approaching a twenty year break from doing a lot of programming. But they weren't difficult to pick up and use effectively. Clearly, latency and performance hits are different things, and the category of people who care about performance is only a partial intersection of those who care about latency. Part of what I do involves applying the principle of contrarian thinking, and I can say that it is very useful, and not just in the investment world: http://www.amazon.com/The-Contrary-Thinking-Humphrey-Neill/dp/087004110X On the other hand, there is also the phenomenon of just being contrary. One sometimes has the impression that some people like to argue for the sake of it. Nothing wrong with that, provided one understands the situation. Poking holes at things without taking any positive steps to fix them is understandable for people that haven't a choice about their situation, but in my experience is rarely effective in making the world better.
Re: shared array?
On Monday, 14 September 2015 at 13:56:16 UTC, Laeeth Isharc wrote: An associate who sold a decent sized marketing group Should read marketmaking. Making prices in listed equity options.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:10:50 UTC, Edwin van Leeuwen wrote: Two things that you could try: First hitlists.byKey can be expensive (especially if hitlists is big). Instead use: foreach( key, value ; hitlists ) Also the filter.array.length is quite expensive. You could use count instead. import std.algorithm : count; value.count!(h => h.pid >= (max_pid - max_pid_diff)); I didn't know that hitlists.byKey was that expensive, that's just the kind of feedback I was hoping for. I'm just grasping for straws in the online documentation when I want to do things. With my Python background it feels as if I can still get things that work that way. I realize the filter.array.length thing is indeed expensive. I find it especially horrendous that the code I've written needs to allocate a big dynamic array that will most likely be cut down quite drastically in this step. Unfortunately I haven't figured out a good way to do this without storing the intermediary results since I cannot know if there might be yet another hit for any encountered "query" since the input file might not be sorted. But the main reason I didn't just count the values like you suggest is actually that I need the filtered hits in later downstream analysis. The filtered hits for each query are used as input to a lowest common ancestor algorithm on the taxonomic tree (of life).
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:18:58 UTC, John Colvin wrote: Range-based code like you are using leads to *huge* numbers of function calls to get anything done. The advantage of inlining is twofold: 1) you don't have to pay the cost of the function call itself and 2) often more optimisation can be done once a function is inlined. Thanks for that explanation! Now that you mention it it makes perfect sense. I never considered it, but of course *huge* numbers of function calls to e.g. next() and other range-methods will be made. Because there are much better at inlining. dmd is quick to compile your code and is most up-to-date, but ldc and gdc will produce somewhat faster code in almost all cases, sometimes very dramatically much faster. Sure sounds like I could have more fun with LDC and GDC on my system in addition to DMD :).
Re: Passing Elements of A Static Array as Function Parameters
On Monday, 14 September 2015 at 09:09:27 UTC, Per Nordlöw wrote: Is there a reason why such a common thing isn't already in Phobos? If not what about adding it to std.typecons : asTuple I guess nobody's really needed that functionality before. It might be an interesting addition to std.array.
Re: Passing Elements of A Static Array as Function Parameters
On Monday, 14 September 2015 at 08:56:43 UTC, Per Nordlöw wrote: BTW: What about .tupleof? Isn't that what should be used here? I don't believe .tupleof works for arrays.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? -inline in particular is likely to have a strong impact here Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea +1 I would expect ldc or gdc to strongly outperform dmd on this code.
Re: Operator overloading or alternatives to expression templates
On Friday, 11 September 2015 at 19:41:41 UTC, Martin Nowak wrote: AFAIK expression templates are the primary choice tom implement SIMD and matrix libraries. And I still have [this idea](http://dpaste.dzfl.pl/cd375ac594cf) of implementing a nice query language for ORMs. While expression templates are used in many matrix libraries, there has also been some move away from them. Blaze uses so-called smart expression templates because of what they view as performance limitations of expression templates. I think the general idea is to reduce the creation of temporaries and ordering the calculations so as to minimize the size of the problem (like in the multiplication A*B*c, doing A*(B*c) instead of (A*B)*c) http://arxiv.org/pdf/1104.1729.pdf At least the comparison operators are really limiting, e.g. it's not possible to efficiently implement logical indexing. vec[vec < 15], vec[vec == 15], vec[(vec != 15) & (vec > 10)] I loves the logical indexing.
Re: Creating a DLL with a ActiveX interface.
On Monday, 14 September 2015 at 15:20:50 UTC, Adam D. Ruppe wrote: On Monday, 14 September 2015 at 15:14:05 UTC, Taylor Hillegeist wrote: Gives a short example but the code doesn't compile for me. core\stdc\windows\com.d seems to be missing? I think the doc copy/pasted a typo there. It should be `core.sys.windows.com`. I've done some COM stuff with D before, getting it callable from vbscript and jscript. Can you tell me what steps you're using (in some detail, like are you using IE? or some other thing?) to test your thing? then I can try to make a working example that passes it and share that. So, Actually I am using NI LabVIEW to interact with my DLL. I imagine even getting hold of of that would troublesome or expensive. But I'm pretty savvy on that end. But for me its more about how to expose an com interface to the rest of the system through a dll.
Re: chaining chain Result and underlying object of chain
On Monday, 14 September 2015 at 14:31:33 UTC, anonymous wrote: On Monday 14 September 2015 16:17, Laeeth Isharc wrote: chain doesn't seem to compile if I try and chain a chain of two strings and another string. what should I use instead? Please show code, always. A simple test works for me: import std.algorithm: equal; import std.range: chain; void main() { auto chain1 = chain("foo", "bar"); auto chain2 = chain(chain1, "baz"); assert(equal(chain2, "foobarbaz")); } Sorry - was exhausted yesterday when I had the code there, or would have posted. I was trying to use the same variable eg auto chain1 = chain("foo", "bar"); chain1 = chain(chain1, "baz"); Realized that in this case it was much simpler just to use the delegate version of toString and sink (which I had forgotten about). But I wondered what to do in other cases. It may be that the type of chain1 and chain2 don't mix.
Re: chaining chain Result and underlying object of chain
On Monday 14 September 2015 17:01, Laeeth Isharc wrote: >auto chain1 = chain("foo", "bar"); >chain1 = chain(chain1, "baz"); > > Realized that in this case it was much simpler just to use the > delegate version of toString and sink (which I had forgotten > about). But I wondered what to do in other cases. It may be > that the type of chain1 and chain2 don't mix. Yes, the types don't match. The result types of most range functions depend on the argument types. Let's say chain("foo", "bar") has the type ChainResult!(string, string). Then chain(chain("foo", "bar"), "baz") has the type ChainResult! (ChainResult!(string, string), string). Those are different and not compatible. You can get the same type by: a) being eager: import std.array: array; auto chain1 = chain("foo", "bar").array; chain1 = chain(chain1, "baz").array; (At that point you could of course just work with the strings directly, using ~ and ~=.) b) being classy: import std.range.interfaces; InputRange!dchar chain1 = inputRangeObject(chain("foo", "bar")); chain1 = inputRangeObject(chain(chain1, "baz")); Those have performance implications, of course. Being eager means allocating the whole thing, and possibly intermediate results. Being classy means allocating objects for the ranges (could possibly put them on the stack), and it means indirections.
Re: Creating a DLL with a ActiveX interface.
On Monday, 14 September 2015 at 15:14:05 UTC, Taylor Hillegeist wrote: Gives a short example but the code doesn't compile for me. core\stdc\windows\com.d seems to be missing? I think the doc copy/pasted a typo there. It should be `core.sys.windows.com`. I've done some COM stuff with D before, getting it callable from vbscript and jscript. Can you tell me what steps you're using (in some detail, like are you using IE? or some other thing?) to test your thing? then I can try to make a working example that passes it and share that.
Re: chaining chain Result and underlying object of chain
On Monday, 14 September 2015 at 15:30:14 UTC, Ali Çehreli wrote: On 09/14/2015 08:01 AM, Laeeth Isharc wrote: > I was trying to use the same variable eg > >auto chain1 = chain("foo", "bar"); >chain1 = chain(chain1, "baz"); [...] > It may be that the type of chain1 > and chain2 don't mix. Exactly. I was going to recommend using pragma(msg, typeof(chain1)) to see what they are but it looks like chain()'s return type is not templatized. (?) pragma(msg, typeof(chain1)); pragma(msg, typeof(chain2)); Prints Result Result instead of something like (hypothetical) ChainResult!(string, string) ChainResult!(ChainResult!(string, string), string) Ali It is templated, but by means of it's enclosing function being templated, which doesn't end up in the name.
Re: chaining chain Result and underlying object of chain
On 09/14/2015 08:01 AM, Laeeth Isharc wrote: > I was trying to use the same variable eg > >auto chain1 = chain("foo", "bar"); >chain1 = chain(chain1, "baz"); [...] > It may be that the type of chain1 > and chain2 don't mix. Exactly. I was going to recommend using pragma(msg, typeof(chain1)) to see what they are but it looks like chain()'s return type is not templatized. (?) pragma(msg, typeof(chain1)); pragma(msg, typeof(chain2)); Prints Result Result instead of something like (hypothetical) ChainResult!(string, string) ChainResult!(ChainResult!(string, string), string) Ali
Re: how do I check if a member of a T has a member ?
On Monday, 14 September 2015 at 14:21:12 UTC, John Colvin wrote: On Monday, 14 September 2015 at 14:05:01 UTC, Laeeth Isharc wrote: On Sunday, 13 September 2015 at 17:34:11 UTC, BBasile wrote: On Sunday, 13 September 2015 at 17:24:20 UTC, Laeeth Isharc wrote: [...] can't you use 'hasMember' (either with __traits() or std.traits.hasMember)? It's more idiomatic than checking if it's compilable. I'll check again in a bit, but I seem to recall hasMember didn't work. I would like to get the type of a member of a type, and I think hasMember!(T.bar.date","hour") didn't work for that. Possibly it does work and I messed it up somehow, or it doesn't work and there is a more elegant way. You mean hasMember!(typeof(T.bar.date), "hour"), right? Ahh. Probably that was why (I will check it shortly). Why do I need to do a typeof? What kind of thing is T.bar.date before the typeof given that T is a type?
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:35:26 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 14:28:41 UTC, John Colvin wrote: Yup, glibc is too old for those binaries. What does "ldd --version" say? It says "ldd (GNU libc) 2.12". Hmm... The most recent version in RHEL's repo is "2.12-1.166.el6_7.1", which is what is installed. Can this be side-loaded without too much hassle and manual effort? I've had nothing but trouble when using different versions of libc. It would be easier to do this instead: http://wiki.dlang.org/Building_LDC_from_source I'm running a build of LDC git HEAD right now on an old server with 2.11, I'll upload the result somewhere once it's done if it might be useful
Re: Operator overloading or alternatives to expression templates
On 09/14/2015 03:47 PM, Sebastiaan Koppe wrote: > > In our last project we took the following approach: > > `auto q = query.builder!Person.age!">"(20).name("Peter");` Interesting idea.
Creating a DLL with a ActiveX interface.
So, I've looked at this topic of COM OLE and activeX, and found myself confused. http://dlang.org/interface.html Gives a short example but the code doesn't compile for me. core\stdc\windows\com.d seems to be missing? And i cant find any documentation on core\stdc on the standard library page. http://wiki.dlang.org/Win32_DLLs_in_D Points to "The Sample Code" under COM. but I find that confusing. So here is what I desire, You guys can let me know how dumb it is. I want a dll with an activex x interface, that contains a small bit if data a string for example. And this is what i want to happen? (caller sends activex object with string)->(my dll written in d minpulates string)->(caller gets a diffrent string) I call on the wisdom of the community to help me in this. Thanks!
Re: how do I check if a member of a T has a member ?
On Monday, 14 September 2015 at 15:04:00 UTC, Laeeth Isharc wrote: On Monday, 14 September 2015 at 14:21:12 UTC, John Colvin wrote: On Monday, 14 September 2015 at 14:05:01 UTC, Laeeth Isharc wrote: On Sunday, 13 September 2015 at 17:34:11 UTC, BBasile wrote: On Sunday, 13 September 2015 at 17:24:20 UTC, Laeeth Isharc wrote: [...] can't you use 'hasMember' (either with __traits() or std.traits.hasMember)? It's more idiomatic than checking if it's compilable. I'll check again in a bit, but I seem to recall hasMember didn't work. I would like to get the type of a member of a type, and I think hasMember!(T.bar.date","hour") didn't work for that. Possibly it does work and I messed it up somehow, or it doesn't work and there is a more elegant way. You mean hasMember!(typeof(T.bar.date), "hour"), right? Ahh. Probably that was why (I will check it shortly). Why do I need to do a typeof? What kind of thing is T.bar.date before the typeof given that T is a type? T.bar.date is just a symbol. If you tried to actually access it then it would have to be a compile-time construct or be a static member/method, but it's perfectly OK to ask what type it has or what size it has. The simple story: hasMember takes a type as its first argument. T.bar.date isn't a type, it's a member of a member of a type. To find out what type it is, use typeof.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:54:34 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 14:40:29 UTC, H. S. Teoh wrote: I agree with you on that. I used Python's cProfile module to find the performance bottleneck in the Python version I posted, and shaved off 8-10 seconds of runtime on an extraneous str.split() I had missed. I tried using the built-in profiler in DMD on the D program but to no avail. I couldn't really make any sense of the output other than that were enormous amounts of calls to lots of functions I couldn't find a way to remove from the code. Here's a paste of the trace output from the version I posted in the original post: http://dpaste.com/1AXPK9P See this link for clarification on what the columns/numbers in the profile file mean http://forum.dlang.org/post/f9gjmo$2gce$1...@digitalmars.com It is still difficult to parse though. I myself often use sysprof (only available on linux), which automatically ranks by time spent.
Canvas in Gtk connected to D?
Is there a way to do a canvas in GTK3 so that I can use chart.js, and connect this to D? See, in something similar, a guy named Julien Wintz figured out that Qt's QQuickWidget acts much like the webkit Canvas object, and thus was able to port chart.js to that widget. This allows one to use Qt + QQuickWidget + D (or any Qt-supported language for that matter) to draw charts using Javascript, using the chart.js documentation. What's also fascinating about this is that it's fairly lightweight -- Julien's solution doesn't use Chromium (or other webkit implementation) to make it work. (It should be noted, however, that QQuickWidget uses OpenGL, however.) Likewise, it would be great if I could do something similar in GTK3. See, I like D, and I'm getting somewhere with it with GTK3, but doing static charts like I see with chart.js is important for my use of this language.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 16:33:23 UTC, Rikki Cattermole wrote: On 15/09/15 12:30 AM, Fredrik Boulund wrote: [...] A lot of this hasn't been covered I believe. http://dpaste.dzfl.pl/f7ab2915c3e1 1) You don't need to convert char[] to string via to. No. Too much. Cast it. Not a good idea in general. Much better to ask for a string in the first place by using byLine!(immutable(char), immutable(char)). Alternatively just use char[] throughout.
Re: Creating a DLL with a ActiveX interface.
On Monday, 14 September 2015 at 15:44:36 UTC, Taylor Hillegeist wrote: So, Actually I am using NI LabVIEW to interact with my DLL. I imagine even getting hold of of that would troublesome or expensive. Ah, all right. Here's a SO thing (followed up by email then copy/pasted there) I did for someone else with a hello COM dll: http://stackoverflow.com/questions/19937521/what-are-options-to-communicate-between-vb6-com-server-on-windows-xp-and-python The main thing is dealing with a bug in Windows XP. If you are on Vista or above, you can skip that and hopefully just look at the example zip here: http://arsdnet.net/dcode/com.zip That's almost two years old now but should still basically work (I will try next time I have a free hour on Windows too) and hopefully get you started.
[Issue 15044] [REG2.068.0] destroy might leak memory
https://issues.dlang.org/show_bug.cgi?id=15044 Kenji Harachanged: What|Removed |Added Keywords||pull, rejects-valid Component|druntime|dmd Summary|[Reg 2.068.0] destroy might |[REG2.068.0] destroy might |leak memory |leak memory --- Comment #4 from Kenji Hara --- https://github.com/D-Programming-Language/dmd/pull/5075 --
Re: Speeding up text file parser (BLAST tabular format)
On Mon, Sep 14, 2015 at 04:13:12PM +, Edwin van Leeuwen via Digitalmars-d-learn wrote: > On Monday, 14 September 2015 at 14:54:34 UTC, Fredrik Boulund wrote: > >[...] I tried using the built-in profiler in DMD on the D program but > >to no avail. I couldn't really make any sense of the output other > >than that were enormous amounts of calls to lots of functions I > >couldn't find a way to remove from the code. Here's a paste of the > >trace output from the version I posted in the original post: > >http://dpaste.com/1AXPK9P > > > > See this link for clarification on what the columns/numbers in the > profile file mean > http://forum.dlang.org/post/f9gjmo$2gce$1...@digitalmars.com > > It is still difficult to parse though. I myself often use sysprof > (only available on linux), which automatically ranks by time spent. Dmd's profiler has some limitations, especially if you're doing something that's CPU bound for a long time (its internal counters are not wide enough and may overflow -- I have run into this before and it made it unusable for me). I highly recommend using `gdc -pg` with gprof. T -- Only boring people get bored. -- JM
Re: Speeding up text file parser (BLAST tabular format)
On 15/09/15 12:30 AM, Fredrik Boulund wrote: Hi, This is my first post on Dlang forums and I don't have a lot of experience with D (yet). I mainly code bioinformatics-stuff in Python on my day-to-day job, but I've been toying with D for a couple of years now. I had this idea that it'd be fun to write a parser for a text-based tabular data format I tend to read a lot of in my programs, but I was a bit stomped that the D implementation I created was slower than my Python-version. I tried running `dmd -profile` on it but didn't really understand what I can do to make it go faster. I guess there's some unnecessary dynamic array extensions being made but I can't figure out how to do without them, maybe someone can help me out? I tried making the examples as small as possible. Here's the code D code: http://dpaste.com/2HP0ZVA Here's my Python code for comparison: http://dpaste.com/0MPBK67 Using a small test file (~550 MB) on my machine (2x Xeon(R) CPU E5-2670 with RAID6 SAS disks and 192GB of RAM), the D version runs in about 20 seconds and the Python version less than 16 seconds. I've repeated runs at least thrice when testing. This holds true even if the D version is compiled with -O. The file being parsed is the output of a DNA/protein sequence mapping algorithm called BLAT, but the tabular output format is originally known from the famous BLAST algorithm. Here's a short example of what the input files looks like: http://dpaste.com/017N58F The format is TAB-delimited: query, target, percent_identity, alignment_length, mismatches, gaps, query_start, query_end, target_start, target_end, e-value, bitscore In the example the output is sorted by query, but this cannot be assumed to hold true for all cases. The input file varies in range from several hundred megabytes to several gigabytes (10+ GiB). A brief explanation on what the code does: Parse each line, Only accept records with percent_identity >= min_identity (90.0) and alignment_length >= min_matches (10), Store all such records as tuples (in the D code this is a struct) in an array in an associative array indexed by 'query', For each query, remove any records with percent_id less than 5 percentage points less than the highest value observed for that query, Write results to stdout (in my real code the data is subject to further downstream processing) This was all just for me learning to do some basic stuff in D, e.g. file handling, streaming data from disk, etc. I'm really curious what I can do to improve the D code. My original idea was that maybe I should compile the performance critical parts of my Python codebase to D and call them with PyD or something, but not I'm not so sure any more. Help and suggestions appreciated! A lot of this hasn't been covered I believe. http://dpaste.dzfl.pl/f7ab2915c3e1 1) You don't need to convert char[] to string via to. No. Too much. Cast it. 2) You don't need byKey, use foreach key, value syntax. That way you won't go around modifying things unnecessarily. Ok, I disabled GC + reserved a bunch of memory. It probably won't help much actually. In fact may make it fail so keep that in mind. Humm what else. I'm worried about that first foreach. I don't think it needs to exist as it does. I believe an input range would be far better. Use a buffer to store the Hit[]'s. Have a subset per set of them. If the first foreach is an input range, then things become slightly easier in the second. Now you can turn that into it's own input range. Also that .array usage concerns me. Many an allocation there! Hence why the input range should be the return from it. The last foreach, is lets assume dummy. Keep in mind, stdout is expensive here. DO NOT USE. If you must buffer output then do it large quantities. Based upon what I can see, you are definitely not able to use your cpu's to the max. There is no way that is the limiting factor here. Maybe your usage of a core is. But not the cpu's itself. The thing is, you cannot use multiple threads on that first foreach loop to speed things up. No. That needs to happen all on one thread. Instead after that thread you need to push the result into another. Perhaps, per thread one lock (mutex) + buffer for hits. Go round robin over all the threads. If mutex is still locked, you'll need to wait. In this situation a locked mutex means all you worker threads are working. So you can't do anything more (anyway). Of course after all this, the HDD may still be getting hit too hard. In which case I would recommend you memory mapping it. Which should allow the OS to more efficiently handle reading it into memory. But you'll need to rework .byLine for that. Wow that was a lot at 4:30am! So don't take it too seriously. I'm sure somebody else will rip that to shreds!
[Issue 15044] [REG2.068.0] destroy might leak memory
https://issues.dlang.org/show_bug.cgi?id=15044 --- Comment #5 from github-bugzi...@puremagic.com --- Commits pushed to stable at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/8e4676303a688ce3a034b38508b5e5b8c7bfa7e0 fix Issue 15044 - destroy might leak memory Defer semantic3 running of generated `opAssign` function, and add special error gagging mechanism in `FuncDeclaration.semantic3()` to hide errors from its body. https://github.com/D-Programming-Language/dmd/commit/af4d3a4158f95d1720b42e8027ae2aead90c7a4f Merge pull request #5075 from 9rnsr/fix15044 [REG2.068.0] Issue 15044 - destroy might leak memory --
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 16:33:23 UTC, Rikki Cattermole wrote: A lot of this hasn't been covered I believe. http://dpaste.dzfl.pl/f7ab2915c3e1 I believe that should be: foreach (query, ref value; hitlists) Since an assignment happenin there..?
Beta D 2.068.2-b2
The second beta for the 2.068.2 point release fixes an regression with destroy that could result in a memory leak [¹]. http://downloads.dlang.org/pre-releases/2.x/2.068.2/ -Martin [¹]: https://issues.dlang.org/show_bug.cgi?id=15044
[Issue 15060] New: Can't load a D shared library first, then load a C shared library
https://issues.dlang.org/show_bug.cgi?id=15060 Issue ID: 15060 Summary: Can't load a D shared library first, then load a C shared library Product: D Version: D2 Hardware: All OS: Mac OS X Status: NEW Severity: major Priority: P1 Component: druntime Assignee: nob...@puremagic.com Reporter: alil...@gmail.com Repost of a message in https://issues.dlang.org/show_bug.cgi?id=14824 also reported here with a $50 bounty since I originally found it with LDC, but it also happens with DMD: https://github.com/ldc-developers/ldc/issues/1071 DMD version = 2.068.1 OS = OS X 10.10.4 == Setup == Here the host program source. What it does is: - for each dynlib in command line: - load dynlibs - call the VSTPluginMain function if it exist, - unload it - ldvst.cpp #include #include #include #include typedef __cdecl void* (*VSTPluginMain_t)(void*); int main(int argc, char**argv) { std::vectordllPaths; if (argc < 2) { printf("usage: ldvst [-lazy] \n"); return 1; } bool lazy = false; for (int i = 1; i < argc; ++i) { char* arg = argv[i]; if (strcmp(arg, "-lazy") == 0) lazy = true; else if (strcmp(arg, "-now") == 0) lazy = false; else dllPaths.push_back(arg); } for (int i = 0; i < dllPaths.size(); ++i) { char* dllPath = dllPaths[i]; printf("dlopen(%s)\n", dllPath); void* handle = dlopen(dllPath, lazy ? RTLD_LAZY : RTLD_NOW); if (handle == NULL) { printf("error: dlopen of %s failed\n", dllPath); return 2; } VSTPluginMain_t VSTPluginMain = (VSTPluginMain_t) dlsym(handle, "VSTPluginMain"); printf("dlsym returned %p\n", (void*)VSTPluginMain); if (VSTPluginMain != NULL) { void* result = VSTPluginMain(NULL); printf("VSTPluginMain returned %p\n", result); } printf("dlclose(%s)\n\n", dllPath); dlclose(handle); } return 0; } - The host is compiled with: $ clang++ ldvst.cpp -o ldvst-cpp Here the whole dynlib source: - distort.d - extern(C) void* VSTPluginMain(void* hostCallback) { import core.runtime; Runtime.initialize(); import std.stdio; import core.stdc.stdio; printf("Hello !\n"); Runtime.terminate(); return null; } -- This dynlib can be built with ldc: $ ldc2 -shared -oflibdistort.so -g -w -I. distort.d -relocation-model=pic or with dmd $ dmd -c -ofdistort.o -debug -g -w -version=Have_distort -I. distort.d -fPIC $ dmd -oflibdistort.so distort.o -shared -g For the purpose of demonstration you need another C dynlib, for example /System/Library/Frameworks/Cocoa.framework/Cocoa on OS X. Now the bug triggers when calling: == How to reproduce == $ ldvst-cpp libdistort.so => works $ ldvst-cpp libdistort.so libdistort.so => works $ ldvst-cpp /System/Library/Frameworks/Cocoa.framework/Cocoa libdistort.so => works $ ldvst-cpp libdistort.so /System/Library/Frameworks/Cocoa.framework/Cocoa => FAIL, and that's precisely the case that happen in production :( $ ldvst-cpp /System/Library/Frameworks/Cocoa.framework/Cocoa libdistort.so /System/Library/Frameworks/Cocoa.framework/Cocoa => works In other words, if the host program loads a D dynlib first, then a C dynlib, then the second dlopen fail. This is pernicious since the host program would scan my program, dlclose successfully, then put in jail another program. As of today, this is the only thing between me and customers, all other bugs can be work-arounded, this one I'm not sure. --
Re: Release D 2.068.1
On Monday, 14 September 2015 at 17:51:59 UTC, Martin Nowak wrote: What platform are you on? I'm on OS X, using the homebrew version of DMD. And homebrew is telling me that I have 2.068.1 installed $ brew install dmd Warning: dmd-2.068.1 already installed $ dmd --version DMD64 D Compiler v2.068 Copyright (c) 1999-2015 by Digital Mars written by Walter Bright And if I check $ which dmd /usr/local/bin/dmd Then if I check the link /usr/local/bin/dmd -> ../Cellar/dmd/2.068.1/bin/dmd
Re: shared array?
On Monday, September 14, 2015 01:12:02 Ola Fosheim Grostad via Digitalmars-d-learn wrote: > On Monday, 14 September 2015 at 00:41:28 UTC, Jonathan M Davis > wrote: > > Regardless, idiomatic D involves a lot more stack allocations > > than you often get even in C++, so GC usage tends to be low in > > Really? I use VLAs in my C++ (a C extension) and use very few > mallocs after init. In C++ even exceptions can be put outside the > heap. Just avoid STL after init and you're good. >From what I've seen of C++ and understand of typical use cases from other folks, that's not at all typical of C++ usage (though there's enough people using C++ across a wide enough spectrum of environments and situations that there's obviously going to be quite a wide spread of what folks do with it). A lot of C++ folks use classes heavily, frequently allocating them on the heap. Major C++ libraries such as Qt certainly are designed with the idea that you're going to be allocating as the program runs. And C++ historically has been tooted fairly heavily by many folks as an OO language, in which case, inheritance (and therefore heap allocation) are used heavily by many programs. And with C++11/14, more mechanisms for safely handling memory have been added, thereby further encouraging the use of certain types of heap allocations in your typical C++ program - e.g. make_shared has become the recommended way to allocate memory in most cases. And while folks who are trying to get the bare metal performance that some stuff like games require, most folks are going to use the STL quite a bit. And if they aren't, they're probably using similar classes from a 3rd party library such as Qt. It's the folks who are in embedded environments or who have much more restrictive performance requirements who are more likely to avoid the STL or do stuff like avoid heap allocations after the program has been initialized. So, a _lot_ of C++ code uses the heap quite heavily, and I expect that very little of it tries to allocate everything up front. I, for one, have never worked on an application where that even made sense aside from something very small. I know that applications like that definitely exist, but from everything I've seen, I'd expect them to be the exception to the rule rather than the norm. Regardless, idiomatic D promotes ranges, which naturally help reduce heap allocation. It also means using structs heavily and classes sparingly (though there are plenty of cases where inheritance is required and thus classes get used). And while arrays/strings get allocated on the heap, slicing seriously reduces how often they need to be copied in memory, which reduces heap allocations. So, idiomatic D encourages programs to be written in a way that keeps heap allocation to a minimum. The big place that it happens in most D programs is probably strings, but slicing helps consderably with that, and even some of the string stuff can be made to live on the stack rather than the heap (which is what a lot of Walter's recent work in Phobos has been for - making string-based stuff work as lazy ranges), reducing heap allocations for strings ever further. C++ on the other hand does not have such idioms as the norm or as promoted in any serious way. So, programmers are much more likely to use idioms that involve a lot of heap allocations, and the language and standard library don't really have much to promote idioms that avoid heap allocations (and std::string definitely isn't designed to avoid copying). You can certainly do it - and many do - but since it's not really what's promoted by the language or standard library, it's less likely to happen in your average program. It's much more likely to be done by folks who avoid the STL. So, you _can_ have low heap allocation in a C++ program, and many people do, but from what I've seen, that really isn't the norm across the C++ community in general. - Jonathan M Davis
Re: reading file byLine
On Monday, 14 September 2015 at 18:36:54 UTC, Meta wrote: As an aside, you should use `sort()` instead of the parentheses-less `sort`. The reason for this is that doing `arr.sort` invokes the old builtin array sorting which is terribly slow, whereas `import std.algorithm; arr.sort()` uses the much better sorting algorithm defined in Phobos. Thanks for pointing out.
Re: Passing Arguments on in Variadic Functions
On Monday 14 September 2015 21:59, jmh530 wrote: > This approach gives the correct result, but dmd won't deduce the > type of the template. So for instance, the second to the last > line of the unit test requires explicitly stating the types. I > may as well use the alternate version that doesn't use the > variadic function (which is simple for this trivial example, but > maybe not more generally). You can use a variadic template instead: import std.algorithm : sum; auto test(R, E ...)(R r, E e) { return sum(r, e); } unittest { int[] x = [10, 5, 15, 20, 30]; assert(test(x) == 80); assert(test(x, 0f) == 80f); assert(test(x, 0f) == 80f); }
Re: Passing Arguments on in Variadic Functions
On Monday, 14 September 2015 at 19:59:18 UTC, jmh530 wrote: In R, it is easy to have some optional inputs labeled as ... and then pass all those optional inputs in to another function. I was trying to get something similar to work in a templated D function, but I couldn't quite get the same behavior. What I have below is what I was able to get working. You want to generally avoid the varargs and instead use variadic templates. The syntax is similar but a bit different: R test(R, E, Args...)(Args args) { static if(Args.length == 0) // no additional arguments else return sum(args); } or whatever you actually need. But what this does is take any number of arguments of various types but makes that length and type available at compile time for static if inspection. The args represents the whole list and you can loop over it, convert to an array with `[args]` (if they are all compatible types, or pass to another function as a group like I did here. This is the way writeln is implemented btw.
Re: Passing Arguments on in Variadic Functions
Thanks to you both. This works perfect.
Re: Speeding up text file parser (BLAST tabular format)
On Mon, Sep 14, 2015 at 08:07:45PM +, Kapps via Digitalmars-d-learn wrote: > On Monday, 14 September 2015 at 18:31:38 UTC, H. S. Teoh wrote: > >I decided to give the code a spin with `gdc -O3 -pg`. Turns out that > >the hotspot is in std.array.split, contrary to expectations. :-) > >Here are the first few lines of the gprof output: > > > >[...] > > Perhaps using the new rangified splitter instead of split would help. I tried it. It was slower, surprisingly. I didn't dig deeper into why. T -- I see that you JS got Bach.
[Issue 15059] New: D Bug Tracker graph disappeared
https://issues.dlang.org/show_bug.cgi?id=15059 Issue ID: 15059 Summary: D Bug Tracker graph disappeared Product: D Version: D2 Hardware: x86 OS: Mac OS X Status: NEW Severity: enhancement Priority: P1 Component: dlang.org Assignee: nob...@puremagic.com Reporter: alil...@gmail.com This page used to have a nice graph of D bugs and resolution. http://dlang.org/bugstats.php --
Re: Canvas in Gtk connected to D?
On Monday, 14 September 2015 at 19:56:57 UTC, Justin Whear wrote: Mike, as this is really a GTK3 question and not specific to D (if GTK will let you do it in C, you can do it in D), you might have better success asking the GTK forum (gtkforums.com). Another avenue of research would be to look at CEF (D bindings here: http://code.dlang.org/packages/ derelict-cef) and see if that will integrate with your toolkit. Unfortunately derelict-cef is still alpha and not documented well yet. The developer is working on a book on another project, I hear. I'll ask in the GTK Forums what they recommend as the most recently recommended approach for doing static charts in GTK3.
Re: shared array?
On Monday, 14 September 2015 at 20:54:55 UTC, Jonathan M Davis wrote: On Monday, September 14, 2015 01:12:02 Ola Fosheim Grostad via Digitalmars-d-learn wrote: On Monday, 14 September 2015 at 00:41:28 UTC, Jonathan M Davis wrote: > Regardless, idiomatic D involves a lot more stack > allocations than you often get even in C++, so GC usage > tends to be low in Really? I use VLAs in my C++ (a C extension) and use very few mallocs after init. In C++ even exceptions can be put outside the heap. Just avoid STL after init and you're good. From what I've seen of C++ and understand of typical use cases from other folks, that's not at all typical of C++ usage (though there's enough people using C++ across a wide enough spectrum of environments and situations that there's obviously going to be quite a wide spread of what folks do with it). A lot of C++ folks use classes heavily, frequently allocating them on the heap. Dude, my c++ programs are all static ringbuffers and stack allocations. :) It varies a lot. Some c++ programmers turn off everything runtime related and use it as a better c. When targetting mobile you have to be careful about wasting memory... types of heap allocations in your typical C++ program - e.g. make_shared has become the recommended way to allocate memory in most cases I use unique_ptr with custom deallocator (custom freelist), so it can be done outside the heap. :) And while folks who are trying to get the bare metal performance that some stuff like games require, most folks are going to use the STL quite a bit. I use std::array. And my own array view type to reference it. Array_view us coming to c++17 I think. Kinda like D slices. STL/string/iostream is for me primarily useful for init and testing... such as Qt. It's the folks who are in embedded environments or who have much more restrictive performance requirements who are more likely to avoid the STL or do stuff like avoid heap allocations after the program has been initialized. Mobile audio/graphics... So, you _can_ have low heap allocation in a C++ program, and many people do, but from what I've seen, that really isn't the norm across the C++ community in general. I dont think there is a C++ community ;-) I think c++ programmers are quite different based on what they do and when they started using it. I only use it where performance/latency matters. C++ is too annoying (time consuming) for full blown apps IMHO. Classes are easy to stack allocate though, no need to heap allocate most of the time. Lambdas in c++ are often just stack allocated objects, so not so different from D's "ranges" (iterators) anyhow. I don't see my own programs suffer from c++isms anyway...
Re: Release D 2.068.1
On Monday, 14 September 2015 at 20:14:45 UTC, Jack Stouffer wrote: On Monday, 14 September 2015 at 17:51:59 UTC, Martin Nowak wrote: What platform are you on? I'm on OS X, using the homebrew version of DMD. And homebrew is telling me that I have 2.068.1 installed $ brew install dmd Warning: dmd-2.068.1 already installed $ dmd --version DMD64 D Compiler v2.068 Copyright (c) 1999-2015 by Digital Mars written by Walter Bright And if I check $ which dmd /usr/local/bin/dmd Then if I check the link /usr/local/bin/dmd -> ../Cellar/dmd/2.068.1/bin/dmd Yeah, I get this too. Same with 2.068.2-b2
Re: Release D 2.068.1
On Monday, 14 September 2015 at 20:14:45 UTC, Jack Stouffer wrote: On Monday, 14 September 2015 at 17:51:59 UTC, Martin Nowak wrote: What platform are you on? I'm on OS X, using the homebrew version of DMD. And homebrew is telling me that I have 2.068.1 installed Well I guess it's a bug in the homebrew script then. Nobody is setting the VERSION file and there is no git repo to query. https://github.com/Homebrew/homebrew/blob/f8b0ff3ef63e60a1da17ec8d8e68d949b1cebc27/Library/Formula/dmd.rb#L50
Re: shared array?
On Monday, September 14, 2015 14:19:30 Ola Fosheim Grøstad via Digitalmars-d-learn wrote: > On Monday, 14 September 2015 at 13:56:16 UTC, Laeeth Isharc wrote: > The claim is correct: you need to follow every pointer that > through some indirection may lead to a pointer that may point > into the GC heap. Not doing so will lead to unverified memory > unsafety. > > > Given one was written by one (very smart) student for his PhD > > thesis, and that as I understand it that formed the basis of > > Sociomantic's concurrent garbage collector (correct me if I am > > wrong), and that this is being ported to D2, and whether or not > > it is released, success will spur others to follow - it strikes > > As it has been described, it is fork() based and unsuitable for > the typical use case. I'm not sure why it wouldn't be suitable for the typical use case. It's quite performant. It would still not be suitable for many games and environments that can't afford to stop the world for more than a few milliseconds, but it brings the stop the world time down considerably, making the GC more suitable for more environments than it would be now, and I'm not aware of any serious downsides to it on a *nix system. Its achilles heel is Windows. On *nix, forking is cheap, but on Windows, it definitely isn't. So, a different mechanism would be needed to make the concurrent GC work on Windows, and I don't know if Windows really provides the necessarily tools to do that, though I know that some folks were looking into it at least at the time of Leandro's talk. So, we're either going to need to figure out how to get the concurrent GC working on Windows via some mechanism other than fork, or Windows is going to need a different solution to get that kind of improvement out of the GC. - Jonathan M Davis
Convert array to tupled array easily?
I created the following code that some of you have already seen. It's sort of a multiple value AA array with self tracking. The problem is, that for some type values, such as delegates, the comparison is is identical. (basically when the delegate is the same) To solve that problem, I'd like to try and turn the Value into Tuples of the Value and the address of the SingleStore wrapper(which should be unique). e.g., public Tuple!(TValue, void*)[][TKey] Store; then I'll simply compare the value and address stored with the this(inside single store) instead of just this. Of course, this requires somewhat of a rewrite of the code(trying it produced all kinds of errors(I tried to fix up all the references and correlated variables but still a mess, specially with D's error codes). It shouldn't be that much trouble though. Essentially where ever I access the value, I want to instead of use value from the tuple(a single indirection). Probably not that easy though? import std.stdio; import std.concurrency; extern (C) int getch(); import std.string; import std.concurrency; import core.time; import core.thread; import std.container.array; import std.typecons; public class SingleStore(TKey, TValue) { public TValue Value; public TKey Key; public TValue[][TKey] Store; // Duplicate entries will be removed together as there is no way to distinguish them public auto Remove() { import std.algorithm; if (Value == null || Key == null) return; int count = 0; for(int i = 0; i < Store[Key].length;i++) { auto c = Store[Key][i]; if (c == Value) { count++; Store[Key][i] = null; // Set to null to release any references if necessary swap(Store[Key][i], Store[Key][max(0, Store[Key].length - count)]); i = i - 1; } } if (count == 1 && Store[Key].length == 1) { Store[Key] = null; Store.remove(Key); } else Store[Key] = Store[Key][0..max(0,Store[Key].length-count)]; Value = null; Key = null; } public static auto New(TKey k, TValue v, ref TValue[][TKey] s) { auto o = new SingleStore!(TKey, TValue)(k, v); o.Store = s; return o; } private this(TKey k, TValue v) { Key = k; Value = v; } } // Creates a static Associative Array that stores multiple values per key. The object returned by New can then be used to remove the key/value without having to remember specifically them. public mixin template ObjectStore(TKey, TValue) { // The object store. It is static. Mixin the template into it's different types to create different types of stores. All objects of that type are then in the same store. public static TValue[][TKey] Store; public static auto New(TKey k, TValue v) { (Store[k]) ~= v; auto o = SingleStore!(TKey, TValue).New(k, v, Store); return o; } public string ToString() { return "asdf"; } } alias dg = int delegate(int); //alias dg = string; class MyStore { mixin ObjectStore!(string, dg); //mixin ObjectStore!(string, string); } void main() { auto k = "x"; dg d1 = (int x) { return x; }; dg d2 = (int x) { return x; }; dg d3 = d1; dg d4 = (int x) { return 3*x; }; /* dg d1 = "a1"; dg d2 = "a2"; dg d3 = "a3"; dg d4 = "a4"; */ auto s = MyStore.New(k, d1); writeln(MyStore.Store[k].length); auto s1 = MyStore.New(k, d2); writeln(MyStore.Store[k].length); auto s2 = MyStore.New(k, d3); writeln(MyStore.Store[k].length); auto s3 = MyStore.New(k, d4); writeln(MyStore.Store[k].length); //auto x = MyStore.Store[k][0](3); //writeln("-" ~ x); s1.Remove(); writeln(MyStore.Store[k].length); s2.Remove(); writeln(MyStore.Store[k].length); s.Remove(); writeln(MyStore.Store[k].length); s3.Remove(); getch(); }
Re: shared array?
On Monday, 14 September 2015 at 20:34:03 UTC, Jonathan M Davis wrote: I'm not sure why it wouldn't be suitable for the typical use case. It's quite performant. It would still not be suitable for many games and environments that can't afford to stop the world for more than a few milliseconds, but it brings the stop the world time down considerably, making the GC more suitable for more environments than it would be now, and I'm not aware of any serious downsides to it on a *nix system. For me concurrent gc implies interactive applications or webservices that are memory constrained/diskless. You cannot prevent triggering actions that writes all over memory during collection without taking special care, like avoiding RC. A fork kan potentially double memory consumption. Gc by itself uses ~2x memory, with fork you have to plan for 3-4x. In the cloud you pay for extra RAM. So configuring the app to a fixed sized memory heap that matches the instance RAM capacity is useful. With fork you just have to play it safe and halve the heap size. So more collections and less utilized RAM per dollar with fork. Only testing will show the effect, but it does not sound promising for my use cases.
Re: shared array?
On Monday, 14 September 2015 at 20:54:55 UTC, Jonathan M Davis wrote: So, you _can_ have low heap allocation in a C++ program, and many people do, but from what I've seen, that really isn't the norm across the C++ community in general. - Jonathan M Davis Fully agreed, C++ in the wild often make lots of copies of data structure, sometimes by mistake (like std::vector passed by value instead of ref). When you copy an aggregate by mistake, every field itself gets copied etc. Copies copies copies everywhere.
Re: GC performance: collection frequency
http://dlang.org/changelog/2.067.0.html#gc-options On Mon, 14 Sep 2015 12:25:06 -0700 "H. S. Teoh via Digitalmars-d"wrote: > On Mon, Sep 14, 2015 at 07:19:53PM +, Jonathan M Davis via > Digitalmars-d wrote: [...] > > Isn't there some amount of configuration that can currently be done > > via environment variables? Or was that just something that someone > > had done in one of the GC-related dconf talks that never made it > > into druntime proper? It definitely seemed like a good idea in any > > case. > [...] > > If it's undocumented, it's as good as not existing as far as end users > are concerned. :-) I didn't see anything mentioned in core.memory's > docs, nor in dlang.org's page on the GC, nor on the wiki's GC page. > > > T >
[Issue 15056] [REG2.068.1] Unstored RAII struct return yields bogus error: "cannot mix core.std.stdlib.alloca() and exception handling"
https://issues.dlang.org/show_bug.cgi?id=15056 --- Comment #5 from Kenji Hara--- It's intentionally introduced in: https://github.com/D-Programming-Language/dmd/pull/5003 To fix wrong-code issue 14708. Today, in Win64 and all Posix platforms, dmd uses exception handling table, and it doesn't support using alloca in a function that contains try-finally statement. --
Re: Operator overloading or alternatives to expression templates
On Monday, 14 September 2015 at 18:17:05 UTC, Adam D. Ruppe wrote: On Monday, 14 September 2015 at 13:47:10 UTC, Sebastiaan Koppe wrote: `auto q = query.builder!Person.age!">"(20).name("Peter");` I confess that I'm not really paying attention to this thread, but I can't help but think plain old literal: `Person.age > 20 && Person.name = 'Peter'` is nicer. You can still CT check that by parsing the string if you want. It is definitely nicer, but this is also a contrived use-case. What happens when you have a couple of joins, how would you write that? Or inner queries? Suppose we wanted to fetch all users who didn't place an order in the last year, it would be as simple as calling `.lastOrderDate!"<"(lastYear)`. It would do the join with the order table on the appropriate column. Granted, someone has to write `lastOrderDate()`.
Re: Convert array to tupled array easily?
On 09/14/2015 04:23 PM, Prudence wrote: > To solve that problem, I'd like to try and turn the Value into Tuples of > the Value and the address of the SingleStore wrapper(which should be > unique). > > e.g., > public Tuple!(TValue, void*)[][TKey] Store; After changing that, I methodically dealt with compilation errors. A total of 6 changes were sufficient: $ diff before.d after.d 24c24 < public TValue[][TKey] Store; --- > public Tuple!(TValue, void*)[][TKey] Store; 36c36 < if (c == Value) --- > if (c[0] == Value) 39c39 < Store[Key][i] = null; // Set to null to release any references if necessary --- > Store[Key][i][1] = null; // Set to null to release any references if necessary 58c58 < public static auto New(TKey k, TValue v, ref TValue[][TKey] s) --- > public static auto New(TKey k, TValue v, ref Tuple!(TValue, void*)[][TKey] s) 77c77 < public static TValue[][TKey] Store; --- > public static Tuple!(TValue, void*)[][TKey] Store; 81c81 < (Store[k]) ~= v; --- > (Store[k]) ~= tuple(v, cast(void*)null); Ali
Re: Implement the "unum" representation in D ?
On Wednesday, 22 July 2015 at 20:41:42 UTC, jmh530 wrote: On Wednesday, 22 July 2015 at 19:28:41 UTC, Andrei Alexandrescu wrote: On 7/13/15 1:20 AM, Nick B wrote: All we can do now, with our limited resources, is to keep an eye on developments and express cautious interest. If someone able and willing comes along with a unum library for D, that would be great. The book has quite a bit of Mathematica code at the end. A first pass at a unum library could probably just involve porting that to D. Here is a Python port of the Mathematica code which is that much closer to D: https://github.com/jrmuizel/pyunum Regardless of whether the representation used follows the unum bitwise format proposed in the book, having the semantics of interval arithmetic that keeps proper account of amount of precision, and exact values vs. open/closed ended intervals on the extended real line would be quite valuable for verified computation and for how it enables simple root finding / optimization / differential equation solving via search, and could build on hard float directly andor an extended precision float library such as MPFR to keep performance close to raw floats. The bitwise format is intended to fit as much data as possible through the time-energy bottleneck between main memory and the processor. This would clearly be of great value if the format were supported in hardware but of less clear value otherwise. The semantic improvements would be quite welcome regardless. They go much further than best practices with floats to put error bounds on computations, handle over/underflow and infinity correctly, and keep algorithms correct by default. We might also consider small semantic improvements if not rigidly bound by the proposed formats and the need to implement in hardware, such as having division by an interval including zero return the union of the results from the intervals on either side of the zero as well as NaN (1-D intervals are strictly represented as the convex hull of at most two points in the proposed format, so an interval such as this made of a sentinel value unified with two disjoint intervals, though it has better semantics, is not supported). Anthony
Re: Canvas in Gtk connected to D?
On Monday, 14 September 2015 at 21:57:23 UTC, Mike McKee wrote: I'll ask in the GTK Forums what they recommend as the most recently recommended approach for doing static charts in GTK3. BTW, the gtkforums.com site doesn't just let anyone in. I'm still waiting on an admin to approve me. :(
Re: Implement the "unum" representation in D ?
On Saturday, 11 July 2015 at 03:02:24 UTC, Nick B wrote: On Thursday, 20 February 2014 at 10:10:13 UTC, Nick B wrote: Hi everyone. John Gustafson Will be presenting a Keynote on Thursday 27th February at 11:00 am The abstract is here: http://openparallel.com/multicore-world-2014/speakers/john-gustafson/ There is also a excellent background paper, (PDF - 64 pages) which can be found here: FYI John Gustafson book is now out: It can be found here: http://www.amazon.com/End-Error-Computing-Chapman-Computational/dp/1482239868/ref=sr_1_1?s=books=UTF8=1436582956=1-1=John+Gustafson=1436583212284=093TDC82KFP9Y4S5PXPY Here is one of the reviewers comments: 9 of 9 people found the following review helpful This book is revolutionary By David Jefferson on April 18, 2015 This book is revolutionary. That is the only way to describe it. I have been a professional computer science researcher for almost 40 years, and only once or twice before have I seen a book that is destined to make such a profound change in the way we think about computation. It is hard to imagine that after 70 years or so of computer arithmetic that there is anything new to say about it, but this book reinvents the subject from the ground up, from the very notion of finite precision numbers to their bit-level representation, through the basic arithmetic operations, the calculation of elementary functions, all the way to the fundamental methods of numerical analysis, including completely new approaches to expression calculation, root finding, and the solution of differential equations. On every page from the beginning to the end of the book there are surprises that just astonished me, making me re-think material that I thought had been settled for decades. The methods described in this book are profoundly different from all previous treatments of numerical methods. Unum arithmetic is an extension of floating point arithmetic, but mathematically much cleaner. It never does rounding, so there is no rounding error. It handles what in floating point arithmetic is called "overflow" and "underflow" in a far more natural and correct way that makes them normal rather than exceptional. It also handles exceptional values (NaN, +infinity, -infinity) cleanly and consistently. Those contributions alone would have been a profound contribution. But the book does much more. One of the reasons I think the book is revolutionary is that unum-based numerical methods can effortlessly provide provable bounds on the error in numerical computation, something that is very rare for methods based on floating point calculations. And the bounds are generally as tight as possible (or as tight as you want them), rather than the useless or trivial bounds as often happens with floating point methods or even interval arithmetic methods. Another reason I consider the book revolutionary is that many of the unum-based methods are cleanly parallelizable, even for problems that are normally considered to be unavoidably sequential. This was completely unexpected. A third reason is that in most cases unum arithmetic uses fewer bits, and thus less power, storage, and bandwidth (the most precious resources in today’s computers) than the comparable floating point calculation. It hard to believe that we get this advantage in addition to all of the others, but it is amply demonstrated in the book. Doing efficient unum arithmetic takes more logic (e.g. transistors) than comparable floating point arithmetic does, but as the author points out, transistors are so cheap today that that hardly matters, especially when compared to the other benefits. Some of the broader themes of the book are counterintuitive to people like me advanced conventional training, so that I have to re-think everything I “knew” before. For example, the discussion of just what it means to “solve” an equation numerically is extraordinarily thought provoking. Another example is the author’s extended discussion of how calculus is not the best inspiration for computational numerical methods, even for problems that would seem to absolutely require calculus-based thinking, such as the solution of ordinary differential equations. Not only is the content of the book brilliant, but so is the presentation. The text is so well written, a mix of clarity, precision, and reader friendliness that it is a pure pleasure to read, rather then the dense struggle that mathematical textbooks usually require of the reader. But in addition, almost every page has full color graphics and diagrams that are completely compelling in their ability to clearly communicate the ideas. I cannot think of any technical book I have ever seen that is so beautifully illustrated all the way through. I should add that I read the Kindle edition on an iPad, and for once Amazon did not screw up the presentation of a technical book, at least for this platform. It is
[Issue 15058] [VisualD] A way to specify Debugging Current Directory from within the .visualdproj
https://issues.dlang.org/show_bug.cgi?id=15058 --- Comment #2 from ponce--- Well, that would be very confusing, it's quite common to save projects files even generated. Better abandon the idea of hiding .visualdproj instead to have that. It's not a huge must-have anyway. --
[Issue 15045] [Reg 2.069-devel] hasElaborateCopyConstructor is true for struct with opDispatch
https://issues.dlang.org/show_bug.cgi?id=15045 Kenji Harachanged: What|Removed |Added Keywords||pull --- Comment #4 from Kenji Hara --- (In reply to Martin Nowak from comment #3) > Oh yes, that's a much better solution. The semantics of those builtin member > functions don't make sense with forwarding (through alias this or for > opDispatch) anyhow. Implemented compiler fix: https://github.com/D-Programming-Language/dmd/pull/5077 --
Re: Speeding up text file parser (BLAST tabular format)
On 15/09/15 5:41 AM, NX wrote: On Monday, 14 September 2015 at 16:33:23 UTC, Rikki Cattermole wrote: A lot of this hasn't been covered I believe. http://dpaste.dzfl.pl/f7ab2915c3e1 I believe that should be: foreach (query, ref value; hitlists) Since an assignment happenin there..? Probably.
[Issue 15058] New: [VisualD] A way to specify Debugging Current Directory from within the .visualdproj
https://issues.dlang.org/show_bug.cgi?id=15058 Issue ID: 15058 Summary: [VisualD] A way to specify Debugging Current Directory from within the .visualdproj Product: D Version: D2 Hardware: x86_64 OS: Windows Status: NEW Severity: enhancement Priority: P1 Component: visuald Assignee: nob...@puremagic.com Reporter: alil...@gmail.com In DUB we'd like to hide the multiple .visualdproj in a hidden .dub directory, it helps with clutter in projects with many packages (only the .sln stays visible which is handy). https://github.com/D-Programming-Language/dub/pull/680 But the option "Debugging/Working Directory" is stored in the generated .suo file, and the default is "." where this change woudl require "..". Is there, or should there be, an option within the .visualdproj to store that current working directory? --
[Issue 15058] [VisualD] A way to specify Debugging Current Directory from within the .visualdproj
https://issues.dlang.org/show_bug.cgi?id=15058 Rainer Schuetzechanged: What|Removed |Added CC||r.sagita...@gmx.de --- Comment #1 from Rainer Schuetze --- Visual D stored it in the project file at the beginning, but that's usually not what you want. You might still be able to change the default by adding an element to the project settings, as it is still read from the project: https://github.com/D-Programming-Language/visuald/blob/master/visuald/config.d#L1371 It won't survive the next saving of the project, though, but maybe that's ok. --
Re: D + Dub + Sublime +... build/run in terminal?
On Sunday, 13 September 2015 at 10:00:13 UTC, SuperLuigi wrote: Just wondering if anyone here might know how I can accomplish this... basically I'm editing my D code in Sublime using the Dkit plugin to access DCD which so far is more reliable than monodevelop's autocomplete but I do need to reset the server pretty often... but that's neither here nor there... Sublime is great but its little output panel sucks... won't update stdout text until the end of program run so I'd like to open newly built programs in a linux terminal. So I installed a terminal plugin https://github.com/wbond/sublime_terminal . I can open the terminal, great. I can build with dub, great. Now I need to put them together... somehow... When building, sublime looks at a file it has called D.sublime-build, and this files looks like the following: { "cmd": ["dmd", "-g", "-debug", "$file"], "file_regex": "^(.*?)\\(([0-9]+),?([0-9]+)?\\): (.*)", "selector": "source.d", "variants": [ { "name": "Run", "cmd": ["rdmd", "-g", "-debug", "$file"] }, { "name": "dub", "working_dir": "$project_path", "cmd": ["dub"] } ] } I'm pretty sure I need to edit the last line to pop open the terminal at the project path and run dub... but whatever I put in messes up and I just get errors... Has anyone done this before and can give me a clue as to what I can do to get this to work? Are you on Windows or Linux ? If the latter, which terminal do you use? A reference for lxterminal is here: http://manpages.ubuntu.com/manpages/precise/man1/lxterminal.1.html In which case I think if you replace call to dub with a call to lxterminal and stick --command dub at the end it might work Can't try here. Will be similar thing for Windows.
Re: Combining Unique type with concurrency module
On Monday, 14 September 2015 at 00:11:07 UTC, Ali Çehreli wrote: On 09/13/2015 09:09 AM, Alex wrote: > I'm new to this forum so, please excuse me in advance for > asking silly questions. Before somebody else says it: There are no silly questions. :) > struct std.typecons.Unique!(S).Unique is not copyable because it is > annotated with @disable I have made the code compile and work (without any thread synchronization at all). See the comments with [Ali] annotations: import std.stdio; import std.concurrency; import std.typecons; void spawnedFunc2(Tid ownerTid) { /* [Ali] Aside: ownerTid is already and automatically * available. You don't need to pass it in explicitly. */ receive( /* [Ali] The compilation error comes from Variant, which * happens to be the catch all type for concurrency * messages. Unfortunately, there are issues with that * type. * * Although implemented as a pointer, according to * Variant, a 'ref' is not a pointer. (I am not sure * whether this one is a Variant issue or a language * issue.) * * Changing the message to a pointer to a shared * object: */ (shared(Unique!S) * urShared) { /* [Ali] Because the expression ur.i does not work on * a shared object, we will hack it to unshared * first. */ auto ur = cast(Unique!S*)urShared; writeln("Recieved the number ", ur.i); } ); send(ownerTid, true); } static struct S { int i; this(int i){this.i = i;} } Unique!S produce() { // Construct a unique instance of S on the heap Unique!S ut = new S(5); // Implicit transfer of ownership return ut; } void main() { Unique!S u1; u1 = produce(); auto childTid2 = spawn(, thisTid); /* [Ali] Cast it to shared so that it passes to the other * side. Unfortunately, there is no guarantee that this * object is not used by more than one thread. */ send(childTid2, cast(shared(Unique!S*))); /* [Ali] We must wait to ensure that u1 is not destroyed * before all workers have finished their tasks. */ import core.thread; thread_joinAll(); writeln("Successfully printed number."); } Note that thread synchronization is still the programmer's responsibility. > I'm aware of the fact, that my u1 struct can't be copied, but I don't > intend to do so. Correct. > As in the docu stated, I want to lend the struct to the > other thread (by using ref), being sure, that any other thread can't > access the struct during it is processed by the first one. There is a misconception. Unique guarantees that the object will not be copied. It does not provide any guarantee that only one thread will access the object. It is possible to write a type that acquires a lock during certain operations but Unique isn't that type. > Is such a thing possible? > Thanks in advance. > Alex Ali Thanks for answering! Do you have a hint how to create such a type? The needed operation is "onPassingTo" another thread. So the idea is to create a resource, which is not really shared (a question of definition, I think), as it should be accessible only from one thread at a time. But there is a "main" thread, from which the resource can be lent to "worker" threads and there are "worker" threads, where only one worker can have the resource at a given time. On my own the next possibility I would try is something with RefCounting and checking, how many references there exist. Deciding on this number allow or disallow accessing the reference again. By the way, synchronizing by hand is ok. Don't know how important that is, but the idea is, that synchronization appears very rare, as the lending process acquires and releases resources automatically and the next thread can acquire the resource after a release, the synchronization should not be expected systematically but only at some strange time points... I can't even give an example of such times now... maybe only at the end of the program, to let all workers end their existence.