Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Example output might be useful for you to see as well: 10009.1.1:5.2e-02_13: 16 10014.1.1:2.9e-03_11: 44 10017.1.1:4.1e-02_13: 16 10026.1.1:5.8e-03_12: 27 10027.1.1:6.6e-04_13: 16 10060.1.1:2.7e-03_14: 2 10061.1.1:5.1e-07_13: 41 Worth noticing is that it is essentially impossible to predict how many "hits"/records there are for each query, it varies wildly from 0 to 1000+ in some cases.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 12:44:22 UTC, Edwin van Leeuwen wrote: Sounds like this program is actually IO bound. In that case I would not expect a really expect an improvement by using D. What is the CPU usage like when you run this program? Also which dmd version are you using. I think there were some performance improvements for file reading in the latest version (2.068) Hi Edwin, thanks for your quick reply! I'm using v2.068.1; I actually got inspired to try this out after skimming the changelog :). Regarding if it is IO-bound. I actually expected it would be, but both the Python and the D-version consume 100% CPU while running, and just copying the file around only takes a few seconds (cf 15-20 sec in runtime for the two programs). There's bound to be some aggressive file caching going on, but I figure that would rather normalize program runtimes at lower times after running them a few times, but I see nothing indicating that.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea s/also/even
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea
Speeding up text file parser (BLAST tabular format)
Hi, This is my first post on Dlang forums and I don't have a lot of experience with D (yet). I mainly code bioinformatics-stuff in Python on my day-to-day job, but I've been toying with D for a couple of years now. I had this idea that it'd be fun to write a parser for a text-based tabular data format I tend to read a lot of in my programs, but I was a bit stomped that the D implementation I created was slower than my Python-version. I tried running `dmd -profile` on it but didn't really understand what I can do to make it go faster. I guess there's some unnecessary dynamic array extensions being made but I can't figure out how to do without them, maybe someone can help me out? I tried making the examples as small as possible. Here's the code D code: http://dpaste.com/2HP0ZVA Here's my Python code for comparison: http://dpaste.com/0MPBK67 Using a small test file (~550 MB) on my machine (2x Xeon(R) CPU E5-2670 with RAID6 SAS disks and 192GB of RAM), the D version runs in about 20 seconds and the Python version less than 16 seconds. I've repeated runs at least thrice when testing. This holds true even if the D version is compiled with -O. The file being parsed is the output of a DNA/protein sequence mapping algorithm called BLAT, but the tabular output format is originally known from the famous BLAST algorithm. Here's a short example of what the input files looks like: http://dpaste.com/017N58F The format is TAB-delimited: query, target, percent_identity, alignment_length, mismatches, gaps, query_start, query_end, target_start, target_end, e-value, bitscore In the example the output is sorted by query, but this cannot be assumed to hold true for all cases. The input file varies in range from several hundred megabytes to several gigabytes (10+ GiB). A brief explanation on what the code does: Parse each line, Only accept records with percent_identity >= min_identity (90.0) and alignment_length >= min_matches (10), Store all such records as tuples (in the D code this is a struct) in an array in an associative array indexed by 'query', For each query, remove any records with percent_id less than 5 percentage points less than the highest value observed for that query, Write results to stdout (in my real code the data is subject to further downstream processing) This was all just for me learning to do some basic stuff in D, e.g. file handling, streaming data from disk, etc. I'm really curious what I can do to improve the D code. My original idea was that maybe I should compile the performance critical parts of my Python codebase to D and call them with PyD or something, but not I'm not so sure any more. Help and suggestions appreciated!
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 12:50:03 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 12:44:22 UTC, Edwin van Leeuwen wrote: Sounds like this program is actually IO bound. In that case I would not expect a really expect an improvement by using D. What is the CPU usage like when you run this program? Also which dmd version are you using. I think there were some performance improvements for file reading in the latest version (2.068) Hi Edwin, thanks for your quick reply! I'm using v2.068.1; I actually got inspired to try this out after skimming the changelog :). Regarding if it is IO-bound. I actually expected it would be, but both the Python and the D-version consume 100% CPU while running, and just copying the file around only takes a few seconds (cf 15-20 sec in runtime for the two programs). There's bound to be some aggressive file caching going on, but I figure that would rather normalize program runtimes at lower times after running them a few times, but I see nothing indicating that. Two things that you could try: First hitlists.byKey can be expensive (especially if hitlists is big). Instead use: foreach( key, value ; hitlists ) Also the filter.array.length is quite expensive. You could use count instead. import std.algorithm : count; value.count!(h => h.pid >= (max_pid - max_pid_diff));
Re: Why does reverse also flips my other dynamic array?
Thanks for the clarification.
Re: Adjacent Pairs Range
On Monday, 14 September 2015 at 05:37:05 UTC, Sebastiaan Koppe wrote: What about using zip and a slice? Slicing requires a RandomAccessRange (Array). This is too restrictive. We want to change operations such as adjacentTuples with for example map and reduce without the need for temporary copies of the whole range. This is the thing about D's standard library. Read up on D's range concepts.
Re: Adjacent Pairs Range
On Monday, 14 September 2015 at 10:45:52 UTC, Per Nordlöw wrote: restrictive. We want to change operations such as Correction: We want to *chain* operations such as...
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: Hi, Using a small test file (~550 MB) on my machine (2x Xeon(R) CPU E5-2670 with RAID6 SAS disks and 192GB of RAM), the D version runs in about 20 seconds and the Python version less than 16 seconds. I've repeated runs at least thrice when testing. This holds true even if the D version is compiled with -O. Sounds like this program is actually IO bound. In that case I would not expect a really expect an improvement by using D. What is the CPU usage like when you run this program? Also which dmd version are you using. I think there were some performance improvements for file reading in the latest version (2.068)
Re: chaining chain Result and underlying object of chain
On Monday, 14 September 2015 at 14:17:51 UTC, Laeeth Isharc wrote: chain doesn't seem to compile if I try and chain a chain of two strings and another string. what should I use instead? Laeeth. Works for me: http://dpaste.dzfl.pl/a692281f7a80
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:14:18 UTC, John Colvin wrote: what system are you on? What are the error messages you are getting? I really appreciate your will to try to help me out. This is what ldd shows on the latest binary release of LDC on my machine. I'm on a Red Hat Enterprise Linux 6.6 system. [boulund@terra ~]$ ldd ~/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2 /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2) /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2: /lib64/libc.so.6: version `GLIBC_2.15' not found (required by /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2) linux-vdso.so.1 => (0x7fff623ff000) libconfig.so.8 => /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/libconfig.so.8 (0x7f7f716e1000) libpthread.so.0 => /lib64/libpthread.so.0 (0x7f7f714a3000) libdl.so.2 => /lib64/libdl.so.2 (0x7f7f7129f000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0032cde0) libm.so.6 => /lib64/libm.so.6 (0x7f7f7101a000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0032cca0) libc.so.6 => /lib64/libc.so.6 (0x7f7f70c86000) /lib64/ld-linux-x86-64.so.2 (0x7f7f718ec000) As you can see it lacks something related to GLIBC, but I'm not sure how to fix that.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:15:25 UTC, Laeeth Isharc wrote: I picked up D to start learning maybe a couple of years ago. I found Ali's book, Andrei's book, github source code (including for Phobos), and asking here to be the best resources. The docs make perfect sense when you have got to a certain level (or perhaps if you have a computer sciencey background), but can be tough before that (though they are getting better). You should definitely take a look at the dlangscience project organized by John Colvin and others. If you like ipython/jupyter also see his pydmagic - write D inline in a notebook. I saw the dlangscience project on GitHub the other day. I've yet to venture deeper. The inlining of D in jupyter notebooks sure is cool, but I'm not sure it's very useful for me, Python feels more succinct for notebook use. Still, I really appreciate the effort put into that, it's really cool! You may find this series of posts interesting too - another bioinformatics guy migrating from Python: http://forum.dlang.org/post/akzdstfiwwzfeoudh...@forum.dlang.org I'll have a look at that series of posts, thanks for the heads-up! Unfortunately I haven't time to read your code, and others will do better. But do you use .reserve() ? Also these are a nice fast container library based on Andrei Alexandrescu's allocator: https://github.com/economicmodeling/containers Not familiar with .reserve(), nor Andrei's allocator library. I'll put that in the stuff-to-read-about-queue for now. :) Thanks for your tips!
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:58:33 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 13:37:18 UTC, John Colvin wrote: On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? -inline in particular is likely to have a strong impact here Why would -inline be particularly likely to make a big difference in this case? I'm trying to learn, but I don't see what inlining could be done in this case. Range-based code like you are using leads to *huge* numbers of function calls to get anything done. The advantage of inlining is twofold: 1) you don't have to pay the cost of the function call itself and 2) often more optimisation can be done once a function is inlined. Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea +1 I would expect ldc or gdc to strongly outperform dmd on this code. Why is that? I would love to learn to understand why they could be expected to perform much better on this kind of code. Because there are much better at inlining. dmd is quick to compile your code and is most up-to-date, but ldc and gdc will produce somewhat faster code in almost all cases, sometimes very dramatically much faster.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:55:50 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 13:10:50 UTC, Edwin van Leeuwen wrote: Two things that you could try: First hitlists.byKey can be expensive (especially if hitlists is big). Instead use: foreach( key, value ; hitlists ) Also the filter.array.length is quite expensive. You could use count instead. import std.algorithm : count; value.count!(h => h.pid >= (max_pid - max_pid_diff)); I didn't know that hitlists.byKey was that expensive, that's just the kind of feedback I was hoping for. I'm just grasping for straws in the online documentation when I want to do things. With my Python background it feels as if I can still get things that work that way. I picked up D to start learning maybe a couple of years ago. I found Ali's book, Andrei's book, github source code (including for Phobos), and asking here to be the best resources. The docs make perfect sense when you have got to a certain level (or perhaps if you have a computer sciencey background), but can be tough before that (though they are getting better). You should definitely take a look at the dlangscience project organized by John Colvin and others. If you like ipython/jupyter also see his pydmagic - write D inline in a notebook. You may find this series of posts interesting too - another bioinformatics guy migrating from Python: http://forum.dlang.org/post/akzdstfiwwzfeoudh...@forum.dlang.org I realize the filter.array.length thing is indeed expensive. I find it especially horrendous that the code I've written needs to allocate a big dynamic array that will most likely be cut down quite drastically in this step. Unfortunately I haven't figured out a good way to do this without storing the intermediary results since I cannot know if there might be yet another hit for any encountered "query" since the input file might not be sorted. But the main reason I didn't just count the values like you suggest is actually that I need the filtered hits in later downstream analysis. The filtered hits for each query are used as input to a lowest common ancestor algorithm on the taxonomic tree (of life). Unfortunately I haven't time to read your code, and others will do better. But do you use .reserve() ? Also these are a nice fast container library based on Andrei Alexandrescu's allocator: https://github.com/economicmodeling/containers
Re: shared array?
On Monday, 14 September 2015 at 13:56:16 UTC, Laeeth Isharc wrote: Personally, when I make a strong claim about something and find that I am wrong (the claim that D needs to scan every pointer), I take a step back and consider my view rather than pressing harder. It's beautiful to be wrong because through recognition of error, growth. If recognition. The claim is correct: you need to follow every pointer that through some indirection may lead to a pointer that may point into the GC heap. Not doing so will lead to unverified memory unsafety. Given one was written by one (very smart) student for his PhD thesis, and that as I understand it that formed the basis of Sociomantic's concurrent garbage collector (correct me if I am wrong), and that this is being ported to D2, and whether or not it is released, success will spur others to follow - it strikes As it has been described, it is fork() based and unsuitable for the typical use case. provided one understands the situation. Poking holes at things without taking any positive steps to fix them is understandable for people that haven't a choice about their situation, but in my experience is rarely effective in making the world better. Glossing over issues that needs attention is not a good idea. It wastes other people's time. I am building my own libraries, also for memory management with move semantics etc.
chaining chain Result and underlying object of chain
chain doesn't seem to compile if I try and chain a chain of two strings and another string. what should I use instead? Laeeth.
Re: chaining chain Result and underlying object of chain
On Monday, 14 September 2015 at 14:17:51 UTC, Laeeth Isharc wrote: chain doesn't seem to compile if I try and chain a chain of two strings and another string. what should I use instead? Laeeth. std.algorithm.iteration.joiner?
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:28:41 UTC, John Colvin wrote: Yup, glibc is too old for those binaries. What does "ldd --version" say? It says "ldd (GNU libc) 2.12". Hmm... The most recent version in RHEL's repo is "2.12-1.166.el6_7.1", which is what is installed. Can this be side-loaded without too much hassle and manual effort?
Re: Speeding up text file parser (BLAST tabular format)
On Mon, Sep 14, 2015 at 02:34:41PM +, Fredrik Boulund via Digitalmars-d-learn wrote: > On Monday, 14 September 2015 at 14:18:58 UTC, John Colvin wrote: > >Range-based code like you are using leads to *huge* numbers of > >function calls to get anything done. The advantage of inlining is > >twofold: 1) you don't have to pay the cost of the function call > >itself and 2) often more optimisation can be done once a function is > >inlined. > > Thanks for that explanation! Now that you mention it it makes perfect > sense. I never considered it, but of course *huge* numbers of > function calls to e.g. next() and other range-methods will be made. > > >Because there are much better at inlining. dmd is quick to compile > >your code and is most up-to-date, but ldc and gdc will produce > >somewhat faster code in almost all cases, sometimes very dramatically > >much faster. > > Sure sounds like I could have more fun with LDC and GDC on my system > in addition to DMD :). If performance is a problem, the first thing I'd recommend is to use a profiler to find out where the hotspots are. (More often than not, I have found that the hotspots are not where I expected them to be; sometimes a 1-line change to an unanticipated hotspot can result in a huge performance boost.) The next thing I'd try is to use gdc instead of dmd. ;-) IME, code produced by `gdc -O3` is at least 20-30% faster than code produced by `dmd -O -inline`. Sometimes the difference can be up to 40-50%, depending on the kind of code you're compiling. T -- Lottery: tax on the stupid. -- Slashdotter
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea Thanks for the suggestions! I'm not too familiar with compiled languages like this, I've mainly written small programs in D and run them via `rdmd` in a scripting language fashion. I'll read up on what the different compile flags do (I knew about -O, but I'm not sure what the others do). Unfortunately I cannot get LDC working on my system. It seems to fail finding some shared library when I download the binary released, and I can't figure out how to make it compile. I haven't really given GDC a try yet. I'll see what I can do. Running the original D code I posted before with the flags you suggested reduced the runtime by about 2 seconds on average.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:37:18 UTC, John Colvin wrote: On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? -inline in particular is likely to have a strong impact here Why would -inline be particularly likely to make a big difference in this case? I'm trying to learn, but I don't see what inlining could be done in this case. Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea +1 I would expect ldc or gdc to strongly outperform dmd on this code. Why is that? I would love to learn to understand why they could be expected to perform much better on this kind of code.
Re: shared array?
On Monday, 14 September 2015 at 00:53:58 UTC, Jonathan M Davis wrote: Only the stack and the GC heap get scanned unless you tell the GC about memory that was allocated by malloc or some other mechanism. malloced memory won't be scanned by default. So, if you're using the GC minimally and coding in a way that doesn't involve needing to tell the GC about a bunch of malloced memory, then the GC won't have all that much to scan. And while the amount of memory that the GC has to scan does affect the speed of a collection, in general, the less memory that's been allocated by the GC, the faster a collection is. Idiomatic D code uses the stack heavily and allocates very little on the GC heap. ... So, while the fact that D's GC is less than stellar is certainly a problem, and we would definitely like to improve that, the idioms that D code typically uses seriously reduce the number of performance problems that we get. - Jonathan M Davis Thank you for your posts on this (and the others), Jonathan. I appreciate your taking the time to write so carefully and thoroughly, and I learn a lot from reading your work.
Re: how do I check if a member of a T has a member ?
On Sunday, 13 September 2015 at 17:34:11 UTC, BBasile wrote: On Sunday, 13 September 2015 at 17:24:20 UTC, Laeeth Isharc wrote: On Sunday, 13 September 2015 at 17:09:57 UTC, wobbles wrote: Use __traits(compiles, date.second)? Thanks. This works: static if (__traits(compiles, { T bar; bar.date.hour;})) pragma(msg,"hour"); else pragma(msg,"nohour"); can't you use 'hasMember' (either with __traits() or std.traits.hasMember)? It's more idiomatic than checking if it's compilable. I'll check again in a bit, but I seem to recall hasMember didn't work. I would like to get the type of a member of a type, and I think hasMember!(T.bar.date","hour") didn't work for that. Possibly it does work and I messed it up somehow, or it doesn't work and there is a more elegant way. Someone ought to write a tutorial showing how to use the good stuff we have to solve real problems. Eg an annotated babysteps version of Andrei's allocator talk. I can't do it as too much on my plate.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:50:22 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: [...] Thanks for the suggestions! I'm not too familiar with compiled languages like this, I've mainly written small programs in D and run them via `rdmd` in a scripting language fashion. I'll read up on what the different compile flags do (I knew about -O, but I'm not sure what the others do). Unfortunately I cannot get LDC working on my system. It seems to fail finding some shared library when I download the binary released, and I can't figure out how to make it compile. I haven't really given GDC a try yet. I'll see what I can do. Running the original D code I posted before with the flags you suggested reduced the runtime by about 2 seconds on average. what system are you on? What are the error messages you are getting?
Re: how do I check if a member of a T has a member ?
On Monday, 14 September 2015 at 14:05:01 UTC, Laeeth Isharc wrote: On Sunday, 13 September 2015 at 17:34:11 UTC, BBasile wrote: On Sunday, 13 September 2015 at 17:24:20 UTC, Laeeth Isharc wrote: [...] can't you use 'hasMember' (either with __traits() or std.traits.hasMember)? It's more idiomatic than checking if it's compilable. I'll check again in a bit, but I seem to recall hasMember didn't work. I would like to get the type of a member of a type, and I think hasMember!(T.bar.date","hour") didn't work for that. Possibly it does work and I messed it up somehow, or it doesn't work and there is a more elegant way. You mean hasMember!(typeof(T.bar.date), "hour"), right?
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:25:04 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 14:14:18 UTC, John Colvin wrote: what system are you on? What are the error messages you are getting? I really appreciate your will to try to help me out. This is what ldd shows on the latest binary release of LDC on my machine. I'm on a Red Hat Enterprise Linux 6.6 system. [boulund@terra ~]$ ldd ~/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2 /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2) /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2: /lib64/libc.so.6: version `GLIBC_2.15' not found (required by /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/ldc2) linux-vdso.so.1 => (0x7fff623ff000) libconfig.so.8 => /home/boulund/apps/ldc2-0.16.0-alpha2-linux-x86_64/bin/libconfig.so.8 (0x7f7f716e1000) libpthread.so.0 => /lib64/libpthread.so.0 (0x7f7f714a3000) libdl.so.2 => /lib64/libdl.so.2 (0x7f7f7129f000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0032cde0) libm.so.6 => /lib64/libm.so.6 (0x7f7f7101a000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0032cca0) libc.so.6 => /lib64/libc.so.6 (0x7f7f70c86000) /lib64/ld-linux-x86-64.so.2 (0x7f7f718ec000) As you can see it lacks something related to GLIBC, but I'm not sure how to fix that. Yup, glibc is too old for those binaries. What does "ldd --version" say?
Re: chaining chain Result and underlying object of chain
On Monday 14 September 2015 16:17, Laeeth Isharc wrote: > chain doesn't seem to compile if I try and chain a chain of two > strings and another string. > > what should I use instead? Please show code, always. A simple test works for me: import std.algorithm: equal; import std.range: chain; void main() { auto chain1 = chain("foo", "bar"); auto chain2 = chain(chain1, "baz"); assert(equal(chain2, "foobarbaz")); }
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:40:29 UTC, H. S. Teoh wrote: If performance is a problem, the first thing I'd recommend is to use a profiler to find out where the hotspots are. (More often than not, I have found that the hotspots are not where I expected them to be; sometimes a 1-line change to an unanticipated hotspot can result in a huge performance boost.) I agree with you on that. I used Python's cProfile module to find the performance bottleneck in the Python version I posted, and shaved off 8-10 seconds of runtime on an extraneous str.split() I had missed. I tried using the built-in profiler in DMD on the D program but to no avail. I couldn't really make any sense of the output other than that were enormous amounts of calls to lots of functions I couldn't find a way to remove from the code. Here's a paste of the trace output from the version I posted in the original post: http://dpaste.com/1AXPK9P The next thing I'd try is to use gdc instead of dmd. ;-) IME, code produced by `gdc -O3` is at least 20-30% faster than code produced by `dmd -O -inline`. Sometimes the difference can be up to 40-50%, depending on the kind of code you're compiling. Yes, it really seems that gdc or ldc is the way to go.
Re: shared array?
On Monday, 14 September 2015 at 08:57:07 UTC, Ola Fosheim Grøstad wrote: On Monday, 14 September 2015 at 00:53:58 UTC, Jonathan M Davis wrote: So, while the fact that D's GC is less than stellar is certainly a problem, and we would definitely like to improve that, the idioms that D code typically uses seriously reduce the number of performance problems that we get. What D needs is some way for a static analyzer to be certain that a pointer does not point to a specific GC heap. And that means language changes... one way or the other. Without language changes it becomes very difficult to reduce the amount of memory scanned without sacrificing memory safety. Personally, when I make a strong claim about something and find that I am wrong (the claim that D needs to scan every pointer), I take a step back and consider my view rather than pressing harder. It's beautiful to be wrong because through recognition of error, growth. If recognition. And I don't think a concurrent GC is realistic given the complexity and performance penalties. The same people who complain about GC would not accept performance hits on pointer-writes. That would essentially make D and Go too similar IMO. Given one was written by one (very smart) student for his PhD thesis, and that as I understand it that formed the basis of Sociomantic's concurrent garbage collector (correct me if I am wrong), and that this is being ported to D2, and whether or not it is released, success will spur others to follow - it strikes me as a problematic claim to make that developing one isn't realistic unless one is deeply embedded in the nitty gritty of the problem (because theory and practice are more different in practice than they are in theory!) There is etcimon's work too (at research stage). Don't underestimate too how future corporate support combined with an organically growing community may change what's possible. Andy Smith gave his talk based on his experience at one of the largest and well-run hedge funds. An associate who sold a decent sized marketing group got in contact to thank me for posting links on D as it helped him implement a machine-learning problem better. And if I look at what's in front of me, I really am not aware of a better solution to the needs I have, which I am pretty sure are needs that are more generally shared - corporate inertia may be a nuisance but it is also a source of opportunity for others. In response to your message earlier where you suggested that Sociomantic was an edge case of little relevance for the rest of us. I made that point in response to the claim that D had no place for such purposes. It's true that being able to do something doesn't mean it is a good idea, but really having seen them speak and looked at the people they hire, I really would be surprised if they do not know what they are doing. (I would say the same if they had never been bought). And they say that using D has significantly lowered their costs compared to their competitors. It's what I have been finding, too, dealing with data sets that are for now by no means 'big' but will be soon enough. It's also a human group phenomenon that it's very difficult to do something for the first time, and the more people that follow, the easier it is for others. So the edge case of yesteryear shall be the best practice of the future. One sees this also with allocators, where Andrei's library is already beginning to be integrated in different projects. I had never even heard of D two years ago and had approaching a twenty year break from doing a lot of programming. But they weren't difficult to pick up and use effectively. Clearly, latency and performance hits are different things, and the category of people who care about performance is only a partial intersection of those who care about latency. Part of what I do involves applying the principle of contrarian thinking, and I can say that it is very useful, and not just in the investment world: http://www.amazon.com/The-Contrary-Thinking-Humphrey-Neill/dp/087004110X On the other hand, there is also the phenomenon of just being contrary. One sometimes has the impression that some people like to argue for the sake of it. Nothing wrong with that, provided one understands the situation. Poking holes at things without taking any positive steps to fix them is understandable for people that haven't a choice about their situation, but in my experience is rarely effective in making the world better.
Re: shared array?
On Monday, 14 September 2015 at 13:56:16 UTC, Laeeth Isharc wrote: An associate who sold a decent sized marketing group Should read marketmaking. Making prices in listed equity options.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:10:50 UTC, Edwin van Leeuwen wrote: Two things that you could try: First hitlists.byKey can be expensive (especially if hitlists is big). Instead use: foreach( key, value ; hitlists ) Also the filter.array.length is quite expensive. You could use count instead. import std.algorithm : count; value.count!(h => h.pid >= (max_pid - max_pid_diff)); I didn't know that hitlists.byKey was that expensive, that's just the kind of feedback I was hoping for. I'm just grasping for straws in the online documentation when I want to do things. With my Python background it feels as if I can still get things that work that way. I realize the filter.array.length thing is indeed expensive. I find it especially horrendous that the code I've written needs to allocate a big dynamic array that will most likely be cut down quite drastically in this step. Unfortunately I haven't figured out a good way to do this without storing the intermediary results since I cannot know if there might be yet another hit for any encountered "query" since the input file might not be sorted. But the main reason I didn't just count the values like you suggest is actually that I need the filtered hits in later downstream analysis. The filtered hits for each query are used as input to a lowest common ancestor algorithm on the taxonomic tree (of life).
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:18:58 UTC, John Colvin wrote: Range-based code like you are using leads to *huge* numbers of function calls to get anything done. The advantage of inlining is twofold: 1) you don't have to pay the cost of the function call itself and 2) often more optimisation can be done once a function is inlined. Thanks for that explanation! Now that you mention it it makes perfect sense. I never considered it, but of course *huge* numbers of function calls to e.g. next() and other range-methods will be made. Because there are much better at inlining. dmd is quick to compile your code and is most up-to-date, but ldc and gdc will produce somewhat faster code in almost all cases, sometimes very dramatically much faster. Sure sounds like I could have more fun with LDC and GDC on my system in addition to DMD :).
Re: Passing Elements of A Static Array as Function Parameters
On Monday, 14 September 2015 at 09:09:27 UTC, Per Nordlöw wrote: Is there a reason why such a common thing isn't already in Phobos? If not what about adding it to std.typecons : asTuple I guess nobody's really needed that functionality before. It might be an interesting addition to std.array.
Re: Passing Elements of A Static Array as Function Parameters
On Monday, 14 September 2015 at 08:56:43 UTC, Per Nordlöw wrote: BTW: What about .tupleof? Isn't that what should be used here? I don't believe .tupleof works for arrays.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 13:05:32 UTC, Andrea Fontana wrote: On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] Also if problem probabily is i/o related, have you tried with: -O -inline -release -noboundscheck ? -inline in particular is likely to have a strong impact here Anyway I think it's a good idea to test it against gdc and ldc that are known to generate faster executables. Andrea +1 I would expect ldc or gdc to strongly outperform dmd on this code.
Re: Creating a DLL with a ActiveX interface.
On Monday, 14 September 2015 at 15:20:50 UTC, Adam D. Ruppe wrote: On Monday, 14 September 2015 at 15:14:05 UTC, Taylor Hillegeist wrote: Gives a short example but the code doesn't compile for me. core\stdc\windows\com.d seems to be missing? I think the doc copy/pasted a typo there. It should be `core.sys.windows.com`. I've done some COM stuff with D before, getting it callable from vbscript and jscript. Can you tell me what steps you're using (in some detail, like are you using IE? or some other thing?) to test your thing? then I can try to make a working example that passes it and share that. So, Actually I am using NI LabVIEW to interact with my DLL. I imagine even getting hold of of that would troublesome or expensive. But I'm pretty savvy on that end. But for me its more about how to expose an com interface to the rest of the system through a dll.
Re: chaining chain Result and underlying object of chain
On Monday, 14 September 2015 at 14:31:33 UTC, anonymous wrote: On Monday 14 September 2015 16:17, Laeeth Isharc wrote: chain doesn't seem to compile if I try and chain a chain of two strings and another string. what should I use instead? Please show code, always. A simple test works for me: import std.algorithm: equal; import std.range: chain; void main() { auto chain1 = chain("foo", "bar"); auto chain2 = chain(chain1, "baz"); assert(equal(chain2, "foobarbaz")); } Sorry - was exhausted yesterday when I had the code there, or would have posted. I was trying to use the same variable eg auto chain1 = chain("foo", "bar"); chain1 = chain(chain1, "baz"); Realized that in this case it was much simpler just to use the delegate version of toString and sink (which I had forgotten about). But I wondered what to do in other cases. It may be that the type of chain1 and chain2 don't mix.
Re: chaining chain Result and underlying object of chain
On Monday 14 September 2015 17:01, Laeeth Isharc wrote: >auto chain1 = chain("foo", "bar"); >chain1 = chain(chain1, "baz"); > > Realized that in this case it was much simpler just to use the > delegate version of toString and sink (which I had forgotten > about). But I wondered what to do in other cases. It may be > that the type of chain1 and chain2 don't mix. Yes, the types don't match. The result types of most range functions depend on the argument types. Let's say chain("foo", "bar") has the type ChainResult!(string, string). Then chain(chain("foo", "bar"), "baz") has the type ChainResult! (ChainResult!(string, string), string). Those are different and not compatible. You can get the same type by: a) being eager: import std.array: array; auto chain1 = chain("foo", "bar").array; chain1 = chain(chain1, "baz").array; (At that point you could of course just work with the strings directly, using ~ and ~=.) b) being classy: import std.range.interfaces; InputRange!dchar chain1 = inputRangeObject(chain("foo", "bar")); chain1 = inputRangeObject(chain(chain1, "baz")); Those have performance implications, of course. Being eager means allocating the whole thing, and possibly intermediate results. Being classy means allocating objects for the ranges (could possibly put them on the stack), and it means indirections.
Re: Creating a DLL with a ActiveX interface.
On Monday, 14 September 2015 at 15:14:05 UTC, Taylor Hillegeist wrote: Gives a short example but the code doesn't compile for me. core\stdc\windows\com.d seems to be missing? I think the doc copy/pasted a typo there. It should be `core.sys.windows.com`. I've done some COM stuff with D before, getting it callable from vbscript and jscript. Can you tell me what steps you're using (in some detail, like are you using IE? or some other thing?) to test your thing? then I can try to make a working example that passes it and share that.
Re: chaining chain Result and underlying object of chain
On Monday, 14 September 2015 at 15:30:14 UTC, Ali Çehreli wrote: On 09/14/2015 08:01 AM, Laeeth Isharc wrote: > I was trying to use the same variable eg > >auto chain1 = chain("foo", "bar"); >chain1 = chain(chain1, "baz"); [...] > It may be that the type of chain1 > and chain2 don't mix. Exactly. I was going to recommend using pragma(msg, typeof(chain1)) to see what they are but it looks like chain()'s return type is not templatized. (?) pragma(msg, typeof(chain1)); pragma(msg, typeof(chain2)); Prints Result Result instead of something like (hypothetical) ChainResult!(string, string) ChainResult!(ChainResult!(string, string), string) Ali It is templated, but by means of it's enclosing function being templated, which doesn't end up in the name.
Re: chaining chain Result and underlying object of chain
On 09/14/2015 08:01 AM, Laeeth Isharc wrote: > I was trying to use the same variable eg > >auto chain1 = chain("foo", "bar"); >chain1 = chain(chain1, "baz"); [...] > It may be that the type of chain1 > and chain2 don't mix. Exactly. I was going to recommend using pragma(msg, typeof(chain1)) to see what they are but it looks like chain()'s return type is not templatized. (?) pragma(msg, typeof(chain1)); pragma(msg, typeof(chain2)); Prints Result Result instead of something like (hypothetical) ChainResult!(string, string) ChainResult!(ChainResult!(string, string), string) Ali
Re: how do I check if a member of a T has a member ?
On Monday, 14 September 2015 at 14:21:12 UTC, John Colvin wrote: On Monday, 14 September 2015 at 14:05:01 UTC, Laeeth Isharc wrote: On Sunday, 13 September 2015 at 17:34:11 UTC, BBasile wrote: On Sunday, 13 September 2015 at 17:24:20 UTC, Laeeth Isharc wrote: [...] can't you use 'hasMember' (either with __traits() or std.traits.hasMember)? It's more idiomatic than checking if it's compilable. I'll check again in a bit, but I seem to recall hasMember didn't work. I would like to get the type of a member of a type, and I think hasMember!(T.bar.date","hour") didn't work for that. Possibly it does work and I messed it up somehow, or it doesn't work and there is a more elegant way. You mean hasMember!(typeof(T.bar.date), "hour"), right? Ahh. Probably that was why (I will check it shortly). Why do I need to do a typeof? What kind of thing is T.bar.date before the typeof given that T is a type?
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:35:26 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 14:28:41 UTC, John Colvin wrote: Yup, glibc is too old for those binaries. What does "ldd --version" say? It says "ldd (GNU libc) 2.12". Hmm... The most recent version in RHEL's repo is "2.12-1.166.el6_7.1", which is what is installed. Can this be side-loaded without too much hassle and manual effort? I've had nothing but trouble when using different versions of libc. It would be easier to do this instead: http://wiki.dlang.org/Building_LDC_from_source I'm running a build of LDC git HEAD right now on an old server with 2.11, I'll upload the result somewhere once it's done if it might be useful
Creating a DLL with a ActiveX interface.
So, I've looked at this topic of COM OLE and activeX, and found myself confused. http://dlang.org/interface.html Gives a short example but the code doesn't compile for me. core\stdc\windows\com.d seems to be missing? And i cant find any documentation on core\stdc on the standard library page. http://wiki.dlang.org/Win32_DLLs_in_D Points to "The Sample Code" under COM. but I find that confusing. So here is what I desire, You guys can let me know how dumb it is. I want a dll with an activex x interface, that contains a small bit if data a string for example. And this is what i want to happen? (caller sends activex object with string)->(my dll written in d minpulates string)->(caller gets a diffrent string) I call on the wisdom of the community to help me in this. Thanks!
Re: how do I check if a member of a T has a member ?
On Monday, 14 September 2015 at 15:04:00 UTC, Laeeth Isharc wrote: On Monday, 14 September 2015 at 14:21:12 UTC, John Colvin wrote: On Monday, 14 September 2015 at 14:05:01 UTC, Laeeth Isharc wrote: On Sunday, 13 September 2015 at 17:34:11 UTC, BBasile wrote: On Sunday, 13 September 2015 at 17:24:20 UTC, Laeeth Isharc wrote: [...] can't you use 'hasMember' (either with __traits() or std.traits.hasMember)? It's more idiomatic than checking if it's compilable. I'll check again in a bit, but I seem to recall hasMember didn't work. I would like to get the type of a member of a type, and I think hasMember!(T.bar.date","hour") didn't work for that. Possibly it does work and I messed it up somehow, or it doesn't work and there is a more elegant way. You mean hasMember!(typeof(T.bar.date), "hour"), right? Ahh. Probably that was why (I will check it shortly). Why do I need to do a typeof? What kind of thing is T.bar.date before the typeof given that T is a type? T.bar.date is just a symbol. If you tried to actually access it then it would have to be a compile-time construct or be a static member/method, but it's perfectly OK to ask what type it has or what size it has. The simple story: hasMember takes a type as its first argument. T.bar.date isn't a type, it's a member of a member of a type. To find out what type it is, use typeof.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 14:54:34 UTC, Fredrik Boulund wrote: On Monday, 14 September 2015 at 14:40:29 UTC, H. S. Teoh wrote: I agree with you on that. I used Python's cProfile module to find the performance bottleneck in the Python version I posted, and shaved off 8-10 seconds of runtime on an extraneous str.split() I had missed. I tried using the built-in profiler in DMD on the D program but to no avail. I couldn't really make any sense of the output other than that were enormous amounts of calls to lots of functions I couldn't find a way to remove from the code. Here's a paste of the trace output from the version I posted in the original post: http://dpaste.com/1AXPK9P See this link for clarification on what the columns/numbers in the profile file mean http://forum.dlang.org/post/f9gjmo$2gce$1...@digitalmars.com It is still difficult to parse though. I myself often use sysprof (only available on linux), which automatically ranks by time spent.
Canvas in Gtk connected to D?
Is there a way to do a canvas in GTK3 so that I can use chart.js, and connect this to D? See, in something similar, a guy named Julien Wintz figured out that Qt's QQuickWidget acts much like the webkit Canvas object, and thus was able to port chart.js to that widget. This allows one to use Qt + QQuickWidget + D (or any Qt-supported language for that matter) to draw charts using Javascript, using the chart.js documentation. What's also fascinating about this is that it's fairly lightweight -- Julien's solution doesn't use Chromium (or other webkit implementation) to make it work. (It should be noted, however, that QQuickWidget uses OpenGL, however.) Likewise, it would be great if I could do something similar in GTK3. See, I like D, and I'm getting somewhere with it with GTK3, but doing static charts like I see with chart.js is important for my use of this language.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 16:33:23 UTC, Rikki Cattermole wrote: On 15/09/15 12:30 AM, Fredrik Boulund wrote: [...] A lot of this hasn't been covered I believe. http://dpaste.dzfl.pl/f7ab2915c3e1 1) You don't need to convert char[] to string via to. No. Too much. Cast it. Not a good idea in general. Much better to ask for a string in the first place by using byLine!(immutable(char), immutable(char)). Alternatively just use char[] throughout.
Re: Creating a DLL with a ActiveX interface.
On Monday, 14 September 2015 at 15:44:36 UTC, Taylor Hillegeist wrote: So, Actually I am using NI LabVIEW to interact with my DLL. I imagine even getting hold of of that would troublesome or expensive. Ah, all right. Here's a SO thing (followed up by email then copy/pasted there) I did for someone else with a hello COM dll: http://stackoverflow.com/questions/19937521/what-are-options-to-communicate-between-vb6-com-server-on-windows-xp-and-python The main thing is dealing with a bug in Windows XP. If you are on Vista or above, you can skip that and hopefully just look at the example zip here: http://arsdnet.net/dcode/com.zip That's almost two years old now but should still basically work (I will try next time I have a free hour on Windows too) and hopefully get you started.
Re: Speeding up text file parser (BLAST tabular format)
On Mon, Sep 14, 2015 at 04:13:12PM +, Edwin van Leeuwen via Digitalmars-d-learn wrote: > On Monday, 14 September 2015 at 14:54:34 UTC, Fredrik Boulund wrote: > >[...] I tried using the built-in profiler in DMD on the D program but > >to no avail. I couldn't really make any sense of the output other > >than that were enormous amounts of calls to lots of functions I > >couldn't find a way to remove from the code. Here's a paste of the > >trace output from the version I posted in the original post: > >http://dpaste.com/1AXPK9P > > > > See this link for clarification on what the columns/numbers in the > profile file mean > http://forum.dlang.org/post/f9gjmo$2gce$1...@digitalmars.com > > It is still difficult to parse though. I myself often use sysprof > (only available on linux), which automatically ranks by time spent. Dmd's profiler has some limitations, especially if you're doing something that's CPU bound for a long time (its internal counters are not wide enough and may overflow -- I have run into this before and it made it unusable for me). I highly recommend using `gdc -pg` with gprof. T -- Only boring people get bored. -- JM
Re: Speeding up text file parser (BLAST tabular format)
On 15/09/15 12:30 AM, Fredrik Boulund wrote: Hi, This is my first post on Dlang forums and I don't have a lot of experience with D (yet). I mainly code bioinformatics-stuff in Python on my day-to-day job, but I've been toying with D for a couple of years now. I had this idea that it'd be fun to write a parser for a text-based tabular data format I tend to read a lot of in my programs, but I was a bit stomped that the D implementation I created was slower than my Python-version. I tried running `dmd -profile` on it but didn't really understand what I can do to make it go faster. I guess there's some unnecessary dynamic array extensions being made but I can't figure out how to do without them, maybe someone can help me out? I tried making the examples as small as possible. Here's the code D code: http://dpaste.com/2HP0ZVA Here's my Python code for comparison: http://dpaste.com/0MPBK67 Using a small test file (~550 MB) on my machine (2x Xeon(R) CPU E5-2670 with RAID6 SAS disks and 192GB of RAM), the D version runs in about 20 seconds and the Python version less than 16 seconds. I've repeated runs at least thrice when testing. This holds true even if the D version is compiled with -O. The file being parsed is the output of a DNA/protein sequence mapping algorithm called BLAT, but the tabular output format is originally known from the famous BLAST algorithm. Here's a short example of what the input files looks like: http://dpaste.com/017N58F The format is TAB-delimited: query, target, percent_identity, alignment_length, mismatches, gaps, query_start, query_end, target_start, target_end, e-value, bitscore In the example the output is sorted by query, but this cannot be assumed to hold true for all cases. The input file varies in range from several hundred megabytes to several gigabytes (10+ GiB). A brief explanation on what the code does: Parse each line, Only accept records with percent_identity >= min_identity (90.0) and alignment_length >= min_matches (10), Store all such records as tuples (in the D code this is a struct) in an array in an associative array indexed by 'query', For each query, remove any records with percent_id less than 5 percentage points less than the highest value observed for that query, Write results to stdout (in my real code the data is subject to further downstream processing) This was all just for me learning to do some basic stuff in D, e.g. file handling, streaming data from disk, etc. I'm really curious what I can do to improve the D code. My original idea was that maybe I should compile the performance critical parts of my Python codebase to D and call them with PyD or something, but not I'm not so sure any more. Help and suggestions appreciated! A lot of this hasn't been covered I believe. http://dpaste.dzfl.pl/f7ab2915c3e1 1) You don't need to convert char[] to string via to. No. Too much. Cast it. 2) You don't need byKey, use foreach key, value syntax. That way you won't go around modifying things unnecessarily. Ok, I disabled GC + reserved a bunch of memory. It probably won't help much actually. In fact may make it fail so keep that in mind. Humm what else. I'm worried about that first foreach. I don't think it needs to exist as it does. I believe an input range would be far better. Use a buffer to store the Hit[]'s. Have a subset per set of them. If the first foreach is an input range, then things become slightly easier in the second. Now you can turn that into it's own input range. Also that .array usage concerns me. Many an allocation there! Hence why the input range should be the return from it. The last foreach, is lets assume dummy. Keep in mind, stdout is expensive here. DO NOT USE. If you must buffer output then do it large quantities. Based upon what I can see, you are definitely not able to use your cpu's to the max. There is no way that is the limiting factor here. Maybe your usage of a core is. But not the cpu's itself. The thing is, you cannot use multiple threads on that first foreach loop to speed things up. No. That needs to happen all on one thread. Instead after that thread you need to push the result into another. Perhaps, per thread one lock (mutex) + buffer for hits. Go round robin over all the threads. If mutex is still locked, you'll need to wait. In this situation a locked mutex means all you worker threads are working. So you can't do anything more (anyway). Of course after all this, the HDD may still be getting hit too hard. In which case I would recommend you memory mapping it. Which should allow the OS to more efficiently handle reading it into memory. But you'll need to rework .byLine for that. Wow that was a lot at 4:30am! So don't take it too seriously. I'm sure somebody else will rip that to shreds!
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 16:33:23 UTC, Rikki Cattermole wrote: A lot of this hasn't been covered I believe. http://dpaste.dzfl.pl/f7ab2915c3e1 I believe that should be: foreach (query, ref value; hitlists) Since an assignment happenin there..?
Re: shared array?
On Monday, September 14, 2015 01:12:02 Ola Fosheim Grostad via Digitalmars-d-learn wrote: > On Monday, 14 September 2015 at 00:41:28 UTC, Jonathan M Davis > wrote: > > Regardless, idiomatic D involves a lot more stack allocations > > than you often get even in C++, so GC usage tends to be low in > > Really? I use VLAs in my C++ (a C extension) and use very few > mallocs after init. In C++ even exceptions can be put outside the > heap. Just avoid STL after init and you're good. >From what I've seen of C++ and understand of typical use cases from other folks, that's not at all typical of C++ usage (though there's enough people using C++ across a wide enough spectrum of environments and situations that there's obviously going to be quite a wide spread of what folks do with it). A lot of C++ folks use classes heavily, frequently allocating them on the heap. Major C++ libraries such as Qt certainly are designed with the idea that you're going to be allocating as the program runs. And C++ historically has been tooted fairly heavily by many folks as an OO language, in which case, inheritance (and therefore heap allocation) are used heavily by many programs. And with C++11/14, more mechanisms for safely handling memory have been added, thereby further encouraging the use of certain types of heap allocations in your typical C++ program - e.g. make_shared has become the recommended way to allocate memory in most cases. And while folks who are trying to get the bare metal performance that some stuff like games require, most folks are going to use the STL quite a bit. And if they aren't, they're probably using similar classes from a 3rd party library such as Qt. It's the folks who are in embedded environments or who have much more restrictive performance requirements who are more likely to avoid the STL or do stuff like avoid heap allocations after the program has been initialized. So, a _lot_ of C++ code uses the heap quite heavily, and I expect that very little of it tries to allocate everything up front. I, for one, have never worked on an application where that even made sense aside from something very small. I know that applications like that definitely exist, but from everything I've seen, I'd expect them to be the exception to the rule rather than the norm. Regardless, idiomatic D promotes ranges, which naturally help reduce heap allocation. It also means using structs heavily and classes sparingly (though there are plenty of cases where inheritance is required and thus classes get used). And while arrays/strings get allocated on the heap, slicing seriously reduces how often they need to be copied in memory, which reduces heap allocations. So, idiomatic D encourages programs to be written in a way that keeps heap allocation to a minimum. The big place that it happens in most D programs is probably strings, but slicing helps consderably with that, and even some of the string stuff can be made to live on the stack rather than the heap (which is what a lot of Walter's recent work in Phobos has been for - making string-based stuff work as lazy ranges), reducing heap allocations for strings ever further. C++ on the other hand does not have such idioms as the norm or as promoted in any serious way. So, programmers are much more likely to use idioms that involve a lot of heap allocations, and the language and standard library don't really have much to promote idioms that avoid heap allocations (and std::string definitely isn't designed to avoid copying). You can certainly do it - and many do - but since it's not really what's promoted by the language or standard library, it's less likely to happen in your average program. It's much more likely to be done by folks who avoid the STL. So, you _can_ have low heap allocation in a C++ program, and many people do, but from what I've seen, that really isn't the norm across the C++ community in general. - Jonathan M Davis
Re: reading file byLine
On Monday, 14 September 2015 at 18:36:54 UTC, Meta wrote: As an aside, you should use `sort()` instead of the parentheses-less `sort`. The reason for this is that doing `arr.sort` invokes the old builtin array sorting which is terribly slow, whereas `import std.algorithm; arr.sort()` uses the much better sorting algorithm defined in Phobos. Thanks for pointing out.
Re: Passing Arguments on in Variadic Functions
On Monday 14 September 2015 21:59, jmh530 wrote: > This approach gives the correct result, but dmd won't deduce the > type of the template. So for instance, the second to the last > line of the unit test requires explicitly stating the types. I > may as well use the alternate version that doesn't use the > variadic function (which is simple for this trivial example, but > maybe not more generally). You can use a variadic template instead: import std.algorithm : sum; auto test(R, E ...)(R r, E e) { return sum(r, e); } unittest { int[] x = [10, 5, 15, 20, 30]; assert(test(x) == 80); assert(test(x, 0f) == 80f); assert(test(x, 0f) == 80f); }
Re: Passing Arguments on in Variadic Functions
On Monday, 14 September 2015 at 19:59:18 UTC, jmh530 wrote: In R, it is easy to have some optional inputs labeled as ... and then pass all those optional inputs in to another function. I was trying to get something similar to work in a templated D function, but I couldn't quite get the same behavior. What I have below is what I was able to get working. You want to generally avoid the varargs and instead use variadic templates. The syntax is similar but a bit different: R test(R, E, Args...)(Args args) { static if(Args.length == 0) // no additional arguments else return sum(args); } or whatever you actually need. But what this does is take any number of arguments of various types but makes that length and type available at compile time for static if inspection. The args represents the whole list and you can loop over it, convert to an array with `[args]` (if they are all compatible types, or pass to another function as a group like I did here. This is the way writeln is implemented btw.
Re: Passing Arguments on in Variadic Functions
Thanks to you both. This works perfect.
Re: Speeding up text file parser (BLAST tabular format)
On Mon, Sep 14, 2015 at 08:07:45PM +, Kapps via Digitalmars-d-learn wrote: > On Monday, 14 September 2015 at 18:31:38 UTC, H. S. Teoh wrote: > >I decided to give the code a spin with `gdc -O3 -pg`. Turns out that > >the hotspot is in std.array.split, contrary to expectations. :-) > >Here are the first few lines of the gprof output: > > > >[...] > > Perhaps using the new rangified splitter instead of split would help. I tried it. It was slower, surprisingly. I didn't dig deeper into why. T -- I see that you JS got Bach.
Re: Canvas in Gtk connected to D?
On Monday, 14 September 2015 at 19:56:57 UTC, Justin Whear wrote: Mike, as this is really a GTK3 question and not specific to D (if GTK will let you do it in C, you can do it in D), you might have better success asking the GTK forum (gtkforums.com). Another avenue of research would be to look at CEF (D bindings here: http://code.dlang.org/packages/ derelict-cef) and see if that will integrate with your toolkit. Unfortunately derelict-cef is still alpha and not documented well yet. The developer is working on a book on another project, I hear. I'll ask in the GTK Forums what they recommend as the most recently recommended approach for doing static charts in GTK3.
Re: shared array?
On Monday, 14 September 2015 at 20:54:55 UTC, Jonathan M Davis wrote: On Monday, September 14, 2015 01:12:02 Ola Fosheim Grostad via Digitalmars-d-learn wrote: On Monday, 14 September 2015 at 00:41:28 UTC, Jonathan M Davis wrote: > Regardless, idiomatic D involves a lot more stack > allocations than you often get even in C++, so GC usage > tends to be low in Really? I use VLAs in my C++ (a C extension) and use very few mallocs after init. In C++ even exceptions can be put outside the heap. Just avoid STL after init and you're good. From what I've seen of C++ and understand of typical use cases from other folks, that's not at all typical of C++ usage (though there's enough people using C++ across a wide enough spectrum of environments and situations that there's obviously going to be quite a wide spread of what folks do with it). A lot of C++ folks use classes heavily, frequently allocating them on the heap. Dude, my c++ programs are all static ringbuffers and stack allocations. :) It varies a lot. Some c++ programmers turn off everything runtime related and use it as a better c. When targetting mobile you have to be careful about wasting memory... types of heap allocations in your typical C++ program - e.g. make_shared has become the recommended way to allocate memory in most cases I use unique_ptr with custom deallocator (custom freelist), so it can be done outside the heap. :) And while folks who are trying to get the bare metal performance that some stuff like games require, most folks are going to use the STL quite a bit. I use std::array. And my own array view type to reference it. Array_view us coming to c++17 I think. Kinda like D slices. STL/string/iostream is for me primarily useful for init and testing... such as Qt. It's the folks who are in embedded environments or who have much more restrictive performance requirements who are more likely to avoid the STL or do stuff like avoid heap allocations after the program has been initialized. Mobile audio/graphics... So, you _can_ have low heap allocation in a C++ program, and many people do, but from what I've seen, that really isn't the norm across the C++ community in general. I dont think there is a C++ community ;-) I think c++ programmers are quite different based on what they do and when they started using it. I only use it where performance/latency matters. C++ is too annoying (time consuming) for full blown apps IMHO. Classes are easy to stack allocate though, no need to heap allocate most of the time. Lambdas in c++ are often just stack allocated objects, so not so different from D's "ranges" (iterators) anyhow. I don't see my own programs suffer from c++isms anyway...
Re: shared array?
On Monday, September 14, 2015 14:19:30 Ola Fosheim Grøstad via Digitalmars-d-learn wrote: > On Monday, 14 September 2015 at 13:56:16 UTC, Laeeth Isharc wrote: > The claim is correct: you need to follow every pointer that > through some indirection may lead to a pointer that may point > into the GC heap. Not doing so will lead to unverified memory > unsafety. > > > Given one was written by one (very smart) student for his PhD > > thesis, and that as I understand it that formed the basis of > > Sociomantic's concurrent garbage collector (correct me if I am > > wrong), and that this is being ported to D2, and whether or not > > it is released, success will spur others to follow - it strikes > > As it has been described, it is fork() based and unsuitable for > the typical use case. I'm not sure why it wouldn't be suitable for the typical use case. It's quite performant. It would still not be suitable for many games and environments that can't afford to stop the world for more than a few milliseconds, but it brings the stop the world time down considerably, making the GC more suitable for more environments than it would be now, and I'm not aware of any serious downsides to it on a *nix system. Its achilles heel is Windows. On *nix, forking is cheap, but on Windows, it definitely isn't. So, a different mechanism would be needed to make the concurrent GC work on Windows, and I don't know if Windows really provides the necessarily tools to do that, though I know that some folks were looking into it at least at the time of Leandro's talk. So, we're either going to need to figure out how to get the concurrent GC working on Windows via some mechanism other than fork, or Windows is going to need a different solution to get that kind of improvement out of the GC. - Jonathan M Davis
Convert array to tupled array easily?
I created the following code that some of you have already seen. It's sort of a multiple value AA array with self tracking. The problem is, that for some type values, such as delegates, the comparison is is identical. (basically when the delegate is the same) To solve that problem, I'd like to try and turn the Value into Tuples of the Value and the address of the SingleStore wrapper(which should be unique). e.g., public Tuple!(TValue, void*)[][TKey] Store; then I'll simply compare the value and address stored with the this(inside single store) instead of just this. Of course, this requires somewhat of a rewrite of the code(trying it produced all kinds of errors(I tried to fix up all the references and correlated variables but still a mess, specially with D's error codes). It shouldn't be that much trouble though. Essentially where ever I access the value, I want to instead of use value from the tuple(a single indirection). Probably not that easy though? import std.stdio; import std.concurrency; extern (C) int getch(); import std.string; import std.concurrency; import core.time; import core.thread; import std.container.array; import std.typecons; public class SingleStore(TKey, TValue) { public TValue Value; public TKey Key; public TValue[][TKey] Store; // Duplicate entries will be removed together as there is no way to distinguish them public auto Remove() { import std.algorithm; if (Value == null || Key == null) return; int count = 0; for(int i = 0; i < Store[Key].length;i++) { auto c = Store[Key][i]; if (c == Value) { count++; Store[Key][i] = null; // Set to null to release any references if necessary swap(Store[Key][i], Store[Key][max(0, Store[Key].length - count)]); i = i - 1; } } if (count == 1 && Store[Key].length == 1) { Store[Key] = null; Store.remove(Key); } else Store[Key] = Store[Key][0..max(0,Store[Key].length-count)]; Value = null; Key = null; } public static auto New(TKey k, TValue v, ref TValue[][TKey] s) { auto o = new SingleStore!(TKey, TValue)(k, v); o.Store = s; return o; } private this(TKey k, TValue v) { Key = k; Value = v; } } // Creates a static Associative Array that stores multiple values per key. The object returned by New can then be used to remove the key/value without having to remember specifically them. public mixin template ObjectStore(TKey, TValue) { // The object store. It is static. Mixin the template into it's different types to create different types of stores. All objects of that type are then in the same store. public static TValue[][TKey] Store; public static auto New(TKey k, TValue v) { (Store[k]) ~= v; auto o = SingleStore!(TKey, TValue).New(k, v, Store); return o; } public string ToString() { return "asdf"; } } alias dg = int delegate(int); //alias dg = string; class MyStore { mixin ObjectStore!(string, dg); //mixin ObjectStore!(string, string); } void main() { auto k = "x"; dg d1 = (int x) { return x; }; dg d2 = (int x) { return x; }; dg d3 = d1; dg d4 = (int x) { return 3*x; }; /* dg d1 = "a1"; dg d2 = "a2"; dg d3 = "a3"; dg d4 = "a4"; */ auto s = MyStore.New(k, d1); writeln(MyStore.Store[k].length); auto s1 = MyStore.New(k, d2); writeln(MyStore.Store[k].length); auto s2 = MyStore.New(k, d3); writeln(MyStore.Store[k].length); auto s3 = MyStore.New(k, d4); writeln(MyStore.Store[k].length); //auto x = MyStore.Store[k][0](3); //writeln("-" ~ x); s1.Remove(); writeln(MyStore.Store[k].length); s2.Remove(); writeln(MyStore.Store[k].length); s.Remove(); writeln(MyStore.Store[k].length); s3.Remove(); getch(); }
Re: shared array?
On Monday, 14 September 2015 at 20:34:03 UTC, Jonathan M Davis wrote: I'm not sure why it wouldn't be suitable for the typical use case. It's quite performant. It would still not be suitable for many games and environments that can't afford to stop the world for more than a few milliseconds, but it brings the stop the world time down considerably, making the GC more suitable for more environments than it would be now, and I'm not aware of any serious downsides to it on a *nix system. For me concurrent gc implies interactive applications or webservices that are memory constrained/diskless. You cannot prevent triggering actions that writes all over memory during collection without taking special care, like avoiding RC. A fork kan potentially double memory consumption. Gc by itself uses ~2x memory, with fork you have to plan for 3-4x. In the cloud you pay for extra RAM. So configuring the app to a fixed sized memory heap that matches the instance RAM capacity is useful. With fork you just have to play it safe and halve the heap size. So more collections and less utilized RAM per dollar with fork. Only testing will show the effect, but it does not sound promising for my use cases.
Re: shared array?
On Monday, 14 September 2015 at 20:54:55 UTC, Jonathan M Davis wrote: So, you _can_ have low heap allocation in a C++ program, and many people do, but from what I've seen, that really isn't the norm across the C++ community in general. - Jonathan M Davis Fully agreed, C++ in the wild often make lots of copies of data structure, sometimes by mistake (like std::vector passed by value instead of ref). When you copy an aggregate by mistake, every field itself gets copied etc. Copies copies copies everywhere.
Re: Convert array to tupled array easily?
On 09/14/2015 04:23 PM, Prudence wrote: > To solve that problem, I'd like to try and turn the Value into Tuples of > the Value and the address of the SingleStore wrapper(which should be > unique). > > e.g., > public Tuple!(TValue, void*)[][TKey] Store; After changing that, I methodically dealt with compilation errors. A total of 6 changes were sufficient: $ diff before.d after.d 24c24 < public TValue[][TKey] Store; --- > public Tuple!(TValue, void*)[][TKey] Store; 36c36 < if (c == Value) --- > if (c[0] == Value) 39c39 < Store[Key][i] = null; // Set to null to release any references if necessary --- > Store[Key][i][1] = null; // Set to null to release any references if necessary 58c58 < public static auto New(TKey k, TValue v, ref TValue[][TKey] s) --- > public static auto New(TKey k, TValue v, ref Tuple!(TValue, void*)[][TKey] s) 77c77 < public static TValue[][TKey] Store; --- > public static Tuple!(TValue, void*)[][TKey] Store; 81c81 < (Store[k]) ~= v; --- > (Store[k]) ~= tuple(v, cast(void*)null); Ali
Re: Canvas in Gtk connected to D?
On Monday, 14 September 2015 at 21:57:23 UTC, Mike McKee wrote: I'll ask in the GTK Forums what they recommend as the most recently recommended approach for doing static charts in GTK3. BTW, the gtkforums.com site doesn't just let anyone in. I'm still waiting on an admin to approve me. :(
Re: Speeding up text file parser (BLAST tabular format)
On 15/09/15 5:41 AM, NX wrote: On Monday, 14 September 2015 at 16:33:23 UTC, Rikki Cattermole wrote: A lot of this hasn't been covered I believe. http://dpaste.dzfl.pl/f7ab2915c3e1 I believe that should be: foreach (query, ref value; hitlists) Since an assignment happenin there..? Probably.
Re: D + Dub + Sublime +... build/run in terminal?
On Sunday, 13 September 2015 at 10:00:13 UTC, SuperLuigi wrote: Just wondering if anyone here might know how I can accomplish this... basically I'm editing my D code in Sublime using the Dkit plugin to access DCD which so far is more reliable than monodevelop's autocomplete but I do need to reset the server pretty often... but that's neither here nor there... Sublime is great but its little output panel sucks... won't update stdout text until the end of program run so I'd like to open newly built programs in a linux terminal. So I installed a terminal plugin https://github.com/wbond/sublime_terminal . I can open the terminal, great. I can build with dub, great. Now I need to put them together... somehow... When building, sublime looks at a file it has called D.sublime-build, and this files looks like the following: { "cmd": ["dmd", "-g", "-debug", "$file"], "file_regex": "^(.*?)\\(([0-9]+),?([0-9]+)?\\): (.*)", "selector": "source.d", "variants": [ { "name": "Run", "cmd": ["rdmd", "-g", "-debug", "$file"] }, { "name": "dub", "working_dir": "$project_path", "cmd": ["dub"] } ] } I'm pretty sure I need to edit the last line to pop open the terminal at the project path and run dub... but whatever I put in messes up and I just get errors... Has anyone done this before and can give me a clue as to what I can do to get this to work? Are you on Windows or Linux ? If the latter, which terminal do you use? A reference for lxterminal is here: http://manpages.ubuntu.com/manpages/precise/man1/lxterminal.1.html In which case I think if you replace call to dub with a call to lxterminal and stick --command dub at the end it might work Can't try here. Will be similar thing for Windows.
Re: Combining Unique type with concurrency module
On Monday, 14 September 2015 at 00:11:07 UTC, Ali Çehreli wrote: On 09/13/2015 09:09 AM, Alex wrote: > I'm new to this forum so, please excuse me in advance for > asking silly questions. Before somebody else says it: There are no silly questions. :) > struct std.typecons.Unique!(S).Unique is not copyable because it is > annotated with @disable I have made the code compile and work (without any thread synchronization at all). See the comments with [Ali] annotations: import std.stdio; import std.concurrency; import std.typecons; void spawnedFunc2(Tid ownerTid) { /* [Ali] Aside: ownerTid is already and automatically * available. You don't need to pass it in explicitly. */ receive( /* [Ali] The compilation error comes from Variant, which * happens to be the catch all type for concurrency * messages. Unfortunately, there are issues with that * type. * * Although implemented as a pointer, according to * Variant, a 'ref' is not a pointer. (I am not sure * whether this one is a Variant issue or a language * issue.) * * Changing the message to a pointer to a shared * object: */ (shared(Unique!S) * urShared) { /* [Ali] Because the expression ur.i does not work on * a shared object, we will hack it to unshared * first. */ auto ur = cast(Unique!S*)urShared; writeln("Recieved the number ", ur.i); } ); send(ownerTid, true); } static struct S { int i; this(int i){this.i = i;} } Unique!S produce() { // Construct a unique instance of S on the heap Unique!S ut = new S(5); // Implicit transfer of ownership return ut; } void main() { Unique!S u1; u1 = produce(); auto childTid2 = spawn(, thisTid); /* [Ali] Cast it to shared so that it passes to the other * side. Unfortunately, there is no guarantee that this * object is not used by more than one thread. */ send(childTid2, cast(shared(Unique!S*))); /* [Ali] We must wait to ensure that u1 is not destroyed * before all workers have finished their tasks. */ import core.thread; thread_joinAll(); writeln("Successfully printed number."); } Note that thread synchronization is still the programmer's responsibility. > I'm aware of the fact, that my u1 struct can't be copied, but I don't > intend to do so. Correct. > As in the docu stated, I want to lend the struct to the > other thread (by using ref), being sure, that any other thread can't > access the struct during it is processed by the first one. There is a misconception. Unique guarantees that the object will not be copied. It does not provide any guarantee that only one thread will access the object. It is possible to write a type that acquires a lock during certain operations but Unique isn't that type. > Is such a thing possible? > Thanks in advance. > Alex Ali Thanks for answering! Do you have a hint how to create such a type? The needed operation is "onPassingTo" another thread. So the idea is to create a resource, which is not really shared (a question of definition, I think), as it should be accessible only from one thread at a time. But there is a "main" thread, from which the resource can be lent to "worker" threads and there are "worker" threads, where only one worker can have the resource at a given time. On my own the next possibility I would try is something with RefCounting and checking, how many references there exist. Deciding on this number allow or disallow accessing the reference again. By the way, synchronizing by hand is ok. Don't know how important that is, but the idea is, that synchronization appears very rare, as the lending process acquires and releases resources automatically and the next thread can acquire the resource after a release, the synchronization should not be expected systematically but only at some strange time points... I can't even give an example of such times now... maybe only at the end of the program, to let all workers end their existence.
Re: Passing Elements of A Static Array as Function Parameters
On Monday, 14 September 2015 at 05:18:00 UTC, Nordlöw wrote: If I have a static array `x` defined as enum N = 3; int[N] x; how do I pass it's elements into a variadic function f(T...)(T xs) if (T.length >= 3) ? You could turn it into a Tuple and use the `expand` method to get a TypeTuple (AliasSeq). import std.typecons; import std.typetuple; import std.stdio; template genTypeList(T, size_t n) { static if (n <= 1) { alias genTypeList = T; } else { alias genTypeList = TypeTuple!(T, genTypeList!(T, n - 1)); } } auto asTuple(T, size_t n)(ref T[n] arr) { return Tuple!(genTypeList!(T, n))(arr); } void test(T...)(T xs) { writeln("Length: ", T.length, ", Elements: ", xs); } void main() { int[5] a = [0, 1, 2, 3, 4]; test(a);//Length: 1, Elements: [0, 1, 2, 3, 4] test(a.asTuple.expand); //Length: 5, Elements: 01234 }
Re: Combining Unique type with concurrency module
On Monday, 14 September 2015 at 00:11:07 UTC, Ali Çehreli wrote: There is a misconception. Unique guarantees that the object will not be copied. It does not provide any guarantee that only one thread will access the object. It is possible to write a type that acquires a lock during certain operations but Unique isn't that type. By intention Unique means more than just "no copies" - it also means "only one reference at a single point of time" which, naturally, leads to implicit moving (not sharing!) between threads. However, AFAIK there are still ways to break that rule with existing Unique implementation and, of course, std.concurrency was never pacthed for special Unique support (it should).
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: Hi, This is my first post on Dlang forums and I don't have a lot of experience with D (yet). I mainly code bioinformatics-stuff in Python on my day-to-day job, but I've been toying with D for a couple of years now. I had this idea that it'd be fun to write a parser for a text-based tabular data format I tend to read a lot of in my programs, but I was a bit stomped that the D implementation I created was slower than my Python-version. I tried running `dmd -profile` on it but didn't really understand what I can do to make it go faster. I guess there's some unnecessary dynamic array extensions being made but I can't figure out how to do without them, maybe someone can help me out? I tried making the examples as small as possible. Here's the code D code: http://dpaste.com/2HP0ZVA Here's my Python code for comparison: http://dpaste.com/0MPBK67 clip I am going to go off the beaten path here. If you really want speed for a file like this one way of getting that is to read the file in as a single large binary array of ubytes (or in blocks if its too big) and parse the lines yourself. Should be fairly easy with D's array slicing. I looked at the format and it appears that lines are quite simple and use a limited subset of the ASCII chars. If that is in fact true then you should be able to speed up reading using this technique. If you can have UTF8 chars in there, or if the format can be more complex than that shown in your example, then please ignore my suggestion.
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 17:51:43 UTC, CraigDillabaugh wrote: On Monday, 14 September 2015 at 12:30:21 UTC, Fredrik Boulund wrote: [...] I am going to go off the beaten path here. If you really want speed for a file like this one way of getting that is to read the file in as a single large binary array of ubytes (or in blocks if its too big) and parse the lines yourself. Should be fairly easy with D's array slicing. my favourite for streaming a file: enum chunkSize = 4096; File(fileName).byChunk(chunkSize).map!"cast(char[])a".joiner()
Re: Speeding up text file parser (BLAST tabular format)
I decided to give the code a spin with `gdc -O3 -pg`. Turns out that the hotspot is in std.array.split, contrary to expectations. :-) Here are the first few lines of the gprof output: -snip- Each sample counts as 0.01 seconds. % cumulative self self total time seconds secondscalls ms/call ms/call name 19.77 0.43 0.43 76368364 0.00 0.00 nothrow @safe int std.array.split!(char[]).split(char[]).__foreachbody2(ref ulong, ref dchar) 16.74 0.79 0.36 nothrow void gc.gc.Gcx.mark(void*, void*, int) 10.23 1.01 0.22 _aApplycd2 8.14 1.18 0.18 _d_arrayappendcTX 4.19 1.27 0.09 nothrow ulong gc.gc.Gcx.fullcollect() -snip- As you can see, std.array.split takes up almost 20% of the total running time. The surprising (or not-so-surprising?) second runner-up for hot spot is actually the GC's marking algorithm. I'm guessing this is likely because of the extensive use of small GC-allocated arrays (splitting into strings, and the like). The 3rd entry, _aApplycd2, appears to be the druntime implementation of the foreach loop, so I'm not sure what could be done about it. Anyway, just for kicks, I tried various ways of reducing the cost of std.array.split, but didn't get very far. Replacing it with std.regex.split didn't help. Looking at its implementation, while it does allocate a new array each time, it also slices over the input when constructing the substrings, so it didn't seem as inefficient as I first thought. Next I tried disabling the GC with core.memory.GC.disable(). Immediately, I got a 20% performance boost. Of course, running with a disabled GC will soon eat up all memory and crash, which isn't an option in real-world usage, so the next best solution is to manually schedule GC collection cycles, say call GC.collect() every n iterations of the parsing loop, for some value of n. I tried implementing a crude version of this (see code below), and found that manually calling GC.collect() even as frequently as once every 5000 loop iterations (for a 500,000 line test input file) still gives about 15% performance improvement over completely disabling the GC. Since most of the arrays involved here are pretty small, the frequency could be reduced to once every 50,000 iterations and you'd pretty much get the 20% performance boost for free, and still not run out of memory too quickly. Note that all measurements were done with `gdc -O3`. I did try the original code with `dmd -O`, and found that it was 20% slower than the gdc version, so I didn't look further. Anyway, here's the code with the manual GC collection schedule (I modified main() slightly so that I could easily test the code with different input files, so you have to specify the input filename as an argument to the program when running it): ---snip--- #!/usr/bin/env rdmd // Programmed in the D language // Fredrik Boulund 2015-09-09 // Modified by H. S. Teoh 2015-09-14 import core.memory; // for GC control import std.stdio; import std.array; import std.conv; import std.algorithm; struct Hit { string target; float pid; int matches, mismatches, gaps, qstart, qstop, tstart, tstop; double evalue, bitscore; } enum manualGcCount = 5_000; ulong ticksToCollect = manualGcCount; void gcTick() { if (--ticksToCollect == 0) { GC.collect(); ticksToCollect = manualGcCount; } } void custom_parse(string filename) { float min_identity = 90.0; int min_matches = 10; Hit[][string] hitlists; foreach (record; filename .File .byLine .map!split .filter!(l => l[2].to!float >= min_identity && l[3].to!int >= min_matches)) { hitlists[record[0].to!string] ~= Hit(record[1].to!string, record[2].to!float, record[3].to!int, record[4].to!int, record[5].to!int, record[6].to!int, record[7].to!int, record[8].to!int, record[9].to!int, record[10].to!double, record[11].to!double); gcTick(); } foreach (query; hitlists.byKey) { float max_pid = reduce!((a,b) => max(a, b.pid))(-double.max, hitlists[query]); float max_pid_diff = 5.00; hitlists[query] = hitlists[query].filter!(h => h.pid >= (max_pid - max_pid_diff)).array(); writeln(query, ": ",
Re: reading file byLine
On Monday, 7 September 2015 at 10:25:09 UTC, deed wrote: Right, it's like int x = 3; // x + 5; // Just an expression evaluated to 8, // but what do you want to do with it? // It won't affect your program and the // compiler will give you an error. int y = x + 5; // But you can assign the expression to // a new variable x = x + 5; // or you can assign it back writeln(x);// or you can pass it to a function. // For your case: int[] arr = [1, 2, 3, 2, 1, 4]; arr.sort; // Operating on arr in place -> arr itself is mutated arr.writeln; // [1, 1, 2, 2, 3, 4] arr.uniq; // Not operating on arr, it's like the expression // x + 5 (but no compiler error is given). arr.uniq.writeln; // [1, 2, 3, 4] (Expression passed to writeln) arr.writeln; // [1, 1, 2, 2, 3, 4] (Not affected) int[] newArr = arr.uniq.array; // Expression put into a new array assigned to newArr newArr.writeln;// [1, 2, 3, 4] arr.writeln; // Still the sorted array. [1, 1, 2, 2, 3, 4] arr = arr.uniq.array; // Now arr is assigned the uniq array arr.writeln; // [1, 2, 3, 4] You need to know whether the function will mutate your array; sort does, while uniq doesn't. If you want to do things requiring mutation, but still want your original data unchanged, you can duplicate the data with .dup before the mutating operations, like this: int[] data = [1, 2, 2, 1]; int[] uniqData = data.dup.sort.uniq.array; data.writeln; // [1, 2, 2, 1] Unchanged, a duplicate was sorted. uniqData.writeln; // [1, 2] As an aside, you should use `sort()` instead of the parentheses-less `sort`. The reason for this is that doing `arr.sort` invokes the old builtin array sorting which is terribly slow, whereas `import std.algorithm; arr.sort()` uses the much better sorting algorithm defined in Phobos.
Re: Canvas in Gtk connected to D?
On Mon, 14 Sep 2015 17:05:16 +, Mike McKee wrote: > Is there a way to do a canvas in GTK3 so that I can use chart.js, Mike, as this is really a GTK3 question and not specific to D (if GTK will let you do it in C, you can do it in D), you might have better success asking the GTK forum (gtkforums.com). Another avenue of research would be to look at CEF (D bindings here: http://code.dlang.org/packages/ derelict-cef) and see if that will integrate with your toolkit.
Passing Arguments on in Variadic Functions
In R, it is easy to have some optional inputs labeled as ... and then pass all those optional inputs in to another function. I was trying to get something similar to work in a templated D function, but I couldn't quite get the same behavior. What I have below is what I was able to get working. This approach gives the correct result, but dmd won't deduce the type of the template. So for instance, the second to the last line of the unit test requires explicitly stating the types. I may as well use the alternate version that doesn't use the variadic function (which is simple for this trivial example, but maybe not more generally). Do I have any options to get it more similar to the way R does things? import std.algorithm : sum; import core.vararg; auto test(R)(R r) { return sum(r); } auto test(R, E)(R r, ...) { if (_arguments.length == 0) return test(r); else { auto seed = va_arg!(E)(_argptr); return sum(r, seed); } } auto test_alt(R, E)(R r, E seed) { return sum(r, seed); } unittest { int[] x = [10, 5, 15, 20, 30]; assert(test(x) == 80); assert(test!(int[], float)(x, 0f) == 80f); assert(test_alt(x, 0f) == 80f); }
Re: Speeding up text file parser (BLAST tabular format)
On Monday, 14 September 2015 at 18:31:38 UTC, H. S. Teoh wrote: I decided to give the code a spin with `gdc -O3 -pg`. Turns out that the hotspot is in std.array.split, contrary to expectations. :-) Here are the first few lines of the gprof output: [...] Perhaps using the new rangified splitter instead of split would help.
Re: Canvas in Gtk connected to D?
Am 14.09.2015 um 19:05 schrieb Mike McKee: > Is there a way to do a canvas in GTK3 so that I can use chart.js, and > connect this to D? See, in something similar, a guy named Julien Wintz > figured out that Qt's QQuickWidget acts much like the webkit Canvas > object, and thus was able to port chart.js to that widget. This allows > one to use Qt + QQuickWidget + D (or any Qt-supported language for that > matter) to draw charts using Javascript, using the chart.js > documentation. What's also fascinating about this is that it's fairly > lightweight -- Julien's solution doesn't use Chromium (or other webkit > implementation) to make it work. (It should be noted, however, that > QQuickWidget uses OpenGL, however.) Likewise, it would be great if I > could do something similar in GTK3. > > See, I like D, and I'm getting somewhere with it with GTK3, but doing > static charts like I see with chart.js is important for my use of this > language. > I'm guessing the port was so easy because the QQuickWidgets have access to the V8 javascript engine (from Chromium ) that is included with Qt. As GTK doesn't have a javascript engine integrated a straightforward port of chartjs won't work.
Re: Combining Unique type with concurrency module
On 09/14/2015 12:07 AM, Alex wrote: > Do you have a hint how to create such a type? The needed operation is > "onPassingTo" another thread. So the idea is to create a resource, which > is not really shared (a question of definition, I think), as it should > be accessible only from one thread at a time. > But there is a "main" thread, from which the resource can be lent to > "worker" threads and there are "worker" threads, where only one worker > can have the resource at a given time. Here is an unpolished solution that enforces that the thread that is using it is really its owner: struct MultiThreadedUnique(T) { Tid currentOwner; Unique!T u; this(Unique!T u) { this.u = u.release(); this.currentOwner = thisTid; } void enforceRightOwner() { import std.exception; import std.string; enforce(currentOwner == thisTid, format("%s is the owner; not %s", currentOwner, thisTid)); } ref Unique!T get() { enforceRightOwner(); return u; } void giveTo(Tid newOwner) { enforceRightOwner(); currentOwner = newOwner; } } The entire program that I tested it with: import std.stdio; import std.concurrency; import std.typecons; void spawnedFunc2(Tid ownerTid) { receive( (shared(MultiThreadedUnique!S) * urShared) { auto ur = cast(MultiThreadedUnique!S*)urShared; writeln("Recieved the number ", ur.get().i); ur.giveTo(ownerTid); } ); send(ownerTid, true); } static struct S { int i; this(int i){this.i = i;} } Unique!S produce() { // Construct a unique instance of S on the heap Unique!S ut = new S(5); // Implicit transfer of ownership return ut; } struct MultiThreadedUnique(T) { Tid currentOwner; Unique!T u; this(Unique!T u) { this.u = u.release(); this.currentOwner = thisTid; } void enforceRightOwner() { import std.exception; import std.string; enforce(currentOwner == thisTid, format("%s is the owner; not %s", currentOwner, thisTid)); } ref Unique!T get() { enforceRightOwner(); return u; } void giveTo(Tid newOwner) { enforceRightOwner(); currentOwner = newOwner; } } void main() { MultiThreadedUnique!S u1 = produce(); auto childTid2 = spawn(, thisTid); u1.giveTo(childTid2); send(childTid2, cast(shared(MultiThreadedUnique!S*))); import core.thread; thread_joinAll(); writeln("Successfully printed number."); auto u2 = (); } Ali
Re: Passing Elements of A Static Array as Function Parameters
On Monday, 14 September 2015 at 07:05:23 UTC, Meta wrote: You could turn it into a Tuple and use the `expand` method to get a TypeTuple (AliasSeq). import std.typecons; import std.typetuple; import std.stdio; template genTypeList(T, size_t n) { static if (n <= 1) { alias genTypeList = T; } else { alias genTypeList = TypeTuple!(T, genTypeList!(T, n - 1)); } } auto asTuple(T, size_t n)(ref T[n] arr) { return Tuple!(genTypeList!(T, n))(arr); } void test(T...)(T xs) { writeln("Length: ", T.length, ", Elements: ", xs); } void main() { int[5] a = [0, 1, 2, 3, 4]; test(a);//Length: 1, Elements: [0, 1, 2, 3, 4] test(a.asTuple.expand); //Length: 5, Elements: 01234 } Is there a reason why such a common thing isn't already in Phobos? If not what about adding it to std.typecons : asTuple
Re: Combining Unique type with concurrency module
On Monday, 14 September 2015 at 00:11:07 UTC, Ali Çehreli wrote: send(childTid2, cast(shared(Unique!S*))); And yeah this violates the idea of Unique. Sadly, I am not aware of any way to prohibit taking address of an aggregate.
Re: shared array?
On Monday, 14 September 2015 at 00:53:58 UTC, Jonathan M Davis wrote: So, while the fact that D's GC is less than stellar is certainly a problem, and we would definitely like to improve that, the idioms that D code typically uses seriously reduce the number of performance problems that we get. What D needs is some way for a static analyzer to be certain that a pointer does not point to a specific GC heap. And that means language changes... one way or the other. Without language changes it becomes very difficult to reduce the amount of memory scanned without sacrificing memory safety. And I don't think a concurrent GC is realistic given the complexity and performance penalties. The same people who complain about GC would not accept performance hits on pointer-writes. That would essentially make D and Go too similar IMO.
Re: Passing Elements of A Static Array as Function Parameters
On Monday, 14 September 2015 at 07:05:23 UTC, Meta wrote: You could turn it into a Tuple and use the `expand` method to get a TypeTuple (AliasSeq). import std.typecons; import std.typetuple; import std.stdio; template genTypeList(T, size_t n) { static if (n <= 1) { alias genTypeList = T; } else { alias genTypeList = TypeTuple!(T, genTypeList!(T, n - 1)); } } auto asTuple(T, size_t n)(ref T[n] arr) { return Tuple!(genTypeList!(T, n))(arr); } BTW: What about .tupleof? Isn't that what should be used here?