Re: Looking for a Code Review of a Bioinformatics POC

2020-06-12 Thread H. S. Teoh via Digitalmars-d-learn
On Fri, Jun 12, 2020 at 12:11:44PM +, duck_tape via Digitalmars-d-learn wrote: > On Friday, 12 June 2020 at 12:02:19 UTC, duck_tape wrote: > > For speedups with getting my hands dirty: > > - Does writef and company flush on every line? I still haven't found > > the source of this. writef, et

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-12 Thread duck_tape via Digitalmars-d-learn
On Friday, 12 June 2020 at 12:02:19 UTC, duck_tape wrote: For speedups with getting my hands dirty: - Does writef and company flush on every line? I still haven't found the source of this. - It looks like I could use {f}printf if I really wanted to: https://forum.dlang.org/post/hzcjbanvkxgohkbv

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-12 Thread duck_tape via Digitalmars-d-learn
On Friday, 12 June 2020 at 07:25:09 UTC, Jon Degenhardt wrote: tsv-utils has the advantage of only needing to support utf-8 files with Unix newlines, so the code is simpler. (Windows newlines are detected, this occurs separately from bufferedByLine.) But as you describe, support for a wider va

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-12 Thread Jon Degenhardt via Digitalmars-d-learn
On Friday, 12 June 2020 at 06:20:59 UTC, H. S. Teoh wrote: I glanced over the implementation of byLine. It appears to be the unhappy compromise of trying to be 100% correct, cover all possible UTF encodings, and all possible types of input streams (on-disk file vs. interactive console). It do

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread H. S. Teoh via Digitalmars-d-learn
On Fri, Jun 12, 2020 at 03:32:48AM +, Jon Degenhardt via Digitalmars-d-learn wrote: [...] > I haven't spent much time on results presentation, I know it's not > that easy to read and interpret the results. Brief summary - On files > with short lines buffering will result in dramatic throughput

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread Jon Degenhardt via Digitalmars-d-learn
On Friday, 12 June 2020 at 00:58:34 UTC, duck_tape wrote: On Thursday, 11 June 2020 at 23:45:31 UTC, H. S. Teoh wrote: Hmm, looks like it's not so much input that's slow, but *output*. In fact, it looks pretty bad, taking almost as much time as overlap() does in total! [snip...] I'll play

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread duck_tape via Digitalmars-d-learn
On Thursday, 11 June 2020 at 23:45:31 UTC, H. S. Teoh wrote: Hmm, looks like it's not so much input that's slow, but *output*. In fact, it looks pretty bad, taking almost as much time as overlap() does in total! This makes me think that writing your own output buffer could be worthwhile. H

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread H. S. Teoh via Digitalmars-d-learn
On Thu, Jun 11, 2020 at 11:02:21PM +, duck_tape via Digitalmars-d-learn wrote: [...] > I will give that a shot! Also of interest, the profiler results on a > full runthrough do show file writing and int parsing as the 2nd and > 3rd most time consuming activities: > > ``` > Num Tree

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread duck_tape via Digitalmars-d-learn
On Thursday, 11 June 2020 at 22:57:55 UTC, H. S. Teoh wrote: But one simple thing to try is to add 'scope' to the callback parameter, which could potentially save you a GC allocation. I'm not 100% certain this will make a difference, but since it's such an easy change it's worth a shot. I wi

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread duck_tape via Digitalmars-d-learn
On Thursday, 11 June 2020 at 22:53:52 UTC, tastyminerals wrote: Mir is fine-tuned for LLVM, pointer magic and SIMD optimizations. I'll have to give that a shot for the biofast version of this. There are other ways of doing this same thing that could very well benefit from Mir.

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread H. S. Teoh via Digitalmars-d-learn
On Thu, Jun 11, 2020 at 10:41:12PM +, duck_tape via Digitalmars-d-learn wrote: > On Thursday, 11 June 2020 at 22:19:27 UTC, H. S. Teoh wrote: > > To encourage inlining, you could make it an alias parameter instead > > of a delegate, something like this: > > > > void overlap(alias cb)(STyp

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread tastyminerals via Digitalmars-d-learn
On Thursday, 11 June 2020 at 21:54:31 UTC, duck_tape wrote: On Thursday, 11 June 2020 at 20:24:37 UTC, tastyminerals wrote: Mir Slices instead of standard D arrays are faster. Athough looking at your code I don't see where you can plug them in. Just keep in mind. Thanks for taking a look! Wha

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread duck_tape via Digitalmars-d-learn
On Thursday, 11 June 2020 at 22:19:27 UTC, H. S. Teoh wrote: To encourage inlining, you could make it an alias parameter instead of a delegate, something like this: void overlap(alias cb)(SType start, SType stop) { ... } ... bed[chr].overlap!callback(st0, en0); I don

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread H. S. Teoh via Digitalmars-d-learn
On Thu, Jun 11, 2020 at 04:13:34PM +, duck_tape via Digitalmars-d-learn wrote: [...] > Currently my D version is a few seconds slower than the Crystal > version. putting it very solid in third place overall. I'm not really > sure where it's falling behind crystal since `-release` removes boun

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread duck_tape via Digitalmars-d-learn
On Thursday, 11 June 2020 at 20:24:37 UTC, tastyminerals wrote: Mir Slices instead of standard D arrays are faster. Athough looking at your code I don't see where you can plug them in. Just keep in mind. I just started following links, sweet blog! Your reason for getting into D is exactly the

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread duck_tape via Digitalmars-d-learn
On Thursday, 11 June 2020 at 20:24:37 UTC, tastyminerals wrote: Mir Slices instead of standard D arrays are faster. Athough looking at your code I don't see where you can plug them in. Just keep in mind. Thanks for taking a look! What is it about Mir Slices that makes them faster? I hadn't se

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread tastyminerals via Digitalmars-d-learn
On Thursday, 11 June 2020 at 16:13:34 UTC, duck_tape wrote: Hi! I'm new to dlang but loving it so far! One of my favorite first things to implement in a new language is an interval library. In this case I want to submit to a benchmark repo: https://github.com/lh3/biofast If anyone is willing

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread tastyminerals via Digitalmars-d-learn
On Thursday, 11 June 2020 at 16:13:34 UTC, duck_tape wrote: Hi! I'm new to dlang but loving it so far! One of my favorite first things to implement in a new language is an interval library. In this case I want to submit to a benchmark repo: https://github.com/lh3/biofast If anyone is willing

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread duck_tape via Digitalmars-d-learn
On Thursday, 11 June 2020 at 17:25:13 UTC, CraigDillabaugh wrote: Are you building with DMD or with LDC/GDC? I'm building with LDC. I haven't pulled up a linux box to test drive gdc yet. `ldc2 -O -release`

Re: Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread CraigDillabaugh via Digitalmars-d-learn
On Thursday, 11 June 2020 at 16:13:34 UTC, duck_tape wrote: Hi! I'm new to dlang but loving it so far! One of my favorite first things to implement in a new language is an interval library. In this case I want to submit to a benchmark repo: https://github.com/lh3/biofast I also think there is

Looking for a Code Review of a Bioinformatics POC

2020-06-11 Thread duck_tape via Digitalmars-d-learn
Hi! I'm new to dlang but loving it so far! One of my favorite first things to implement in a new language is an interval library. In this case I want to submit to a benchmark repo: https://github.com/lh3/biofast If anyone is willing to take a look and give some feedback I'd be very appreciati