Interesting. Do You have an error report filed anywhere to peruse?
M
On Thu, 4 May 2017 at 5:05 PM, Ketil Malde <ke...@malde.org> wrote:

>
> > I know it may be a long shot, but did you consider using columnar data
> store like Apache Arrow?
>
> Arrow might be an option, but is there a Haskell interface?  (Googling
> gives the obvious hits regarding arrows, and Google doesn't seem to care
> about me adding +apache to the search, it gives me result where
> "+apache" is overstruck.)
>
> > Without knowing more about your application it is a bit difficult to
> produce more hints.
> > What is your application?
>
> The short story is that I extract a number of 64-bit values from my
> data, and want to maintain frequency counts for each unique value.  So
> there'll be on the order of 10^9 (plus/minus an order of magnitude)
> unique values, with counts ranging from one to a few million (and large
> values being rare).
>
> The long explanation is that I'm doing k-mer counts for molecular
> sequences,
> breaking DNA sequence data into overlapping words of fixed size (the
> parameter k), and counting their occurrences.  I encode them as Word64,
> using two bits per nucleotide (the alphabet is A, C, G, and T).  This is
> of course a fairly staple thing to do, and there is no lack of
> alternative programs that do it - but I'd like mine to work anyway, and
> it annoys me to have run into this particular bug.  Whether it is my own
> fault, in the Judy FFI, the GHC runtime or libraries, the libjudy code,
> GHC compilation issues, or a hardware error.
>
> -k
> --
> If I haven't seen further, it is by standing in the footprints of giants
>
_______________________________________________
Biohaskell mailing list
Biohaskell@biohaskell.org
http://biohaskell.org/cgi-bin/mailman/listinfo/biohaskell

Reply via email to