> I know it may be a long shot, but did you consider using columnar data store > like Apache Arrow?
Arrow might be an option, but is there a Haskell interface? (Googling gives the obvious hits regarding arrows, and Google doesn't seem to care about me adding +apache to the search, it gives me result where "+apache" is overstruck.) > Without knowing more about your application it is a bit difficult to produce > more hints. > What is your application? The short story is that I extract a number of 64-bit values from my data, and want to maintain frequency counts for each unique value. So there'll be on the order of 10^9 (plus/minus an order of magnitude) unique values, with counts ranging from one to a few million (and large values being rare). The long explanation is that I'm doing k-mer counts for molecular sequences, breaking DNA sequence data into overlapping words of fixed size (the parameter k), and counting their occurrences. I encode them as Word64, using two bits per nucleotide (the alphabet is A, C, G, and T). This is of course a fairly staple thing to do, and there is no lack of alternative programs that do it - but I'd like mine to work anyway, and it annoys me to have run into this particular bug. Whether it is my own fault, in the Judy FFI, the GHC runtime or libraries, the libjudy code, GHC compilation issues, or a hardware error. -k -- If I haven't seen further, it is by standing in the footprints of giants _______________________________________________ Biohaskell mailing list Biohaskell@biohaskell.org http://biohaskell.org/cgi-bin/mailman/listinfo/biohaskell