On 8/8/17 11:28 AM, Guillaume Chatelet wrote:
Let's say I'm processing MB of data, I'm lazily iterating over the incoming lines storing data in an associative array. I don't want to copy unless I have to.

Contrived example follows:

input file
----------
a,b,15
c,d,12
....

Efficient ingestion
-------------------
void main() {

   size_t[string][string] indexed_map;

   foreach(char[] line ; stdin.byLine) {
     char[] a;
     char[] b;
     size_t value;
     line.formattedRead!"%s,%s,%d"(a,b,value);

     auto pA = a in indexed_map;
     if(pA is null) {
       pA = &(indexed_map[a.idup] = (size_t[string]).init);
     }

     auto pB = b in (*pA);
     if(pB is null) {
       pB = &((*pA)[b.idup] = size_t.init
     }

     // Technically unneeded but let's say we have more than 2 dimensions.
     (*pB) = value;
   }

   indexed_map.writeln;
}


I qualify this code as ugly but fast. Any idea on how to make this less ugly? Is there something in Phobos to help?

I wouldn't use formattedRead, as I think this is going to allocate temporaries for a and b.

Note, this is very close to Jon Degenhardt's blog post in May: https://dlang.org/blog/2017/05/24/faster-command-line-tools-in-d/

-Steve

Reply via email to