I have change the code to parallel on files rather than lines. codes are available here <https://gist.github.com/innerlee/9b176a52b3330a1340ec94da0cdc721b> if anyone have interests. However, the speed is not satisfactory still (total processing speed approx. 10M/s, ideally it should be 100M/s, the network speed). CPU not full, IO not full, and I cannot find the bottleneck...
@Jeremy, thanks for the reply. The bottleneck is IO. You need days just to stream all files at full speed. Thus waiting to load the whole file will waste a lot of time. Ideally it will be that when I streamed the data one pass, the processing is also done without extra time. @Páll, do you mean that pmap will first do a ``collect`` operation, then processing? So even you give pmap an iterator, it will not benefit from it? That will be sad.
