On 8/12/20 6:44 PM, methonash wrote:
Hi,

Relative beginner to D-lang here, and I'm very confused by the apparent performance disparity I've noticed between programs that do the following:

1) cat some-large-file | D-program-reading-stdin-byLine()

2) D-program-directly-reading-file-byLine() using File() struct

The D-lang difference I've noticed from options (1) and (2) is somewhere in the range of 80% wall time taken (7.5s vs 4.1s), which seems pretty extreme.

For comparison, I attempted the same using Perl with the same large file, and I only noticed a 25% difference (10s vs 8s) in performance, which I imagine to be partially attributable to the overhead incurred by using a pipe and its buffer.

So, is this difference in D-lang performance typical? Is this expected behavior?

Was wondering if this may have anything to do with the library definition for std.stdio.stdin (https://dlang.org/library/std/stdio/stdin.html)? Does global file-locking significantly affect read-performance?

For reference: I'm trying to build a single-threaded application; my present use-case cannot benefit from parallelism, because its ultimate purpose is to serve as a single-threaded downstream filter from an upstream application consuming (n-1) system threads.

Are we missing the obvious here? cat needs to read from disk, write the results into a pipe buffer, then context-switch into your D program, then the D program reads from the pipe buffer.

Whereas, reading from a file just needs to read from the file.

The difference does seem a bit extreme, so maybe there is another more complex explanation.

But for sure, reading from stdin doesn't do anything different than reading from a file if you are using the File struct.

A more appropriate test might be using the shell to feed the file into the D program:

dprogram < FILE

Which means the same code runs for both tests.

-Steve

Reply via email to