It seems that when I read gzipped data, I have "something more" in the stream 
than when the input is first decompressed outside of nim. I noticed this 
because I found data corresponding to an extra empty `Fastq` in my output when 
dealing with gzipped input. I tried to confirm this by inserting assertions:
    
    
    proc fastqParser(stream: Stream): iterator(): Fastq =
      result = iterator(): Fastq =
        var
          nameLine: string
          nucLine: string
          quaLine: string
        while not stream.atEnd():
        # while not input.endOfFile:
          nameLine = stream.readLine()
          #TODO: Why is there an extra empty Fastq when reading gzipped input?
          doAssert(not stream.atEnd(), "stream ended after nameLine: " & 
nameLine)
          nucLine = stream.readLine()
          doAssert(not stream.atEnd(), "stream ended after nucLine: " & nucLine)
          discard stream.readLine()
          doAssert(not stream.atEnd(), "stream ended after quaLine: " & quaLine)
          quaLine = stream.readLine()
          yield makeFastq(nameLine, nucLine, quaLine)
    

This results in `Error: unhandled exception: not atEnd(stream) stream ended 
after nameLine: [AssertionError]`, both when reading from stdin and when 
reading from a command-line given file.

The python version doesn't generate one extra empty `Fastq`, but maybe the 
gzipped data is still responsible for the problem, and the gzip reading 
capability of python is more robust.

Reply via email to