On slightly closer inspection of that spec, it seems that backslash quoting only happens in the `##` comment sections ignored above. So, maybe they aren't buggy after all if that data is not important to the calculation in question.
A further point along the lines of "if we're going to provide `split`, maybe we should provide faster variants" is that the times when it is going to matter most will be for huge inputs where the columns may well have a more regular nature like numbers and specifically forbid more complex lexical structure. There it might A) be correct, B) performance might matter a lot in human terms, and C) the programmer might mostly be non-sophisticated with regard to even terminology like "lexing" or "vectorized memchr". This all seems to be the case with this VCF thing, but I feel like it's come up quite a few times over the years. It may not **always** be "bugs running faster". :-) Anyway, I don't think "to force people to learn new terminology/techniques" is a very welcoming answer. So, I tried to provide something more welcoming. Even if their parsing is sloppy & error prone, I think naive programmers facing consequent errors on their own data sets rather than complaining about Nim library performance is better optics for Nim. All that said, I think we agree 100% that we probably need more information from @markebbert to help him any more with his actual problem. Maybe it is IO. Maybe he didn't even compile with `-d:danger`. If he's on Linux, I would suggest him decompressing first and trying my `mmap` versions. 90 GB/(100 MB/s) =~ 900 seconds =~ 15 minutes. Heck, some people even have 90GB of RAM. :-)
