On Thu, 7 Aug 2025 at 14:15, Nazir Bilal Yavuz <byavu...@gmail.com> wrote: > I have a couple of ideas that I was working on: > --- > > + * However, SIMD optimization cannot be applied in the following > cases: > + * - Inside quoted fields, where escape sequences and closing quotes > + * require sequential processing to handle correctly. > > I think you can continue SIMD inside quoted fields. Only important > thing is you need to set last_was_esc to false when SIMD skipped the > chunk.
There is a trick with doing carryless multiplication with -1 that can be used to SIMD process transitions between quoted/not-quoted. [1] This is able to convert a bitmask of unescaped quote character positions to a quote mask in a single operation. I last looked at it 5 years ago, but I remember coming to the conclusion that it would work for implementing PostgreSQL's interpretation of CSV. [1] https://github.com/geofflangdale/simdcsv/blob/master/src/main.cpp#L76 -- Ants