Re: Speed up COPY FROM text/CSV parsing using SIMD

Andrew Dunstan Thu, 21 Aug 2025 08:47:52 -0700


On 2025-08-19 Tu 10:14 AM, Nazir Bilal Yavuz wrote:

Hi,

On Tue, 19 Aug 2025 at 15:33, Nazir Bilal Yavuz <[email protected]> wrote:

I am able to reproduce the regression you mentioned but both
regressions are %20 on my end. I found that (by experimenting) SIMD
causes a regression if it advances less than 5 characters.

So, I implemented a small heuristic. It works like that:

- If advance < 5 -> insert a sleep penalty (n cycles).

'sleep' might be a poor word choice here. I meant skipping SIMD for n
number of times.

I was thinking a bit about that this morning. I wonder if it might bebetter instead of having a constantly applied heuristic like this, itmight be better to do a little extra accounting in the first, say, 1000lines of an input file, and if less than some portion of the input isfound to be special characters then switch to the SIMD code. What thatportion should be would need to be determined by some experimentationwith a variety of typical workloads, but given your findings 20% seemslike a good starting point.



cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Speed up COPY FROM text/CSV parsing using SIMD

Reply via email to