Hi everyone, On Tue, Dec 09, 2025 at 04:40:19PM +0300, Bilal Yavuz wrote: > Hi, > > On Sat, 6 Dec 2025 at 10:55, Bilal Yavuz <[email protected]> wrote: > > > > Hi, > > > > On Sat, 6 Dec 2025 at 04:40, Manni Wood <[email protected]> wrote: > > > Hello, all. > > > > > > Andrew, I tried your suggestion of just reading the first chunk of the > > > copy file to determine if SIMD is worth using. Attached are v4 versions > > > of the patches showing a first attempt at doing that. > > > > Thank you for doing this! > > > > > I attached test.sh.txt to show how I've been testing, with 5 million > > > lines of the various copy file variations introduced by Ayub Kazar. > > > > > > The text copy with no special chars is 30% faster. The CSV copy with no > > > special chars is 48% faster. The text with 1/3rd escapes is 3% slower. > > > The CSV with 1/3rd quotes is 0.27% slower. > > > > > > This set of patches follows the simplest suggestion of just testing the > > > first N lines (actually first N bytes) of the file and then deciding > > > whether or not to enable SIMD. This set of patches does not follow > > > Andrew's later suggestion of maybe checking again every million lines or > > > so. > > > > My input-generation script is not ready to share yet, but the inputs > > follow this format: text_${n}.input, where n represents the number of > > normal characters before the delimiter. For example: > > > > n = 0 -> "\n\n\n\n\n..." (no normal characters) > > n = 1 -> "a\n..." (1 normal character before the delimiter) > > ... > > n = 5 -> "aaaaa\n..." > > … continuing up to n = 32. > > > > Each line has 4096 chars and there are a total of 100000 lines in each > > input file. > > > > I only benchmarked the text format. I compared the latest heuristic I > > shared [1] with the current method. The benchmarks show roughly a ~16% > > regression at the worst case (n = 2), with regressions up to n = 5. > > For the remaining values, performance was similar. > > I tried to improve the v4 patchset. My changes are: > > 1 - I changed CopyReadLineText() to an inline function and sent the > use_simd variable as an argument to get help from inlining. > > 2 - A main for loop in the CopyReadLineText() function is called many > times, so I moved the use_simd check to the CopyReadLine() function. > > 3 - Instead of 'bytes_processed', I used 'chars_processed' because > cstate->bytes_processed is increased before we process them and this > can cause wrong results. > > 4 - Because of #2 and #3, instead of having > 'SPECIAL_CHAR_SIMD_THRESHOLD', I used the ratio of 'chars_processed / > special_chars_encountered' to determine whether we want to use SIMD. > > 5 - cstate->special_chars_encountered is incremented wrongly for the > CSV case. It is not incremented for the quote and escape delimiters. I > moved all increments of cstate->special_chars_encountered to the > central place and tried to optimize it but it still causes a > regression as it creates one more branching. > > With these changes, I am able to decrease the regression to %10 from > %16. Regression decreases to %7 if I modify #5 for the only text input > but I did not do that. > > My changes are in the 0003.
I was helping collect some data, but I'm a little behind sharing what I ran against the v4.1 patches (on commit 07961ef8) with the v4.2 version out there... I hope it's still helpfule that I share what I collected even though they are not quite as nice, but maybe it's more about how/where I ran them. My laptop has a Intel(R) Core(TM) Ultra 7 165H, where most of these tests were using up 95%+ of one of the cores (I have hyperthreading disabled), and using about 10% the ssd's capacity. Summarizing my results from the same script Manni ran, I didn't see as much as an improvement in the positive tests, and then saw more negative results in the other tests. text copy with no special chars: 18% improvement of 15s from 80s before the patch CSV copy with no special chars: 23% improvement of 23s from 96s before the patch text with 1/3rd escapes: 6% slower, an additional 5s to 85 seconds before the patch CSV with 1/3rd quotes: 7% slower, an additional 10 seconds to 129 seconds before the patch I'm wondering if my laptop/processor isn't the best test bed for this... Regards, Mark -- Mark Wong <[email protected]> EDB https://enterprisedb.com
