Hello, On Wed, Feb 4, 2026, 6:38 AM Manni Wood <[email protected]> wrote:
> The 0001-COPY-from-SIMD-v3-with-line_buf-periodic-refill.patch seems nice! > On My x86 PC, it had the usual performance improvment of earlier patches, > but the regression seemed more similar for both text and csv inputs. > Unfortunately, the regression is about 2.5%, but maybe that is an > acceptable worst-case for an improvement of 22% for text inputs and 33% for > CSV inputs? > > The 0001-COPY-from-SIMD-v3-with-line_buf-periodic-refill.patch looks even > better on my Raspberry Pi's arm processor: not only do we see a 22% > improvement for text and an almost 34% improvement for CSV, even the > worst-case scenarios show an almost 4% improvement for text and an 11.7% > improvement for CSV. > > By comparison, > the v5.1-0001-Simple-heuristic-for-SIMD-COPY-FROM.patch.patch's worst-case > performance is poorer on both architectures. > > I'd be curious to know if anyone else can reproduces these > numbers. 0001-COPY-from-SIMD-v3-with-line_buf-periodic-refill.patch seems > like a real winner. > Thanks for the benchmark Manni, i suppose this is the same threshold as patch has (4096 bytes), have you tried any bigger values for the threshold ? Because i'm still expecting less l1d cache misses and execution times the more we increase the threshold (relatively to l1d cache size per core). As per my previous not-so-stable numbers 28KB wasn't too bad. Regards, Ayoub
