On Wed, Nov 26, 2025 at 5:51 AM KAZAR Ayoub <[email protected]> wrote:
> Hello, > On Wed, Nov 19, 2025 at 10:01 PM Nathan Bossart <[email protected]> > wrote: > >> On Tue, Nov 18, 2025 at 05:20:05PM +0300, Nazir Bilal Yavuz wrote: >> > Thanks, done. >> >> I took a look at the v3 patches. Here are my high-level thoughts: >> >> + /* >> + * Parse data and transfer into line_buf. To get benefit from >> inlining, >> + * call CopyReadLineText() with the constant boolean variables. >> + */ >> + if (cstate->simd_continue) >> + result = CopyReadLineText(cstate, is_csv, true); >> + else >> + result = CopyReadLineText(cstate, is_csv, false); >> >> I'm curious whether this actually generates different code, and if it >> does, >> if it's actually faster. We're already branching on cstate->simd_continue >> here. > > I've compiled both versions with -O2 and confirmed they generate different > code. When simd_continue is passed as a constant to CopyReadLineText, the > compiler optimizes out the condition checks from the SIMD path. > A small benchmark on a 1GB+ file shows the expected benefit which is > around 6% performance improvement. > I've attached the assembly outputs in case someone wants to check > something else. > > > Regards, > Ayoub Kazar > Correction to my last post: I also tried files that alternated lines with no special characters and lines with 1/3rd special characters, thinking I could force the algorithm to continually check whether or not it should use simd and therefore force more overhead in the try-simd/don't-try-simd housekeeping code. The text file was still 20% faster (not 50% faster as I originally stated --- that was a typo). The CSV file was still 13% faster. Also, apologies for posting at the top in my last e-mail. -- -- Manni Wood EDB: https://www.enterprisedb.com
