Hi,

On Sat, 14 Feb 2026 at 02:09, Nathan Bossart <[email protected]> wrote:
>
> On Fri, Feb 13, 2026 at 02:45:30PM +0300, Nazir Bilal Yavuz wrote:
> > Also, if I change this code to:
> >
> >     if (cstate->simd_enabled)
> >     {
> >         if (is_csv)
> >             result = CopyReadLineText(cstate, true, true);
> >         else
> >             result = CopyReadLineText(cstate, false, true);
> >     }
> >     else
> >     {
> >         if (is_csv)
> >             result = CopyReadLineText(cstate, true, false);
> >         else
> >             result = CopyReadLineText(cstate, false, false);
> >     }
> >
> > then I see ~%5 performance improvement in scalar path compared to master.
>
> Hm.  What difference do you see if you just do
>
>         if (is_csv)
>                 result = CopyReadLineText(cstate, true);
>         else
>                 result = CopyReadLineText(cstate, false);
>
> both with and without the SIMD stuff?  IIUC this is allowing the compiler
> to remove several branches in CopyReadLineText(), which might be a nice
> improvement on its own.  That being said, I'm less convinced that adding a
> simd_enabled parameter to CopyReadLineText() helps, because 1) it's
> involved in fewer branches and 2) we change it within the function, so the
> compiler can't remove the branches, anyway.  But perhaps I'm missing
> something.

I did couple of benchmarks, some info about them:

- Benchmarks show percentage comparisons of timings against the master
branch. Positive values mean a regression, while negative values mean
an improvement.

- There are a total 200000 lines in each input and each line is 8192 bytes.

- For the columns, none means there is no special character. The other
numbers represent the ratio of normal characters to special
characters. For example, 0 means all the data is special characters, 4
means %25 of the data is special characters, 16 means 1/16 of the data
is special characters and such.

--------------------

This is the benchmark without the SIMD stuff:

- only_inline: Only change is CopyReadLineText() being inlined.

- is_csv_verbose-wo-inline: is_csv is sent as a constant boolean like
you suggested but CopyReadLineText() isn't inlined.

- is_csv_verbose-w-inline: is_csv is sent as a constant boolean and
CopyReadLineText() is inlined.

+-------------------------------+-------+------+------+-------+-------+------+
| TEXT                          |  none |   0  |   4  |   8   |   16  |  32  |
+-------------------------------+-------+------+------+-------+-------+------+
| only_inline                   |   0   | +1.6 |   0  |   0   |   -1  |   0  |
+-------------------------------+-------+------+------+-------+-------+------+
| is_csv_verbose-wo-inline      |   0   |   0  |   0  |   0   |   0   |   0  |
+-------------------------------+-------+------+------+-------+-------+------+
| is_csv_verbose-w-inline       |  -3.5 |   0  | -7.7 |  -2.3 |  -4.1 | -3.4 |
+-------------------------------+-------+------+------+-------+-------+------+
|                               |       |      |      |       |       |      |
+-------------------------------+-------+------+------+-------+-------+------+
|                               |       |      |      |       |       |      |
+-------------------------------+-------+------+------+-------+-------+------+
| CSV                           |  none |   0  |   4  |   8   |   16  |  32  |
+-------------------------------+-------+------+------+-------+-------+------+
| only_inline                   |  -1.1 |   0  |   0  |   0   |   0   |  -1  |
+-------------------------------+-------+------+------+-------+-------+------+
| is_csv_verbose-wo-inline      |   0   |   0  |   0  |   0   |  -0.3 |   0  |
+-------------------------------+-------+------+------+-------+-------+------+
| is_csv_verbose-w-inline       | -4    | -2.3 | -1.2 | -1.8  | -2.8  | -2.7 |
+-------------------------------+-------+------+------+-------+-------+------+

By looking the benchmark results,

        if (is_csv)
                result = CopyReadLineText(cstate, true);
        else
                result = CopyReadLineText(cstate, false);

with inline CopyReadLineText() function helps the performance even without SIMD.

----------------------------------------

This is the same benchmark with SIMD stuff:

only_inline: Only change is CopyReadLineText() being inlined, there is
no simd_enabled argument in the CopyReadLineText().

is_csv_verbose-w-inline: is_csv is sent as a constant boolean and
CopyReadLineText() is inlined. There is no simd_enabled argument in
the CopyReadLineText().

simd_enabled_verbose-w-inline: simd_enabled is sent as a constant
boolean and CopyReadLineText() is inlined.

both_verbose_w_inline: both is_csv and simd_enabled are sent as a
constant boolean and CopyReadLineText() is inlined.

+-------------------------------+-------+------+------+-------+-------+------+
| TEXT                          |  none |   0  |   4  |   8   |   16  |  32  |
+-------------------------------+-------+------+------+-------+-------+------+
| only_inline                   |  -11  | +9.7 | +9.1 | +11.4 | +14.8 |  +8  |
+-------------------------------+-------+------+------+-------+-------+------+
| is_csv_verbose-wo-inline      | -11.9 | +4.5 | +2.4 |  +3.5 |  +2.2 | +1.8 |
+-------------------------------+-------+------+------+-------+-------+------+
| is_csv_verbose-w-inline       | -12.6 |   0  | -2.4 |  +2.8 |  +1.6 |   0  |
+-------------------------------+-------+------+------+-------+-------+------+
| both_verbose_w_inline         | -12.1 |   0  |  -5  |   0   |  +2.5 | -1.8 |
+-------------------------------+-------+------+------+-------+-------+------+
|                               |       |      |      |       |       |      |
+-------------------------------+-------+------+------+-------+-------+------+
| CSV                           |  none |   0  |   4  |   8   |   16  |  32  |
+-------------------------------+-------+------+------+-------+-------+------+
| only_inline                   | -22.6 | +4.2 | +2.1 |  +2.6 |   0   | +2.2 |
+-------------------------------+-------+------+------+-------+-------+------+
| is_csv_verbose-w-inline       | -22.5 | -2.1 | -3.4 |  -3.9 |  -6.4 | -3.4 |
+-------------------------------+-------+------+------+-------+-------+------+
| simd_enabled_verbose-w-inline | -23   | 0    | -1.9 | -2.2  | -4.8  | -1.6 |
+-------------------------------+-------+------+------+-------+-------+------+
| both_verbose_w_inline         | -23.3 | -2.9 | -5   | -4.5  | -7.1  | -4.3 |
+-------------------------------+-------+------+------+-------+-------+------+

By looking at these results having both is_csv and simd_enabled as an
argument and sending them as constant boolean arguments help most.

>
> Some other random thoughts:

I am sending the benchmark results for now. I haven't tested other
suggestions yet. I will follow up with another email once I have
tested them.

--
Regards,
Nazir Bilal Yavuz
Microsoft


Reply via email to