Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches
It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --------------------------------------------------------------------------- Heikki Linnakangas wrote: > Heikki Linnakangas wrote: > > Attached is a patch that modifies CopyReadLineText so that it uses > > memchr to speed up the scan. The nice thing about memchr is that we can > > take advantage of any clever optimizations that might be in libc or > > compiler. > > Here's an updated version of the patch. The principle is the same, but > the same optimization is now used for CSV input as well, and there's > more comments. > > I still need to do more benchmarking. I mentioned a ~5% speedup on the > test I ran earlier, which was a load of the lineitem table from TPC-H. > It looks like with cheaper data types the gain can be much bigger; > here's an oprofile from loading the TPC-H partsupp table, > > Before: > > samples % image name symbol name > 5146 25.7635 postgres CopyReadLine > 4089 20.4716 postgres DoCopy > 1449 7.2544 reiserfs (no symbols) > 1369 6.8539 postgres pg_verify_mbstr_len > 1013 5.0716 libc-2.7.so memcpy > 749 3.7499 libc-2.7.so ____strtod_l_internal > 598 2.9939 postgres heap_formtuple > 548 2.7436 libc-2.7.so ____strtol_l_internal > 403 2.0176 libc-2.7.so memset > 309 1.5470 libc-2.7.so strlen > 208 1.0414 postgres AllocSetAlloc > ... > > After: > > samples % image name symbol name > 4165 25.7879 postgres DoCopy > 1574 9.7455 postgres pg_verify_mbstr_len > 1520 9.4112 reiserfs (no symbols) > 1005 6.2225 libc-2.7.so memchr > 986 6.1049 libc-2.7.so memcpy > 632 3.9131 libc-2.7.so ____strtod_l_internal > 589 3.6468 postgres heap_formtuple > 546 3.3806 libc-2.7.so ____strtol_l_internal > 386 2.3899 libc-2.7.so memset > 366 2.2661 postgres CopyReadLine > 287 1.7770 libc-2.7.so strlen > 215 1.3312 postgres LWLockAcquire > 208 1.2878 postgres hash_any > 176 1.0897 postgres LWLockRelease > 161 0.9968 postgres InputFunctionCall > 157 0.9721 postgres AllocSetAlloc > ... > > Profile shows that with the patch, ~8.5% of the CPU time is spent in > CopyReadLine+memchr, vs. 25.5% before. That's a quite significant speedup. > > I still need to test the worst-case performance, with input that has a > lot of escapes. It would be interesting to hear reports with this patch > from people on different platforms. These results are from my laptop > with 32-bit Intel CPU, running Linux. There could be big differences in > the memchr implementations. > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your Subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-patches