Re: [PATCHES] CopyReadAttributesCSV optimization

2008-03-08 Thread Heikki Linnakangas

Andrew Dunstan wrote:



Heikki Linnakangas wrote:
Here's a patch to speed up CopyReadAttributesCSV. On the test case 
I've been playing with, loading the TPC-H partsupp table, about 20% 
CopyReadAttributesCSV (inlined into DoCopy, DoCopy itself is 
insignificant):




[snip]


The trick is to split the loop in CopyReadAttributesCSV into two 
parts, inside quotes, and outside quotes, saving some instructions in 
both parts.


Your mileage may vary, but I'm quite happy with this. I haven't tested 
it much yet, but I wouldn't expect it to be a loss in any interesting 
scenario. The code also doesn't look much worse after the patch, 
perhaps even better.


  


This looks sane enough, and worked for me in testing, so I'm going to 
apply it shortly. I'll probably add a comment or two about how the loops 
interact.


Thanks.

FWIW, I did some more performance testing, with input consisting of a 
lot of quotes, and it seems the performance gain holds even then. I was 
not able to find an input where the new version performs worse.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] CopyReadAttributesCSV optimization

2008-03-07 Thread Andrew Dunstan



Heikki Linnakangas wrote:
Here's a patch to speed up CopyReadAttributesCSV. On the test case 
I've been playing with, loading the TPC-H partsupp table, about 20% 
CopyReadAttributesCSV (inlined into DoCopy, DoCopy itself is 
insignificant):




[snip]


The trick is to split the loop in CopyReadAttributesCSV into two 
parts, inside quotes, and outside quotes, saving some instructions in 
both parts.


Your mileage may vary, but I'm quite happy with this. I haven't tested 
it much yet, but I wouldn't expect it to be a loss in any interesting 
scenario. The code also doesn't look much worse after the patch, 
perhaps even better.


  


This looks sane enough, and worked for me in testing, so I'm going to 
apply it shortly. I'll probably add a comment or two about how the loops 
interact.


cheers

andrew

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] CopyReadAttributesCSV optimization

2008-03-03 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Heikki Linnakangas wrote:
> Here's a patch to speed up CopyReadAttributesCSV. On the test case I've 
> been playing with, loading the TPC-H partsupp table, about 20% 
> CopyReadAttributesCSV (inlined into DoCopy, DoCopy itself is insignificant):
> 
> samples  %image name   symbol name
> 8136 25.8360  postgres CopyReadLine
> 6350 20.1645  postgres DoCopy
> 2181  6.9258  postgres pg_verify_mbstr_len
> 2157  6.8496  reiserfs (no symbols)
> 1668  5.2968  libc-2.7.so  memcpy
> 1142  3.6264  libc-2.7.so  strtod_l_internal
> 951   3.0199  postgres heap_formtuple
> 904   2.8707  libc-2.7.so  strtol_l_internal
> 619   1.9656  libc-2.7.so  memset
> 442   1.4036  libc-2.7.so  strlen
> 341   1.0828  postgres hash_any
> 329   1.0447  postgres pg_atoi
> 300   0.9527  postgres AllocSetAlloc
> 
> With this patch, the usage of that function goes down to ~13%
> 
> samples  %image name   symbol name
> 7191 28.7778  postgres CopyReadLine
> 3257 13.0343  postgres DoCopy
> 2127  8.5121  reiserfs (no symbols)
> 1914  7.6597  postgres pg_verify_mbstr_len
> 1413  5.6547  libc-2.7.so  memcpy
> 920   3.6818  libc-2.7.so  strtod_l_internal
> 784   3.1375  libc-2.7.so  strtol_l_internal
> 745   2.9814  postgres heap_formtuple
> 508   2.0330  libc-2.7.so  memset
> 398   1.5928  libc-2.7.so  strlen
> 315   1.2606  postgres hash_any
> 255   1.0205  postgres AllocSetAlloc
> 
> The trick is to split the loop in CopyReadAttributesCSV into two parts, 
> inside quotes, and outside quotes, saving some instructions in both 
> parts.
> 
> Your mileage may vary, but I'm quite happy with this. I haven't tested 
> it much yet, but I wouldn't expect it to be a loss in any interesting 
> scenario. The code also doesn't look much worse after the patch, perhaps 
> even better.
> 
> -- 
>Heikki Linnakangas
>EnterpriseDB   http://www.enterprisedb.com


> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-patches