subject:"\[PATCHES\] CopyReadLineText optimization"

Re: [PATCHES] CopyReadLineText optimization

2008-03-11 Thread Heikki Linnakangas

Andrew Dunstan wrote: Heikki Linnakangas wrote: It looks like strpbrk() performs poorly: Yes, not surprising. I just looked at the implementation in glibc, which I assume you are using, and it seemed rather basic. The one in NetBSD's libc looks much more efficient. See http://sources.redh

Re: [PATCHES] CopyReadLineText optimization

2008-03-10 Thread Andrew Dunstan

Heikki Linnakangas wrote: Andrew Dunstan wrote: Another question that occurred to me - did you try using strpbrk() to look for the next interesting character rather than your homegrown searcher gadget? If so, how did that perform? It looks like strpbrk() performs poorly: Yes, not surpris

Re: [PATCHES] CopyReadLineText optimization

2008-03-10 Thread Heikki Linnakangas

Andrew Dunstan wrote: Another question that occurred to me - did you try using strpbrk() to look for the next interesting character rather than your homegrown searcher gadget? If so, how did that perform? It looks like strpbrk() performs poorly: unpatched: testname | min duration --

Re: [PATCHES] CopyReadLineText optimization

2008-03-08 Thread Heikki Linnakangas

Andrew Dunstan wrote: Heikki Linnakangas wrote: Andrew Dunstan wrote: I'm still a bit worried about applying it unless it gets some adaptive behaviour or something so that we don't cause any serious performance regressions in some cases. I'll try to come up with something. At the most cons

Re: [PATCHES] CopyReadLineText optimization

2008-03-07 Thread Andrew Dunstan

Heikki Linnakangas wrote: Andrew Dunstan wrote: I'm still a bit worried about applying it unless it gets some adaptive behaviour or something so that we don't cause any serious performance regressions in some cases. I'll try to come up with something. At the most conservative end, we could

Re: [PATCHES] CopyReadLineText optimization

2008-03-06 Thread Andrew Dunstan

Greg Smith wrote: On Thu, 6 Mar 2008, Heikki Linnakangas wrote: At the most conservative end, we could fall back to the current method on the first escape, quote or backslash character. I would just count the number of escaped/quote characters on each line, and then at the end of the line

Re: [PATCHES] CopyReadLineText optimization

2008-03-06 Thread Greg Smith

On Thu, 6 Mar 2008, Heikki Linnakangas wrote: At the most conservative end, we could fall back to the current method on the first escape, quote or backslash character. I would just count the number of escaped/quote characters on each line, and then at the end of the line switch modes between

Re: [PATCHES] CopyReadLineText optimization

2008-03-06 Thread Heikki Linnakangas

Andrew Dunstan wrote: Heikki Linnakangas wrote: Andrew Dunstan wrote: I'm still a bit worried about applying it unless it gets some adaptive behaviour or something so that we don't cause any serious performance regressions in some cases. I'll try to come up with something. At the most conser

Re: [PATCHES] CopyReadLineText optimization

2008-03-06 Thread Andrew Dunstan

Heikki Linnakangas wrote: Andrew Dunstan wrote: I'm still a bit worried about applying it unless it gets some adaptive behaviour or something so that we don't cause any serious performance regressions in some cases. I'll try to come up with something. At the most conservative end, we could

Re: [PATCHES] CopyReadLineText optimization

2008-03-06 Thread Heikki Linnakangas

Andrew Dunstan wrote: I'm still a bit worried about applying it unless it gets some adaptive behaviour or something so that we don't cause any serious performance regressions in some cases. I'll try to come up with something. At the most conservative end, we could fall back to the current met

Re: [PATCHES] CopyReadLineText optimization

2008-03-06 Thread Heikki Linnakangas

Tom Lane wrote: BTW, I notice that the code allows CSV escape and quote characters that have the high bit set (in single-byte server encodings that is). Is this a good idea? It seems like such are extremely unlikely to be the same in two different encodings. Maybe we should restrict to the ASC

Re: [PATCHES] CopyReadLineText optimization

2008-03-06 Thread Andrew Dunstan

Heikki Linnakangas wrote: Andrew Dunstan wrote: Heikki Linnakangas wrote: Another update attached: It occurred to me that the memchr approach is only safe for server encodings, where the non-first bytes of a multi-byte character always have the hi-bit set. We currently make the following

Re: [PATCHES] CopyReadLineText optimization

2008-03-06 Thread Andrew Dunstan

Tom Lane wrote: BTW, I notice that the code allows CSV escape and quote characters that have the high bit set (in single-byte server encodings that is). Is this a good idea? It seems like such are extremely unlikely to be the same in two different encodings. Maybe we should restrict to the A

Re: [PATCHES] CopyReadLineText optimization

2008-03-06 Thread Tom Lane

"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > Andrew Dunstan wrote: >> We currently make the following assumption in the code: >> >> * These four characters, and the CSV escape and quote characters, are >> * assumed the same in frontend and backend encodings. >> >> The four characters are th

Re: [PATCHES] CopyReadLineText optimization

2008-03-06 Thread Heikki Linnakangas

Andrew Dunstan wrote: Heikki Linnakangas wrote: Another update attached: It occurred to me that the memchr approach is only safe for server encodings, where the non-first bytes of a multi-byte character always have the hi-bit set. We currently make the following assumption in the code:

Re: [PATCHES] CopyReadLineText optimization

2008-03-06 Thread Andrew Dunstan

Heikki Linnakangas wrote: Heikki Linnakangas wrote: Heikki Linnakangas wrote: Attached is a patch that modifies CopyReadLineText so that it uses memchr to speed up the scan. The nice thing about memchr is that we can take advantage of any clever optimizations that might be in libc or compil

Re: [PATCHES] CopyReadLineText optimization

2008-03-05 Thread Andrew Dunstan

Heikki Linnakangas wrote: So the overhead of using memchr slows us down if there's a lot of escape or quote characters. The breakeven point seems to be about 1 in 8 characters. I'm not sure if that's a good tradeoff or not... How about we test the first buffer read in from the file and

Re: [PATCHES] CopyReadLineText optimization

2008-03-05 Thread Heikki Linnakangas

Heikki Linnakangas wrote: Heikki Linnakangas wrote: Attached is a patch that modifies CopyReadLineText so that it uses memchr to speed up the scan. The nice thing about memchr is that we can take advantage of any clever optimizations that might be in libc or compiler. Here's an updated versi

Re: [PATCHES] CopyReadLineText optimization

2008-03-05 Thread Heikki Linnakangas

Heikki Linnakangas wrote: I still need to test the worst-case performance, with input that has a lot of escapes. Ok, I've done some more performance testing with this. I tested COPY FROM with a table with a single "text" column. There was a million rows in the table, with a 1000 character lon

Re: [PATCHES] CopyReadLineText optimization

2008-03-03 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- He

Re: [PATCHES] CopyReadLineText optimization

2008-02-29 Thread Heikki Linnakangas

Heikki Linnakangas wrote: Attached is a patch that modifies CopyReadLineText so that it uses memchr to speed up the scan. The nice thing about memchr is that we can take advantage of any clever optimizations that might be in libc or compiler. Here's an updated version of the patch. The princi

Re: [PATCHES] CopyReadLineText optimization

2008-02-23 Thread Luke Lonergan

esql.org > Subject: [PATCHES] CopyReadLineText optimization > > The purpose of CopyReadLineText is to scan the input buffer, > and find the next newline, taking into account any escape > characters. It currently operates in a loop, one byte at a > time, searching for LF, CR, o

[PATCHES] CopyReadLineText optimization

2008-02-23 Thread Heikki Linnakangas

The purpose of CopyReadLineText is to scan the input buffer, and find the next newline, taking into account any escape characters. It currently operates in a loop, one byte at a time, searching for LF, CR, or a backslash. That's a bit slow: I've been running oprofile on COPY, and I've seen Copy

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

Re: [PATCHES] CopyReadLineText optimization

[PATCHES] CopyReadLineText optimization

23 matches

Site Navigation

Mail list logo

Footer information