Simon,
That part of the code was specifically written to take advantage of
processing pipelines in the hardware, not because the actual theoretical
algorithm for that approach was itself faster.
Yup, good point.
Nobody's said what compiler/hardware they have been using, so since both
Simon Riggs [EMAIL PROTECTED] writes:
Nobody's said what compiler/hardware they have been using, so since both
Alon and Tom say their character finding logic is faster, it is likely
to be down to that? Name your platforms gentlemen, please.
I tested on HPPA with gcc 2.95.3 and on a Pentium 4
Luke Lonergan [EMAIL PROTECTED] writes:
Yes, I think one thing we've learned is that there are important parts
of the code, those that are in the data path (COPY, sort, spill to
disk, etc) that are in dire need of optimization. For instance, the
fgetc() pattern should be banned everywhere in
On Wed, Aug 10, 2005 at 09:16:08AM -0700, Luke Lonergan wrote:
On 8/10/05 8:37 AM, Tom Lane [EMAIL PROTECTED] wrote:
Luke, I dislike whacking people upside the head, but this discussion
seems to presume that raw speed on Intel platforms is the only thing
that matters. We have a few
Luke Lonergan wrote:
Tom,
On 8/10/05 8:37 AM, Tom Lane [EMAIL PROTECTED] wrote:
Luke, I dislike whacking people upside the head, but this discussion
seems to presume that raw speed on Intel platforms is the only thing
that matters. We have a few other concerns. Portability,
Also, as we proved the last time the correctness argument was thrown in, we
can fix the bugs and still make it a lot faster - and I would stick to that
whether it's a PA-RISC, DEC Alpha, Intel or AMD or event Ultra Sparc.
Luke this comment doesn't work. Do you have a test case that shows that
Alvaro Herrera wrote:
Another question that comes to mind is: have you tried another compiler?
I see you are all using GCC at most 3.4; maybe the new optimizing
infrastructure in GCC 4.1 means you can have most of the speedup without
uglifying the code. What about Intel's compiler?
Alvaro,
On 8/10/05 9:46 AM, Alvaro Herrera [EMAIL PROTECTED] wrote:
AFAIR he never claimed otherwise ... his point was that to gain that
additional speedup, the code has to be made considerable worse (in
maintenability terms.) Have you (or Alon) tried to port the rest of the
speed
On Wed, Aug 10, 2005 at 12:57:18PM -0400, Bruce Momjian wrote:
Alvaro Herrera wrote:
Another question that comes to mind is: have you tried another compiler?
I see you are all using GCC at most 3.4; maybe the new optimizing
infrastructure in GCC 4.1 means you can have most of the speedup
I did some performance checks after the recent code commit.
The good news is that the parsing speed of COPY is now MUCH faster, which is
great. It is about 5 times faster - about 100MB/sec on my machine
(previously 20MB/sec at best, usually less).
The better news is that my original patch
Alon Goldshuv wrote:
I performed those measurement by executing *only the parsing logic* of the
COPY pipeline. All data conversion (functioncall3(string...)) and tuple
handling (form_heaptuple etc...) and insertion were manually disabled. So
the only code measured is reading from disk and
Alon Goldshuv [EMAIL PROTECTED] writes:
New patch attached. It includes very minor changes. These are changes that
were committed to CVS 3 weeks ago (copy.c 1.247) which I missed in the
previous patch.
I've applied this with (rather extensive) revisions. I didn't like what
you had done with
Tom,
Thanks for finding the bugs and reworking things.
I had some difficulty in generating test cases that weren't largely
I/O-bound, but AFAICT the patch as applied is about the same speed
as what you submitted.
You achieve the important objective of knocking the parsing stage down a
lot,
Tom,
The previous timings were for a table with 15 columns of mixed type. We
also test with 1 column to make the parsing overhead more apparent. In the
case of 1 text column with 145MB of input data:
Your patch:
Time: 6612.599 ms
Alon's patch:
Time: 6119.244 ms
Alon's patch is 7.5%
Tom,
My direct e-mails to you are apparently blocked, so I'll send this to the
list.
I've attached the case we use for load performance testing, with the data
generator modified to produce a single row version of the dataset.
I do believe that you/we will need to invert the processing loop to
Luke Lonergan [EMAIL PROTECTED] writes:
I had some difficulty in generating test cases that weren't largely
I/O-bound, but AFAICT the patch as applied is about the same speed
as what you submitted.
You achieve the important objective of knocking the parsing stage down a
lot, but your parsing
Tom,
On 8/6/05 9:08 PM, Tom Lane [EMAIL PROTECTED] wrote:
Luke Lonergan [EMAIL PROTECTED] writes:
I had some difficulty in generating test cases that weren't largely
I/O-bound, but AFAICT the patch as applied is about the same speed
as what you submitted.
You achieve the important
Tom,
Thanks for pointing it out. I made the small required modifications to match
copy.c version 1.247 and sent it to -patches list. New patch is V16.
Alon.
On 8/1/05 7:51 PM, Tom Lane [EMAIL PROTECTED] wrote:
Alon Goldshuv [EMAIL PROTECTED] writes:
This patch appears to reverse out the
Alon Goldshuv [EMAIL PROTECTED] writes:
This patch appears to reverse out the most recent committed changes in
copy.c.
Which changes do you refer to? I thought I accommodated all the recent
changes (I recall some changes to the tupletable/tupleslot interface, HEADER
in cvs, and hex escapes
Luke Lonergan wrote:
Joshua,
On 7/21/05 7:53 PM, Joshua D. Drake [EMAIL PROTECTED] wrote:
Well I know that isn't true at least not with ANY of the Dells my
customers have purchased in the last 18 months. They are still really,
really slow.
That's too bad, can you cite some model numbers?
On Thu, Jul 21, 2005 at 09:19:04PM -0700, Luke Lonergan wrote:
Joshua,
On 7/21/05 7:53 PM, Joshua D. Drake [EMAIL PROTECTED] wrote:
Well I know that isn't true at least not with ANY of the Dells my
customers have purchased in the last 18 months. They are still really,
really slow.
I just ran through a few tests with the v14 patch against 100GB of data
from dbt3 and found a 30% improvement; 3.6 hours vs 5.3 hours. Just to
give a few details, I only loaded data and started a COPY in parallel
for each the data files:
Cool!
At what rate does your disk setup write sequential data, e.g.:
time dd if=/dev/zero of=bigfile bs=8k count=50
(sized for 2x RAM on a system with 2GB)
BTW - the Compaq smartarray controllers are pretty broken on Linux from a
performance standpoint in our experience. We've had
Luke Lonergan wrote:
Cool!
At what rate does your disk setup write sequential data, e.g.:
time dd if=/dev/zero of=bigfile bs=8k count=50
(sized for 2x RAM on a system with 2GB)
BTW - the Compaq smartarray controllers are pretty broken on Linux from a
performance standpoint in our
Joshua,
On 7/21/05 5:08 PM, Joshua D. Drake [EMAIL PROTECTED] wrote:
O.k. this strikes me as interesting, now we know that Compaq and Dell
are borked for Linux. Is there a name brand server (read Enterprise)
that actually does provide reasonable performance?
I think late model Dell (post the
Joshua,
On 7/21/05 7:53 PM, Joshua D. Drake [EMAIL PROTECTED] wrote:
Well I know that isn't true at least not with ANY of the Dells my
customers have purchased in the last 18 months. They are still really,
really slow.
That's too bad, can you cite some model numbers? SCSI?
I have great
Alon Goldshuv wrote:
I revisited my patch and removed the code duplications that were there, and
added support for CSV with buffered input, so CSV now runs faster too
(although it is not as optimized as the TEXT format parsing). So now
TEXT,CSV and BINARY are all parsed in CopyFrom(), like in
On Thu, 14 Jul 2005 17:22:18 -0700
Alon Goldshuv [EMAIL PROTECTED] wrote:
I revisited my patch and removed the code duplications that were there, and
added support for CSV with buffered input, so CSV now runs faster too
(although it is not as optimized as the TEXT format parsing). So now
Hi Mark,
I improved the data *parsing* capabilities of COPY, and didn't touch the
data conversion or data insertion parts of the code. The parsing improvement
will vary largely depending on the ratio of parsing -to- converting and
inserting.
Therefore, the speed increase really depends on the
Hi Alon,
Yeah, that helps. I just need to break up my scripts a little to just
load the data and not build indexes.
Is the following information good enough to give a guess about the data
I'm loading, if you don't mind? ;) Here's a link to my script to create
tables:
Mark,
Thanks for the info.
Yes, isolating indexes out of the picture is a good idea for this purpose.
I can't really give a guess to how fast the load rate should be. I don't
know how your system is configured, and all the hardware characteristics
(and even if I knew that info I may not be able
Mark,
You should definitely not be doing this sort of thing, I believe:
CREATE TABLE orders (
o_orderkey INTEGER,
o_custkey INTEGER,
o_orderstatus CHAR(1),
o_totalprice REAL,
o_orderDATE DATE,
o_orderpriority CHAR(15),
o_clerk CHAR(15),
Whoopsies, yeah good point about the PRIMARY KEY. I'll fix that.
Mark
On Tue, 19 Jul 2005 18:17:52 -0400
Andrew Dunstan [EMAIL PROTECTED] wrote:
Mark,
You should definitely not be doing this sort of thing, I believe:
CREATE TABLE orders (
o_orderkey INTEGER,
o_custkey
Good points on all, another element in the performance expectations is the
ratio of CPU speed to I/O subsystem speed, as Alon had hinted earlier.
This patch substantially (500%) improves the efficiency of parsing in the
COPY path, which, on a 3GHz P4 desktop with a commodity disk drive
represents
Luke, Alon
OK, I'm going to apply the patch to my copy and try to get my head
around it. meanwhile:
. we should not be describing things as old or new. The person
reading the code might have no knowledge of the history, and should not
need to.
. we should not have slow and fast either. We
Luke Lonergan wrote:
Patch to update pgindent with new symbols and fix a bug in an awk section
(extra \\ in front of a ')').
Yea, that '\' wasn't needed. I applied the following patch to use //
instead of for patterns, and removed the unneeded backslash.
I will update the typedefs in a
Luke Lonergan wrote:
Yah - I think I fixed several mis-indented comments. I'm using vim with
tabstop=4. I personally don't like tabs in text and would prefer them
expanded using spaces, but that's a nice way to make small formatting
changes look huge in a cvs diff.
You might like to
Please change 'if(' to 'if (', and remove parenthese like this:
for(start = s; (*s != c) (s (start + len)) ; s++)
My only other comment is, Yow, that is a massive patch.
---
Luke Lonergan wrote:
Tom,
Is it
Luke Lonergan wrote:
Attached has spaces between if,for, and foreach and (, e.g., if( is now
if (. It definitely looks better to me :-)
Massive patch - agreed. Less bloated than it was yesterday though.
Good, thanks.
What about the Protocol version 2? Looks like it could be added back
Luke Lonergan wrote:
Bruce,
Well, there has been no discussion about removing version 2 support, so
it seems it is required.
This should do it - see attached.
Those parentheses are still there:
for (start = s; (*s != c) (s (start + len)) ; s++)
It should be:
for (start
40 matches
Mail list logo