Re: [PATCHES] Auto Partitioning Patch - WIP version 1

2008-03-21 Thread Luke Lonergan
Hi all,

I think the intent of the syntax / parser patch from Gavin and Jeff was to get 
consensus from PG on the syntax prior to proceeding with the next chunk of work.

The next chunk of work is now well underway - with support for ALTER TABLE 
and partitioning, along with fast inserts into the parent table.  This 
involves changes to the catalog, so we'll also need to discuss this as part of 
a submission.

GP is in the middle of merging 8.3 into our product, so it will be a few weeks 
at least before we can push any more info to the list.

Was there consensus on the syntax?  IIRC, there was a cessation of contrary 
comments on the matter.  If so, the parser patch was provided earlier - we 
could posibly refresh it.  The way it works in our dev branch now is that the 
partition syntax is turned off by default using a GUC, but is fully functional 
wrt creating rules, etc.  This allows for experimentation.

- Original Message -
From: [EMAIL PROTECTED] [EMAIL PROTECTED]
To: Tom Lane [EMAIL PROTECTED]
Cc: Bruce Momjian [EMAIL PROTECTED]; Simon Riggs [EMAIL PROTECTED]; 
pgsql-patches@postgresql.org pgsql-patches@postgresql.org
Sent: Sat Mar 22 01:19:01 2008
Subject: Re: [PATCHES] Auto Partitioning Patch - WIP version 1

Hi, 


On Fri, Mar 21, 2008 at 9:23 PM, Tom Lane [EMAIL PROTECTED] wrote:


Bruce Momjian [EMAIL PROTECTED] writes:

 NikhilS wrote:
 Thanks for taking a look. But if I am not mistaken Gavin and co. are 
working
 on a much exhaustive proposal. In light of that maybe this patch 
might not
 be needed in the first place?

 I will wait for discussion and a subsequent collective consensus 
here,
 before deciding the further course of actions.

 I think it is unwise to wait on Gavin for a more complex implemention
 ---  we might end up with nothing for 8.4.  As long as your syntax is
 compatible with whatever Gavin proposed Gavin can add on to your patch
 once it is applied.


It would be equally unwise to apply a stopgap patch if we're not certain
it will be upward compatible with what we want to do later.

I haven't been through the partitioning threads at all yet, but I think
what we probably want to have when we emerge from commit fest is some
consensus on what the road map is for partitioning.



+2

Regards,
Nikhils

-- 
EnterpriseDB http://www.enterprisedb.com 


Re: [PATCHES] CopyReadLineText optimization

2008-02-23 Thread Luke Lonergan
Cool!  It's been a while since we've done the same kind of thing :-)

- Luke 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Heikki Linnakangas
 Sent: Saturday, February 23, 2008 5:30 PM
 To: pgsql-patches@postgresql.org
 Subject: [PATCHES] CopyReadLineText optimization
 
 The purpose of CopyReadLineText is to scan the input buffer, 
 and find the next newline, taking into account any escape 
 characters. It currently operates in a loop, one byte at a 
 time, searching for LF, CR, or a backslash. That's a bit 
 slow: I've been running oprofile on COPY, and I've seen 
 CopyReadLine to take around ~10% of the CPU time, and Joshua 
 Drake just posted a very similar profile to hackers.
 
 Attached is a patch that modifies CopyReadLineText so that it 
 uses memchr to speed up the scan. The nice thing about memchr 
 is that we can take advantage of any clever optimizations 
 that might be in libc or compiler.
 
 In the tests I've been running, it roughly halves the time 
 spent in CopyReadLine (including the new memchr calls), thus 
 reducing the total CPU overhead by ~5%. I'm planning to run 
 more tests with data that has backslashes and with different 
 width tables to see what the worst-case and best-case 
 performance is like. Also, it doesn't work for CSV format at 
 the moment; that needs to be fixed.
 
 5% isn't exactly breathtaking, but it's a start. I tried the 
 same trick to CopyReadAttributesText, but unfortunately it 
 doesn't seem to help there because you need to stop the 
 efficient word-at-a-time scan that memchr does (at least with 
 glibc, YMMV) whenever there's a column separator, while in 
 CopyReadLineText you get to process the whole line in one 
 call, assuming there's no backslashes.
 
 -- 
Heikki Linnakangas
EnterpriseDB   http://www.enterprisedb.com
 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Bitmap index scan preread using posix_fadvise (Was: There's random access and then there's random access)

2008-02-02 Thread Luke Lonergan
Nice!

- Luke


On 1/30/08 9:22 AM, Gregory Stark [EMAIL PROTECTED] wrote:

 
 Here's the WIP patch for doing prereading when doing bitmap index scans.
 
 I was performance testing it as I was developing it here:
 
 http://archives.postgresql.org/pgsql-hackers/2007-12/msg00395.php
 
 Note that this only kicks in for bitmap index scans which are kind of tricky
 to generate. I used the attached function to generate them in the post above.
 
 Also note I wouldn't expect to see much benefit unless you're on a raid array,
 even a small one. But if you are on a raid array then the benefit should be
 immediately obvious or else posix_fadvise just isn't working for you. I would
 be interested in hearing on which OSes it does or doesn't work.
 
 *If* this is the approach we want to take rather than restructure the buffer
 manager to avoid taking two trips by marking the buffer i/o-in-progress and
 saving the pinned buffer in the bitmap heap scan then this is more or less in
 final form. Aside from some autoconf tests and the documentation for the GUC I
 think it's all in there.
 
 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] updated hash functions for postgresql v1

2007-10-28 Thread Luke Lonergan
We just applied this and saw a 5 percent speedup on a hash aggregation query 
with four colums in a 'group by' clause run against a single TPC-H table 
(lineitem).

CK - can you post the query?

- Luke

Msg is shrt cuz m on ma treo

 -Original Message-
From:   Simon Riggs [mailto:[EMAIL PROTECTED]
Sent:   Sunday, October 28, 2007 04:11 PM Eastern Standard Time
To: Kenneth Marshall
Cc: pgsql-patches@postgresql.org; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject:Re: [PATCHES] updated hash functions for postgresql v1

On Sun, 2007-10-28 at 13:05 -0500, Kenneth Marshall wrote:
 On Sun, Oct 28, 2007 at 05:27:38PM +, Simon Riggs wrote:
  On Sat, 2007-10-27 at 15:15 -0500, Kenneth Marshall wrote:
   Its features include a better and faster hash function.
  
  Looks very promising. Do you have any performance test results to show
  it really is faster, when compiled into Postgres? Better probably needs
  some definition also; in what way are the hash functions better?
   
  -- 
Simon Riggs
2ndQuadrant  http://www.2ndQuadrant.com
  
 The new hash function is roughly twice as fast as the old function in
 terms of straight CPU time. It uses the same design as the current
 hash but provides code paths for aligned and unaligned access as well
 as separate mixing functions for different blocks in the hash run
 instead of having one general purpose block. I think the speed will
 not be an obvious win with smaller items, but will be very important
 when hashing larger items (up to 32kb).
 
 Better in this case means that the new hash mixes more thoroughly
 which results in less collisions and more even bucket distribution.
 There is also a 64-bit varient which is still faster since it can
 take advantage of the 64-bit processor instruction set.

Ken, I was really looking for some tests that show both of the above
were true. We've had some trouble proving the claims of other algorithms
before, so I'm less inclined to take those things at face value.

I'd suggest tests with Integers, BigInts, UUID, CHAR(20) and CHAR(100).
Others may have different concerns.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] updated hash functions for postgresql v1

2007-10-28 Thread Luke Lonergan
That's on Greenplum latest.

We used this query to expose CPU heavy aggregation.

The 1GB overall TPCH size is chosen to fit into the RAM of a typical 
workstation/laptop with 2GB of RAM.  That ensures the time is spent in the CPU 
processing of the hashagg, which is what we'd like to measure here.

The PG performance will be different, but the measurement approach should be 
the same IMO.  The only suggestion to make it easier is to use 250MB scale 
factor, as we use four cores against 1GB.  The principal is the same.

- Luke

Msg is shrt cuz m on ma treo

 -Original Message-
From:   Simon Riggs [mailto:[EMAIL PROTECTED]
Sent:   Sunday, October 28, 2007 04:48 PM Eastern Standard Time
To: CK.Tan
Cc: Luke Lonergan; Kenneth Marshall; pgsql-patches@postgresql.org; [EMAIL 
PROTECTED]; [EMAIL PROTECTED]
Subject:Re: [PATCHES] updated hash functions for postgresql v1

On Sun, 2007-10-28 at 13:19 -0700, CK Tan wrote:
 Hi, this query on TPCH 1G data gets about 5% improvement.

 select count (*) from (select l_orderkey, l_partkey, l_comment,
 count(l_tax) from lineitem group by 1, 2, 3) tmpt;

 On Oct 28, 2007, at 1:17 PM, Luke Lonergan wrote:
 
  We just applied this and saw a 5 percent speedup on a hash
  aggregation query with four colums in a 'group by' clause run
  against a single TPC-H table (lineitem).
  
  CK - can you post the query? 

Is this on Postgres or Greenplum?


That looks like quite a wide set of columns.

Sounds good though. Can we get any more measurements in?

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com



Re: [PATCHES] Seq scans status update

2007-06-02 Thread Luke Lonergan
Hi All,

On 5/31/07 12:40 AM, Heikki Linnakangas [EMAIL PROTECTED] wrote:

 BTW, we've been talking about the L2 cache effect but we don't really
 know for sure if the effect has anything to do with the L2 cache. But
 whatever it is, it's real.

The mailing list archives contain the ample evidence of:
- it's definitely an L2 cache effect
- on fast I/O hardware tests show large benefits of keeping the ring in L2

I see no reason to re-open the discussion about these, can we accept these
as fact and continue?

- Luke


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Seq scans status update

2007-05-17 Thread Luke Lonergan
Hi Heikki,

On 5/17/07 10:28 AM, Heikki Linnakangas [EMAIL PROTECTED] wrote:

 is also visible on larger scans that don't fit in cache with bigger I/O
 hardware, and this patch would increase the max. I/O throughput that we
 can handle on such hardware. I don't have such hardware available, I
 hope someone else will try that.

Yes, this is absolutely the case, in addition to the benefits of not
polluting the bufcache with seq scans (as discussed in detail previously).
We've adopted this (see CK's patch) with excellent benefits.

We can try your version on a machine with fast I/O and get back to you with
a comparison of this and CK's version.

- Luke



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Avg performance for int8/numeric

2006-11-24 Thread Luke Lonergan
So, if I understand this correctly, we're calling Alloc and ContextAlloc 10
times for every row being summed?

There are approx 10M rows and the profile snippet below shows 100M calls to
each of those.

- Luke


On 11/24/06 4:46 PM, Mark Kirkwood [EMAIL PROTECTED] wrote:

  time   seconds   secondscalls   s/call   s/call  name
  14.42  2.16 2.16 12977 0.00 0.00  AllocSetAlloc
   9.08  3.52 1.36 2000 0.00 0.00  add_abs
   5.54  4.35 0.83 1000 0.00 0.00  slot_deform_tuple
   5.41  5.16 0.81 60001673 0.00 0.00  AllocSetFree
   4.34  5.81 0.65 1000 0.00 0.00  construct_md_array
   4.21  6.44 0.63 2003 0.00 0.00  make_result
   3.54  6.97 0.53 1000 0.00 0.00  numeric_add
   3.27  7.46 0.49 3003 0.00 0.00  set_var_from_num
   3.00  7.91 0.45 12652 0.00 0.00  MemoryContextAlloc



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Bitmap index AM

2006-06-12 Thread Luke Lonergan
Bruce,

We have a bitmap index AM in Bizgres (on PG 8.1.3) that is functional and
achieves very worthwhile (10x+) performance and space objectives.

It is a large patch, and introduces the access method along with modifying
the appropriate executor nodes.  The latter part was necessary because of
the need to bypass the in-memory bitmap index when an on-disk bitmap is
available.

Because this patch is large, how do you suggest we go through review?  Also,
there is some further work that Jie is doing to support efficient
multi-column indexes that will simplify the code, so we're not quite ready
for patch submission.

- Luke


On 6/12/06 9:13 AM, Bruce Momjian pgman@candle.pha.pa.us wrote:

 Victor Yegorov wrote:
 Hi again.
 
 Here's an updated patch, that fixes several bugs and is synced with HEAD.
 
 Are you closer to submitting this patch for application?



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by

2006-05-11 Thread Luke Lonergan
Martijn,

On 5/11/06 12:17 AM, Martijn van Oosterhout kleptog@svana.org wrote:

 Did you test it? There are some cases where this might still leave some
 noticable overhead (high loop count). I'm just not sure if they occur
 all that often in practice...

I've sent it to our team for testing, let's see if we get some info to
forward.

We're running the 10TB TPC-H case and I'm asking for EXPLAIN ANALYZE that
might take days to complete, so we certainly have some test cases for this
;-)

- Luke



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by

2006-05-10 Thread Luke Lonergan
Nice one Martijn - we have immediate need for this, as one of our sizeable
queries under experimentation took 3 hours without EXPLAIN ANALYZE, then
over 20 hours with it...

- Luke 


On 5/9/06 2:38 PM, Martijn van Oosterhout kleptog@svana.org wrote:

 On Tue, May 09, 2006 at 05:16:57PM -0400, Rocco Altier wrote:
 - To get this close it needs to get an estimate of the sampling
 overhead. It does this by a little calibration loop that is run
 once per backend. If you don't do this, you end up assuming all
 tuples take the same time as tuples with the overhead, resulting in
 nodes apparently taking longer than their parent nodes. Incidently,
 I measured the overhead to be about 3.6us per tuple per node on my
 (admittedly slightly old) machine.
 
 Could this be deferred until the first explain analyze?  So that we
 aren't paying the overhead of the calibration in all backends, even the
 ones that won't be explaining?
 
 If you look it's only done on the first call to InstrAlloc() which
 should be when you run EXPLAIN ANALYZE for the first time.
 
 In any case, the calibration is limited to half a millisecond (that's
 500 microseconds), and it'll be a less on fast machines.
 
 Have a nice day,



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Page at a time index scan

2006-05-08 Thread Luke Lonergan
Tom,

On 5/8/06 11:46 AM, Tom Lane [EMAIL PROTECTED] wrote:

 I made a table of 16M rows with an
 index over a random-data integer column.  With a thoroughly disordered
 index (built on-the-fly as the random data was inserted), the time to
 VACUUM after deleting a small number of rows was 615 seconds with
 yesterday's code, 31 seconds today.  With a perfectly-ordered index
 (identical table, but CREATE INDEX after all the data is in place), the
 times were about 28 and 26 seconds respectively.

Very impressive!  This corroborates findings we've had with index
maintenance in the field - thanks for finding/fixing this.

- Luke



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] WIP: further sorting speedup

2006-02-19 Thread Luke Lonergan
Title: Re: [PATCHES] WIP: further sorting speedup



Cool!

Well test this sometime soon and get back to you. Were kind of jammed this week, hopefully well get some time.

So you know, weve done some more work on the external sort to remove the tape abstraction from the code, which makes a significant improvement. This involved removing both the Knuth tapes, and the logtape.c codepath. The result is a reasonable improvement in performance (tens of percent), and a dramatic reduction in the amount of code.

Since were looking for a 4-fold improvement based on comparisons to other commercial databases, we feel were not done yet. Our next step (before we got jammed getting our latest MPP release out) was to implement these:
Locate the cause for the excessive time in heap_getattr (you just did it)
Implement something other than replacement selection for creating runs to optimize cache use

- Luke

On 2/19/06 6:40 PM, Tom Lane [EMAIL PROTECTED] wrote:

After applying Simon's recent sort patch, I was doing some profiling and
noticed that sorting spends an unreasonably large fraction of its time
extracting datums from tuples (heap_getattr or index_getattr). The
attached patch does something about this by pulling out the leading sort
column of a tuple when it is received by the sort code or re-read from a
tape. This increases the space needed by 8 or 12 bytes (depending on
sizeof(Datum)) per in-memory tuple, but doesn't cost anything as far as
the on-disk representation goes. The effort needed to extract the datum
at this point is well repaid because the tuple will normally undergo
multiple comparisons while it remains in memory. In some quick tests
the patch seemed to make for a significant speedup, on the order of 30%,
despite increasing the number of runs emitted because of the smaller
available memory.

The choice to pull out just the leading column, rather than all columns,
is driven by concerns of (a) code complexity and (b) memory space.
Having the extra columns pre-extracted wouldn't buy anything anyway
in the common case where the leading key determines the result of
a comparison.

This is still WIP because it leaks memory intra-query (I need to fix it
to clean up palloc'd space better). I thought I'd post it now in case
anyone wants to try some measurements for their own favorite test cases.
In particular it would be interesting to see what happens for a
multi-column sort with lots of duplicated keys in the first column,
which is the case where the least advantage would be gained.

Comments?

regards, tom lane









Re: [PATCHES] WIP: further sorting speedup

2006-02-19 Thread Luke Lonergan
Title: Re: [PATCHES] WIP: further sorting speedup



The improvement was pre-Simons patch, and it came from implementing a single pass merge instead of a variable pass based on the number of tapes, as it is in Knuths tape algorithm. Also, the additional tricks in logtape.c were higher in the profile than what I see here.

Simons patch had the effect of reducing the number of passes by increasing the number of tapes depending on the memory available, but thats a long tail effect as seen in figure (70?) in Knuth.

Where Id like this to go is the implementation of a two pass create runs, merge, where the second merge can be avoided unless random access is needed (as discussed previously on list).

In the run creation phase, the idea would be to implement something like quicksort or another L2-cache friendly algorithm (ideas?)

- Luke


On 2/19/06 8:19 PM, Tom Lane [EMAIL PROTECTED] wrote:

Luke Lonergan [EMAIL PROTECTED] writes:
 So you know, we=B9ve done some more work on the external sort to remove the
 =B3tape=B2 abstraction from the code, which makes a significant improvement.

Improvement where? That code's down in the noise so far as I can tell.
I see results like this (with the patched code):

CPU: P4 / Xeon with 2 hyper-threads, speed 2793.08 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped)
with a unit mask of 0x01 (mandatory) count 24
samples % symbol name
147310 31.9110 tuplesort_heap_siftup
68381 14.8130 comparetup_index
34063 7.3789 btint4cmp
22573 4.8899 AllocSetAlloc
19317 4.1845 writetup_index
18953 4.1057 tuplesort_gettuple_common
18100 3.9209 mergepreread
17083 3.7006 GetMemoryChunkSpace
12527 2.7137 LWLockAcquire
11686 2.5315 LWLockRelease
6172 1.3370 tuplesort_heap_insert
5392 1.1680 index_form_tuple
5323 1.1531 PageAddItem
4943 1.0708 LogicalTapeWrite
4525 0.9802 LogicalTapeRead
4487 0.9720 LockBuffer
4217 0.9135 heapgettup
3891 0.8429 IndexBuildHeapScan
3862 0.8366 ltsReleaseBlock

It appears that a lot of the cycles blamed on tuplesort_heap_siftup are
due to cache misses associated with referencing memtuples[] entries
that have fallen out of L2 cache. Not sure how to improve that though.

regards, tom lane









Re: [PATCHES] OS X shared memory documentation

2006-02-11 Thread Luke Lonergan
Thanks for this Jim!  We'll test it and report results.
 
- Luke



From: [EMAIL PROTECTED] on behalf of Jim C. Nasby
Sent: Sat 2/11/2006 9:26 PM
To: pgsql-patches@postgresql.org
Subject: [PATCHES] OS X shared memory documentation



I finally figured out what the issue was with OS X blowing away shared
memory settings  from /etc/sysctl.conf. I added a note to the 8.1 docs
already, and here's a patch against -HEAD.
--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [BUGS] BUG #2114: (patch) COPY FROM ... end of

2005-12-27 Thread Luke Lonergan
Bruce,

On 12/27/05 10:20 AM, Bruce Momjian pgman@candle.pha.pa.us wrote:

 OK, original patch applied to HEAD and smaller version to 8.1.X, and
 regression test added, now attached.

Great, good catch.

Have you tested performance, before and after?

The only good way to test performance is using a fast enough I/O subsystem
that you are CPU-bound, which means 60MB/s of write speed.

I'd be happy to get you an account on one.

- Luke



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] COPY FROM performance improvements

2005-08-10 Thread Luke Lonergan
Simon,
 
 That part of the code was specifically written to take advantage of
 processing pipelines in the hardware, not because the actual theoretical
 algorithm for that approach was itself faster.

Yup, good point.
 
 Nobody's said what compiler/hardware they have been using, so since both
 Alon and Tom say their character finding logic is faster, it is likely
 to be down to that? Name your platforms gentlemen, please.

In this case, we've been using gcc (3.2.3 RHEL3 Linux, 3.4.3 Solaris 10) on 
Opteron and Intel Xeon and Pentium 4.  Alon's performance comparisons for the 
parse only were done on a HT enabled P4 3.0GHz on RHEL3 with gcc 3.2.3, 
probably with optimization -O2, but possibly -O3.
 
Note that the level of microparallelism on upcoming CPUs is increasing with 
increasing pipeline depth.  Though there will be a step back on the Intel line 
with the introduction of the Centrino-based Xeon cores in 2006/7, other CPUs 
continue the trend, and I expect the next generation of multi-core CPUs to 
possibly introduce threaded micro-architectures which can also be scheduled as 
pipelines.  The gcc 4 compiler introduces auto vectorization, which may enhance 
the optimality of some loops.
 
I think the key thing is to make as much parallelism apparent to the compiler 
as possible, which will generally mean loops.  This means faster code on all 
modern CPUs and it won't hurt older CPU speeds.

 My feeling is that we may learn something here that applies more widely
 across many parts of the code.

Yes, I think one thing we've learned is that there are important parts of the 
code, those that are in the data path (COPY, sort, spill to disk, etc) that are 
in dire need of optimization.  For instance, the fgetc() pattern should be 
banned everywhere in the data path.
 
BTW - we are tracking down (in our spare time :-() the extremely slow sort 
performance.  We're seeing sort times of 1.7MB/s on our fastest machines, even 
when the work_mem is equal to the square root of the sort set.  This is a 
*serious* problem for us and we aren't getting to it - ideas are welcome.
 
Optimization here means both the use of good fundamental algorithms and 
micro-optimization (minimize memory copies, expose long runs of operations to 
the compiler, maximize computational intensity by working in cache-resident 
blocks, etc).
 
- Luke




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] COPY FROM performance improvements

2005-08-10 Thread Luke Lonergan
Alvaro,

On 8/10/05 9:46 AM, Alvaro Herrera [EMAIL PROTECTED] wrote:

 AFAIR he never claimed otherwise ... his point was that to gain that
 additional speedup, the code has to be made considerable worse (in
 maintenability terms.)  Have you (or Alon) tried to port the rest of the
 speed improvement to the new code?  Maybe it's possible to have at least
 some of it without worsening the maintenability too badly.

As I suggested previously, there is another, more maintainable way to get
more performance from the parsing logic.

It involves replacing something like this:


char c = input_routine()
 if (c == '\n') {
 else if (
.
.
.
 }


With something like this:


char [32] carr;

nread = Input_routine_new(carr,32)

  for (i=0; inread; i++) {
if (carr[I] == '\n') {
.
.
.
  }


And this section would run much faster (3x?).

This is what I think could make the overall patch 50% faster than it is now
(on the parsing part).

The issue that I expect we'll hear about is that since the parsing is
already 500% faster, it has vanished in the profile.  That's why Tom's
testing is not showing much difference between his and Alon's code, we
actually drop the other sections to bring it forward where we see the bigger
difference.

However, what I'm arguing here and elsewhere is that there's still a lot
more of this kind of optimization to be done.  12 MB/s COPY speed is not
enough.  There's 40% of the time in processing left to smack down.
  
 Another question that comes to mind is: have you tried another compiler?
 I see you are all using GCC at most 3.4; maybe the new optimizing
 infrastructure in GCC 4.1 means you can have most of the speedup without
 uglifying the code.  What about Intel's compiler?

We have routinely distributed PostgreSQL with the Intel compiler, up until
recently.  Interestingly, GCC now beats it handily in our tests on Opteron
and matches it on Xeon, which is too bad - it's my fav compiler.

The problem with this code is that it doesn't have enough micro-parallelism
without loops on the character parsing core.  The compiler can only do
register optimizations and branch prediction (poorly) unless it is given
more to work with.

 PostgreSQL needs major improvement to compete with Oracle and even MySQL on
 speed.  No whacking on the head is going to change that.
 
 Certainly.  I think the point is what cost do we want to pay for the
 speedup.  I think we all agree that even if we gain a 200% speedup by
 rewriting COPY in assembly, it's simply not acceptable.

Understood, and I totally agree.

 Another point may be that Bizgres can have a custom patch for the extra
 speedup, without inflicting the maintenance cost on the community.

We are committed to making Postgres the best DBMS for Business Intelligence.
Bizgres makes it safe for businesses to rely on open source for their
production uses.  As far as features go, I think the best way for our
customers is to make sure that Bizgres features are supporting the
PostgreSQL core and vis-versa.

- Luke 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] COPY FROM performance improvements

2005-08-06 Thread Luke Lonergan
Tom,

Thanks for finding the bugs and reworking things.

 I had some difficulty in generating test cases that weren't largely
 I/O-bound, but AFAICT the patch as applied is about the same speed
 as what you submitted.

You achieve the important objective of knocking the parsing stage down a
lot, but your parsing code is actually about 20% slower than Alon's.

Before your patch:
  Time: 14205.606 ms

With your patch:
  Time: 10565.374 ms

With Alon's patch:
  Time: 10289.845 ms

The parsing part of the code in your version is slower, but as a percentage
of the total it's hidden. The loss of 0.3 seconds on 143MB means:

- If parsing takes a total of 0.9 seconds, the parsing rate is 160MB/s
(143/0.9)

- If we add another 0.3 seconds to parsing to bring it to 1.2, then the
parsing rate becomes 120MB/s

When we improve the next stages of the processing (attribute conversion,
write-to disk), this will stand out a lot more.  Our objective is to get the
COPY rate *much* faster than the current poky rate of 14MB/s (after this
patch).

- Luke

On 8/6/05 2:04 PM, Tom Lane [EMAIL PROTECTED] wrote:

 Alon Goldshuv [EMAIL PROTECTED] writes:
 New patch attached. It includes very minor changes. These are changes that
 were committed to CVS 3 weeks ago (copy.c 1.247) which I missed in the
 previous patch.
 
 I've applied this with (rather extensive) revisions.  I didn't like what
 you had done with the control structure --- loading the input buffer
 only at the outermost loop level was a bad design choice IMHO.  You had
 sprinkled the code with an unreasonable number of special cases in order
 to try to cope with the effects of that mistake, but there were lots
 of problems still left.  Some of the bugs I noticed:
 
 * Broke old-protocol COPY, since that has no provision for stopping at
 the EOF marker except by parsing the data carefully to start with.  The
 backend would just hang up unless the total data size chanced to be a
 multiple of 64K.
 
 * Subtle change in interpretation of \. EOF marker (the existing code
 will recognize it even when not at start of line).
 
 * Seems to have thrown away detection of newline format discrepancies.
 
 * Fails for zero-column tables.
 
 * Broke display of column values during error context callback (would
 always show the first column contents no matter which one is being
 complained of).
 
 * DetectLineEnd mistakenly assumes CR mode if very last character of first
 bufferload is CR; need to reserve judgment until next char is available.
 
 * DetectLineEnd fails to account for backslashed control characters,
 so it will e.g. accept \ followed by \n as determining the newline
 style.
 
 * Fails to apply encoding conversion if first line exceeds copy buf
 size, because when DetectLineEnd fails the quick-exit path doesn't do
 it.
 
 * There seem to be several bugs associated with the fact that input_buf[]
 always has 64K of data in it even when EOF has been reached on the
 input.  One example:
 echo -n 123 zzz1
 psql create temp table t1(f1 text);
 psql copy t1 from '/home/tgl/zzz1';
 psql select * from t1;
 hmm ... where'd that 64K of whitespace come from?
 
 I rewrote the patch in a way that retained most of the speedups without
 changing the basic control structure (except for replacing multiple
 CopyReadAttribute calls with one CopyReadAttributes call per line).
 
 I had some difficulty in generating test cases that weren't largely
 I/O-bound, but AFAICT the patch as applied is about the same speed
 as what you submitted.
 
 regards, tom lane
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
 



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] COPY FROM performance improvements

2005-08-06 Thread Luke Lonergan
Tom,

The previous timings were for a table with 15 columns of mixed type.  We
also test with 1 column to make the parsing overhead more apparent.  In the
case of 1 text column with 145MB of input data:

Your patch:
  Time: 6612.599 ms

Alon's patch:
  Time: 6119.244 ms


Alon's patch is 7.5% faster here, where it was only 3% faster on the 15
column case.  This is consistent with a large difference in parsing speed
between your approach and Alon's.

I'm pretty sure that the mistake you refer to is responsible for the speed
improvement, and was deliberately chosen to minimize memory copies, etc.
Given that we're looking ahead to getting much higher speeds, approaching
current high performance disk speeds, we've been looking more closely at the
parsing speed.  It comes down to a tradeoff between elegant code and speed.

We'll prove it in lab tests soon, where we measure the parsing rate
directly, but these experiments show it clearly, though indirectly.

- Luke



On 8/6/05 2:04 PM, Tom Lane [EMAIL PROTECTED] wrote:

 Alon Goldshuv [EMAIL PROTECTED] writes:
 New patch attached. It includes very minor changes. These are changes that
 were committed to CVS 3 weeks ago (copy.c 1.247) which I missed in the
 previous patch.
 
 I've applied this with (rather extensive) revisions.  I didn't like what
 you had done with the control structure --- loading the input buffer
 only at the outermost loop level was a bad design choice IMHO.  You had
 sprinkled the code with an unreasonable number of special cases in order
 to try to cope with the effects of that mistake, but there were lots
 of problems still left.  Some of the bugs I noticed:
 
 * Broke old-protocol COPY, since that has no provision for stopping at
 the EOF marker except by parsing the data carefully to start with.  The
 backend would just hang up unless the total data size chanced to be a
 multiple of 64K.
 
 * Subtle change in interpretation of \. EOF marker (the existing code
 will recognize it even when not at start of line).
 
 * Seems to have thrown away detection of newline format discrepancies.
 
 * Fails for zero-column tables.
 
 * Broke display of column values during error context callback (would
 always show the first column contents no matter which one is being
 complained of).
 
 * DetectLineEnd mistakenly assumes CR mode if very last character of first
 bufferload is CR; need to reserve judgment until next char is available.
 
 * DetectLineEnd fails to account for backslashed control characters,
 so it will e.g. accept \ followed by \n as determining the newline
 style.
 
 * Fails to apply encoding conversion if first line exceeds copy buf
 size, because when DetectLineEnd fails the quick-exit path doesn't do
 it.
 
 * There seem to be several bugs associated with the fact that input_buf[]
 always has 64K of data in it even when EOF has been reached on the
 input.  One example:
 echo -n 123 zzz1
 psql create temp table t1(f1 text);
 psql copy t1 from '/home/tgl/zzz1';
 psql select * from t1;
 hmm ... where'd that 64K of whitespace come from?
 
 I rewrote the patch in a way that retained most of the speedups without
 changing the basic control structure (except for replacing multiple
 CopyReadAttribute calls with one CopyReadAttributes call per line).
 
 I had some difficulty in generating test cases that weren't largely
 I/O-bound, but AFAICT the patch as applied is about the same speed
 as what you submitted.
 
 regards, tom lane
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] COPY FROM performance improvements

2005-08-06 Thread Luke Lonergan
Tom,

My direct e-mails to you are apparently blocked, so I'll send this to the
list.

I've attached the case we use for load performance testing, with the data
generator modified to produce a single row version of the dataset.

I do believe that you/we will need to invert the processing loop to get the
maximum parsing speed.  We will be implementing much higher loading speeds
which require it to compete with Oracle, Netezza, Teradata, so we'll have to
work this out for the best interests of our users.

- Luke



IVP.tgz
Description: Binary data

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] COPY FROM performance improvements

2005-08-06 Thread Luke Lonergan
Tom,

On 8/6/05 9:08 PM, Tom Lane [EMAIL PROTECTED] wrote:

 Luke Lonergan [EMAIL PROTECTED] writes:
 I had some difficulty in generating test cases that weren't largely
 I/O-bound, but AFAICT the patch as applied is about the same speed
 as what you submitted.
 
 You achieve the important objective of knocking the parsing stage down a
 lot, but your parsing code is actually about 20% slower than Alon's.
 
 I would like to see the exact test case you are using to make this
 claim; the tests I did suggested my code is the same speed or faster.

I showed mine - you show yours :-)  Apparently our e-mail crossed.
  
 As best I can tell, my version of CopyReadAttributes is significantly
 quicker than Alon's, approximately balancing out the fact that my
 version of CopyReadLine is slower.  I did the latter first, and would
 now be tempted to rewrite it in the same style as CopyReadAttributes,
 ie one pass of memory-to-memory copy using pointers rather than buffer
 indexes.

See previous timings - looks like Alon's parsing is substantially faster.
However, I'd like him to confirm by running with the shunt placed at
different stages, in this case between parse and attribute conversion (not
attribute parse).
 
 BTW, late today I figured out a way to get fairly reproducible
 non-I/O-bound numbers about COPY FROM: use a trigger that suppresses
 the actual inserts, thus:
 
 create table foo ...
 create function noway() returns trigger as
 'begin return null; end' language plpgsql;
 create trigger noway before insert on foo
   for each row execute procedure noway();
 then repeat:
 copy foo from '/tmp/foo.data';

Cool!  That's a better way than hacking code and inserting shunts.
 
Alon will likely hit this tomorrow.

- Luke



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] COPY FROM performance improvements

2005-07-21 Thread Luke Lonergan
Cool!

At what rate does your disk setup write sequential data, e.g.:
  time dd if=/dev/zero of=bigfile bs=8k count=50

(sized for 2x RAM on a system with 2GB)

BTW - the Compaq smartarray controllers are pretty broken on Linux from a
performance standpoint in our experience.  We've had disastrously bad
results from the SmartArray 5i and 6 controllers on kernels from 2.4 -
2.6.10, on the order of 20MB/s.

For comparison, the results on our dual opteron with a single LSI SCSI
controller with software RAID0 on a 2.6.10 kernel:

[EMAIL PROTECTED] dbfast]$ time dd if=/dev/zero of=bigfile bs=8k
count=50
50+0 records in
50+0 records out

real0m24.702s
user0m0.077s
sys 0m8.794s

Which calculates out to about 161MB/s.

- Luke


On 7/21/05 2:55 PM, Mark Wong [EMAIL PROTECTED] wrote:

 I just ran through a few tests with the v14 patch against 100GB of data
 from dbt3 and found a 30% improvement; 3.6 hours vs 5.3 hours.  Just to
 give a few details, I only loaded data and started a COPY in parallel
 for each the data files:
 http://www.testing.osdl.org/projects/dbt3testing/results/fast_copy/
 
 Here's a visual of my disk layout, for those familiar with the database
 schema:
 http://www.testing.osdl.org/projects/dbt3testing/results/fast_copy/layout-dev4
 -010-dbt3.html
 
 I have 6 arrays of fourteen 15k rpm drives in a split-bus configuration
 attached to a 4-way itanium2 via 6 compaq smartarray pci-x controllers.
 
 Let me know if you have any questions.
 
 Mark
 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] COPY FROM performance improvements

2005-07-21 Thread Luke Lonergan
Joshua,

On 7/21/05 5:08 PM, Joshua D. Drake [EMAIL PROTECTED] wrote:

 O.k. this strikes me as interesting, now we know that Compaq and Dell
 are borked for Linux. Is there a name brand server (read Enterprise)
 that actually does provide reasonable performance?

I think late model Dell (post the bad chipset problem, circa 2001-2?) and
IBM and Sun servers are fine because they all use simple SCSI adapters from
LSI or Adaptec.

The HP Smartarray is an aberration, they don't have good driver support for
Linux and as a consequence have some pretty bad problems with both
performance and stability.  On Windows they perform quite well.

Also - there are very big issues with some SATA controllers and Linux we've
seen, particularly the Silicon Image, Highpoint other non-Intel controllers.
Not sure about Nvidia, but the only ones I trust now are 3Ware and the
others mentioned in earlier posts.

- Luke

 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] COPY FROM performance improvements

2005-07-21 Thread Luke Lonergan
Joshua,

On 7/21/05 7:53 PM, Joshua D. Drake [EMAIL PROTECTED] wrote:
 Well I know that isn't true at least not with ANY of the Dells my
 customers have purchased in the last 18 months. They are still really,
 really slow.

That's too bad, can you cite some model numbers?  SCSI?

 I have great success with Silicon Image as long as I am running them
 with Linux software RAID. The LSI controllers are also really nice.

That's good to hear, I gave up on Silicon Image controllers on Linux about 1
year ago, which kernel are you using with success?  Silicon Image
controllers are the most popular, so it's important to see them supported
well, though I'd rather see more SATA headers than 2 off of the built-in
chipsets. 

- Luke



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] COPY FROM performance improvements

2005-07-19 Thread Luke Lonergan
Good points on all, another element in the performance expectations is the
ratio of CPU speed to I/O subsystem speed, as Alon had hinted earlier.

This patch substantially (500%) improves the efficiency of parsing in the
COPY path, which, on a 3GHz P4 desktop with a commodity disk drive
represents 8 of a total of 30 seconds of processing time.  So, by reducing
the parsing time from 8 seconds to 1.5 seconds, the overall COPY time is
reduced from 30 seconds to 23.5 seconds, or a speedup of about 20%.

On a dual 2.2GHz Opteron machine with a 6-disk SCSI RAID subsystem capable
of 240MB/s sequential read and writes, the ratios change and we see between
35% and 95% increase in COPY performance, with the bottleneck being CPU.
The disk is only running at about 90MB/s during this period.

I'd expect that as your CPUs slow down relative to your I/O speed, and
Itaniums or IT2s are quite slow, you should see an increased effect of the
parsing improvements.

One good way to validate the effect is to watch the I/O bandwidth using
vmstat 1 (on Linux) while the load is progressing.  When you watch that with
the unpatched source and with the patched source, if they are the same, you
should see no benefit from the patch (you are I/O limited).

If you check your underlying sequential write speed, you will be
bottlenecked at roughly half that in performing COPY because of the
write-through the WAL.

- Luke

On 7/19/05 3:51 PM, Mark Wong [EMAIL PROTECTED] wrote:

 Whoopsies, yeah good point about the PRIMARY KEY.  I'll fix that.
 
 Mark
 
 On Tue, 19 Jul 2005 18:17:52 -0400
 Andrew Dunstan [EMAIL PROTECTED] wrote:
 
 Mark,
 
 You should definitely not be doing this sort of thing, I believe:
 
 CREATE TABLE orders (
 o_orderkey INTEGER,
 o_custkey INTEGER,
 o_orderstatus CHAR(1),
 o_totalprice REAL,
 o_orderDATE DATE,
 o_orderpriority CHAR(15),
 o_clerk CHAR(15),
 o_shippriority INTEGER,
 o_comment VARCHAR(79),
 PRIMARY KEY (o_orderkey))
 
 Create the table with no constraints, load the data, then set up primary keys
 and whatever other constraints you want using ALTER TABLE. Last time I did a
 load like this (albeit 2 orders of magnitude smaller) I saw a 50% speedup
 from deferring constarint creation.
 
 
 cheers
 
 andrew
 
 
 
 Mark Wong wrote:
 
 Hi Alon,
 
 Yeah, that helps.  I just need to break up my scripts a little to just
 load the data and not build indexes.
 
 Is the following information good enough to give a guess about the data
 I'm loading, if you don't mind? ;)  Here's a link to my script to create
 tables:
 http://developer.osdl.org/markw/mt/getfile.py?id=eaf16b7831588729780645b2bb4
 4f7f23437e432path=scripts/pgsql/create_tables.sh.in
 
 File sizes:
 -rw-r--r--  1 markw 50 2.3G Jul  8 15:03 customer.tbl
 -rw-r--r--  1 markw 50  74G Jul  8 15:03 lineitem.tbl
 -rw-r--r--  1 markw 50 2.1K Jul  8 15:03 nation.tbl
 -rw-r--r--  1 markw 50  17G Jul  8 15:03 orders.tbl
 -rw-r--r--  1 markw 50 2.3G Jul  8 15:03 part.tbl
 -rw-r--r--  1 markw 50  12G Jul  8 15:03 partsupp.tbl
 -rw-r--r--  1 markw 50  391 Jul  8 15:03 region.tbl
 -rw-r--r--  1 markw 50 136M Jul  8 15:03 supplier.tbl
 
 Number of rows:
 # wc -l *.tbl
1500 customer.tbl
   600037902 lineitem.tbl
  25 nation.tbl
   15000 orders.tbl
2000 part.tbl
8000 partsupp.tbl
   5 region.tbl
 100 supplier.tbl
 
 Thanks,
 Mark
 
 On Tue, 19 Jul 2005 14:05:56 -0700
 Alon Goldshuv [EMAIL PROTECTED] wrote:
 
  
 
 Hi Mark,
 
 I improved the data *parsing* capabilities of COPY, and didn't touch the
 data conversion or data insertion parts of the code. The parsing
 improvement
 will vary largely depending on the ratio of parsing -to- converting and
 inserting. 
 
 Therefore, the speed increase really depends on the nature of your data:
 
 100GB file with
 long data rows (lots of parsing)
 Small number of columns (small number of attr conversions per row)
 less rows (less tuple insertions)
 
 Will show the best performance improvements.
 
 However, same file size 100GB with
 Short data rows (minimal parsing)
 large number of columns (large number of attr conversions per row)
 AND/OR
 more rows (more tuple insertions)
 
 Will show improvements but not as significant.
 In general I'll estimate 40%-95% improvement in load speed for the 1st case
 and 10%-40% for the 2nd. But that also depends on the hardware, disk speed
 etc... This is for TEXT format. As for CSV, it may be faster but not as
 much
 as I specified here. BINARY will stay the same as before.
 
 HTH
 Alon.
 
 
 
 
 
 
 On 7/19/05 12:54 PM, Mark Wong [EMAIL PROTECTED] wrote:
 

 
 On Thu, 14 Jul 2005 17:22:18 -0700
 Alon Goldshuv [EMAIL PROTECTED] wrote:
 
  
 
 I revisited my patch and removed the code duplications that were there,
 and
 added support for CSV with buffered input, so CSV now runs faster too
 (although it is not as optimized as the TEXT format parsing). So now
 TEXT,CSV and BINARY are all parsed in CopyFrom(), like in the original
 file.
   

Re: [PATCHES] backslashes in pgindent

2005-07-15 Thread Luke Lonergan
Bruce,
 I found that parentheses in gawk regular expressions require backslashes
 so they are not treated as regex groupings:
 
 $ echo '('|awk '$0 ~ /(/ {print $0}'
 awk: cmd. line:1: fatal: Unmatched ( or \(: /(/
 $ echo '('|awk '$0 ~ /\(/ {print $0}'
 (

 
 Now, it seems closing parentheses are OK because there is no open group,
 but I think I should use backslashes there too:
 
 $ echo ')'|awk '$0 ~ /)/ {print $0}'
 )
 $ echo ')'|awk '$0 ~ /\)/ {print $0}'
 
 Does your awk produce different results?  What version is it?  Mine is GNU Awk
 3.0.6.

Yes - on the last test, mine emits the ) and yours apparently does not.
The version I ran with is 3.1.4.

The escaped parenthesis in the unpatched pgindent causes the following
warning:

 $ pgindent test.c
 Hope you installed /src/tools/pgindent/indent.bsd.patch.
 awk: cmd. line:12: warning: escape sequence `\)' treated as plain `)'
 
Which implies an unnecessary escaping, which appears to function correctly
without the escape.

Cheers,

- Luke



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] backslashes in pgindent

2005-07-15 Thread Luke Lonergan
Bruce,

On 7/15/05 9:59 PM, Bruce Momjian pgman@candle.pha.pa.us wrote:

 Actually, mine returns ')' too for the last command.   I didn't copy
 that into the email.  How about the top tests?  Notice I get an error on
 the first one without the backslash.  Are you OK escaping '(' but not
 ')'?  That might be a solution.

You know, I'm not sure - I don't know the intended meaning of this line:

awk '   BEGIN   {line1 = ; line2 = }
{
line2 = $0;
if (NR = 2)
print line1;
if (NR = 2 
line2 ~ ^{[]*$ 
line1 !~ ^struct 
line1 !~ ^enum 
line1 !~ ^typedef 
line1 !~ ^extern[  ][  ]*\C\ 
line1 !~ = 
=  line1 ~ \))
print int  pgindent_func_no_var_fix;;
line1 = line2;
}
END

Is the escaped paren within  meant to be a literal?

- Luke



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] COPY fast parse patch

2005-06-02 Thread Luke Lonergan
Andrew,

 OK ... that seems fair enough. The next question is where the data being
 loaded comes from? pg_dump? How does load speed compare with using COPY's
 binary mode?

Oddly, our tests in the past have shown that binary is actually slower.

Luke



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] COPY fast parse patch

2005-06-01 Thread Luke Lonergan
Andrew,

 I will be the first to admit that there are probably some very good
 possibilities for optimisation of this code. My impression though has been
 that in almost all cases it's fast enough anyway. I know that on some very
 modest hardware I have managed to load a 6m row TPC line-items table in just
 a few minutes. Before we start getting too hung up, I'd be interested to
 know just how much data people want to load and how fast they want it to be.
 If people have massive data loads that take hours, days or weeks then it's
 obviously worth improving if we can. I'm curious to know what size datasets
 people are really handling this way.

x0+ GB files are common in data warehousing.  The issue is often can we
load our data within the time allotted for the batch window, usually a
matter of an hour or two.

Assuming that TPC lineitem is 140Bytes/row, 6M rows in 3 minutes is 4.7
MB/s.  To load a 10GB file at that rate takes about 2/3 hour.  If one were
to restore a 300GB database, it would take 18 hours.  Maintenance operations
are impractical after a few hours, 18 is a non-starter.

In practice, we're usually replacing an Oracle system with PostgreSQL, and
the load speed difference between the two is currently embarrassing and
makes the work impractical.

- Luke



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org