date:20050414

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Tom Lane

Kevin Brown <[EMAIL PROTECTED]> writes:
> In the case of pure random reads, you'll end up having to wait an
> average of half of a rotation before beginning the read.

You're assuming the conclusion.  The above is true if the disk is handed
one request at a time by a kernel that doesn't have any low-level timing
information.  If there are multiple random requests on the same track,
the drive has an opportunity to do better than that --- if it's got all
the requests in hand.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Kevin Brown

Tom Lane wrote:
> Kevin Brown <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> The reason this is so much more of a win than it was when ATA was
> >> designed is that in modern drives the kernel has very little clue about
> >> the physical geometry of the disk.  Variable-size tracks, bad-block
> >> sparing, and stuff like that make for a very hard-to-predict mapping
> >> from linear sector addresses to actual disk locations.  
> 
> > What I mean is that when it comes to scheduling disk activity,
> > knowledge of the specific physical geometry of the disk isn't really
> > important.
> 
> Oh?
> 
> Yes, you can probably assume that blocks with far-apart numbers are
> going to require a big seek, and you might even be right in supposing
> that a block with an intermediate number should be read on the way.
> But you have no hope at all of making the right decisions at a more
> local level --- say, reading various sectors within the same cylinder
> in an optimal fashion.  You don't know where the track boundaries are,
> so you can't schedule in a way that minimizes rotational latency.

This is true, but has to be examined in the context of the workload.

If the workload is a sequential read, for instance, then the question
becomes whether or not giving the controller a set of sequential
blocks (in block ID order) will get you maximum read throughput.
Given that the manufacturers all attempt to generate the biggest read
throughput numbers, I think it's reasonable to assume that (a) the
sectors are ordered within a cylinder such that reading block x + 1
immediately after block x will incur the smallest possible amount of
delay if requested quickly enough, and (b) the same holds true when
block x + 1 is on the next cylinder.

In the case of pure random reads, you'll end up having to wait an
average of half of a rotation before beginning the read.  Where SCSI
buys you something here is when you have sequential chunks of reads
that are randomly distributed.  The SCSI drive can determine which
block in the set to start with first.  But for that to really be a big
win, the chunks themselves would have to span more than half a track
at least, else you'd have a greater than half a track gap in the
middle of your two sorted sector lists for that track (a really
well-engineered SCSI disk could take advantage of the fact that there
are multiple platters and fill the "gap" with reads from a different
platter).

Admittedly, this can be quite a big win.  With an average rotational
latency of 4 milliseconds on a 7200 RPM disk, being able to begin the
read at the earliest possible moment will shave at most 25% off the
total average random-access latency, if the average seek time is 12
milliseconds.

> That might be the case with respect to decisions about long seeks,
> but not with respect to rotational latency.  The kernel simply hasn't
> got the information.

True, but that should reduce the total latency by something like 17%
(on average).  Not trivial, to be sure, but not an order of magnitude,
either.

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Tom Lane

Kevin Brown <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> The reason this is so much more of a win than it was when ATA was
>> designed is that in modern drives the kernel has very little clue about
>> the physical geometry of the disk.  Variable-size tracks, bad-block
>> sparing, and stuff like that make for a very hard-to-predict mapping
>> from linear sector addresses to actual disk locations.  

> What I mean is that when it comes to scheduling disk activity,
> knowledge of the specific physical geometry of the disk isn't really
> important.

Oh?

Yes, you can probably assume that blocks with far-apart numbers are
going to require a big seek, and you might even be right in supposing
that a block with an intermediate number should be read on the way.
But you have no hope at all of making the right decisions at a more
local level --- say, reading various sectors within the same cylinder
in an optimal fashion.  You don't know where the track boundaries are,
so you can't schedule in a way that minimizes rotational latency.
You're best off to throw all the requests at the drive together and
let the drive sort it out.

This is not to say that there's not a place for a kernel-side scheduler
too.  The drive will probably have a fairly limited number of slots in
its command queue.  The optimal thing is for those slots to be filled
with requests that are in the same area of the disk.  So you can still
get some mileage out of an elevator algorithm that works on logical
block numbers to give the drive requests for nearby block numbers at the
same time.  But there's also a lot of use in letting the drive do its
own low-level scheduling.

> My argument is that a sufficiently smart kernel scheduler *should*
> yield performance results that are reasonably close to what you can
> get with that feature.  Perhaps not quite as good, but reasonably
> close.  It shouldn't be an orders-of-magnitude type difference.

That might be the case with respect to decisions about long seeks,
but not with respect to rotational latency.  The kernel simply hasn't
got the information.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Alex Turner

3ware claim that their 'software' implemented command queueing
performs at 95% effectiveness compared to the hardware queueing on a
SCSI drive, so I would say that they agree with you.

I'm still learning, but as I read it, the bits are split across the
platters and there is only 'one' head, but happens to be reading from
multiple platters.  The 'further' in linear distance the data is from
the current position, the longer it's going to take to get there. 
This seems to be true based on a document that was circulated.  A hard
drive takes considerable amount of time to 'find' a track on the
platter compared to the rotational speed, which would agree with the
fact that you can read 70MB/sec, but it takes up to 13ms to seek.

the ATA protocol is just how the HBA communicates with the drive,
there is no reason why the HBA can't reschedule reads and writes just
the like SCSI drive would do natively, and this is what infact 3ware
claims.  I get the feeling based on my own historical experience that
generaly drives don't just have a bunch of bad blocks.  This all leads
me to believe that you can predict with pretty good accuracy how
expensive it is to retrieve a given block knowing it's linear
increment.

Alex Turner
netEconomist

On 4/14/05, Kevin Brown <[EMAIL PROTECTED]> wrote:
> Tom Lane wrote:
> > Kevin Brown <[EMAIL PROTECTED]> writes:
> > > I really don't see how this is any different between a system that has
> > > tagged queueing to the disks and one that doesn't.  The only
> > > difference is where the queueing happens.  In the case of SCSI, the
> > > queueing happens on the disks (or at least on the controller).  In the
> > > case of SATA, the queueing happens in the kernel.
> >
> > That's basically what it comes down to: SCSI lets the disk drive itself
> > do the low-level I/O scheduling whereas the ATA spec prevents the drive
> > from doing so (unless it cheats, ie, caches writes).  Also, in SCSI it's
> > possible for the drive to rearrange reads as well as writes --- which
> > AFAICS is just not possible in ATA.  (Maybe in the newest spec...)
> >
> > The reason this is so much more of a win than it was when ATA was
> > designed is that in modern drives the kernel has very little clue about
> > the physical geometry of the disk.  Variable-size tracks, bad-block
> > sparing, and stuff like that make for a very hard-to-predict mapping
> > from linear sector addresses to actual disk locations.
> 
> Yeah, but it's not clear to me, at least, that this is a first-order
> consideration.  A second-order consideration, sure, I'll grant that.
> 
> What I mean is that when it comes to scheduling disk activity,
> knowledge of the specific physical geometry of the disk isn't really
> important.  What's important is whether or not the disk conforms to a
> certain set of expectations.  Namely, that the general organization is
> such that addressing the blocks in block number order guarantees
> maximum throughput.
> 
> Now, bad block remapping destroys that guarantee, but unless you've
> got a LOT of bad blocks, it shouldn't destroy your performance, right?
> 
> > Combine that with the fact that the drive controller can be much
> > smarter than it was twenty years ago, and you can see that the case
> > for doing I/O scheduling in the kernel and not in the drive is
> > pretty weak.
> 
> Well, I certainly grant that allowing the controller to do the I/O
> scheduling is faster than having the kernel do it, as long as it can
> handle insertion of new requests into the list while it's in the
> middle of executing a request.  The most obvious case is when the head
> is in motion and the new request can be satisfied by reading from the
> media between where the head is at the time of the new request and
> where the head is being moved to.
> 
> My argument is that a sufficiently smart kernel scheduler *should*
> yield performance results that are reasonably close to what you can
> get with that feature.  Perhaps not quite as good, but reasonably
> close.  It shouldn't be an orders-of-magnitude type difference.
> 
> --
> Kevin Brown   [EMAIL PROTECTED]
> 
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
>

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Kevin Brown

Tom Lane wrote:
> Kevin Brown <[EMAIL PROTECTED]> writes:
> > I really don't see how this is any different between a system that has
> > tagged queueing to the disks and one that doesn't.  The only
> > difference is where the queueing happens.  In the case of SCSI, the
> > queueing happens on the disks (or at least on the controller).  In the
> > case of SATA, the queueing happens in the kernel.
> 
> That's basically what it comes down to: SCSI lets the disk drive itself
> do the low-level I/O scheduling whereas the ATA spec prevents the drive
> from doing so (unless it cheats, ie, caches writes).  Also, in SCSI it's
> possible for the drive to rearrange reads as well as writes --- which
> AFAICS is just not possible in ATA.  (Maybe in the newest spec...)
> 
> The reason this is so much more of a win than it was when ATA was
> designed is that in modern drives the kernel has very little clue about
> the physical geometry of the disk.  Variable-size tracks, bad-block
> sparing, and stuff like that make for a very hard-to-predict mapping
> from linear sector addresses to actual disk locations.  

Yeah, but it's not clear to me, at least, that this is a first-order
consideration.  A second-order consideration, sure, I'll grant that.

What I mean is that when it comes to scheduling disk activity,
knowledge of the specific physical geometry of the disk isn't really
important.  What's important is whether or not the disk conforms to a
certain set of expectations.  Namely, that the general organization is
such that addressing the blocks in block number order guarantees
maximum throughput.

Now, bad block remapping destroys that guarantee, but unless you've
got a LOT of bad blocks, it shouldn't destroy your performance, right?

> Combine that with the fact that the drive controller can be much
> smarter than it was twenty years ago, and you can see that the case
> for doing I/O scheduling in the kernel and not in the drive is
> pretty weak.

Well, I certainly grant that allowing the controller to do the I/O
scheduling is faster than having the kernel do it, as long as it can
handle insertion of new requests into the list while it's in the
middle of executing a request.  The most obvious case is when the head
is in motion and the new request can be satisfied by reading from the
media between where the head is at the time of the new request and
where the head is being moved to.

My argument is that a sufficiently smart kernel scheduler *should*
yield performance results that are reasonably close to what you can
get with that feature.  Perhaps not quite as good, but reasonably
close.  It shouldn't be an orders-of-magnitude type difference.

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Geoffrey

Alex Turner wrote:
Looking at the numbers, the raptor with TCQ enabled was close or beat
the Atlas III 10k drive on most benchmarks.
Naturaly a 15k drive is going to be faster in many areas, but it is
also much more expensive.  It was only 44% better on the server tests
than the raptor with TCQ, but it costs nearly 300% more ($538 cdw.com,
$180 newegg.com).
True, but that's a one time expense (300%) for a 44% gain ALL the time. 
 '44% better' is nothing to sneeze at.  I'd easily pay the price for 
the gain in a large server env.

--
Until later, Geoffrey
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Alex Turner

Looking at the numbers, the raptor with TCQ enabled was close or beat
the Atlas III 10k drive on most benchmarks.

Naturaly a 15k drive is going to be faster in many areas, but it is
also much more expensive.  It was only 44% better on the server tests
than the raptor with TCQ, but it costs nearly 300% more ($538 cdw.com,
$180 newegg.com).  Note also that the 15k drive was the only drive
that kept up with the raptor on raw transfer speed, which is going to
matter for WAL.

For those of us on a budget, a quality controller card with lots of
RAM is going to be our biggest friend because it can cache writes, and
improve performance.  The 3ware controllers seem to be universally
benchmarked as the best SATA RAID 10 controllers where database
performance is concerned.  Even the crappy tweakers.net review had the
3ware as the fastest controller for a MySQL data partition in RAID 10.

The Raptor drives can be had for as little as $180/ea, which is quite
a good price point considering they can keep up with their SCSI 10k
RPM counterparts on almost all tests with NCQ enabled (Note that 3ware
controllers _don't_ support NCQ, although they claim their HBA based
queueing is 95% as good as NCQ on the drive).

Alex Turner
netEconomist

On 4/14/05, Dave Held <[EMAIL PROTECTED]> wrote:
> > -Original Message-
> > From: Alex Turner [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, April 14, 2005 12:14 PM
> > To: [EMAIL PROTECTED]
> > Cc: Greg Stark; pgsql-performance@postgresql.org;
> > [EMAIL PROTECTED]
> > Subject: Re: [PERFORM] Intel SRCS16 SATA raid?
> >
> >
> > I have put together a little head to head performance of a 15k SCSI,
> > 10k SCSI 10K SATA w/TCQ, 10K SATA wo/TCQ and 7.2K SATA drive
> > comparison at storage review
> >
> > http://www.storagereview.com/php/benchmark/compare_rtg_2001.ph
> > p?typeID=10&testbedID=3&osID=4&raidconfigID=1&numDrives=1&devI
> > D_0=232&devID_1=40&devID_2=259&devID_3=267&devID_4=261&devID_5
> > =248&devCnt=6
> >
> > It does illustrate some of the weaknesses of SATA drives, but all in
> > all the Raptor drives put on a good show.
> > [...]
> 
> I think it's a little misleading that your tests show 0ms seek times
> for some of the write tests.  The environmental test also selects a
> missing data point as the winner.  Besides that, it seems to me that
> seek time is one of the most important features for a DB server, which
> means that the SCSI drives are the clear winners and the non-WD SATA
> drives are the embarrassing losers.  Transfer rate is import, but
> perhaps less so because DBs tend to read/write small blocks rather
> than large files.  On the server suite, which seems to me to be the
> most relevant for DBs, the Atlas 15k spanks the other drives by a
> fairly large margin (especially the lesser SATA drives).  When you
> ignore the "consumer app" benchmarks, I wouldn't be so confident in
> saying that the Raptors "put on a good show".
> 
> __
> David B. Held
> Software Engineer/Array Services Group
> 200 14th Ave. East,  Sartell, MN 56377
> 320.534.3637 320.253.7800 800.752.8129
> 
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
>

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] 8.0.1 much slower than 7.4.2?

2005-04-14 Thread Bruce Momjian


I would ask this on the jdbc mailling list.  They might know.

---

anon permutation wrote:
> 
> Hi,
> 
> I have just upgraded our db from 7.4.2 to 8.0.1 and we are doing some 
> testing.  For some reason, we have discovered that  our application performs 
> much slower on 8.0.1.
> 
> My initial reaction was to turn on  log_min_duration_statement to see what's 
> happening.  However, log_min_duration_statement does not work for JDBC 
> clients in 8.0.1.
> 
> As a result, I modified log_statement to all.  Without my application doing 
> anything, I see statements below being executed non-stop.  Who is triggering 
> these statemetns?  Is this normal?  What am I doing wrong?
> 
> I am using Fedora Core 1 - Kernel: 2.4.22-1.2174.nptl
> 
> Please help.  Thanks.
> 
> 
> 
> 
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT attnotnull FROM 
> pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT def.adsrc FROM 
> pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid
> ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND 
> a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L
> IKE '%nextval(%'
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT attnotnull FROM 
> pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT def.adsrc FROM 
> pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid
> ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND 
> a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L
> IKE '%nextval(%'
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT attnotnull FROM 
> pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT def.adsrc FROM 
> pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid
> ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND 
> a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L
> IKE '%nextval(%'
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT attnotnull FROM 
> pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT def.adsrc FROM 
> pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid
> ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND 
> a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L
> IKE '%nextval(%'
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT attnotnull FROM 
> pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT def.adsrc FROM 
> pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid
> ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND 
> a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L
> IKE '%nextval(%'
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT attnotnull FROM 
> pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT def.adsrc FROM 
> pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid
> ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND 
> a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L
> IKE '%nextval(%'
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT attnotnull FROM 
> pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT def.adsrc FROM 
> pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid
> ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND 
> a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L
> IKE '%nextval(%'
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT attnotnull FROM 
> pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT def.adsrc FROM 
> pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid
> ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND 
> a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L
> IKE '%nextval(%'
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT attnotnull FROM 
> pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT def.adsrc FROM 
> pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid
> ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND 
> a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L
> IKE '%nextval(%'
> 2005-04-04 18:05:00 CST PARSELOG:  statement: SELECT attnotnull FROM 
> pg_catalog.pg_attribute WHERE attrelid = $1 AND at

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Alex Turner

Just to clarify these are tests from http://www.storagereview.com, not
my own.  I guess they couldn't get number for those parts.  I think
everyone understands that a 0ms seek time impossible, and indicates a
missing data point.

Thanks,

Alex Turner
netEconomist

On 4/14/05, Dave Held <[EMAIL PROTECTED]> wrote:
> > -Original Message-
> > From: Alex Turner [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, April 14, 2005 12:14 PM
> > To: [EMAIL PROTECTED]
> > Cc: Greg Stark; pgsql-performance@postgresql.org;
> > [EMAIL PROTECTED]
> > Subject: Re: [PERFORM] Intel SRCS16 SATA raid?
> >
> >
> > I have put together a little head to head performance of a 15k SCSI,
> > 10k SCSI 10K SATA w/TCQ, 10K SATA wo/TCQ and 7.2K SATA drive
> > comparison at storage review
> >
> > http://www.storagereview.com/php/benchmark/compare_rtg_2001.ph
> > p?typeID=10&testbedID=3&osID=4&raidconfigID=1&numDrives=1&devI
> > D_0=232&devID_1=40&devID_2=259&devID_3=267&devID_4=261&devID_5
> > =248&devCnt=6
> >
> > It does illustrate some of the weaknesses of SATA drives, but all in
> > all the Raptor drives put on a good show.
> > [...]
> 
> I think it's a little misleading that your tests show 0ms seek times
> for some of the write tests.  The environmental test also selects a
> missing data point as the winner.  Besides that, it seems to me that
> seek time is one of the most important features for a DB server, which
> means that the SCSI drives are the clear winners and the non-WD SATA
> drives are the embarrassing losers.  Transfer rate is import, but
> perhaps less so because DBs tend to read/write small blocks rather
> than large files.  On the server suite, which seems to me to be the
> most relevant for DBs, the Atlas 15k spanks the other drives by a
> fairly large margin (especially the lesser SATA drives).  When you
> ignore the "consumer app" benchmarks, I wouldn't be so confident in
> saying that the Raptors "put on a good show".
> 
> __
> David B. Held
> Software Engineer/Array Services Group
> 200 14th Ave. East,  Sartell, MN 56377
> 320.534.3637 320.253.7800 800.752.8129
> 
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
>

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] speed of querry?

2005-04-14 Thread Dave Held

> -Original Message-
> From: Joel Fradkin [mailto:[EMAIL PROTECTED]
> Sent: Thursday, April 14, 2005 11:39 AM
> To: 'Tom Lane'; 'Dawid Kuroczko'
> Cc: 'PERFORM'
> Subject: Re: [PERFORM] speed of querry?
> 
> 
> I did as described to alter table and did not see any 
> difference in speed. I am trying to undo the symbolic
> link to the data array and set it up on raid 5 disks in
> the machine just to test if there is an issue with the
> config of the raid 10 array or a problem with the controller.
> 
> I am kinda lame at Linux so not sure I have got it yet still
> testing. Still kind puzzled why it chose tow different option,
> but one is running windows version of postgres, so maybe that
> has something to do with it.

That sounds like a plausible explanation.  However, it could
simply be that the statistics gathered on each box are
sufficiently different to cause different plans.

> The data bases and configs (as far as page cost) are the same.

Did you do as Dawid suggested?

> [...]
> Then do a query couple of times (EXPLAIN ANALYZE also :)), then
> do:
> SET enable_seqscan = off;
> and rerun the query -- if it was significantly faster, you will
> want to do:
> SET enable_seqscan = on;
> and tweak:
> SET random_page_cost = 2.1;
> ...and play with values.  When you reach the random_page_cost
> which suits your data, you will want to put it into
> postgresql.conf
> [...]

This is above and beyond toying with the column statistics.  You
are basically telling the planner to use an index.  Try this,
and post the EXPLAIN ANALYZE for the seqscan = off case on the
slow box if it doesn't speed things up for you.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] speed of querry?

2005-04-14 Thread Joel Fradkin

Well so far I have 1.5 hours with commandpromt.com and 8 + hours with Dell
and have not seen any changes in the speed of my query.

I did move the data base to the raid 5 drives and did see a 1 second
improvement from 13 secs to 12 secs (keep in mind it runs in 6 on the
optiplex).

The dell guy ran Bonie and found 40meg per sec read/write speed for the
arrays.

He also installed version 8.0.2 (went fine on AS4 he had to uninstall 8.0.1
first).

He is going to get a 6650 in his test lab to see what he can fugure out.
I will say both commandprompt.com and Dell have been very professional and I
am impressed at the level of support available for Redhat from Dell and
postgres. As always I still feel this list has been my most useful asset,
but I am glad there are folks to call on. I am trying to go live soon and
need to get this resolved.

I told the guy from Dell it makes no sense that a windows 2.4 single proc
with 750 meg of ram can go faster then a 4 proc (3.ghz) 8 gig machine.
Both databases were restored from the same file. Same view etc.

Config files are set the same except for amount of cached ram, although
Commandprompt.com had me adjust a few items that should help going into
production, put planning stuff is basicly the same.

This view returns in 3 secs on MSSQL server on the optiplex (750 meg 2.4
box); and 6 secs using postgres on windows and 12-13 secs on the 4 processor
box. Needless to say I am very frustrated. Maybe Dell will turn up something
testing in their lab. It took a bit of perseverance to get to the right guy
at Dell (the first guy actually told me to load it all on a like machine and
if it was very much slower on my original they would pursue it otherwise it
was not an issue. I was like the machine cost 30K you going to send me one
to test that. But seriously I am open to trying anything (loading AS3, using
postgres 7.4)? The fellow at Dell does not think it is a hardware problem,
so if it is Linux (could very well be, but he seemed very sharp and did not
come up with anything yet) or postgres config (again Josh at
commandprompt.com was very sharp) then what do I do now to isolate the
issue? At least they are loading one in the lab (in theory, I cant send them
my database, so who knows what they will test). Dell changed the file system
to ext2 is that going to bite me in the butt? It did not seem to change the
speed of my explain analyze.

Joel Fradkin

Dawid Kuroczko <[EMAIL PROTECTED]> writes:
> Basically it tells postgres how many values should it keep for
> statistics per column.  The config default_statistics_target
> is the default (= used when creating table) and ALTER... is
> a way to change it later.

Not quite.  default_statistics_target is the value used by ANALYZE for
any column that hasn't had an explicit ALTER SET STATISTICS done on it.
So you can change default_statistics_target and that will affect
existing tables.

(It used to work the way you are saying, but that was a few releases
back...)

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Joshua D. Drake

Steve Poe wrote:
If SATA drives don't have the ability to replace SCSI for a multi-user
I don't think it is a matter of not having the ability. SATA all in all 
is fine as long as
it is battery backed. It isn't as high performing as SCSI but who says 
it has to be?

There are plenty of companies running databases on SATA without issue. Would
I put it on a database that is expecting to have 500 connections at all 
times? No.
Then again, if you have an application with that requirement, you have 
the money
to buy a big fat SCSI array.

Sincerely,
Joshua D. Drake

Postgres apps, but you needed to save on cost (ALWAYS an issue), 
could/would you implement SATA for your logs (pg_xlog) and keep the 
rest on SCSI?

Steve Poe
Mohan, Ross wrote:
I've been doing some reading up on this, trying to keep up here, and 
have found out that (experts, just yawn and cover your ears)

1) some SATA drives (just type II, I think?) have a "Phase Zero"
   implementation of Tagged Command Queueing (the special sauce
   for SCSI).
2) This SATA "TCQ" is called NCQ and I believe it basically
   allows the disk software itself to do the reordering
   (this is called "simple" in TCQ terminology) It does not
   yet allow the TCQ "head of queue" command, allowing the
   current tagged request to go to head of queue, which is
   a simple way of manifesting a "high priority" request.
3) SATA drives are not yet multi-initiator?
Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives
are likely to whomp SATA II drives for a while yet (read: a
year or two) in multiuser PostGres applications.
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Thursday, April 14, 2005 2:04 PM
To: Kevin Brown
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?

Kevin Brown <[EMAIL PROTECTED]> writes:
 

Greg Stark wrote:
  

I think you're being misled by analyzing the write case.
Consider the read case. When a user process requests a block and 
that read makes its way down to the driver level, the driver can't 
just put it aside and wait until it's convenient. It has to go 
ahead and issue the read right away.

Well, strictly speaking it doesn't *have* to.  It could delay for a 
couple of milliseconds to see if other requests come in, and then 
issue the read if none do.  If there are already other requests 
being fulfilled, then it'll schedule the request in question just 
like the rest.
  

But then the cure is worse than the disease. You're basically 
describing exactly what does happen anyways, only you're delaying 
more requests than necessary. That intervening time isn't really 
idle, it's filled with all the requests that were delayed during the 
previous large seek...

 

Once the first request has been fulfilled, the driver can now 
schedule the rest of the queued-up requests in disk-layout order.

I really don't see how this is any different between a system that 
has tagged queueing to the disks and one that doesn't.  The only 
difference is where the queueing happens.
  

And *when* it happens. Instead of being able to issue requests while 
a large seek is happening and having some of them satisfied they have 
to wait until that seek is finished and get acted on during the next 
large seek.

If my theory is correct then I would expect bandwidth to be 
essentially equivalent but the latency on SATA drives to be increased 
by about 50% of the average seek time. Ie, while a busy SCSI drive 
can satisfy most requests in about 10ms a busy SATA drive would 
satisfy most requests in 15ms. (add to that that 10k RPM and 15kRPM 
SCSI drives have even lower seek times and no such IDE/SATA drives 
exist...)

In reality higher latency feeds into a system feedback loop causing 
your application to run slower causing bandwidth demands to be lower 
as well. It's often hard to distinguish root causes from symptoms 
when optimizing complex systems.

 


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if 
your
 joining column's datatypes do not match

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [Fwd: Re: [PERFORM] Functionscan estimates]

2005-04-14 Thread Alvaro Herrera

On Thu, Apr 14, 2005 at 10:39:03AM -0700, elein wrote:

> All functions could have a cost associated with them, set by the writer of
> the function in order for the planner to reorder function calls.
> The stonebraker airplane level example was:
>   select ... from ... where f(id) = 3 and expensive_image_function(img)
> The idea, of course is to weight the expensive function so it was
> pushed to the end of the execution.

So there was only a constant cost associated with the function?  No
estimator function, for example?

-- 
Alvaro Herrera (<[EMAIL PROTECTED]>)
"If you have nothing to say, maybe you need just the right tool to help you
not say it."   (New York Times, about Microsoft PowerPoint)

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Dave Held

> -Original Message-
> From: Mohan, Ross [mailto:[EMAIL PROTECTED]
> Sent: Thursday, April 14, 2005 1:30 PM
> To: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] How to improve db performance with $7K?
> 
> Greg Stark wrote:
> > 
> > Kevin Brown <[EMAIL PROTECTED]> writes:
> > 
> > > Greg Stark wrote:
> > > 
> > > > I think you're being misled by analyzing the write case.
> > > > 
> > > > Consider the read case. When a user process requests a block
> > > > and that read makes its way down to the driver level, the 
> > > > driver can't just put it aside and wait until it's convenient.
> > > > It has to go ahead and issue the read right away.
> > > 
> > > Well, strictly speaking it doesn't *have* to.  It could delay
> > > for a couple of milliseconds to see if other requests come in,
> > > and then issue the read if none do.  If there are already other 
> > > requests being fulfilled, then it'll schedule the request in
> > > question just like the rest.
> >
> > But then the cure is worse than the disease. You're basically 
> > describing exactly what does happen anyways, only you're 
> > delaying more requests than necessary. That intervening time 
> > isn't really idle, it's filled with all the requests that 
> > were delayed during the previous large seek...
> > [...]
> 
> [...]
> 1) some SATA drives (just type II, I think?) have a "Phase Zero"
> implementation of Tagged Command Queueing (the special sauce
> for SCSI).
> [...]
> Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives
> are likely to whomp SATA II drives for a while yet (read: a
> year or two) in multiuser PostGres applications. 

I would say it depends on the OS.  What Kevin is describing sounds
just like the Anticipatory I/O Scheduler in Linux 2.6:

http://www.linuxjournal.com/article/6931

For certain application contexts, it looks like a big win.  Not
entirely sure if Postgres is one of them, though.  If SCSI beats
SATA, it sounds like it will be mostly due to better seek times.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Steve Poe

If SATA drives don't have the ability to replace SCSI for a multi-user
Postgres apps, but you needed to save on cost (ALWAYS an issue), 
could/would you implement SATA for your logs (pg_xlog) and keep the rest 
on SCSI?

Steve Poe
Mohan, Ross wrote:
I've been doing some reading up on this, trying to keep up here, 
and have found out that (experts, just yawn and cover your ears)

1) some SATA drives (just type II, I think?) have a "Phase Zero"
   implementation of Tagged Command Queueing (the special sauce
   for SCSI).
2) This SATA "TCQ" is called NCQ and I believe it basically
   allows the disk software itself to do the reordering
   (this is called "simple" in TCQ terminology) It does not
   yet allow the TCQ "head of queue" command, allowing the
   current tagged request to go to head of queue, which is
   a simple way of manifesting a "high priority" request.
3) SATA drives are not yet multi-initiator?
Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives
are likely to whomp SATA II drives for a while yet (read: a
year or two) in multiuser PostGres applications. 


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Thursday, April 14, 2005 2:04 PM
To: Kevin Brown
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?
Kevin Brown <[EMAIL PROTECTED]> writes:
 

Greg Stark wrote:
   

I think you're being misled by analyzing the write case.
Consider the read case. When a user process requests a block and 
that read makes its way down to the driver level, the driver can't 
just put it aside and wait until it's convenient. It has to go ahead 
and issue the read right away.
 

Well, strictly speaking it doesn't *have* to.  It could delay for a 
couple of milliseconds to see if other requests come in, and then 
issue the read if none do.  If there are already other requests being 
fulfilled, then it'll schedule the request in question just like the 
rest.
   

But then the cure is worse than the disease. You're basically describing 
exactly what does happen anyways, only you're delaying more requests than 
necessary. That intervening time isn't really idle, it's filled with all the 
requests that were delayed during the previous large seek...
 

Once the first request has been fulfilled, the driver can now schedule 
the rest of the queued-up requests in disk-layout order.

I really don't see how this is any different between a system that has 
tagged queueing to the disks and one that doesn't.  The only 
difference is where the queueing happens.
   

And *when* it happens. Instead of being able to issue requests while a 
large seek is happening and having some of them satisfied they have to wait 
until that seek is finished and get acted on during the next large seek.
If my theory is correct then I would expect bandwidth to be essentially 
equivalent but the latency on SATA drives to be increased by about 50% of the 
average seek time. Ie, while a busy SCSI drive can satisfy most requests in 
about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that 
that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such 
IDE/SATA drives exist...)
In reality higher latency feeds into a system feedback loop causing your 
application to run slower causing bandwidth demands to be lower as well. It's 
often hard to distinguish root causes from symptoms when optimizing complex 
systems.
 


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread PFC


The real question is whether you choose the single 15kRPM drive or  
additional
drives at 10kRPM... Additional spindles would give a much bigger
	And the bonus question.
	Expensive fast drives as a RAID for everything, or for the same price  
many more slower drives (even SATA) so you can put the transaction log,  
tables, indexes all on separate physical drives ? Like put one very  
frequently used table on its own disk ?
	For the same amount of money which one would be more interesting ?

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Mohan, Ross

I've been doing some reading up on this, trying to keep up here, 
and have found out that (experts, just yawn and cover your ears)

1) some SATA drives (just type II, I think?) have a "Phase Zero"
implementation of Tagged Command Queueing (the special sauce
for SCSI).
2) This SATA "TCQ" is called NCQ and I believe it basically
allows the disk software itself to do the reordering
(this is called "simple" in TCQ terminology) It does not
yet allow the TCQ "head of queue" command, allowing the
current tagged request to go to head of queue, which is
a simple way of manifesting a "high priority" request.

3) SATA drives are not yet multi-initiator?

Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives
are likely to whomp SATA II drives for a while yet (read: a
year or two) in multiuser PostGres applications. 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Thursday, April 14, 2005 2:04 PM
To: Kevin Brown
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?

Kevin Brown <[EMAIL PROTECTED]> writes:

> Greg Stark wrote:
> 
> 
> > I think you're being misled by analyzing the write case.
> > 
> > Consider the read case. When a user process requests a block and 
> > that read makes its way down to the driver level, the driver can't 
> > just put it aside and wait until it's convenient. It has to go ahead 
> > and issue the read right away.
> 
> Well, strictly speaking it doesn't *have* to.  It could delay for a 
> couple of milliseconds to see if other requests come in, and then 
> issue the read if none do.  If there are already other requests being 
> fulfilled, then it'll schedule the request in question just like the 
> rest.

But then the cure is worse than the disease. You're basically describing 
exactly what does happen anyways, only you're delaying more requests than 
necessary. That intervening time isn't really idle, it's filled with all the 
requests that were delayed during the previous large seek...

> Once the first request has been fulfilled, the driver can now schedule 
> the rest of the queued-up requests in disk-layout order.
> 
> I really don't see how this is any different between a system that has 
> tagged queueing to the disks and one that doesn't.  The only 
> difference is where the queueing happens.

And *when* it happens. Instead of being able to issue requests while a large 
seek is happening and having some of them satisfied they have to wait until 
that seek is finished and get acted on during the next large seek.

If my theory is correct then I would expect bandwidth to be essentially 
equivalent but the latency on SATA drives to be increased by about 50% of the 
average seek time. Ie, while a busy SCSI drive can satisfy most requests in 
about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that 
that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such 
IDE/SATA drives exist...)

In reality higher latency feeds into a system feedback loop causing your 
application to run slower causing bandwidth demands to be lower as well. It's 
often hard to distinguish root causes from symptoms when optimizing complex 
systems.

-- 
greg

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] speed of querry?

2005-04-14 Thread Joel Fradkin

I did as described to alter table and did not see any difference in speed.
I am trying to undo the symbolic link to the data array and set it up on
raid 5 disks in the machine just to test if there is an issue with the
config of the raid 10 array or a problem with the controller.

I am kinda lame at Linux so not sure I have got it yet still testing.
Still kind puzzled why it chose tow different option, but one is running
windows version of postgres, so maybe that has something to do with it.

The data bases and configs (as far as page cost) are the same.

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305

[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 14, 2005 11:21 AM
To: Dawid Kuroczko
Cc: Joel Fradkin; PERFORM
Subject: Re: [PERFORM] speed of querry? 

Dawid Kuroczko <[EMAIL PROTECTED]> writes:
> Basically it tells postgres how many values should it keep for
> statistics per column.  The config default_statistics_target
> is the default (= used when creating table) and ALTER... is
> a way to change it later.

Not quite.  default_statistics_target is the value used by ANALYZE for
any column that hasn't had an explicit ALTER SET STATISTICS done on it.
So you can change default_statistics_target and that will affect
existing tables.

(It used to work the way you are saying, but that was a few releases
back...)

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Dave Held

> -Original Message-
> From: Greg Stark [mailto:[EMAIL PROTECTED]
> Sent: Thursday, April 14, 2005 12:55 PM
> To: [EMAIL PROTECTED]
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] How to improve db performance with $7K?
> 
> "Matthew Nuzum" <[EMAIL PROTECTED]> writes:
> 
> > drive A has capacity C and spins at 15K rpms, and
> > drive B has capacity 2 x C and spins at 10K rpms and
> > all other features are the same, the price is the same and 
> > C is enough disk space which would you choose?
> 
> In this case you always choose the 15k RPM drive, at least 
> for Postgres.  The 15kRPM reduces the latency which improves
> performance when fsyncing transaction commits.

I think drive B is clearly the best choice.  Matt said "all
other features are the same", including price.  I take that to
mean that the seek time and throughput are also identical.
However, I think it's fairly clear that there is no such pair
of actual devices.  If Matt really meant that they have the same
cache size, interface, etc, then I would agree with you.  The
15k drive is likely to have the better seek time.

> The real question is whether you choose the single 15kRPM 
> drive or additional drives at 10kRPM... Additional spindles
> would give a much bigger bandwidth improvement but questionable
> latency improvement.

Under the assumption that the seek times and throughput are
realistic rather than contrived as in the stated example, I would
say the 15k drive is the likely winner.  It probably has the
better seek time, and it seems that latency is more important
than bandwidth for DB apps.

> > Would the increased data density of the higher capacity drive
> > be of greater benefit than the faster spindle speed of drive
> > A?
> 
> actually a 2xC capacity drive probably just has twice as many 
> platters which means it would perform identically to the C
> capacity drive.  If it has denser platters that might improve
> performance slightly.

Well, according to the paper referenced by Richard, twice as many
platters means that it probably has slightly worse seek time
(because of the increased mass of the actuator/rw-head).  Yet
another reason why the smaller drive might be preferable.  Of
course, the data density is certainly a factor, as you say.  But
since the drives are within a factor of 2, it seems likely that
real drives would have comparable densities.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Joshua D. Drake



Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller instead of the 3ware one.
 

Well I have never even heard of it. 3ware is the defacto authority of 
reasonable SATA RAID. If you were to
go with a different brand I would go with LSI. The LSI 150-6 is a nice 
card with a battery backup option as well.

Oh and 3ware has  BBU for certain models as well.
Sincerely,
Joshua D. Drake
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Tom Lane

"Matthew Nuzum" <[EMAIL PROTECTED]> writes:
> So if you all were going to choose between two hard drives where:
> drive A has capacity C and spins at 15K rpms, and
> drive B has capacity 2 x C and spins at 10K rpms and
> all other features are the same, the price is the same and C is enough
> disk space which would you choose?

> I've noticed that on IDE drives, as the capacity increases the data
> density increases and there is a pereceived (I've not measured it)
> performance increase.

> Would the increased data density of the higher capacity drive be of
> greater benefit than the faster spindle speed of drive A?

Depends how they got the 2x capacity increase.  If they got it by
increased bit density --- same number of tracks, but more sectors
per track --- then drive B actually has a higher transfer rate,
because in one rotation it can transfer twice as much data as drive A.
More tracks per cylinder (ie, more platters) can also be a speed win
since you can touch more data before you have to seek to another
cylinder.  Drive B will lose if the 2x capacity was all from adding
cylinders (unless its seek-time spec is way better than A's ... which
is unlikely but not impossible, considering the cylinders are probably
closer together).

Usually there's some-of-each involved, so it's hard to make any
definite statement without more facts.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] recovery after long delete

2005-04-14 Thread Greg Stark

Markus Bertheau <[EMAIL PROTECTED]> writes:

> How does oracle do that? Has all this something to do with mvcc? Why
> does it take oracle so long to recover?

Postgres does "pessimistic MVCC" where it keeps the old versions where they
are in the table. Only after it's committed can they be cleaned up and reused.
So aborting is a noop but committing requires additional cleanup (which is put
off until vacuum runs).

Oracle does "optimistic MVCC" where it assumes most transactions will commit
and most transactions will be reading mostly committed data. So it immediately
does all the cleanup for the commit. It stores the old version in separate
storage spaces called the rollback segment and redo logs. Committing is a noop
(almost, there are some details, search for "delayed block cleanout") whereas
rolling back requires copying back all that old data from the redo logs back
to the table.

Engineering is all about tradeoffs.

-- 
greg

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Greg Stark

Kevin Brown <[EMAIL PROTECTED]> writes:

> Greg Stark wrote:
> 
> 
> > I think you're being misled by analyzing the write case.
> > 
> > Consider the read case. When a user process requests a block and
> > that read makes its way down to the driver level, the driver can't
> > just put it aside and wait until it's convenient. It has to go ahead
> > and issue the read right away.
> 
> Well, strictly speaking it doesn't *have* to.  It could delay for a
> couple of milliseconds to see if other requests come in, and then
> issue the read if none do.  If there are already other requests being
> fulfilled, then it'll schedule the request in question just like the
> rest.

But then the cure is worse than the disease. You're basically describing
exactly what does happen anyways, only you're delaying more requests than
necessary. That intervening time isn't really idle, it's filled with all the
requests that were delayed during the previous large seek...

> Once the first request has been fulfilled, the driver can now schedule
> the rest of the queued-up requests in disk-layout order.
> 
> I really don't see how this is any different between a system that has
> tagged queueing to the disks and one that doesn't.  The only
> difference is where the queueing happens.  

And *when* it happens. Instead of being able to issue requests while a large
seek is happening and having some of them satisfied they have to wait until
that seek is finished and get acted on during the next large seek.

If my theory is correct then I would expect bandwidth to be essentially
equivalent but the latency on SATA drives to be increased by about 50% of the
average seek time. Ie, while a busy SCSI drive can satisfy most requests in
about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that
that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such
IDE/SATA drives exist...)

In reality higher latency feeds into a system feedback loop causing your
application to run slower causing bandwidth demands to be lower as well. It's
often hard to distinguish root causes from symptoms when optimizing complex
systems.

-- 
greg

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Greg Stark

"Matthew Nuzum" <[EMAIL PROTECTED]> writes:

> drive A has capacity C and spins at 15K rpms, and
> drive B has capacity 2 x C and spins at 10K rpms and
> all other features are the same, the price is the same and C is enough
> disk space which would you choose?

In this case you always choose the 15k RPM drive, at least for Postgres.
The 15kRPM reduces the latency which improves performance when fsyncing
transaction commits.

The real question is whether you choose the single 15kRPM drive or additional
drives at 10kRPM... Additional spindles would give a much bigger bandwidth
improvement but questionable latency improvement.

> Would the increased data density of the higher capacity drive be of
> greater benefit than the faster spindle speed of drive A?

actually a 2xC capacity drive probably just has twice as many platters which
means it would perform identically to the C capacity drive. If it has denser
platters that might improve performance slightly.

-- 
greg

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Richard_D_Levine

Nice research Alex.

Your data strongly support the information in the paper.  Your SCSI drives
blew away the others in all of the server benchmarks.  They're only
marginally better in desktop use.

I do find it somewhat amazing that a 15K SCSI 320 drive isn't going to help
me play Unreal Tournament much faster.  That's okay.  I suck at it anyway.
My kid has never lost to me.  She enjoys seeing daddy as a bloody smear and
bouncing body parts anyway.  It promotes togetherness.

Here's a quote from the paper:

"[SCSI] interfaces support multiple initiators or hosts. The
drive must keep track of separate sets of information for each
host to which it is attached, e.g., maintaining the processor
pointer sets for multiple initiators and tagged commands.
The capability of SCSI/FC to efficiently process commands
and tasks in parallel has also resulted in a higher overhead
âkernelâ structure for the firmware."

Has anyone ever seen a system with multiple hosts or initiators on a SCSI
bus?  Seems like it would be a very cool thing in an SMP architecture, but
I've not seen an example implemented.

Rick

Alex Turner <[EMAIL PROTECTED]> wrote on 04/14/2005 12:13:41 PM:

> I have put together a little head to head performance of a 15k SCSI,
> 10k SCSI 10K SATA w/TCQ, 10K SATA wo/TCQ and 7.2K SATA drive
> comparison at storage review
>
> http://www.storagereview.com/php/benchmark/compare_rtg_2001.php?
>
typeID=10&testbedID=3&osID=4&raidconfigID=1&numDrives=1&devID_0=232&devID_1=40&devID_2=259&devID_3=267&devID_4=261&devID_5=248&devCnt=6

>
> It does illustrate some of the weaknesses of SATA drives, but all in
> all the Raptor drives put on a good show.
>
> Alex Turner
> netEconomist
>
> On 4/14/05, Alex Turner <[EMAIL PROTECTED]> wrote:
> > I have read a large chunk of this, and I would highly recommend it to
> > anyone who has been participating in the drive discussions.  It is
> > most informative!!
> >
> > Alex Turner
> > netEconomist
> >
> > On 4/14/05, [EMAIL PROTECTED]
> <[EMAIL PROTECTED]> wrote:
> > > Greg,
> > >
> > > I posted this link under a different thread (the $7k server
> thread).  It is
> > > a very good read on why SCSI is better for servers than ATA.  I
> didn't note
> > > bias, though it is from a drive manufacturer.  YMMV.  There is an
> > > interesting, though dated appendix on different manufacturers' drive
> > > characteristics.
> > >
> > > http://www.seagate.
>
com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf

> > >
> > > Enjoy,
> > >
> > > Rick
> > >
> > > [EMAIL PROTECTED] wrote on 04/14/2005 09:54:45
AM:
> > >
> > > >
> > > > Our vendor is trying to sell us on an Intel SRCS16 SATA raid
controller
> > > > instead of the 3ware one.
> > > >
> > > > Poking around it seems this does come with Linux drivers and there
is a
> > > > battery backup option. So it doesn't seem to be completely insane.
> > > >
> > > > Anyone have any experience with these controllers?
> > > >
> > > > I'm also wondering about whether I'm better off with one of these
SATA
> > > raid
> > > > controllers or just going with SCSI drives.
> > > >
> > > > --
> > > > greg
> > > >
> > > >
> > > > ---(end of
broadcast)---
> > > > TIP 8: explain analyze is your friend
> > >
> > > ---(end of
broadcast)---
> > > TIP 9: the planner will ignore your desire to choose an index scan if
your
> > >   joining column's datatypes do not match
> > >
> >
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [Fwd: Re: [PERFORM] Functionscan estimates]

2005-04-14 Thread elein

I'm not subscribed to performance at this time. I reviewed the
thread and owe everything I know about this to Wei Hong whose
brilliance exceeds all others :)  All misinterpretations are
mine alone.

I have not reviewed hellerstein's papers posted by neil, but I
will.

My understanding of this issue is at a very high user level.
In Illustra SRF functions were not necessarily special functions.  
All functions could have a cost associated with them, set by the writer of
the function in order for the planner to reorder function calls.
The stonebraker airplane level example was:
select ... from ... where f(id) = 3 and expensive_image_function(img)
The idea, of course is to weight the expensive function so it was
pushed to the end of the execution.

The only difference I see with SRFs in Postgres is that you may want
the cost represented as one row returned and another weighting representing
the number of estimated rows.  I think this conclusion has already
been drawn.

It seems to make sense, if the optimizer can use this information, to
include wild and/or educated guesses for the costs of the SRF.

I'm sure I haven't contributed here anything new, but perhaps 
phrased it differently.

Copy me on replies and I'll participate as I can.

--elein

On Thu, Apr 14, 2005 at 08:36:38AM +0100, Simon Riggs wrote:
> Elein,
> 
> Any chance you could join this discussion on PERFORM ?
> 
> I understand you did time with Illustra. I thought they had solved the
> optimizer plug-in issue...how did they do it?
> 
> Best Regards, Simon Riggs
> 
> 
>  Forwarded Message 
> From: Tom Lane <[EMAIL PROTECTED]>
> To: Alvaro Herrera <[EMAIL PROTECTED]>
> Cc: Josh Berkus , Michael Fuhr <[EMAIL PROTECTED]>,
> 
> Subject: Re: [PERFORM] Functionscan estimates
> Date: Sat, 09 Apr 2005 00:00:56 -0400
> Not too many releases ago, there were several columns in pg_proc that
> were intended to support estimation of the runtime cost and number of
> result rows of set-returning functions.  I believe in fact that these
> were the remains of Joe Hellerstein's thesis on expensive-function
> evaluation, and are exactly what he was talking about here:
> http://archives.postgresql.org/pgsql-hackers/2002-06/msg00085.php
> 
> But with all due respect to Joe, I think the reason that stuff got
> trimmed is that it didn't work very well.  In most cases it's
> *hard* to write an estimator for a SRF.  Let's see you produce
> one for dblink() for instance ...
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 7: don't forget to increase your free space map settings
> 

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Dave Held

> -Original Message-
> From: Alex Turner [mailto:[EMAIL PROTECTED]
> Sent: Thursday, April 14, 2005 12:14 PM
> To: [EMAIL PROTECTED]
> Cc: Greg Stark; pgsql-performance@postgresql.org;
> [EMAIL PROTECTED]
> Subject: Re: [PERFORM] Intel SRCS16 SATA raid?
> 
> 
> I have put together a little head to head performance of a 15k SCSI,
> 10k SCSI 10K SATA w/TCQ, 10K SATA wo/TCQ and 7.2K SATA drive
> comparison at storage review
> 
> http://www.storagereview.com/php/benchmark/compare_rtg_2001.ph
> p?typeID=10&testbedID=3&osID=4&raidconfigID=1&numDrives=1&devI
> D_0=232&devID_1=40&devID_2=259&devID_3=267&devID_4=261&devID_5
> =248&devCnt=6
> 
> It does illustrate some of the weaknesses of SATA drives, but all in
> all the Raptor drives put on a good show.
> [...]

I think it's a little misleading that your tests show 0ms seek times
for some of the write tests.  The environmental test also selects a
missing data point as the winner.  Besides that, it seems to me that
seek time is one of the most important features for a DB server, which
means that the SCSI drives are the clear winners and the non-WD SATA
drives are the embarrassing losers.  Transfer rate is import, but
perhaps less so because DBs tend to read/write small blocks rather
than large files.  On the server suite, which seems to me to be the
most relevant for DBs, the Atlas 15k spanks the other drives by a
fairly large margin (especially the lesser SATA drives).  When you 
ignore the "consumer app" benchmarks, I wouldn't be so confident in 
saying that the Raptors "put on a good show".

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread PFC


I have a table A with an int column ID that references table B column
ID. Table B has about 150k rows, and has an index on B.ID. When trying
to copy 1 million rows into A, I get the following \timings:
	You're using 7.4.5. It's possible that you have a type mismatch in your  
foreign keys which prevents use of the index on B.
	First of all, be really sure it's THAT foreign key, ie. do your COPY with  
only ONE foreign key at a time if you have several, and see which one is  
the killer.

Then, supposing it's the column in A which REFERENCE's B(id) :
SELECT id FROM A LIMIT 1;
(check type)
SELECT id FROM B LIMIT 1;
(check type)
EXPLAIN ANALYZE the following :
SELECT * FROM B WHERE id = (SELECT id FROM A LIMIT 1);
It should use the index. Does it ?
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Alex Turner

I have put together a little head to head performance of a 15k SCSI,
10k SCSI 10K SATA w/TCQ, 10K SATA wo/TCQ and 7.2K SATA drive
comparison at storage review

http://www.storagereview.com/php/benchmark/compare_rtg_2001.php?typeID=10&testbedID=3&osID=4&raidconfigID=1&numDrives=1&devID_0=232&devID_1=40&devID_2=259&devID_3=267&devID_4=261&devID_5=248&devCnt=6

It does illustrate some of the weaknesses of SATA drives, but all in
all the Raptor drives put on a good show.

Alex Turner
netEconomist

On 4/14/05, Alex Turner <[EMAIL PROTECTED]> wrote:
> I have read a large chunk of this, and I would highly recommend it to
> anyone who has been participating in the drive discussions.  It is
> most informative!!
> 
> Alex Turner
> netEconomist
> 
> On 4/14/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> > Greg,
> >
> > I posted this link under a different thread (the $7k server thread).  It is
> > a very good read on why SCSI is better for servers than ATA.  I didn't note
> > bias, though it is from a drive manufacturer.  YMMV.  There is an
> > interesting, though dated appendix on different manufacturers' drive
> > characteristics.
> >
> > http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf
> >
> > Enjoy,
> >
> > Rick
> >
> > [EMAIL PROTECTED] wrote on 04/14/2005 09:54:45 AM:
> >
> > >
> > > Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller
> > > instead of the 3ware one.
> > >
> > > Poking around it seems this does come with Linux drivers and there is a
> > > battery backup option. So it doesn't seem to be completely insane.
> > >
> > > Anyone have any experience with these controllers?
> > >
> > > I'm also wondering about whether I'm better off with one of these SATA
> > raid
> > > controllers or just going with SCSI drives.
> > >
> > > --
> > > greg
> > >
> > >
> > > ---(end of broadcast)---
> > > TIP 8: explain analyze is your friend
> >
> > ---(end of broadcast)---
> > TIP 9: the planner will ignore your desire to choose an index scan if your
> >   joining column's datatypes do not match
> >
>

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Alex Turner

I have read a large chunk of this, and I would highly recommend it to
anyone who has been participating in the drive discussions.  It is
most informative!!

Alex Turner
netEconomist

On 4/14/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> Greg,
> 
> I posted this link under a different thread (the $7k server thread).  It is
> a very good read on why SCSI is better for servers than ATA.  I didn't note
> bias, though it is from a drive manufacturer.  YMMV.  There is an
> interesting, though dated appendix on different manufacturers' drive
> characteristics.
> 
> http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf
> 
> Enjoy,
> 
> Rick
> 
> [EMAIL PROTECTED] wrote on 04/14/2005 09:54:45 AM:
> 
> >
> > Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller
> > instead of the 3ware one.
> >
> > Poking around it seems this does come with Linux drivers and there is a
> > battery backup option. So it doesn't seem to be completely insane.
> >
> > Anyone have any experience with these controllers?
> >
> > I'm also wondering about whether I'm better off with one of these SATA
> raid
> > controllers or just going with SCSI drives.
> >
> > --
> > greg
> >
> >
> > ---(end of broadcast)---
> > TIP 8: explain analyze is your friend
> 
> ---(end of broadcast)---
> TIP 9: the planner will ignore your desire to choose an index scan if your
>   joining column's datatypes do not match
>

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] recovery after long delete

2005-04-14 Thread Tom Lane

Markus Bertheau <[EMAIL PROTECTED]> writes:
> Now if you pull the plug after 2, at startup, pg will go through the
> in-progress txns and mark them as aborted. That's all the recovery in
> this case. All rows are still there. O(1).

Right.  (Actually it's O(checkpoint interval), because we have to make
sure that everything we did since the last checkpoint actually got to
disk --- but in principle, there's zero recovery effort.)

> How does oracle do that? Has all this something to do with mvcc? Why
> does it take oracle so long to recover?

Oracle doesn't do MVCC the same way we do.  They update rows in place
and put the previous version of a row into an "undo log".  If the
transaction aborts, they have to go back through the undo log and put
back the previous version of the row.  I'm not real clear on how that
applies to deletions, but I suppose it's the same deal: cost of undoing
a transaction in Oracle is proportional to the number of rows it
changed.  There's also the little problem that the space available for
UNDO logs is limited :-(

As against which, they don't have to VACUUM.  So it's a tradeoff.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Marko Ristola

About the foreign key performance:
Maybe foreign key checks could be delayed into the COMMIT phase.
In that position, you could check, that there are lots of foreign key 
checks
for each foreign key pending, and do the foreign key check for an area
or for the whole table, if it is faster.

I have heard, that the database must be in consistent state after COMMIT,
but it does not have necessarily to be okay inside a transaction.
1. COMMIT wanted
2. If there are lots of foreign key checks pending, do either an area 
foreign key check
(join result must be 0 rows), or a full table join.
3. If all foreign key checks are okay, complete the COMMIT operation.
4. If a foreign key check fails, go into the ROLLBACK NEEDED state.

Maybe Tom Lane meant the same.
set option delayed_foreign_keys=true;
BEGIN;
insert 1000 rows.
COMMIT;
Regards,
Marko Ristola
Christopher Kings-Lynne wrote:
My problem with this really is that in my database it is hard to 
predict which inserts will be huge (and thus need FKs dissabled), so 
I would have to code it around all inserts. Instead I can code my own 
integirty logic and avoid using FKs all together.

Just drop the fk and re-add it, until postgres gets more smarts.
Chris
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Stephan Szabo

On Thu, 14 Apr 2005, Tom Lane wrote:

> Stephan Szabo <[EMAIL PROTECTED]> writes:
> > ... At some point, if we can work out how to do all the semantics
> > properly, it'd probably be possible to replace the insert type check with
> > a per-statement check which would be somewhere in between.  That requires
> > access to the affected rows inside the trigger which I don't believe is
> > available currently.
>
> Not necessarily.  It occurs to me that maybe what we need is "lossy
> storage" of the trigger events.  If we could detect that the queue of
> pending checks for a particular FK is getting large, we could discard
> the whole queue and replace it with one entry that says "run the
> wholesale check again when we are ready to fire triggers".  I'm not

Yeah, but there's a potentially large middle ground where neither our
current plan nor check the entire table is particularly good for that we
might be able to handle better.  It'd be nice to also fall back to check
the entire table for even larger changes.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Richard_D_Levine

Greg,

I posted this link under a different thread (the $7k server thread).  It is
a very good read on why SCSI is better for servers than ATA.  I didn't note
bias, though it is from a drive manufacturer.  YMMV.  There is an
interesting, though dated appendix on different manufacturers' drive
characteristics.

http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf

Enjoy,

Rick

[EMAIL PROTECTED] wrote on 04/14/2005 09:54:45 AM:

>
> Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller
> instead of the 3ware one.
>
> Poking around it seems this does come with Linux drivers and there is a
> battery backup option. So it doesn't seem to be completely insane.
>
> Anyone have any experience with these controllers?
>
> I'm also wondering about whether I'm better off with one of these SATA
raid
> controllers or just going with SCSI drives.
>
> --
> greg
>
>
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] speed of querry?

2005-04-14 Thread Tom Lane

Dawid Kuroczko <[EMAIL PROTECTED]> writes:
> Basically it tells postgres how many values should it keep for
> statistics per column.  The config default_statistics_target
> is the default (= used when creating table) and ALTER... is
> a way to change it later.

Not quite.  default_statistics_target is the value used by ANALYZE for
any column that hasn't had an explicit ALTER SET STATISTICS done on it.
So you can change default_statistics_target and that will affect
existing tables.

(It used to work the way you are saying, but that was a few releases
back...)

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] speed of querry?

2005-04-14 Thread Dawid Kuroczko

On 4/14/05, Joel Fradkin <[EMAIL PROTECTED]> wrote:
> Josh from commandprompt.com had me alter the config to have
> default_statistics_target = 250
> 
> Is this somehow related to what your asking me to do?
> I did do an analyze, but have only ran the viw a few times.

well, he did suggest the right thing.  However this parameter
applies to newly created tables, so either recreate the tables
or do the ALTER TABLE I've sent eariler.

Basically it tells postgres how many values should it keep for
statistics per column.  The config default_statistics_target
is the default (= used when creating table) and ALTER... is
a way to change it later.

The more statistics PostgreSQL has means it can better
predict how much data will be returned -- and this directly
leads to a choice how to handle the data (order in which
tables should be read, whether to use index or not, which
algorithm use for join, etc.).  The more statistics, the better
PostgreSQL is able to predict.  The more statistics, the slower
planner is able to do the analysis.  So you have to find
a value which will be as much as is needed to accurately
predict the results but not more!  PostgreSQL's default of
10 is a bit conservative, hence the suggestions to increase
it. :)  [ and so is random_page_cost or some people have
found that in their cases it is beneficial to reduce the value,
even as much as below 2. ]

Hope this clairifies things a bit.

   Regards,
  Dawid

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Matthew Nuzum

On 4/14/05, Tom Lane <[EMAIL PROTECTED]> wrote:
>
> That's basically what it comes down to: SCSI lets the disk drive itself
> do the low-level I/O scheduling whereas the ATA spec prevents the drive
> from doing so (unless it cheats, ie, caches writes).  Also, in SCSI it's
> possible for the drive to rearrange reads as well as writes --- which
> AFAICS is just not possible in ATA.  (Maybe in the newest spec...)
>
> The reason this is so much more of a win than it was when ATA was
> designed is that in modern drives the kernel has very little clue about
> the physical geometry of the disk.  Variable-size tracks, bad-block
> sparing, and stuff like that make for a very hard-to-predict mapping
> from linear sector addresses to actual disk locations.  Combine that
> with the fact that the drive controller can be much smarter than it was
> twenty years ago, and you can see that the case for doing I/O scheduling
> in the kernel and not in the drive is pretty weak.
>
>

So if you all were going to choose between two hard drives where:
drive A has capacity C and spins at 15K rpms, and
drive B has capacity 2 x C and spins at 10K rpms and
all other features are the same, the price is the same and C is enough
disk space which would you choose?

I've noticed that on IDE drives, as the capacity increases the data
density increases and there is a pereceived (I've not measured it)
performance increase.

Would the increased data density of the higher capacity drive be of
greater benefit than the faster spindle speed of drive A?

-- 
Matthew Nuzum <[EMAIL PROTECTED]>
www.followers.net - Makers of Elite Content Management System
View samples of Elite CMS in action by visiting
http://www.followers.net/portfolio/

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[PERFORM] recovery after long delete

2005-04-14 Thread Markus Bertheau

Hi.

Our professor told us the following story: Oracle. A client issued a
selective delete statement on a big table. After two days he lost
patience and pulled the plug. Unfortunately while starting up, oracle
had to restore all the deleted rows, which took it another two days. He
reasoned that one better copies all rows that are not to be deleted in
another table drops the original table afterwards. (Concurrency, fks,
indexes are not the question here). So I wondered how this works in
PostgreSQL. As I understand it, what's going on is the following:

1. The transaction 45 is started. It is recorded as in-progress.
2. The rows selected in the delete statement are one by one marked as
to-be-deleted by txn 45. Among them row 27.
3. If a concurrently running read committed txn 47 wants to update row
27, it blocks, awaiting whether txn 45 commits or aborts.
4.1 When txn 45 commits, it is marked as such.
5.1 txn 47 can continue, but as row 27 was deleted, it is not affected
by txn 47's update statement.
4.2 When txn 45 aborts, it is marked as such. This means the same as not
being marked at all.
5.2 txn 47 continues and updates row 27.

Now if you pull the plug after 2, at startup, pg will go through the
in-progress txns and mark them as aborted. That's all the recovery in
this case. All rows are still there. O(1).

How does oracle do that? Has all this something to do with mvcc? Why
does it take oracle so long to recover?

Thanks

Markus
-- 
Markus Bertheau <[EMAIL PROTECTED]>


signature.asc
Description: This is a digitally signed message part

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Tom Lane

Christopher Kings-Lynne <[EMAIL PROTECTED]> writes:
> No it certainly won't warn you.  You have _avoided_ the check entirely. 
>   That's why I was warning you...

> If you wanted to be really careful, you could:

Probably the better bet is to drop and re-add the FK constraint.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Christopher Kings-Lynne

My problem with this really is that in my database it is hard to predict 
which inserts will be huge (and thus need FKs dissabled), so I would 
have to code it around all inserts. Instead I can code my own integirty 
logic and avoid using FKs all together.
Just drop the fk and re-add it, until postgres gets more smarts.
Chris
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Richard van den Berg

Christopher Kings-Lynne wrote:
No it certainly won't warn you.  You have _avoided_ the check entirely. 
 That's why I was warning you...
I figured as much when I realized it was just a simple table update. I 
was thinking more of a DB2 style "set integrity" command.

If you wanted to be really careful, you could:
So I will be re-checking my own FKs. That's not really what I'd expect 
from a FK.

My problem with this really is that in my database it is hard to predict 
which inserts will be huge (and thus need FKs dissabled), so I would 
have to code it around all inserts. Instead I can code my own integirty 
logic and avoid using FKs all together.

Thanks,
--
Richard van den Berg, CISSP
---
Trust Factory B.V. | www.dna-portal.net
Bazarstraat 44a|  www.trust-factory.com
2518AK The Hague   |  Phone: +31 70 3620684
The Netherlands|  Fax  : +31 70 3603009
---
   Have you visited our new DNA Portal?
---
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Christopher Kings-Lynne

Thanks for the pointer. I got this from the archives:

update pg_class set reltriggers=0 where relname = 'YOUR_TABLE_NAME';
to enable them after you are done, do
update pg_class set reltriggers = count(*) from pg_trigger where 
pg_class.oid=tgrelid and relname='YOUR_TABLE_NAME';


I assume the re-enabling will cause an error when the copy/insert added 
data that does not satisfy the FK. In that case I'll indeed end up with 
invalid data, but at least I will know about it.
No it certainly won't warn you.  You have _avoided_ the check entirely. 
 That's why I was warning you...

If you wanted to be really careful, you could:
being;
lock tables for writes...
turn off triggers
insert
delete where rows don't match fk constraint
turn on triggers
commit;
Chris
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Mohan, Ross

sorry, don't remember whether it's SCSI or SATA II, but IIRC
the Areca controllers are just stellar for things. 

If you do get SATA for db stuff..especially multiuser...i still
haven't seen anything to indicate an across-the-board primacy
for SATA over SCSI. I'd go w/SCSI, or if SATA for $$$ reasons, I'd
be sure to have many spindles and RAID 10. 

my 0.02. I'm surely not an expert of any kind. 





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Thursday, April 14, 2005 10:55 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Intel SRCS16 SATA raid?



Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller instead 
of the 3ware one.

Poking around it seems this does come with Linux drivers and there is a battery 
backup option. So it doesn't seem to be completely insane.

Anyone have any experience with these controllers?

I'm also wondering about whether I'm better off with one of these SATA raid 
controllers or just going with SCSI drives.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Richard van den Berg

Christopher Kings-Lynne wrote:
But why then is the speed acceptable if I copy and then manually add 
the FK? Is the check done by the FK so much different from when it is 
done automatically using an active deffered FK?
Yeah I think it uses a different query formulation...  Actually I only 
assume that deferred fk's don't use that - I guess your experiment 
proves that.
In my tests deferred or not deferred makes no difference in speed. I am 
still quite surprised by how huge the difference is.. this makes FKs 
quite unusable when added a lot of data to a table.


Actually, you can just "disable" them if you want to be really dirty :) 
Thanks for the pointer. I got this from the archives:

update pg_class set reltriggers=0 where relname = 'YOUR_TABLE_NAME';
to enable them after you are done, do
update pg_class set reltriggers = count(*) from pg_trigger where 
pg_class.oid=tgrelid and relname='YOUR_TABLE_NAME';


I assume the re-enabling will cause an error when the copy/insert added 
data that does not satisfy the FK. In that case I'll indeed end up with 
invalid data, but at least I will know about it.

Thanks,
--
Richard van den Berg, CISSP
---
Trust Factory B.V. | www.dna-portal.net
Bazarstraat 44a|  www.trust-factory.com
2518AK The Hague   |  Phone: +31 70 3620684
The Netherlands|  Fax  : +31 70 3603009
---
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Mohan, Ross

Imagine a system in "furious activity" with two (2) process regularly occuring

Process One:  Long read (or write). Takes 20ms to do seek, latency, and 
stream off. Runs over and over. 
Process Two:  Single block read ( or write ). Typical database row access. 
Optimally, could be submillisecond. happens more or less 
randomly. 

Let's say process one starts, and then process two. Assume, for sake of this 
discussion, 
that P2's block lies w/in P1's swath. (But doesn't have to...)

Now, everytime process two has to wait at LEAST 20ms to complete. In a 
queue-reordering
system, it could be a lot faster. And me, looking for disk service times on P2, 
keep
wondering "why does a single diskblock read keep taking >20ms?"

Sit doesn't need to be "a read" or "a write". It doesn't need to be 
"furious activity"
(two processes is not furious, even for a single user desktop.)  This is not a 
"corner case", 
and while it doesn't take into account kernel/drivecache/UBC buffering issues, 
I think it
shines a light on why command re-ordering might be useful.  

YMMV. 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kevin Brown
Sent: Thursday, April 14, 2005 4:36 AM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?

Greg Stark wrote:

> I think you're being misled by analyzing the write case.
> 
> Consider the read case. When a user process requests a block and that 
> read makes its way down to the driver level, the driver can't just put 
> it aside and wait until it's convenient. It has to go ahead and issue 
> the read right away.

Well, strictly speaking it doesn't *have* to.  It could delay for a couple of 
milliseconds to see if other requests come in, and then issue the read if none 
do.  If there are already other requests being fulfilled, then it'll schedule 
the request in question just like the rest.

> In the 10ms or so that it takes to seek to perform that read
> *nothing* gets done. If the driver receives more read or write 
> requests it just has to sit on them and wait. 10ms is a lifetime for a 
> computer. In that time dozens of other processes could have been 
> scheduled and issued reads of their own.

This is true, but now you're talking about a situation where the system goes 
from an essentially idle state to one of furious activity.  In other words, 
it's a corner case that I strongly suspect isn't typical in situations where 
SCSI has historically made a big difference.

Once the first request has been fulfilled, the driver can now schedule the rest 
of the queued-up requests in disk-layout order.

I really don't see how this is any different between a system that has tagged 
queueing to the disks and one that doesn't.  The only difference is where the 
queueing happens.  In the case of SCSI, the queueing happens on the disks (or 
at least on the controller).  In the case of SATA, the queueing happens in the 
kernel.

I suppose the tagged queueing setup could begin the head movement and, if 
another request comes in that requests a block on a cylinder between where the 
head currently is and where it's going, go ahead and read the block in 
question.  But is that *really* what happens in a tagged queueing system?  It's 
the only major advantage I can see it having.

> The same thing would happen if you had lots of processes issuing lots 
> of small fsynced writes all over the place. Postgres doesn't really do 
> that though. It sort of does with the WAL logs, but that shouldn't 
> cause a lot of seeking.  Perhaps it would mean that having your WAL 
> share a spindle with other parts of the OS would have a bigger penalty 
> on IDE drives than on SCSI drives though?

Perhaps.

But I rather doubt that has to be a huge penalty, if any.  When a process 
issues an fsync (or even a sync), the kernel doesn't *have* to drop everything 
it's doing and get to work on it immediately.  It could easily gather a few 
more requests, bundle them up, and then issue them.  If there's a lot of disk 
activity, it's probably smart to do just that.  All fsync and sync require is 
that the caller block until the data hits the disk (from the point of view of 
the kernel). The specification doesn't require that the kernel act on the calls 
immediately or write only the blocks referred to by the call in question.

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Tom Lane

Stephan Szabo <[EMAIL PROTECTED]> writes:
> ... At some point, if we can work out how to do all the semantics
> properly, it'd probably be possible to replace the insert type check with
> a per-statement check which would be somewhere in between.  That requires
> access to the affected rows inside the trigger which I don't believe is
> available currently.

Not necessarily.  It occurs to me that maybe what we need is "lossy
storage" of the trigger events.  If we could detect that the queue of
pending checks for a particular FK is getting large, we could discard
the whole queue and replace it with one entry that says "run the
wholesale check again when we are ready to fire triggers".  I'm not
sure how to detect this efficiently, though --- the trigger manager
doesn't presently know anything about FKs being different from
any other kind of trigger.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Greg Stark


Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller
instead of the 3ware one.

Poking around it seems this does come with Linux drivers and there is a
battery backup option. So it doesn't seem to be completely insane.

Anyone have any experience with these controllers?

I'm also wondering about whether I'm better off with one of these SATA raid
controllers or just going with SCSI drives.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Rosser Schwarz

while you weren't looking, Kevin Brown wrote:

[reordering bursty reads]

> In other words, it's a corner case that I strongly suspect
> isn't typical in situations where SCSI has historically made a big
> difference.

[...]

> But I rather doubt that has to be a huge penalty, if any.  When a
> process issues an fsync (or even a sync), the kernel doesn't *have* to
> drop everything it's doing and get to work on it immediately.  It
> could easily gather a few more requests, bundle them up, and then
> issue them.

To make sure I'm following you here, are you or are you not suggesting
that the kernel could sit on -all- IO requests for some small handful
of ms before actually performing any IO to address what you "strongly
suspect" is a "corner case"?

/rls

-- 
:wq

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Stephan Szabo

On Thu, 14 Apr 2005, Richard van den Berg wrote:

> Hello Chris,
>
> Thanks for your answers.
>
> Christopher Kings-Lynne wrote:
> > Deferring makes no difference to FK checking speed...
>
> But why then is the speed acceptable if I copy and then manually add the
> FK? Is the check done by the FK so much different from when it is done
> automatically using an active deffered FK?

Yes, because currently the check done by the FK on an insert type activity
is a per-row inserted check while the check done when adding a FK acts on
the entire table in a go which allows better optimization of that case
(while generally being worse on small number inserts especially on large
tables).  At some point, if we can work out how to do all the semantics
properly, it'd probably be possible to replace the insert type check with
a per-statement check which would be somewhere in between.  That requires
access to the affected rows inside the trigger which I don't believe is
available currently.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] speed of querry?

2005-04-14 Thread Joel Fradkin

Josh from commandprompt.com had me alter the config to have
default_statistics_target = 250

Is this somehow related to what your asking me to do?
I did do an analyze, but have only ran the viw a few times.

Joel Fradkin

-Original Message-
From: Dawid Kuroczko [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 14, 2005 9:21 AM
To: Joel Fradkin
Cc: PostgreSQL Perform
Subject: Re: [PERFORM] speed of querry?

On 4/14/05, Joel Fradkin <[EMAIL PROTECTED]> wrote:
> I have done a vacuum and a vacuum analyze.
> I can try again for kicks, but it is not in production so no new records
are
> added and vacuum analyze is ran after any mods to the indexes.
> 
> I am still pursuing Dell on why the monster box is so much slower then the
> desktop as well.

First thing:  Do something like:
ALTER TABLE tbljobtitle ALTER COLUMN clientnum SET STATISTICS 50;
make it for each column used, make it even higher than 50 for
many-values columns.
THEN make VACUUM ANALYZE;

Then do a query couple of times (EXPLAIN ANALYZE also :)), then do:
SET enable_seqscan = off;
and rerun the query -- if it was significantly faster, you will want to do:
SET enable_seqscan = on;
and tweak:
SET random_page_cost = 2.1;
...and play with values.  When you reach the random_page_cost which
suits your data, you will want to put it into postgresql.conf

I am sorry if it is already known to you. :)  Also, it is a rather
simplistic
approach to tuning PostgreSQL but it is worth doing.  Especially the
statistics part. :)

   Regards,
   Dawid

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Tom Lane

Kevin Brown <[EMAIL PROTECTED]> writes:
> I really don't see how this is any different between a system that has
> tagged queueing to the disks and one that doesn't.  The only
> difference is where the queueing happens.  In the case of SCSI, the
> queueing happens on the disks (or at least on the controller).  In the
> case of SATA, the queueing happens in the kernel.

That's basically what it comes down to: SCSI lets the disk drive itself
do the low-level I/O scheduling whereas the ATA spec prevents the drive
from doing so (unless it cheats, ie, caches writes).  Also, in SCSI it's
possible for the drive to rearrange reads as well as writes --- which
AFAICS is just not possible in ATA.  (Maybe in the newest spec...)

The reason this is so much more of a win than it was when ATA was
designed is that in modern drives the kernel has very little clue about
the physical geometry of the disk.  Variable-size tracks, bad-block
sparing, and stuff like that make for a very hard-to-predict mapping
from linear sector addresses to actual disk locations.  Combine that
with the fact that the drive controller can be much smarter than it was
twenty years ago, and you can see that the case for doing I/O scheduling
in the kernel and not in the drive is pretty weak.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Christopher Kings-Lynne

Deferring makes no difference to FK checking speed...

But why then is the speed acceptable if I copy and then manually add the 
FK? Is the check done by the FK so much different from when it is done 
automatically using an active deffered FK?
Yeah I think it uses a different query formulation...  Actually I only 
assume that deferred fk's don't use that - I guess your experiment 
proves that.

Well, that's what people do - even pg_dump will restore data and add 
the foreign key afterward...
If I have to go this route, is there a way of automatically dropping and 
re-adding FKs? I can probably query pg_constraints and drop the 
appropriate ones, but how do I re-add them after the copy/insert?
Actually, you can just "disable" them if you want to be really dirty :) 
 You have to be confident that the data you're inserting does satisfy 
the FK, however otherwise you can end up with invalid data.

To see how to do that, try pg_dump with --disable-triggers mode enabled. 
 Just do a data-only dump.

Chris
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Richard van den Berg

Hello Chris,
Thanks for your answers.
Christopher Kings-Lynne wrote:
Deferring makes no difference to FK checking speed...
But why then is the speed acceptable if I copy and then manually add the 
FK? Is the check done by the FK so much different from when it is done 
automatically using an active deffered FK?

Well, that's what people do - even pg_dump will restore data and add the 
foreign key afterward...
If I have to go this route, is there a way of automatically dropping and 
re-adding FKs? I can probably query pg_constraints and drop the 
appropriate ones, but how do I re-add them after the copy/insert?

Sincerely,
--
Richard van den Berg, CISSP
---
Trust Factory B.V. | www.dna-portal.net
Bazarstraat 44a|  www.trust-factory.com
2518AK The Hague   |  Phone: +31 70 3620684
The Netherlands|  Fax  : +31 70 3603009
---
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [PERFORM] speed of querry?

2005-04-14 Thread Dawid Kuroczko

On 4/14/05, Joel Fradkin <[EMAIL PROTECTED]> wrote:
> I have done a vacuum and a vacuum analyze.
> I can try again for kicks, but it is not in production so no new records are
> added and vacuum analyze is ran after any mods to the indexes.
> 
> I am still pursuing Dell on why the monster box is so much slower then the
> desktop as well.

First thing:  Do something like:
ALTER TABLE tbljobtitle ALTER COLUMN clientnum SET STATISTICS 50;
make it for each column used, make it even higher than 50 for
many-values columns.
THEN make VACUUM ANALYZE;

Then do a query couple of times (EXPLAIN ANALYZE also :)), then do:
SET enable_seqscan = off;
and rerun the query -- if it was significantly faster, you will want to do:
SET enable_seqscan = on;
and tweak:
SET random_page_cost = 2.1;
...and play with values.  When you reach the random_page_cost which
suits your data, you will want to put it into postgresql.conf

I am sorry if it is already known to you. :)  Also, it is a rather simplistic
approach to tuning PostgreSQL but it is worth doing.  Especially the
statistics part. :)

   Regards,
   Dawid

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Christopher Kings-Lynne

I am new to cross references between tables, and I am trying to
understand how they impact performance. From reading the documentation I
was under the impression that deffering foreign keys would yield about
the same performance as dropping them before a copy, and adding them
after. However, I cannot see this in my test case.
Even if you defer them, it just defers the check, doesn't eliminate it...
I have a table A with an int column ID that references table B column
ID. Table B has about 150k rows, and has an index on B.ID. When trying
to copy 1 million rows into A, I get the following \timings:
1) drop FK, copy (200s), add FK (5s)
2) add FK defferable initially deffered, copy (I aborted after 30min)
3) add FK defferable initially deffered, begin, copy (200s), commit (I
aborted after 30min)
How do I explain why test cases 2 and 3 do not come close to case 1? Am
I missing something obvious?
Deferring makes no difference to FK checking speed...
Since the database I am working on has many FKs, I would rather not have
to drop/add them when I am loading large data sets.
Well, that's what people do - even pg_dump will restore data and add the 
foreign key afterward...

Chris
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Use of data within indexes

2005-04-14 Thread Christopher Kings-Lynne

To be more explicit, let's say I have table with two fields a and b. If 
I have an index on (a,b) and I do a request like "SELECT b FROM table 
WHERE a=x", will Postgresql use only the index, or will it need to also 
read the table page for that (those) row(s)?
It must read the table because of visibility considerations.
There might be a reason why this is not possible (I don't know if the 
indexes have all necessary transaction ID information?) but otherwise 
this could possibly provide an interesting performance gain for some 
operations, in particular with some types of joins. Or maybe it already 
does it.
It's already been thought of :)
The 4 or so columns that store visibility information are not in the 
indexes, to do so would require a significant write cost.

Chris
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] speed of querry?

2005-04-14 Thread Joel Fradkin

I have done a vacuum and a vacuum analyze.
I can try again for kicks, but it is not in production so no new records are
added and vacuum analyze is ran after any mods to the indexes.

I am still pursuing Dell on why the monster box is so much slower then the
desktop as well.

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305

[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 14, 2005 1:47 AM
To: Richard Huxton
Cc: Joel Fradkin; PostgreSQL Perform
Subject: Re: [PERFORM] speed of querry? 

Richard Huxton  writes:
> In the first, we match outer.clientnum to inner.clientnum, in the second 
> it's "?column10?" - are you sure the query was identical in each case. 
> I'm guessing the unidentified column in query 2 is the reason for the 
> sort a couple of lines below it, which seems to take up a large chunk of 
> time.

The "?column10?" is because EXPLAIN isn't excessively bright about
reporting references to outputs of lower plan nodes.  (Gotta fix that
sometime.)  The real point here is that the planner thought that a scan
plus sort would be faster than scanning an index that exactly matched
the sort order the Merge Join needed ... and it was wrong :-(

So this is just the usual sort of question of "are your stats up to
date, maybe you need to increase stats targets, or else play with
random_page_cost, etc" ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Richard van den Berg

I am new to cross references between tables, and I am trying to
understand how they impact performance. From reading the documentation I
was under the impression that deffering foreign keys would yield about
the same performance as dropping them before a copy, and adding them
after. However, I cannot see this in my test case.
I have a table A with an int column ID that references table B column
ID. Table B has about 150k rows, and has an index on B.ID. When trying
to copy 1 million rows into A, I get the following \timings:
1) drop FK, copy (200s), add FK (5s)
2) add FK defferable initially deffered, copy (I aborted after 30min)
3) add FK defferable initially deffered, begin, copy (200s), commit (I
aborted after 30min)
How do I explain why test cases 2 and 3 do not come close to case 1? Am
I missing something obvious?
Since the database I am working on has many FKs, I would rather not have
to drop/add them when I am loading large data sets.
If it would help I can write this out in a reproducable scenario. I am
using postgresql 7.4.5 at the moment.
Sincerely,
--
Richard van den Berg, CISSP
---
Trust Factory B.V. | www.dna-portal.net
Bazarstraat 44a|  www.trust-factory.com
2518AK The Hague   |  Phone: +31 70 3620684
The Netherlands|  Fax  : +31 70 3603009
---
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[PERFORM] Use of data within indexes

2005-04-14 Thread Jacques Caron

Hi,
Just wondering... Is Postgresql able to use data present within indexes 
without looking up the table data?

To be more explicit, let's say I have table with two fields a and b. If I 
have an index on (a,b) and I do a request like "SELECT b FROM table WHERE 
a=x", will Postgresql use only the index, or will it need to also read the 
table page for that (those) row(s)?

There might be a reason why this is not possible (I don't know if the 
indexes have all necessary transaction ID information?) but otherwise this 
could possibly provide an interesting performance gain for some operations, 
in particular with some types of joins. Or maybe it already does it.

Any hint welcome!
Thanks,
Jacques.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Kevin Brown

Greg Stark wrote:

> I think you're being misled by analyzing the write case.
> 
> Consider the read case. When a user process requests a block and
> that read makes its way down to the driver level, the driver can't
> just put it aside and wait until it's convenient. It has to go ahead
> and issue the read right away.

Well, strictly speaking it doesn't *have* to.  It could delay for a
couple of milliseconds to see if other requests come in, and then
issue the read if none do.  If there are already other requests being
fulfilled, then it'll schedule the request in question just like the
rest.

> In the 10ms or so that it takes to seek to perform that read
> *nothing* gets done. If the driver receives more read or write
> requests it just has to sit on them and wait. 10ms is a lifetime for
> a computer. In that time dozens of other processes could have been
> scheduled and issued reads of their own.

This is true, but now you're talking about a situation where the
system goes from an essentially idle state to one of furious
activity.  In other words, it's a corner case that I strongly suspect
isn't typical in situations where SCSI has historically made a big
difference.

Once the first request has been fulfilled, the driver can now schedule
the rest of the queued-up requests in disk-layout order.

I really don't see how this is any different between a system that has
tagged queueing to the disks and one that doesn't.  The only
difference is where the queueing happens.  In the case of SCSI, the
queueing happens on the disks (or at least on the controller).  In the
case of SATA, the queueing happens in the kernel.

I suppose the tagged queueing setup could begin the head movement and,
if another request comes in that requests a block on a cylinder
between where the head currently is and where it's going, go ahead and
read the block in question.  But is that *really* what happens in a
tagged queueing system?  It's the only major advantage I can see it
having.

> The same thing would happen if you had lots of processes issuing
> lots of small fsynced writes all over the place. Postgres doesn't
> really do that though. It sort of does with the WAL logs, but that
> shouldn't cause a lot of seeking.  Perhaps it would mean that having
> your WAL share a spindle with other parts of the OS would have a
> bigger penalty on IDE drives than on SCSI drives though?

Perhaps.

But I rather doubt that has to be a huge penalty, if any.  When a
process issues an fsync (or even a sync), the kernel doesn't *have* to
drop everything it's doing and get to work on it immediately.  It
could easily gather a few more requests, bundle them up, and then
issue them.  If there's a lot of disk activity, it's probably smart to
do just that.  All fsync and sync require is that the caller block
until the data hits the disk (from the point of view of the kernel).
The specification doesn't require that the kernel act on the calls
immediately or write only the blocks referred to by the call in
question.

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

60 matches

Mail list logo