Re: [HACKERS] Acclerating INSERT/UPDATE using UPS

2007-02-12 Thread Gene

I was curious to see how postgres would perform with wal on a tmpfs vs disk
here are some numbers I got from pgbench. Let me know if I did something
stupid, this is the first time I've used pgbench. The wal on tmpfs method is
not significantly faster.

[[ WAL ON TMPFS ]]
pgbench -i -s 10 -U postgres -d benchmark
...
pgbench -Upostgres -s 10 -c 10 -t 1 benchmark
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 10
number of clients: 10
number of transactions per client: 1
number of transactions actually processed: 10/10
tps = 5817.693724 (including connections establishing)
tps = 5825.646441 (excluding connections establishing)

[[ WAL ON EXT2 14 U320 DRIVE RAID10 WITH BBU (same as data) ]]
pgbench -Upostgres -s 10 -c 10 -t 1 benchmark
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 10
number of clients: 10
number of transactions per client: 1
number of transactions actually processed: 10/10
tps = 5653.187997 (including connections establishing)
tps = 5660.554438 (excluding connections establishing)

pgbench -Upostgres -s 100 -c 10 -t 1 benchmark
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 10
number of transactions per client: 1
number of transactions actually processed: 10/10
tps = 5536.019864 (including connections establishing)
tps = 5543.834350 (excluding connections establishing)


Re: [HACKERS] Acclerating INSERT/UPDATE using UPS

2007-02-10 Thread Gene

I think it would be great to have this kind of functionality built into
postgres (optional and disabled by default of course) I use postgres mainly
for its querying and concurrency features (a 10x increase in insert/update
speed would be phenomenal) I know most people need 100% data integrity but
as Hideyuki pointed out we all don't need 100%. On our project, when the
power is out, we aren't receiving data anyways... just my two cents. on a
side note, would putting the wal on a tmpfs partition give you something
similar?

On 2/10/07, Joshua D. Drake [EMAIL PROTECTED] wrote:


Hideyuki Kawashima wrote:
 Joshua,

:)

 The reason why I made the Sigres is, the advances of recent non volatile
 memories. Just now we do not usually use non volatile memories. But in
 the near future, situation would change. I think if a non volatile
 memories can be considered as a persistence device, PostgreSQL WAL
 mechanism should be modified.
 However, I do not use such devices usually. Thus I made Sigres which
 requires UPS.

This is actually very interesting. We (www.commandprompt.com) have had
several customers ask us how we can make PostgreSQL more reasonable
within a flash environment.

I agree with you that in the future you will see many such databases
including PostgreSQL living on these devices.

Tom? What do you think? Is there some room for movement here within the
postgresql.conf to make something like sigres usable within PostgreSQL
proper?


 Currently I have just ignored XLogWrite and WALWriteLock, but a friend
 of mine (a Japanese great hacker of PostgreSQL) has more idea to improve
 WAL if a battery supplied memory can be considered as a persistent
device.


We are coming up very quickly on a feature freeze for the next version
of PostgreSQL. If... we can has something out quickly enough and in a
thought out fashion, the hackers may be willing to accept a patch for
8.3.. If not there is always 8.4..

Sincerely,

Joshua D. Drake




--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match





--
Gene Hart
cell: 443-604-2679


Re: [HACKERS] default child of partition master

2006-08-20 Thread Gene
Keep in mind if you have multiple rules for a master table, it won't return the number of affected rows as you might expect. This screws up Hibernate which I'm using for my application. It checks the return value to make sure it was inserted properly. Luckily I only need one rule which puts it into the current child table (im partitioning on current timestamp). I suppose I could get around this by using a stored procedure or something but that would not be as portable. I'm looking forward to future versions of PG which automate more of the partitioning features :) keep up the good work!
GeneOn 8/20/06, Tom Lane [EMAIL PROTECTED] wrote:
April Lorenzen [EMAIL PROTECTED] writes: My suggestion is to allow specifying a default destination table in the master partition table definition. This default destination table
 could be changed with ALTER TABLE.The recommended way to do this is with an ON INSERT DO INSTEAD ruleon the master table.The advantage of using rules is that you can haveseveral conditional rules to ensure that insertions go into the right
sub-tables, whereas a default would likely just fail :-(regards, tom lane---(end of broadcast)---TIP 6: explain analyze is your friend
-- Eugene Hart


Re: [HACKERS] [PATCHES] Maintaining cluster order on insert

2006-08-09 Thread Gene
I have a table that inserts lots of rows (million+ per day) int8 as primary key, and I cluster by a timestamp which is approximately the timestamp of the insert beforehand and is therefore in increasing order and doesn't change. Most of the rows are updated about 3 times over time roughly within the next 30 minutes. Should I assume that that all of these updates will be on separate pages unless I perform a cluster (which takes a long time) and performance will suffer due to this? Is it possible to preallocate space on the same page for future updates (whatever the average number of updates may be per row) decreasing the number of page loads for querying?
GeneOn 8/9/06, Jonah H. Harris [EMAIL PROTECTED] wrote:
On 8/9/06, Tom Lane [EMAIL PROTECTED] wrote: UPDATE tries to place the new tuple on the same page it's already on.I think he meant for INSERT.
--Jonah H. Harris, Software Architect | phone: 732.331.1300EnterpriseDB Corporation| fax: 732.331.130133 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/---(end of broadcast)---TIP 9: In versions below 
8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match-- Eugene Hart


Re: [HACKERS] [PATCHES] Maintaining cluster order on insert

2006-08-09 Thread Gene
You are correct the main part I'm worried about is the updates, being so far from the originals. fyi I
am partitioning the tables by the timestamp column,vacuum analyzing once per hour, creating one child
partition per day in a cron job. Because I'm using hibernate for database abstraction (stateless sessions), I can
only have one RULE since having more than one insert rule will not
return the correct number of updated rows which confuses hibernate. The one rule just directs inserts to the latest child partition which seems to work well.The
reason I'm doing the clustering is I was hoping that with the stable non-updating
partitions I could execute a CLUSTER at night (slow...) and it would compact the tables into their most efficient state for querying which always involves a date range. bad idea? In this fillfactor feature, will you be able to set it to 100% once you know that no more updates will occur? Or will doing a cluster effectively do this? Will the fill factor only apply for inserts?
Your best bet might be to partition the table into two subtables, onewith stable data and one with the fresh data, and transfer rows fromone to the other once they get stable.Storage density in the fresh
part would be poor, but it should be small enough you don't care.This sounds interesting, I could create a RULE/INSERT on the unstable table, I will know during the update if it is ready to be put in the stable table. What would be an efficient way to do the transfer? Since the updates occur somewhat randomly, wouldnt the tuples in the stable table then be out of natural timestamp order?
thanks for all of your help and comments! it is greatly appreciated!Gene HartOn 8/9/06, Tom Lane [EMAIL PROTECTED]
 wrote:Gene [EMAIL PROTECTED] writes:
 I have a table that inserts lots of rows (million+ per day) int8 as primary key, and I cluster by a timestamp which is approximately the timestamp of the insert beforehand and is therefore in increasing order and doesn't
 change. Most of the rows are updated about 3 times over time roughly within the next 30 minutes.ISTM you should hardly need to worry about clustering that --- the datawill be in timestamp order pretty naturally.
The main problem you're going to have is the update-3-times bit.Youcould keep updated rows on the same page as the original if you ran thetable at fillfactor 25% (which you'll be able to do in 8.2) ... but
while this might be sane for the leading edge of the table, you hardlywant such low storage density in the stable part.You could reduce the fillfactor requirement if you could vacuum thetable constantly (every 10 minutes or so) but I assume the table is
large enough to make that unattractive.(Eventually we should havea version of vacuum that understands where the dirty stuff is, whichmight make this approach tenable ... but not in 8.2.)Your best bet might be to partition the table into two subtables, one
with stable data and one with the fresh data, and transfer rows fromone to the other once they get stable.Storage density in the freshpart would be poor, but it should be small enough you don't care.
regards, tom lane-- Eugene Hart


Re: [HACKERS] Dreaming About Redesigning SQL

2003-10-16 Thread Gene Wirchenko
[EMAIL PROTECTED] (Seun Osewa) wrote:

[snip]

Sometimes I wonder why its so important to model data in the rela-
tional way, to think of data in form of sets of tuples rather than
tables or lists or whatever.  I mean, though its elegant and based
on mathematical principles I would like to know why its the _right_
model to follow in designing a DBMS (or database).  The way my mind
sees it, should we not rather be interested in what works?

 How do you know it works?  Without the theory and model, you
really do not.

Sincerely,

Gene Wirchenko

Computerese Irregular Verb Conjugation:
 I have preferences.
 You have biases.
 He/She has prejudices.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])