Re: [PERFORM] MySQL is faster than PgSQL but a large margin in

2005-12-23 Thread Vivek Khera


On Dec 22, 2005, at 9:44 PM, Juan Casero wrote:

Agreed.  I have a 13 million row table that gets a 100,000 new  
records every
week.   There are six indexes on this table.   Right about the time  
when it


i have some rather large tables that grow much faster than this (~1  
million per day on a table with > 200m rows) and a few indexes.   
there is no such slowness I see.


do you really need all those indexes?


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] MySQL is faster than PgSQL but a large margin in

2005-12-22 Thread Juan Casero
Agreed.  I have a 13 million row table that gets a 100,000 new records every 
week.   There are six indexes on this table.   Right about the time when it 
reached the 10 million row mark updating the table with new records started 
to take many hours if I left the indexes in place during the update.   Indeed 
there was even some suspicion that the indexes were starting to get corrupted 
during the load.  So I decided to fist drop the indexes when I needed to 
update the table.  Now inserting 100,000 records into the table is nearly 
instantaneous although it does take me a couple of hours to build the indexes 
anew.   This is still big improvement since at one time it was taking almost 
12 hours to update the table with the indexes in place.  


Juan

On Thursday 22 December 2005 08:34, Markus Schaber wrote:
> Hi, Madison,
> Hi, Luke,
>
> Luke Lonergan wrote:
> > Note that indexes will also slow down loading.
>
> For large loading bunches, it often makes sense to temporarily drop the
> indices before the load, and recreate them afterwards, at least, if you
> don't have normal users accessing the database concurrently.
>
> Markus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] MySQL is faster than PgSQL but a large margin in

2005-12-22 Thread Markus Schaber
Hi, Madison,
Hi, Luke,

Luke Lonergan wrote:

> Note that indexes will also slow down loading.

For large loading bunches, it often makes sense to temporarily drop the
indices before the load, and recreate them afterwards, at least, if you
don't have normal users accessing the database concurrently.

Markus

-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] MySQL is faster than PgSQL but a large margin in

2005-12-21 Thread Luke Lonergan
Madison,


On 12/21/05 11:02 PM, "Madison Kelly" <[EMAIL PROTECTED]> wrote:

> Currently 7.4 (what comes with Debian Sarge). I have run my program on
> 8.0 but not since I have added MySQL support. I should run the tests on
> the newer versions of both DBs (using v4.1 for MySQL which is also
> mature at this point).

Yes, this is *definitely* your problem.  Upgrade to Postgres 8.1.1 or
Bizgres 0_8_1 and your COPY speed could double without even changing fsync
(depending on your disk speed).  We typically get 12-14MB/s from Bizgres on
Opteron CPUs and disk subsystems that can write at least 60MB/s.  This means
you can load 100GB in 2 hours.

Note that indexes will also slow down loading.
 
- Luke



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] MySQL is faster than PgSQL but a large margin in

2005-12-21 Thread Luke Lonergan
Madison,

On 12/21/05 10:58 PM, "Madison Kelly" <[EMAIL PROTECTED]> wrote:

> Ah, that makes a lot of sense (I read about the 'fsync' issue before,
> now that you mention it). I am not too familiar with MySQL but IIRC
> MyISAM is their open-source DB and InnoDB is their commercial one, ne?
> If so, then I am running MyISAM.

You can run either storage method with MySQL, I expect the default is
MyISAM.

COPY performance with or without fsync was sped up recently nearly double in
Postgresql.  The Bizgres version (www.bizgres.org, www.greenplum.com) is the
fastest, Postgres 8.1.1 is close, depending on how fast your disk I/O is (as
I/O speed increases Bizgres gets faster).

fsync isn't really an "issue" and I'd suggest you not run without it! We've
found that "fdatasync" as the wal sync method is actually a bit faster than
fsync if you want a bit better speed.

So, I'd recommend you upgrade to either bizgres or Postgres 8.1.1 to get the
maximum COPY speed.

>Here is the MySQL table. The main difference from the PostgreSQL
> table is that the 'varchar(255)' columns are 'text' columns in PostgreSQL.

Shouldn't matter.
 
> mysql> DESCRIBE file_info_1;
> +-+--+--+-+-+---+
> | Field   | Type | Null | Key | Default | Extra |
> +-+--+--+-+-+---+
> | file_group_name | varchar(255) | YES  | | NULL|   |
> | file_group_uid  | int(11)  |  | | 0   |   |
> | file_mod_time   | bigint(20)   |  | | 0   |   |
> | file_name   | varchar(255) |  | | |   |
> | file_parent_dir | varchar(255) |  | MUL | |   |
> | file_perm   | int(11)  |  | | 0   |   |
> | file_size   | bigint(20)   |  | | 0   |   |
> | file_type   | char(1)  |  | | |   |
> | file_user_name  | varchar(255) | YES  | | NULL|   |
> | file_user_uid   | int(11)  |  | | 0   |   |
> | file_backup | char(1)  |  | MUL | i   |   |
> | file_display| char(1)  |  | | i   |   |
> | file_restore| char(1)  |  | | i   |   |
> +-+--+--+-+-+---+

What's a bigint(20)?  Are you using "numeric" in Postgresql?
 
>I will try turning off 'fsync' on my test box to see how much of a
> performance gain I get and to see if it is close to what I am getting
> out of MySQL. If that does turn out to be the case though I will be able
> to comfortably continue recommending PostgreSQL from a stability point
> of view.

Again - fsync is a small part of the performance - you will need to run
either Postgres 8.1.1 or Bizgres to get good COPY speed.

- Luke



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] MySQL is faster than PgSQL but a large margin in

2005-12-21 Thread Madison Kelly

Luke Lonergan wrote:

What version of postgres?

Copy has been substantially improved in bizgres and also in 8.1.
- Luke


Currently 7.4 (what comes with Debian Sarge). I have run my program on 
8.0 but not since I have added MySQL support. I should run the tests on 
the newer versions of both DBs (using v4.1 for MySQL which is also 
mature at this point).


As others mentioned though, so far the most likely explanation is the 
'fsync' being enabled on PostgreSQL.


Thanks for the reply!

Madison

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Madison Kelly (Digimer)
   TLE-BU; The Linux Experience, Back Up
Main Project Page:  http://tle-bu.org
Community Forum:http://forum.tle-bu.org
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] MySQL is faster than PgSQL but a large margin in my

2005-12-21 Thread Madison Kelly

Stephen Frost wrote:

* Madison Kelly ([EMAIL PROTECTED]) wrote:

 If the performace difference comes from the 'COPY...' command being 
slower because of the automatic quoting can I somehow tell PostgreSQL 
that the data is pre-quoted? Could the performance difference be 
something else?



I doubt the issue is with the COPY command being slower than INSERTs
(I'd expect the opposite generally, actually...).  What's the table type
of the MySQL tables?  Is it MyISAM or InnoDB (I think those are the main
alternatives)?  IIRC, MyISAM doesn't do ACID and isn't transaction safe,
and has problems with data reliability (aiui, equivilant to doing 'fsync
= false' for Postgres).  InnoDB, again iirc, is transaction safe and
whatnot, and more akin to the default PostgreSQL setup.

I expect some others will comment along these lines too, if my response
isn't entirely clear. :)

Stephen


Ah, that makes a lot of sense (I read about the 'fsync' issue before, 
now that you mention it). I am not too familiar with MySQL but IIRC 
MyISAM is their open-source DB and InnoDB is their commercial one, ne? 
If so, then I am running MyISAM.


  Here is the MySQL table. The main difference from the PostgreSQL 
table is that the 'varchar(255)' columns are 'text' columns in PostgreSQL.


mysql> DESCRIBE file_info_1;
+-+--+--+-+-+---+
| Field   | Type | Null | Key | Default | Extra |
+-+--+--+-+-+---+
| file_group_name | varchar(255) | YES  | | NULL|   |
| file_group_uid  | int(11)  |  | | 0   |   |
| file_mod_time   | bigint(20)   |  | | 0   |   |
| file_name   | varchar(255) |  | | |   |
| file_parent_dir | varchar(255) |  | MUL | |   |
| file_perm   | int(11)  |  | | 0   |   |
| file_size   | bigint(20)   |  | | 0   |   |
| file_type   | char(1)  |  | | |   |
| file_user_name  | varchar(255) | YES  | | NULL|   |
| file_user_uid   | int(11)  |  | | 0   |   |
| file_backup | char(1)  |  | MUL | i   |   |
| file_display| char(1)  |  | | i   |   |
| file_restore| char(1)  |  | | i   |   |
+-+--+--+-+-+---+

  I will try turning off 'fsync' on my test box to see how much of a 
performance gain I get and to see if it is close to what I am getting 
out of MySQL. If that does turn out to be the case though I will be able 
to comfortably continue recommending PostgreSQL from a stability point 
of view.


Thanks!!

Madison

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Madison Kelly (Digimer)
   TLE-BU; The Linux Experience, Back Up
Main Project Page:  http://tle-bu.org
Community Forum:http://forum.tle-bu.org
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] MySQL is faster than PgSQL but a large margin in

2005-12-21 Thread Luke Lonergan
What version of postgres?

Copy has been substantially improved in bizgres and also in 8.1.
- Luke
--
Sent from my BlackBerry Wireless Device


-Original Message-
From: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
To: pgsql-performance@postgresql.org 
Sent: Wed Dec 21 21:03:18 2005
Subject: [PERFORM] MySQL is faster than PgSQL but a large margin in my 
program... any ideas why?

Hi all,

   On a user's request, I recently added MySQL support to my backup 
program which had been written for PostgreSQL exclusively until now. 
What surprises me is that MySQL is about 20%(ish) faster than PostgreSQL.

   Now, I love PostgreSQL and I want to continue recommending it as the 
database engine of choice but it is hard to ignore a performance 
difference like that.

   My program is a perl backup app that scans the content of a given 
mounted partition, 'stat's each file and then stores that data in the 
database. To maintain certain data (the backup, restore and display 
values for each file) I first read in all the data from a given table 
(one table per partition) into a hash, drop and re-create the table, 
then start (in PostgreSQL) a bulk 'COPY..' call through the 'psql' shell 
app.

   In MySQL there is no 'COPY...' equivalent so instead I generate a 
large 'INSERT INTO file_info_X (col1, col2, ... coln) VALUES (...), 
(blah) ... (blah);'. This doesn't support automatic quoting, obviously, 
so I manually quote my values before adding the value to the INSERT 
statement. I suspect this might be part of the performance difference?

   I take the total time needed to update a partition (load old data 
into hash + scan all files and prepare COPY/INSERT + commit new data) 
and devide by the number of seconds needed to get a score I call a 
'U.Rate). On average on my Pentium3 1GHz laptop I get U.Rate of ~4/500. 
On MySQL though I usually get a U.Rate of ~7/800.

   If the performace difference comes from the 'COPY...' command being 
slower because of the automatic quoting can I somehow tell PostgreSQL 
that the data is pre-quoted? Could the performance difference be 
something else?

   If it would help I can provide code samples. I haven't done so yet 
because it's a little convoluded. ^_^;

   Thanks as always!

Madison


Where the big performance concern is when

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
   Madison Kelly (Digimer)
TLE-BU; The Linux Experience, Back Up
Main Project Page:  http://tle-bu.org
Community Forum:http://forum.tle-bu.org
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] MySQL is faster than PgSQL but a large margin in my program... any ideas why?

2005-12-21 Thread Kevin Brown
On Wednesday 21 December 2005 20:14, Stephen Frost wrote:
> * Madison Kelly ([EMAIL PROTECTED]) wrote:
> >   If the performace difference comes from the 'COPY...' command being
> > slower because of the automatic quoting can I somehow tell PostgreSQL
> > that the data is pre-quoted? Could the performance difference be
> > something else?
>
> I doubt the issue is with the COPY command being slower than INSERTs
> (I'd expect the opposite generally, actually...).  What's the table type
> of the MySQL tables?  Is it MyISAM or InnoDB (I think those are the main
> alternatives)?  IIRC, MyISAM doesn't do ACID and isn't transaction safe,
> and has problems with data reliability (aiui, equivilant to doing 'fsync
> = false' for Postgres).  InnoDB, again iirc, is transaction safe and
> whatnot, and more akin to the default PostgreSQL setup.
>
> I expect some others will comment along these lines too, if my response
> isn't entirely clear. :)

Is fsync() on in your postgres config?  If so, that's why you're slower.  The 
default is to have it on for stability (writes are forced to disk).  It is 
quite a bit slower than just allowing the write caches to do their job, but 
more stable.  MySQL does not force writes to disk.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] MySQL is faster than PgSQL but a large margin in my program... any ideas why?

2005-12-21 Thread Stephen Frost
* Madison Kelly ([EMAIL PROTECTED]) wrote:
>   If the performace difference comes from the 'COPY...' command being 
> slower because of the automatic quoting can I somehow tell PostgreSQL 
> that the data is pre-quoted? Could the performance difference be 
> something else?

I doubt the issue is with the COPY command being slower than INSERTs
(I'd expect the opposite generally, actually...).  What's the table type
of the MySQL tables?  Is it MyISAM or InnoDB (I think those are the main
alternatives)?  IIRC, MyISAM doesn't do ACID and isn't transaction safe,
and has problems with data reliability (aiui, equivilant to doing 'fsync
= false' for Postgres).  InnoDB, again iirc, is transaction safe and
whatnot, and more akin to the default PostgreSQL setup.

I expect some others will comment along these lines too, if my response
isn't entirely clear. :)

Stephen


signature.asc
Description: Digital signature


[PERFORM] MySQL is faster than PgSQL but a large margin in my program... any ideas why?

2005-12-21 Thread Madison Kelly

Hi all,

  On a user's request, I recently added MySQL support to my backup 
program which had been written for PostgreSQL exclusively until now. 
What surprises me is that MySQL is about 20%(ish) faster than PostgreSQL.


  Now, I love PostgreSQL and I want to continue recommending it as the 
database engine of choice but it is hard to ignore a performance 
difference like that.


  My program is a perl backup app that scans the content of a given 
mounted partition, 'stat's each file and then stores that data in the 
database. To maintain certain data (the backup, restore and display 
values for each file) I first read in all the data from a given table 
(one table per partition) into a hash, drop and re-create the table, 
then start (in PostgreSQL) a bulk 'COPY..' call through the 'psql' shell 
app.


  In MySQL there is no 'COPY...' equivalent so instead I generate a 
large 'INSERT INTO file_info_X (col1, col2, ... coln) VALUES (...), 
(blah) ... (blah);'. This doesn't support automatic quoting, obviously, 
so I manually quote my values before adding the value to the INSERT 
statement. I suspect this might be part of the performance difference?


  I take the total time needed to update a partition (load old data 
into hash + scan all files and prepare COPY/INSERT + commit new data) 
and devide by the number of seconds needed to get a score I call a 
'U.Rate). On average on my Pentium3 1GHz laptop I get U.Rate of ~4/500. 
On MySQL though I usually get a U.Rate of ~7/800.


  If the performace difference comes from the 'COPY...' command being 
slower because of the automatic quoting can I somehow tell PostgreSQL 
that the data is pre-quoted? Could the performance difference be 
something else?


  If it would help I can provide code samples. I haven't done so yet 
because it's a little convoluded. ^_^;


  Thanks as always!

Madison


Where the big performance concern is when

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Madison Kelly (Digimer)
   TLE-BU; The Linux Experience, Back Up
Main Project Page:  http://tle-bu.org
Community Forum:http://forum.tle-bu.org
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly