Re: [PERFORM] filesystem performance with lots of files
On Fri, 2 Dec 2005, Qingqing Zhou wrote: I don't have all the numbers readily available (and I didn't do all the tests on every filesystem), but I found that even with only 1000 files/directory ext3 had some problems, and if you enabled dir_hash some functions would speed up, but writing lots of files would just collapse (that was the 80 min run) Interesting. I would suggest test small number but bigger file would be better if the target is for database performance comparison. By small number, I mean 10^2 - 10^3; By bigger, I mean file size from 8k to 1G (PostgreSQL data file is at most this size under normal installation). I agree, that round of tests was done on my system at home, and was in response to a friend who had rsync over a local lan take > 10 hours for <10G of data. but even so it generated some interesting info. I need to make a more controlled run at it though. Let's take TPCC as an example, if we get a TPCC database of 500 files, each one is at most 1G (PostgreSQL has this feature/limit in ordinary installation), then this will give us a 500G database, which is big enough for your current configuration. Regards, Qingqing ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Database restore speed
David, > Luke, would it help to have one machine read the file and > have it connect to postgres on a different machine when doing > the copy? (I'm thinking that the first machine may be able to > do a lot of the parseing and conversion, leaving the second > machine to just worry about doing the writes) Unfortunately not - the parsing / conversion core is in the backend, where it should be IMO because of the need to do the attribute conversion there in the machine-native representation of the attributes (int4, float, etc) in addition to having the backend convert from client encoding (like LATIN1) to the backend encoding (like UNICODE aka UTF8). There are a few areas of discussion about continued performance increases in the codebase for COPY FROM, here are my picks: - More micro-optimization of the parsing and att conversion core - maybe 100% speedup in the parse/convert stage is possible - A user selectable option to bypass transaction logging, similar to Oracle's - A well-defined binary input format, like Oracle's SQL*Loader - this would bypass most parsing / att conversion - A direct-to-table storage loader facility - this would probably be the fastest possible load rate - Luke ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] 15,000 tables
Agreed, and I apologize for the imprecision of my post below. I should have written: "Best practice seems to be to use a journaling fs and log metadata only and put it on separate dedicated spindles." I've seen enough HD failures that I tend to be paranoid and log the metadata of fs dedicated to WAL as well, but that may very well be overkill. Ron At 01:57 PM 12/1/2005, Tom Lane wrote: Ron <[EMAIL PROTECTED]> writes: > Agreed. Also the odds of fs corruption or data loss are higher in a > non journaling fs. Best practice seems to be to use a journaling fs > but to put the fs log on dedicated spindles separate from the actual > fs or pg_xlog. I think we've determined that best practice is to journal metadata only (not file contents) on PG data filesystems. PG does expect the filesystem to remember where the files are, so you need metadata protection, but journalling file content updates is redundant with PG's own WAL logging. On a filesystem dedicated to WAL, you probably do not need any filesystem journalling at all --- we manage the WAL files in a way that avoids changing metadata for a WAL file that's in active use. A conservative approach would be to journal metadata here too, though. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] 15,000 tables
On Fri, Dec 02, 2005 at 03:15:00AM -0500, Ron wrote: I've seen enough HD failures that I tend to be paranoid and log the metadata of fs dedicated to WAL as well, but that may very well be overkill. Especially since it wouldn't gain anything. Journalling doesn't give you any advantage whatsoever in the face of a HD failure. Mike Stone ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] COPY into table too slow with index: now an I/O
On Fri, Dec 02, 2005 at 12:15:57AM -0500, Luke Lonergan wrote: That's good to know - makes sense. I suppose we might still thrash over a 1GB range in seeks if the BG writer starts running at full rate in the background, right? Or is there some write combining in the BG writer? That part your OS should be able to handle. Those writes aren't synced, so the OS has plenty of opportunity to buffer & aggregate them. Mike Stone ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] 15,000 tables
On 1 Dec 2005, at 16:03, Tom Lane wrote: Michael Riess <[EMAIL PROTECTED]> writes: (We NEED that many tables, please don't recommend to reduce them) No, you don't. Add an additional key column to fold together different tables of the same structure. This will be much more efficient than managing that key at the filesystem level, which is what you're effectively doing now. (If you really have 15000 distinct rowtypes, I'd like to know what your database design is...) Won't you end up with awful seek times if you just want data which previously been stored in a single table? E.g. whilst before you wanted 1000 contiguous rows from the table, now you want 1000 rows which now have 1000 rows you don't care about in between each one you do want. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] 15,000 tables
On 2 Dec 2005, at 14:16, Alex Stapleton wrote: On 1 Dec 2005, at 16:03, Tom Lane wrote: Michael Riess <[EMAIL PROTECTED]> writes: (We NEED that many tables, please don't recommend to reduce them) No, you don't. Add an additional key column to fold together different tables of the same structure. This will be much more efficient than managing that key at the filesystem level, which is what you're effectively doing now. (If you really have 15000 distinct rowtypes, I'd like to know what your database design is...) Won't you end up with awful seek times if you just want data which previously been stored in a single table? E.g. whilst before you wanted 1000 contiguous rows from the table, now you want 1000 rows which now have 1000 rows you don't care about in between each one you do want. I must of had a total and utter failure of intellect for a moment there. Please ignore that :P ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] 15,000 tables
On 12/1/2005 2:34 PM, Michael Riess wrote: VACUUM FULL was probably always overkill, unless "always" includes versions prior to 7.3... Well, we tried switching to daily VACUUM ANALYZE and weekly VACUUM FULL, but the database got considerably slower near the end of the week. This indicates that you have FSM settings that are inadequate for that many tables and eventually the overall size of your database. Try setting those to max_fsm_relations = 8 max_fsm_pages = (select sum(relpages) / 2 from pg_class) Another thing you might be suffering from (depending on the rest of your architecture) is file descriptor limits. Especially if you use some sort of connection pooling or persistent connections like PHP, you will have all the backends serving multiple of your logical applications (sets of 30 tables). If on average one backend is called for 50 different apps, then we are talking 50*30*4=6000 files accessed by that backend. 80/20 rule leaves 1200 files in access per backend, thus 100 active backends lead to 120,000 open (virtual) file descriptors. Now add to that any files that a backend would have to open in order to evict an arbitrary dirty block. With a large shared buffer pool and little more aggressive background writer settings, you can avoid mostly that regular backends would have to evict dirty blocks. If the kernel settings allow Postgres to keep that many file descriptors open, you avoid directory lookups. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Network permormance under windows
Hello, We used Postgresql 7.1 under Linux and recently we have changed it to Postgresql 8.1 under Windows XP. Our application uses ODBC and when we try to get some information from the server throw a TCP connection, it's very slow. We have also tried it using psql and pgAdmin III, and we get the same results. If we try it locally, it runs much faster. We have been searching the mailing lists, we have found many people with the same problem, but we haven't found any final solution. How can we solve this? Any help will be appreciated. Thanks in advance. Jordi. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Network permormance under windows
> We used Postgresql 7.1 under Linux and recently we have changed it to > Postgresql 8.1 under Windows XP. Our application uses ODBC and when we > try to get some information from the server throw a TCP connection, it's > very slow. We have also tried it using psql and pgAdmin III, and we get > the same results. If we try it locally, it runs much faster. > > We have been searching the mailing lists, we have found many people with > the same problem, but we haven't found any final solution. > > How can we solve this? Any help will be appreciated. > > Thanks in advance. > by any chance are you working with large tuples/columns (long text, bytea, etc)? Also please define slow. Merlin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] pg_dump slow
> > That was the command used to restore a database > > pg_restore.exe -i -h localhost -p 5432 -U postgres -d temp2 -v > "D:\d\temp.bkp" > > The database was created before using LATIN1 charset > > With 100 rows you can´t feel the test, then I decided send the whole > table. > > Very Thanks > > Franklin Haut How are you dumping out your archive? I confirmed unreasonably slow dump with pg_dump -Z temp2 > temp2.bkp on windows 2000 server. I normally use bzip to compress my dumps. Can you measure time to dump uncompressed and also with bzip and compare? Merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] pg_dump slow
> How are you dumping out your archive? I confirmed unreasonably slow dump > with pg_dump -Z temp2 > temp2.bkp on windows 2000 server. I normally use > bzip to compress my dumps. > > Can you measure time to dump uncompressed and also with bzip and compare? > > Merlin oops...cancel that. I was dumping the wrong database. Dumping your table from localhost on a dual Opteron win2k server took a few seconds with Z=0 and Z=9. Merlin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Open request for benchmarking input (fwd)
"David Lang" <[EMAIL PROTECTED]> wrote > here are the suggestions from the MySQL folks, what additional tests > should I do. > I think the tests you list are enough in this stage, Regards, Qingqing ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] two disks - best way to use them?
I installed another drive in my linux pc in an attempt to improve performance on a large COPY to a table with a geometry index. Based on previous discussion, it seems there are three things competing for the hard drive: 1) the input data file 2) the pg table 3) the WAL What is the best way to distribute these among two drives? From Tom’s comments I would think that the pg table and the WAL should be separate. Does it matter where the input data is?
Re: [PERFORM] Network permormance under windows
Dear Merlin, For instance, we have this table (with 22900 tuples): CREATE TABLE tbl_empresa ( id_empresa int4 NOT NULL DEFAULT nextval(('seq_empresa'::text)::regclass), ref_poblacio int4 NOT NULL, nom varchar(50) NOT NULL, nif varchar(12), carrer varchar(50), telefon varchar(13), fax varchar(13), email varchar(50), lab_materials int2 DEFAULT 0, web varchar(50), ref_empresa int4, ref_classificacio_empresa int4, ref_sector_empresa int4, control int2, origen_volcat int2, data_modificacio date, plantilla int4, tamany int2, autoritzacio_email int2, ref_estat_empresa int2, CONSTRAINT tbl_clients_pkey PRIMARY KEY (id_empresa), CONSTRAINT fk_tbl_empresa_ref_classificacio_emp FOREIGN KEY (ref_classificacio_empresa) REFERENCES tbl_classificacio_empresa (id_classificacio_empresa) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CONSTRAINT fk_tbl_empresa_ref_empresa FOREIGN KEY (ref_empresa) REFERENCES tbl_empresa (id_empresa) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CONSTRAINT fk_tbl_empresa_ref_estat_emp FOREIGN KEY (ref_estat_empresa) REFERENCES tbl_estat_empresa (id_estat_empresa) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CONSTRAINT fk_tbl_empresa_ref_poblacio FOREIGN KEY (ref_poblacio) REFERENCES tbl_poblacions (id_poblacio) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CONSTRAINT fk_tbl_empresa_ref_sector_emp FOREIGN KEY (ref_sector_empresa) REFERENCES tbl_sector_empresa (id_sector_empresa) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT ) WITH OIDS; When we select all data in local machine, we obtain results in 2-3 seconds aprox. In remote connections: Postgresql 7.1 usign pgAdminII: Network traffic generated with remote applications is about 77-80% in a 10Mb connection. 6 seconds aprox. Postgresql 8.1 usign pgAdminIII: Network traffic generated with remote applications is about 2-4% in a 10Mb connection. 12 seconds or more... I feel that is a problem with TCP_NODELAY of socket options... but I don't know. Josep Maria En/na Merlin Moncure ha escrit: We used Postgresql 7.1 under Linux and recently we have changed it to Postgresql 8.1 under Windows XP. Our application uses ODBC and when we try to get some information from the server throw a TCP connection, it's very slow. We have also tried it using psql and pgAdmin III, and we get the same results. If we try it locally, it runs much faster. We have been searching the mailing lists, we have found many people with the same problem, but we haven't found any final solution. How can we solve this? Any help will be appreciated. Thanks in advance. by any chance are you working with large tuples/columns (long text, bytea, etc)? Also please define slow. Merlin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Josep Maria Pinyol i Fontseca Responsable àrea de programació ENDEPRO - Enginyeria de programari Passeig Anselm Clavé, 19 Bx. 08263 Callús (Barcelona) Tel. +34 936930018 - Mob. +34 600310755 - Fax. +34 938361994 [EMAIL PROTECTED] - http://www.endepro.com Aquest missatge i els documents en el seu cas adjunts, es dirigeixen exclusivament al seu destinatari i poden contenir informació reservada i/o CONFIDENCIAL, us del qual no està autoritzat ni la divulgació del mateix, prohibit per la legislació vigent (Llei 32/2002 SSI-CE). Si ha rebut aquest missatge per error, li demanem que ens ho comuniqui immediatament per la mateixa via o bé per telèfon (+34936930018) i procedeixi a la seva destrucció. Aquest e-mail no podrà considerar-se SPAM. Este mensaje, y los documentos en su caso anexos, se dirigen exclusivamente a su destinatario y pueden contener información reservada y/o CONFIDENCIAL cuyo uso no autorizado o divulgación está prohibida por la legislación vigente (Ley 32/2002 SSI-CE). Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía o por teléfono (+34936930018) y proceda a su destrucción. Este e-mail no podrá considerarse SPAM. This message and the enclosed documents are directed exclusively to its receiver and can contain reserved and/or confidential information, from which use isn’t allowed its divulgation, forbidden by the current legislation (Law 32/2002 SSI-CE). If you have received this message by mistake, we kindly ask you to communicate it to us right away by the same way or by phone (+34936930018) and destruct it. This e-mail can’t be considered as SPAM. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] two disks - best way to use them?
At 01:58 PM 12/2/2005, Rick Schumeyer wrote: I installed another drive in my linux pc in an attempt to improve performance on a large COPY to a table with a geometry index. Based on previous discussion, it seems there are three things competing for the hard drive: 1) the input data file 2) the pg table 3) the WAL What is the best way to distribute these among two drives? From Tom's comments I would think that the pg table and the WAL should be separate. Does it matter where the input data is? Best is to have 3 HD or HD sets, one for each of the above. With only 2, and assuming the input file is too large to fit completely into RAM at once, I'd test to see whether: a= input on one + pg table & WAL on the other, or b= WAL on one + pg table & input file on the other is best. If the input file can be made 100% RAM resident, then use c= pg table on one + WAL and input file on the other. The big goal here is to minimize HD head seeks. Ron ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Database restore speed
* Luke Lonergan ([EMAIL PROTECTED]) wrote: > > Luke, would it help to have one machine read the file and > > have it connect to postgres on a different machine when doing > > the copy? (I'm thinking that the first machine may be able to > > do a lot of the parseing and conversion, leaving the second > > machine to just worry about doing the writes) > > Unfortunately not - the parsing / conversion core is in the backend, > where it should be IMO because of the need to do the attribute > conversion there in the machine-native representation of the attributes > (int4, float, etc) in addition to having the backend convert from client > encoding (like LATIN1) to the backend encoding (like UNICODE aka UTF8). Just a thought, but couldn't psql be made to use the binary mode of libpq and do at least some of the conversion on the client side? Or does binary mode not work with copy (that wouldn't suprise me, but perhaps copy could be made to support it)? The other thought, of course, is that you could use PITR for your backups instead of pgdump... Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] Network permormance under windows
we experienced the same. had 2 win2003 servers - www and db connected to the same router through 100mbit. the performance was quite bad. now we run the db on the same machine as the web and everything runs smooth. cheers, thomas - Original Message - From: "Josep Maria Pinyol Fontseca" <[EMAIL PROTECTED]> Cc: ; <[EMAIL PROTECTED]> Sent: Friday, December 02, 2005 6:24 PM Subject: Re: [PERFORM] Network permormance under windows Dear Merlin, For instance, we have this table (with 22900 tuples): CREATE TABLE tbl_empresa ( id_empresa int4 NOT NULL DEFAULT nextval(('seq_empresa'::text)::regclass), ref_poblacio int4 NOT NULL, nom varchar(50) NOT NULL, nif varchar(12), carrer varchar(50), telefon varchar(13), fax varchar(13), email varchar(50), lab_materials int2 DEFAULT 0, web varchar(50), ref_empresa int4, ref_classificacio_empresa int4, ref_sector_empresa int4, control int2, origen_volcat int2, data_modificacio date, plantilla int4, tamany int2, autoritzacio_email int2, ref_estat_empresa int2, CONSTRAINT tbl_clients_pkey PRIMARY KEY (id_empresa), CONSTRAINT fk_tbl_empresa_ref_classificacio_emp FOREIGN KEY (ref_classificacio_empresa) REFERENCES tbl_classificacio_empresa (id_classificacio_empresa) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CONSTRAINT fk_tbl_empresa_ref_empresa FOREIGN KEY (ref_empresa) REFERENCES tbl_empresa (id_empresa) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CONSTRAINT fk_tbl_empresa_ref_estat_emp FOREIGN KEY (ref_estat_empresa) REFERENCES tbl_estat_empresa (id_estat_empresa) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CONSTRAINT fk_tbl_empresa_ref_poblacio FOREIGN KEY (ref_poblacio) REFERENCES tbl_poblacions (id_poblacio) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CONSTRAINT fk_tbl_empresa_ref_sector_emp FOREIGN KEY (ref_sector_empresa) REFERENCES tbl_sector_empresa (id_sector_empresa) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT ) WITH OIDS; When we select all data in local machine, we obtain results in 2-3 seconds aprox. In remote connections: Postgresql 7.1 usign pgAdminII: Network traffic generated with remote applications is about 77-80% in a 10Mb connection. 6 seconds aprox. Postgresql 8.1 usign pgAdminIII: Network traffic generated with remote applications is about 2-4% in a 10Mb connection. 12 seconds or more... I feel that is a problem with TCP_NODELAY of socket options... but I don't know. Josep Maria En/na Merlin Moncure ha escrit: We used Postgresql 7.1 under Linux and recently we have changed it to Postgresql 8.1 under Windows XP. Our application uses ODBC and when we try to get some information from the server throw a TCP connection, it's very slow. We have also tried it using psql and pgAdmin III, and we get the same results. If we try it locally, it runs much faster. We have been searching the mailing lists, we have found many people with the same problem, but we haven't found any final solution. How can we solve this? Any help will be appreciated. Thanks in advance. by any chance are you working with large tuples/columns (long text, bytea, etc)? Also please define slow. Merlin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Josep Maria Pinyol i Fontseca Responsable àrea de programació ENDEPRO - Enginyeria de programari Passeig Anselm Clavé, 19 Bx. 08263 Callús (Barcelona) Tel. +34 936930018 - Mob. +34 600310755 - Fax. +34 938361994 [EMAIL PROTECTED] - http://www.endepro.com Aquest missatge i els documents en el seu cas adjunts, es dirigeixen exclusivament al seu destinatari i poden contenir informació reservada i/o CONFIDENCIAL, us del qual no està autoritzat ni la divulgació del mateix, prohibit per la legislació vigent (Llei 32/2002 SSI-CE). Si ha rebut aquest missatge per error, li demanem que ens ho comuniqui immediatament per la mateixa via o bé per telèfon (+34936930018) i procedeixi a la seva destrucció. Aquest e-mail no podrà considerar-se SPAM. Este mensaje, y los documentos en su caso anexos, se dirigen exclusivamente a su destinatario y pueden contener información reservada y/o CONFIDENCIAL cuyo uso no autorizado o divulgación está prohibida por la legislación vigente (Ley 32/2002 SSI-CE). Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía o por teléfono (+34936930018) y proceda a su destrucción. Este e-mail no podrá considerarse SPAM. This message and the enclosed documents are directed exclusively to its receiver and can contain reserved and/or confidential information, from which use isn’t allowed its divulgation, forbidden by the current legislation (Law 32/2002 SSI-CE). If you have received this message by mistake, we kindly ask you to communicate it to us right away by the same way or by phone (+34936930018) and destruct it. This e-mail can’t be considered as SPAM. ---(end of broadca
Re: [PERFORM] Network permormance under windows
"Josep Maria Pinyol Fontseca" <[EMAIL PROTECTED]> wrote > > When we select all data in local machine, we obtain results in 2-3 seconds > aprox. In remote connections: > > Postgresql 7.1 usign pgAdminII: > Network traffic generated with remote applications is about 77-80% in a > 10Mb connection. > 6 seconds aprox. > > Postgresql 8.1 usign pgAdminIII: > Network traffic generated with remote applications is about 2-4% in a 10Mb > connection. > 12 seconds or more... > Have you tried to use psql? And how you "select all data" - by "select count(*)" or "select *"? Regards, Qingqing ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Database restore speed
Stephen, On 12/2/05 12:18 PM, "Stephen Frost" <[EMAIL PROTECTED]> wrote: > Just a thought, but couldn't psql be made to use the binary mode of > libpq and do at least some of the conversion on the client side? Or > does binary mode not work with copy (that wouldn't suprise me, but > perhaps copy could be made to support it)? Yes - I think this idea is implicit in what David suggested, and my response as well. The problem is that the way the client does conversions can potentially differ from the way the backend does. Some of the types in Postgres are machine intrinsic and the encoding conversions use on-machine libraries, each of which preclude the use of client conversion methods (without a lot of restructuring). We'd tackled this problem in the past and concluded that the parse / convert stage really belongs in the backend. > The other thought, of course, is that you could use PITR for your > backups instead of pgdump... Totally - great idea, if this is actually a backup / restore then PITR plus filesystem copy (tarball) is hugely faster than dump / restore. - Luke ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Network permormance under windows
Yes, with psql, pgAdminIII and our application with ODBC I experiment the same situation... the sentences that I execute are like "select * ..." or similar like this. Qingqing Zhou wrote: "Josep Maria Pinyol Fontseca" <[EMAIL PROTECTED]> wrote When we select all data in local machine, we obtain results in 2-3 seconds aprox. In remote connections: Postgresql 7.1 usign pgAdminII: Network traffic generated with remote applications is about 77-80% in a 10Mb connection. 6 seconds aprox. Postgresql 8.1 usign pgAdminIII: Network traffic generated with remote applications is about 2-4% in a 10Mb connection. 12 seconds or more... Have you tried to use psql? And how you "select all data" - by "select count(*)" or "select *"? Regards, Qingqing ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Josep Maria Pinyol i Fontseca Responsable àrea de programació ENDEPRO - Enginyeria de programari Passeig Anselm Clavé, 19 Bx. 08263 Callús (Barcelona) Tel. +34 936930018 - Mob. +34 600310755 - Fax. +34 938361994 [EMAIL PROTECTED] - http://www.endepro.com Aquest missatge i els documents en el seu cas adjunts, es dirigeixen exclusivament al seu destinatari i poden contenir informació reservada i/o CONFIDENCIAL, us del qual no està autoritzat ni la divulgació del mateix, prohibit per la legislació vigent (Llei 32/2002 SSI-CE). Si ha rebut aquest missatge per error, li demanem que ens ho comuniqui immediatament per la mateixa via o bé per telèfon (+34936930018) i procedeixi a la seva destrucció. Aquest e-mail no podrà considerar-se SPAM. Este mensaje, y los documentos en su caso anexos, se dirigen exclusivamente a su destinatario y pueden contener información reservada y/o CONFIDENCIAL cuyo uso no autorizado o divulgación está prohibida por la legislación vigente (Ley 32/2002 SSI-CE). Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía o por teléfono (+34936930018) y proceda a su destrucción. Este e-mail no podrá considerarse SPAM. This message and the enclosed documents are directed exclusively to its receiver and can contain reserved and/or confidential information, from which use isn’t allowed its divulgation, forbidden by the current legislation (Law 32/2002 SSI-CE). If you have received this message by mistake, we kindly ask you to communicate it to us right away by the same way or by phone (+34936930018) and destruct it. This e-mail can’t be considered as SPAM. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] 15,000 tables
On Thu, Dec 01, 2005 at 08:34:43PM +0100, Michael Riess wrote: > Well, we tried switching to daily VACUUM ANALYZE and weekly VACUUM FULL, > but the database got considerably slower near the end of the week. If you have your FSM configured correctly and you are vacuuming tables often enough for your turnover, than in regular operation you should _never_ need VACUUM FULL. So it sounds like your first problem is that. With the 15000 tables you were talking about, though, that doesn't surprise me. Are you sure more back ends wouldn't be a better answer, if you're really wedded to this design? (I have a feeling that something along the lines of what Tom Lane said would be a better answer -- I think you need to be more clever, because I don't think this will ever work well, on any system.) A -- Andrew Sullivan | [EMAIL PROTECTED] This work was visionary and imaginative, and goes to show that visionary and imaginative work need not end up well. --Dennis Ritchie ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Database restore speed
* Luke Lonergan ([EMAIL PROTECTED]) wrote: > On 12/2/05 12:18 PM, "Stephen Frost" <[EMAIL PROTECTED]> wrote: > > Just a thought, but couldn't psql be made to use the binary mode of > > libpq and do at least some of the conversion on the client side? Or > > does binary mode not work with copy (that wouldn't suprise me, but > > perhaps copy could be made to support it)? > > Yes - I think this idea is implicit in what David suggested, and my response > as well. The problem is that the way the client does conversions can > potentially differ from the way the backend does. Some of the types in > Postgres are machine intrinsic and the encoding conversions use on-machine > libraries, each of which preclude the use of client conversion methods > (without a lot of restructuring). We'd tackled this problem in the past and > concluded that the parse / convert stage really belongs in the backend. I've used the binary mode stuff before, sure, Postgres may have to convert some things but I have a hard time believing it'd be more expensive to do a network_encoding -> host_encoding (or toasting, or whatever) than to do the ascii -> binary change. Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] Database restore speed
Stephen, On 12/2/05 1:19 PM, "Stephen Frost" <[EMAIL PROTECTED]> wrote: > I've used the binary mode stuff before, sure, Postgres may have to > convert some things but I have a hard time believing it'd be more > expensive to do a network_encoding -> host_encoding (or toasting, or > whatever) than to do the ascii -> binary change. >From a performance standpoint no argument, although you're betting that you can do parsing / conversion faster than the COPY core in the backend can (I know *we* can :-). It's a matter of safety and generality - in general you can't be sure that client machines / OS'es will render the same conversions that the backend does in all cases IMO. - Luke ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Database restore speed
Stephen, On 12/2/05 1:19 PM, "Stephen Frost" <[EMAIL PROTECTED]> wrote: > >> I've used the binary mode stuff before, sure, Postgres may have to >> convert some things but I have a hard time believing it'd be more >> expensive to do a network_encoding -> host_encoding (or toasting, or >> whatever) than to do the ascii -> binary change. > > From a performance standpoint no argument, although you're betting that you > can do parsing / conversion faster than the COPY core in the backend can (I > know *we* can :-). It's a matter of safety and generality - in general you > can't be sure that client machines / OS'es will render the same conversions > that the backend does in all cases IMO. One more thing - this is really about the lack of a cross-platform binary input standard for Postgres IMO. If there were such a thing, it *would* be safe to do this. The current Binary spec is not cross-platform AFAICS, it embeds native representations of the DATUMs, and does not specify a universal binary representation of same. For instance - when representing a float, is it an IEEE 32-bit floating point number in little endian byte ordering? Or is it IEEE 64-bit? With libpq, we could do something like an XDR implementation, but the machinery isn't there AFAICS. - Luke ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Database restore speed
On Fri, Dec 02, 2005 at 01:24:31PM -0800, Luke Lonergan wrote: From a performance standpoint no argument, although you're betting that you can do parsing / conversion faster than the COPY core in the backend can Not necessarily; you may be betting that it's more *efficient* to do the parsing on a bunch of lightly loaded clients than your server. Even if you're using the same code this may be a big win. Mike Stone ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] 15,000 tables
Michael Riess writes: Sorry, I should have included that info in the initial post. You're right in that most of these tables have a similar structure. But they are independent and can be customized by the users. How about creating 50 databases and give each it's own tablespace? It's not only whether PostgreSQL can be optimized, but also how well your filesystem is handling the directory with large number of files. by splitting the directories you will likely help the OS and will be able to perhaps better determine if the OS or the DB is at fault for the slowness. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Database restore speed
"Luke Lonergan" <[EMAIL PROTECTED]> writes: > One more thing - this is really about the lack of a cross-platform binary > input standard for Postgres IMO. If there were such a thing, it *would* be > safe to do this. The current Binary spec is not cross-platform AFAICS, it > embeds native representations of the DATUMs, and does not specify a > universal binary representation of same. Sure it does ... at least as long as you are willing to assume everybody uses IEEE floats, and if they don't you have semantic problems translating float datums anyhow. What we lack is documentation, more than functionality. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] 15,000 tables - next step
Hi, thanks for your comments so far - I appreciate it. I'd like to narrow down my problem a bit: As I said in the other thread, I estimate that only 20% of the 15,000 tables are accessed regularly. So I don't think that vacuuming or the number of file handles is a problem. Have a look at this: content2=# select relpages, relname from pg_class order by relpages desc limit 20; relpages | relname --+- 11867 | pg_attribute 10893 | pg_attribute_relid_attnam_index 3719 | pg_class_relname_nsp_index 3310 | wsobjects_types 3103 | pg_class 2933 | wsobjects_types_fields 2903 | wsod_133143 2719 | pg_attribute_relid_attnum_index 2712 | wsod_109727 2666 | pg_toast_98845 2601 | pg_toast_9139566 1876 | wsod_32168 1837 | pg_toast_138780 1678 | pg_toast_101427 1409 | wsobjects_types_fields_idx 1088 | wso_log 943 | pg_depend 797 | pg_depend_depender_index 737 | wsod_3100 716 | wp_hp_zen I don't think that postgres was designed for a situation like this, where a system table that should be fairly small (pg_attribute) is this large. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Database restore speed
Micahel, On 12/2/05 1:46 PM, "Michael Stone" <[EMAIL PROTECTED]> wrote: > Not necessarily; you may be betting that it's more *efficient* to do the > parsing on a bunch of lightly loaded clients than your server. Even if > you're using the same code this may be a big win. If it were possible in light of the issues on client parse / convert, then we should analyze whether it's a performance win. In the restore case, where we've got a dedicated server with a dedicated client machine, I don't see why there would be a speed benefit from running the same parse / convert code on the client versus running it on the server. Imagine a pipeline where there is a bottleneck, moving the bottleneck to a different machine doesn't make it less of a bottleneck. - Luke ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Small table or partial index?
I am in the process of designing a new system. There will be a long list of words such as -word table word_id integer word varchar special boolean Some "special" words are used to determine if some work is to be done and will be what we care the most for one type of operation. Will it be more effective to have a partial index 'where is special' or to copy those special emails to their own table? The projected number of non special words is in the millions while the special ones will be in the thousands at most (under 10K for sure). My personal view is that performance should be pretty much equal, but one of my co-worker's believes that the smaller table would likely get cached by the OS since it would be used so frequently and would perform better. In both instances we would be hitting an index of exactly the same size. The searches will be 'where word = and is special' ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Database restore speed
On Fri, 2005-12-02 at 13:24 -0800, Luke Lonergan wrote: > It's a matter of safety and generality - in general you > can't be sure that client machines / OS'es will render the same conversions > that the backend does in all cases IMO. Can't binary values can safely be sent cross-platform in DataRow messages? At least from my ignorant, cursory look at printtup.c, there's a binary format code path. float4send in utils/adt/float.c uses pq_sendfloat4. I obviously haven't followed the entire rabbit trail, but it seems like it happens. IOW, why isn't there a cross-platform issue when sending binary data from the backend to the client in query results? And if there isn't a problem there, why can't binary data be sent from the client to the backend? Mitch ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Database restore speed
And how do we compose the binary data on the client? Do we trust that the client encoding conversion logic is identical to the backend's? If there is a difference, what happens if the same file loaded from different client machines has different results? Key conflicts when loading a restore from one machine and not from another? - Luke -- Sent from my BlackBerry Wireless Device -Original Message- From: Mitch Skinner <[EMAIL PROTECTED]> To: Luke Lonergan <[EMAIL PROTECTED]> CC: Stephen Frost <[EMAIL PROTECTED]>; David Lang <[EMAIL PROTECTED]>; Steve Oualline <[EMAIL PROTECTED]>; pgsql-performance@postgresql.org Sent: Fri Dec 02 22:26:06 2005 Subject: Re: [PERFORM] Database restore speed On Fri, 2005-12-02 at 13:24 -0800, Luke Lonergan wrote: > It's a matter of safety and generality - in general you > can't be sure that client machines / OS'es will render the same conversions > that the backend does in all cases IMO. Can't binary values can safely be sent cross-platform in DataRow messages? At least from my ignorant, cursory look at printtup.c, there's a binary format code path. float4send in utils/adt/float.c uses pq_sendfloat4. I obviously haven't followed the entire rabbit trail, but it seems like it happens. IOW, why isn't there a cross-platform issue when sending binary data from the backend to the client in query results? And if there isn't a problem there, why can't binary data be sent from the client to the backend? Mitch ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings