Re: [PERFORM] Master/Slave, DB separation or just spend $$$?
Hi Scott, Thanks for the quick reply. I have been staring at *top* for a while and it's mostly been 40% in userspace and 30% in system. Wait is rather low and never ventures beyond 1%. My hardware is a duo core AMD Athlon64 X2 5000+, 1GB RAM and a single 160 GB SATA II hard disk drive. I will go look at Slony now. Scott, one question though - If my master is constantly changing, wouldn't the updates from the master to the slave also slow down the slave? Kelvin Quee +65 9177 3635 On Wed, Jul 22, 2009 at 1:42 PM, Scott Marlowescott.marl...@gmail.com wrote: On Tue, Jul 21, 2009 at 9:47 PM, Kelvin Queekelv...@gmail.com wrote: Hi Performance Wizards! I need advice on this. I have a db which is being constantly updated and queried by a few computers. We are doing datamining. The machine is running on a moderately powered machine and processors constantly hit 90%. When your CPUs say 90%, is that regular user / sys %, or is it wait %? The difference is very important. What kind of hardware are you running on btw? # cpus, memory, # of drives,type, RAID controller if any? At the same time, we need to present these data on a web interface. The performance for the web interface is now very sluggish as most of the power is occupied by the mining process. I have thought of a few ways out of this - 1) Buy a mega powered machine (temporal solution, quick fix) Depends very much on what your bound by, CPU or IO. If adding a couple of 15K SAS drives would double your performance then u don't need a super powerful machine. 2) Do a master-slave configuration Often a good choice. 3) Separate the DB into 2 - One for pure mining purposes, the other purely for web serving For (2), I do not know if it will be very effective since the master will probably have many changes at any moment. I do not understand how the changes will be propagated from the master to the slave without impacting the slave's performance. Anyone with more experience here? (3) seems ideal but is a very very painful solution! We can possibly use a message queue system but again I am not familiar with MQ. Will need to do more research. That could be a very complex solution. If you were me, how would you solve this problem? Slony, most likely. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Atomic access to large arrays
Hi I'm storing historical meteorological gridded data from GFS ( http://www.nco.ncep.noaa.gov/pmb/products/gfs/) into an array field in a table like this: CREATE TABLE grid_f_data_i2 ( //Specifies the variable and other features of data id_inventory integer REFERENCES grid_d_inventory(id_inventory), //A new grid is available each 3 hours since 5 years ago dh_date timestamp, //Values are scaled to be stored as signed integers of 2 bytes vl_grid smallint[361][720], CONSTRAINT meteo_f_gfs_tmp PRIMARY KEY (co_inventory, dh_date) ); Dimensions of each value of field vl_grid are (lat:361 x lon:720 = 259920 cells} for a grid of 0.5 degrees (about each 55 Km) around the world. So, vl_grid[y][x] stores the value at dh_date of a meteorological variable specified by id_inventory in the geodesic point latitude = -90 + y*0.5 longitude = x*0.5 The reverse formula for the closest point in the grid of an arbitary geodesic point will be y = Round((latitude+90) * 2 x = Round(longitude*2) Field vl_grid is stored in the TOAST table and has a good compression level. PostgreSql is the only one database that is able to store this huge amount of data in only 34 GB of disk. It's really great system. Queries returning big rectangular areas are very fast, but the target of almost all queries is to get historical series for a geodesic point SELECT dh_date, vl_grid[123][152] FROM grid_f_data_i2 WHERE id_inventory = 6 ORDER BY dh_date In this case, atomic access to just a cell of each one of a only few thousands of rows becomes too slow. Using standar way, size increase very much CREATE TABLE grid_f_data_i2 ( id_inventory integer REFERENCES grid_d_inventory(id_inventory), dh_date timestamp, smallint lat, smallint lon, smallint value }; This table have (4+8+2+2+2=24) bytes by register and (lat:361 x lon:720 = 259920) registers by grid, so, 6238080 bytes by grid. Uncompressed array design uses 4+8+2*259920=519852 bytes by register but just one register by grid. TOAST table compress these arrays with an average factor 1/2, so, the total size with arrays is 24 times lesser than standard way. Now, I have more than 6 stored grids in 30 GB, instead of 720 GB, but probably I will store 1 million of grids or more in 0.5 TB instead of 12 TB. I have no enougth money to buy nor maintain 12 TB disks. Please, could somebody answer some of these questions? - It's posible to tune some TOAST parameters to get faster atomic access to large arrays? - Using EXTERNAL strategy for storing TOAST-able columns could solve the problem? - Atomic access will be faster if table stores vectors for data in the same parallel instead of matrices of global data? CREATE TABLE grid_f_data_i2 ( //Specifies the variable and other features of data id_inventory integer REFERENCES grid_d_inventory(id_inventory), //A new grid is available each 3 hours since 5 years ago dh_date timestamp, // nu_parallel = y = Round((latitude+90) * 2 smallint nu_parallel, //Values are scaled to be stored as signed integers of 2 bytes vl_parallel smallint[], CONSTRAINT meteo_f_gfs_tmp PRIMARY KEY (co_inventory, nu_parallel, dh_date) ); - There is another faster solution? Thanks in advance and best regards -- Víctor de Buen Remiro Tol Development Team member www.tol-project.org
Re: [PERFORM] Master/Slave, DB separation or just spend $$$?
On Wed, Jul 22, 2009 at 1:52 AM, Kelvin Queekelv...@gmail.com wrote: Hi Scott, Thanks for the quick reply. I have been staring at *top* for a while and it's mostly been 40% in userspace and 30% in system. Wait is rather low and never ventures beyond 1%. My hardware is a duo core AMD Athlon64 X2 5000+, 1GB RAM and a single 160 GB SATA II hard disk drive. So I take it you're on a tight budget then? I'm guessing you could put a single quad core cpu and 8 Gigs of ram in place for a reasonable price. I'd highly recommend setting up at least software RAID-1 for increased reliability. I will go look at Slony now. Might be overkill if you can get by on a single reasonably powerful machine. Scott, one question though - If my master is constantly changing, wouldn't the updates from the master to the slave also slow down the slave? Yes it will, but the overhead for the slave is much less than the master. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Atomic access to large arrays
Victor de Buen (Bayes) vdeb...@bayesinf.com writes: I'm storing historical meteorological gridded data from GFS ( http://www.nco.ncep.noaa.gov/pmb/products/gfs/) into an array field in a table like this: vl_grid smallint[361][720], - It's posible to tune some TOAST parameters to get faster atomic access to large arrays? It might save a little bit to make the toast chunk size larger, but I'm not sure you could gain much from that. - Using EXTERNAL strategy for storing TOAST-able columns could solve the problem? Nope, wouldn't help --- AFAIR array access is not optimized for slice access. In any case, doing that would give up the compression savings that you were so happy about. If your normal access patterns involve vertical rather than horizontal scans of the data, maybe you should rethink the choice of table layout. Or maybe the compression is enough to allow you to consider storing the data twice, once in the current layout and once in a vertical format. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Atomic access to large arrays
Thank you very much, Tom I will try vector 'parallel' and 'vertical' strategies. Regards 2009/7/22 Tom Lane t...@sss.pgh.pa.us Victor de Buen (Bayes) vdeb...@bayesinf.com writes: I'm storing historical meteorological gridded data from GFS ( http://www.nco.ncep.noaa.gov/pmb/products/gfs/) into an array field in a table like this: vl_grid smallint[361][720], - It's posible to tune some TOAST parameters to get faster atomic access to large arrays? It might save a little bit to make the toast chunk size larger, but I'm not sure you could gain much from that. - Using EXTERNAL strategy for storing TOAST-able columns could solve the problem? Nope, wouldn't help --- AFAIR array access is not optimized for slice access. In any case, doing that would give up the compression savings that you were so happy about. If your normal access patterns involve vertical rather than horizontal scans of the data, maybe you should rethink the choice of table layout. Or maybe the compression is enough to allow you to consider storing the data twice, once in the current layout and once in a vertical format. regards, tom lane -- Víctor de Buen Remiro Consultor estadístico Bayes Forecast www.bayesforecast.com Tol Development Team member www.tol-project.org
Re: [PERFORM] Master/Slave, DB separation or just spend $$$?
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 I have a db which is being constantly updated and queried by a few computers. We are doing datamining. The machine is running on a moderately powered machine and processors constantly hit 90%. ... 2) Do a master-slave configuration 3) Separate the DB into 2 - One for pure mining purposes, the other purely for web serving Why not combine the two (if I'm understanding correctly)? Use Bucardo or Slony to make two slaves, one for the web servers to hit (assuming they are read-only queries), and one to act as a data warehouse. Your main box gets all the updates but has no selects or complex queries to weigh it down. If the we server does read and write, have your app maintain two database handles. For (2), I do not know if it will be very effective since the master will probably have many changes at any moment. I do not understand how the changes will be propagated from the master to the slave without impacting the slave's performance. Anyone with more experience here? The slave will get the updates as well, but in a more efficient manner as there will be no WHERE clauses or other logic associated with the original update. Bucardo or Slony will simply COPY over the rows as needed. Keep in mind that both are asynchronous, so changes won't appear on the slaves at the same time as the master, but the delay is typically measured in seconds. - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation PGP Key: 0x14964AC8 200907221229 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAkpnPpsACgkQvJuQZxSWSsggKgCfT0EbxWQdym30n7IV1J1X6dC6 HRkAoND4nCMVeffE2VW34VVmPcRtLclI =tTjn -END PGP SIGNATURE- -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Atomic access to large arrays
Thank you very much, Tom I will try vector 'parallel' and 'vertical' strategies. Regards 2009/7/22 Tom Lane t...@sss.pgh.pa.us Victor de Buen (Bayes) vdeb...@bayesinf.com writes: I'm storing historical meteorological gridded data from GFS ( http://www.nco.ncep.noaa.gov/pmb/products/gfs/) into an array field in a table like this: vl_grid smallint[361][720], - It's posible to tune some TOAST parameters to get faster atomic access to large arrays? It might save a little bit to make the toast chunk size larger, but I'm not sure you could gain much from that. - Using EXTERNAL strategy for storing TOAST-able columns could solve the problem? Nope, wouldn't help --- AFAIR array access is not optimized for slice access. In any case, doing that would give up the compression savings that you were so happy about. If your normal access patterns involve vertical rather than horizontal scans of the data, maybe you should rethink the choice of table layout. Or maybe the compression is enough to allow you to consider storing the data twice, once in the current layout and once in a vertical format. regards, tom lane -- Víctor de Buen Remiro Consultor estadístico Bayes Forecast www.bayesforecast.com Tol Development Team member www.tol-project.org
Re: [PERFORM] Master/Slave, DB separation or just spend $$$?
kelv...@gmail.com (Kelvin Quee) writes: I will go look at Slony now. It's worth looking at, but it is not always to be assumed that replication will necessarily improve scalability of applications; it's not a magic wand to wave such that presto, it's all faster! Replication is helpful from a performance standpoint if there is a lot of query load where it is permissible to look at *somewhat* out of date information. For instance, replication can be quite helpful for pushing load off for processing accounting data where you tend to be doing analysis on data from {yesterday, last week, last month, last year}, and where the data tends to be inherently temporal (e.g. - you're looking at transactions with dates on them). On the other hand, any process that anticipates *writing* to the master database will be more or less risky to try to shift over to a possibly-somewhat-behind 'slave' system, as will be anything that needs to be consistent with the master state. -- (reverse (concatenate 'string ofni.secnanifxunil @ enworbbc)) http://www3.sympatico.ca/cbbrowne/spiritual.html Nondeterminism means never having to say you're wrong. -- Unknown -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Atomic access to large arrays
On Tue, Jul 21, 2009 at 7:43 PM, Victor de Buen (Bayes)vdeb...@bayesinf.com wrote: Hi I'm storing historical meteorological gridded data from GFS (http://www.nco.ncep.noaa.gov/pmb/products/gfs/) into an array field in a table like this: CREATE TABLE grid_f_data_i2 ( //Specifies the variable and other features of data id_inventory integer REFERENCES grid_d_inventory(id_inventory), //A new grid is available each 3 hours since 5 years ago dh_date timestamp, //Values are scaled to be stored as signed integers of 2 bytes vl_grid smallint[361][720], CONSTRAINT meteo_f_gfs_tmp PRIMARY KEY (co_inventory, dh_date) ); Dimensions of each value of field vl_grid are (lat:361 x lon:720 = 259920 cells} for a grid of 0.5 degrees (about each 55 Km) around the world. So, vl_grid[y][x] stores the value at dh_date of a meteorological variable specified by id_inventory in the geodesic point latitude = -90 + y*0.5 longitude = x*0.5 The reverse formula for the closest point in the grid of an arbitary geodesic point will be y = Round((latitude+90) * 2 x = Round(longitude*2) Field vl_grid is stored in the TOAST table and has a good compression level. PostgreSql is the only one database that is able to store this huge amount of data in only 34 GB of disk. It's really great system. Queries returning big rectangular areas are very fast, but the target of almost all queries is to get historical series for a geodesic point SELECT dh_date, vl_grid[123][152] FROM grid_f_data_i2 WHERE id_inventory = 6 ORDER BY dh_date In this case, atomic access to just a cell of each one of a only few thousands of rows becomes too slow. That's a side effect of your use of arrays. Arrays are very compact, and ideal if you always want the whole block of data at once, but asking for particular point is the down side of your trade off. I would suggest maybe experimenting with smaller grid sizes...maybe divide your big grid into approximately 16 (4x4) separate subgrids. This should still 'toast', and give decent compression, but mitigate the impact of single point lookup somewhat. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Odd performance / query plan with bitmasked field as opposed to equality
On Mon, Jul 13, 2009 at 4:46 PM, Frank Joerdensfr...@joerdens.de wrote: I can't figure what is going on below; first of all, this count which returns 1.5 million from a ~2 million row table: woome=# explain analyze SELECT COUNT(*) FROM webapp_person WHERE webapp_person.permissionflags = B''::bit; QUERY PLAN - Aggregate (cost=125774.83..125774.84 rows=1 width=0) (actual time=2976.405..2976.405 rows=1 loops=1) - Seq Scan on webapp_person (cost=0.00..122041.10 rows=1493490 width=0) (actual time=0.019..2781.735 rows=1518635 loops=1) Filter: (permissionflags = B''::bit) Total runtime: 2976.475 ms (4 rows) There are two possibilities here: the planner thinks it CAN'T use the relevant index for this query, or it thinks that the index will be slower than just seq-scaning the whole table. To figure out which it is, try EXPLAIN ANALYZE again with enable_seqscan set to false (note: this is a bad idea in general, but useful for debugging). If you still get a seqscan anyway, then there's some reason why it thinks that it can't use the index (which we can investigate). If that makes it switch to an index scan, then you can try adjusting your cost parameters. But the first thing is to figure out which kind of problem you have. In any case, send the output to the list. Solving this problem will probably shed some light on the other things in your original email, so I'm not going to specifically address each one at this point. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] regression ? 8.4 do not apply One-Time Filter to subquery
On Wed, Jul 1, 2009 at 8:08 PM, Sergey Burladyaneshkin...@gmail.com wrote: 8.4 from CVS HEAD: EXPLAIN ANALYZE select * from (select n, 1 as r from generate_series(1, 10) as n union all select n, 2 from generate_series(1, 10) as n) as x where r = 3; QUERY PLAN Result (cost=0.00..30.00 rows=10 width=36) (actual time=90.723..90.723 rows=0 loops=1) - Append (cost=0.00..30.00 rows=10 width=36) (actual time=90.720..90.720 rows=0 loops=1) - Function Scan on generate_series n (cost=0.00..15.00 rows=5 width=36) (actual time=45.191..45.191 rows=0 loops=1) Filter: (1 = 3) - Function Scan on generate_series n (cost=0.00..15.00 rows=5 width=36) (actual time=45.522..45.522 rows=0 loops=1) Filter: (2 = 3) Total runtime: 118.709 ms (7 rows) 8.3.7: EXPLAIN ANALYZE select * from (select n, 1 as r from generate_series(1, 10) as n union all select n, 2 from generate_series(1, 10) as n) as x where r = 3; QUERY PLAN -- Result (cost=0.00..25.02 rows=2 width=8) (actual time=0.005..0.005 rows=0 loops=1) - Append (cost=0.00..25.02 rows=2 width=8) (actual time=0.004..0.004 rows=0 loops=1) - Result (cost=0.00..12.50 rows=1 width=4) (actual time=0.001..0.001 rows=0 loops=1) One-Time Filter: false - Function Scan on generate_series n (cost=0.00..12.50 rows=1 width=4) (never executed) - Result (cost=0.00..12.50 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1) One-Time Filter: false - Function Scan on generate_series n (cost=0.00..12.50 rows=1 width=4) (never executed) Total runtime: 0.053 ms (9 rows) Is it right ? This might be related to this fix by Tom. http://archives.postgresql.org/message-id/20090711040933.7a083753...@cvs.postgresql.org ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Atomic access to large arrays
Thank a lot, Merlin. I will try to fill a sample of grids in a new table with different sizes of subgrids in order to get the better relation between space and speed. Regards 2009/7/22 Merlin Moncure mmonc...@gmail.com On Tue, Jul 21, 2009 at 7:43 PM, Victor de Buen (Bayes)vdeb...@bayesinf.com wrote: Hi I'm storing historical meteorological gridded data from GFS (http://www.nco.ncep.noaa.gov/pmb/products/gfs/) into an array field in a table like this: CREATE TABLE grid_f_data_i2 ( //Specifies the variable and other features of data id_inventory integer REFERENCES grid_d_inventory(id_inventory), //A new grid is available each 3 hours since 5 years ago dh_date timestamp, //Values are scaled to be stored as signed integers of 2 bytes vl_grid smallint[361][720], CONSTRAINT meteo_f_gfs_tmp PRIMARY KEY (co_inventory, dh_date) ); Dimensions of each value of field vl_grid are (lat:361 x lon:720 = 259920 cells} for a grid of 0.5 degrees (about each 55 Km) around the world. So, vl_grid[y][x] stores the value at dh_date of a meteorological variable specified by id_inventory in the geodesic point latitude = -90 + y*0.5 longitude = x*0.5 The reverse formula for the closest point in the grid of an arbitary geodesic point will be y = Round((latitude+90) * 2 x = Round(longitude*2) Field vl_grid is stored in the TOAST table and has a good compression level. PostgreSql is the only one database that is able to store this huge amount of data in only 34 GB of disk. It's really great system. Queries returning big rectangular areas are very fast, but the target of almost all queries is to get historical series for a geodesic point SELECT dh_date, vl_grid[123][152] FROM grid_f_data_i2 WHERE id_inventory = 6 ORDER BY dh_date In this case, atomic access to just a cell of each one of a only few thousands of rows becomes too slow. That's a side effect of your use of arrays. Arrays are very compact, and ideal if you always want the whole block of data at once, but asking for particular point is the down side of your trade off. I would suggest maybe experimenting with smaller grid sizes...maybe divide your big grid into approximately 16 (4x4) separate subgrids. This should still 'toast', and give decent compression, but mitigate the impact of single point lookup somewhat. merlin -- Víctor de Buen Remiro Consultor estadístico Bayes Forecast www.bayesforecast.com Tol Development Team member www.tol-project.org
Re: [PERFORM] Master/Slave, DB separation or just spend $$$?
On Wed, Jul 22, 2009 at 12:52 AM, Kelvin Queekelv...@gmail.com wrote: I have been staring at *top* for a while and it's mostly been 40% in userspace and 30% in system. Wait is rather low and never ventures beyond 1%. Certainly seems like you are CPU bound. My hardware is a duo core AMD Athlon64 X2 5000+, 1GB RAM and a single 160 GB SATA II hard disk drive. Looks like you are on a budget as Scott also suggested - I would also mirror his recommendation to upgrade to a quad core processor and more memory. Hopefully your motherboard supports quad-cores so you don't have to replace that bit, and you should be able to get at least 4GB of RAM in there. If IO load becomes an issue, Velociraptors are fast and don't cost too much. Getting a basic RAID1 will help prevent data-loss due to disk failure - make sure you are making offline backups as well! I will go look at Slony now. Scott, one question though - If my master is constantly changing, wouldn't the updates from the master to the slave also slow down the slave? Yes - Slony will increase the load on your source node as it does take work to do the replication, so unless you are able to offload your CPU heavy read only queries to the slave machine, it will only bog down the source node more. -Dave -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance