Re: [GENERAL] Divert triggers on materialized views
Hi Ewen, My advice would be to use table partitions... split your history table up into (maybe annual)? partitions, have these inherited into a parent table, which becomes your new "history table" (perhaps instead of a view?) If times are a common component of a where clause, given the partitions (except for the current one) are static (no new data), you can also apply a clustered index on the static partitions, which reduces disk activity on sequential data reads. We are running a reasonably effective database with approaching a biiliion records & twenty years of data using this approach. There is also a fork of Postgres which is optimised for timeseries data, see: https://www.timescale.com/ Some of their optimisation is to build in automatic partitioning, much as described above. You can use your existing normal view as well, which will only be on the "current" partition, so the overhead of a single large index & table is no longer an issue. Cheers Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 Brent Wood Principal Technician - GIS and Spatial Data Management Programme Leader - Environmental Information Delivery +64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nz<http://www.niwa.co.nz> [NIWA]<http://www.niwa.co.nz> To ensure compliance with legal requirements and to maintain cyber security standards, NIWA's IT systems are subject to ongoing monitoring, activity logging and auditing. This monitoring and auditing service may be provided by third parties. Such third parties can access information transmitted to, processed by and stored on NIWA's IT systems. From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Ewen McNeill [pg...@ewen.mcneill.gen.nz] Sent: Wednesday, October 18, 2017 13:34 To: pgsql-general@postgresql.org Subject: [GENERAL] Divert triggers on materialized views PostgreSQL VIEWs have a useful feature where INSTEAD OF triggers can be defined to divert INSERT/DELETE/UPDATE actions into an underlying table (or other location), creating the effect of a "writeable view" (and I believe in more recent PostgreSQL versions this is pretty much automatic). However MATERIALIZED VIEWs are implemented differently (effectively as TABLE storage with a saved SQL query used to REFRESH them), and it appears (at least in 9.5, but AFAICT from documentation through 10) no TRIGGERs at all can be created on MATERIALIZED VIEWs. This makes MATERIALIZED VIEWs difficult to substitute for VIEWs in some instances. In the situation I'm trying to help with: (a) there's a "history" table which has tens of millions of rows of accumulated daily updates; (b) there's a "today view" on the history table which shows the current data via some date range checks on the history table, working out to about 200k rows in the view (c) there's a daily batch task that ingests the latest data, which has a couple of hundred thousand rows of "daily updates", which for legacy reasons is done via (diverted by triggers) actions on the "today view" and they've reported that over time their "dashboards" of reports against the "today view" have become slower, presumably as the underlying history table has grown. Given that almost all of the changes happen in a daily batch update through an easily wrappable process (SFTP, process-on-upload), and the majority of the queries (but not all) are against the current data, turning that "today view" into a MATERIALIZED VIEW seems like an obvious win... except that it breaks the ability to use divert (INSTEAD OF) TRIGGERs which then means a larger application rewrite. Unfortunately the dashboard/reporting front end now has the name of the VIEW hardcoded all over the place (changeable only by one-at-a-time GUI point and click :-( ) so pointing the reporting tool at a different MATERIALIZED VIEW does not seem like a realistic option. Is it likely that TRIGGERs, especially INSTEAD OF triggers, would be supported on MATERIALIZED VIEWs in some later version of PostgreSQL in a similar manner to (non-MATERIALIZED) VIEWs? Ideally 9.6/10, but even being on a todo list would be useful. Alternatively does anyone have any idea of any minimal change rearrangement of TABLEs/VIEWs/TRIGGERs that might achieve the same effect without requiring much of the application to be rewritten? The only one that comes to my mind is to make the "today view" an actual table, with AFTER TRIGGERs on it that mirrors the same action into the "history table" -- which would require recreating all the TRIGGERs, and the VIEWs that hook into that "daily" view, but otherwise superficially seems plausible. It stores the data twice, but so does a MATERIALIZED VIEW, and the daily view data is relatively small. (
Re: [GENERAL] $foo $bar is BAD
+1 We should be "open" to include various languages, dialect & colloquialisms in documentation... the measure is whether the meaning is clear - foobar has a long history, as do foo & bar, in the communication of ideas. That would mean no jargon, abbreviations, humour, sarcasm, acronyms, etc... If we refused to use any words which had a historical connotation than might offend someone, we might as well forget about documentation altogether. Brent Wood Brent Wood Principal Technician - GIS and Spatial Data Management Programme Leader - Environmental Information Delivery +64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nz<http://www.niwa.co.nz> [NIWA]<http://www.niwa.co.nz> To ensure compliance with legal requirements and to maintain cyber security standards, NIWA's IT systems are subject to ongoing monitoring, activity logging and auditing. This monitoring and auditing service may be provided by third parties. Such third parties can access information transmitted to, processed by and stored on NIWA's IT systems. From: pgsql-general-ow...@postgresql.org <pgsql-general-ow...@postgresql.org> on behalf of Peter Devoy <pe...@3xe.co.uk> Sent: Saturday, April 16, 2016 2:00 PM To: Psql_General (E-mail) Subject: Re: [GENERAL] $foo $bar is BAD > Although people commonly use $foo $bar in examples, it is actually a misuse > of a VERY rude acronym. > The next time you need to make an example, please try being a little more > original (or meaningful) with your variable names. In light of recent CoC decisions, I would like to propose the opposite. I think more expletives would diversify the language of the documentation and lower the barriers to contribution by individuals more dispositioned to use colourful language due to their cultural and/or socioeconomic background. O:-) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] About COPY command (and probably file fdw too)
You can already do that, natively in Linux/Mac by adding some simple tools to try make Windows useful: cat FILE | grep filter | psql -d DB -c copy ; between grep, sed, tr, awk you can do almost any in-line filtering or text manipulation you are likely to need. Or a bit of Perl/Python... Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 Brent Wood Principal Technician - GIS and Spatial Data Management Programme Leader - Environmental Information Delivery +64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nzhttp://www.niwa.co.nz [NIWA]http://www.niwa.co.nz To ensure compliance with legal requirements and to maintain cyber security standards, NIWA's IT systems are subject to ongoing monitoring, activity logging and auditing. This monitoring and auditing service may be provided by third parties. Such third parties can access information transmitted to, processed by and stored on NIWA's IT systems. From: pgsql-general-ow...@postgresql.org pgsql-general-ow...@postgresql.org on behalf of Nicolas Paris nipari...@gmail.com Sent: Friday, May 22, 2015 8:33 AM To: Stefan Stefanov Cc: Forums postgresql Subject: Re: [GENERAL] About COPY command (and probably file fdw too) Hi, To me this would be great. Why not the ability to restrict lines too COPY stafflist (userid, username, staffid) FROM 'myfile.txt' WITH (FORMAT text, DELIMITER E'\t', COLUMNS (1, 2, 7), LINES(2:1000,2000:3000), ENCODING 'windows-1250') = subset of full data. 2015-05-21 22:25 GMT+02:00 Stefan Stefanov stefanov...@abv.bgmailto:stefanov...@abv.bg: Hi, Maybe I need to clarify a little. The suggested option “[SKIP] COLUMNS columnslist” would contain columns' positions in the file so that only some of the columns in a text file would be read into a table. Example: copy the first, second and seventh columns form myfile.txt into table stafflist. myfile.txt has many columns. COPY stafflist (userid, username, staffid) FROM 'myfile.txt' WITH (FORMAT text, DELIMITER E'\t', COLUMNS (1, 2, 7), ENCODING 'windows-1250') BR, Stefan Оригинално писмо От: Nicolas Paris nipari...@gmail.commailto:nipari...@gmail.com Относно: Re: [GENERAL] About COPY command (and probably file fdw too) До: Stefan Stefanov stefanov...@abv.bgmailto:stefanov...@abv.bg Изпратено на: 20.05.2015 23:21 2015-05-20 22:16 GMT+02:00 Stefan Stefanov stefanov...@abv.bg: Hi, I have been using COPY .. FROM a lot these days for reading in tabular data and it does a very good job. Still there is an inconvenience when a (large) text file contains more columns than the target table or the columns’ order differs. I can imagine three ways round and none is really nice - - mount the file as a foreign table with all the text file’s columns then insert into the target table a select from the foreign table; - create an intermediate table with all the text file’s columns, copy into it from the file then insert into the target table and finally drop the intermediate table when no more files are expected; - remove the unneeded columns from the file with a text editor prior to COPY-ing. I think that this is happening often in real life and therefore have a suggestion to add this option “[SKIP] COLUMNS columnslist” to the WITH clause of COPY .. FROM. It may be very useful in file fdw too. To be able to re-arrange columns’ order would come as a free bonus for users. Sincerely, Stefan Stefanov Hi, I guess it already does (from documentation): COPY table_name [ ( column_name [, ...] ) ] FROM { 'filename' | STDIN } [ [ WITH ] ( option [, ...] ) ] Then you can order the column_name as the source file has.
Re: [GENERAL] Why is there no object create date is the catalogs?
Yep. Still created once - instantiated repeated times, but created once. Try federated metadata records only one original creation date which is an explicit attribute of a record. Last copied, updated, edited are different. Creation date can be when first entered into a spreadsheet, or written down... insert date pertains to creation of the record as a database tuple, etc... A replica can be copied - but that is a date this instance was created, not the original record. One question - does an edit explicitly destroy the original object and create a new (child? linked?) object, or a modified version of the original? Answer yeah/nah - whichever you decide is correct for your use case - there no universal yes or no answer. The real issue is confusion about what created means - for data audit tracking/provenance, etc - very important in best practice data mgmt in many domains - all these are dates representing different actions which can be defined maintained - but by the user rather than the system (albeit often by triggers representing local business rules). Postgres has all the tools you need to implement whatever audit trails you need for create (when first written on a piece of paper), inserts, updates/edits, etc... but doing this in a standard way to meet all users needs is a long standing, unsolved probably unsolvable issue. Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 Brent Wood Principal Technician - GIS and Spatial Data Management Programme Leader - Environmental Information Delivery +64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nzhttp://www.niwa.co.nz [NIWA]http://www.niwa.co.nz To ensure compliance with legal requirements and to maintain cyber security standards, NIWA's IT systems are subject to ongoing monitoring, activity logging and auditing. This monitoring and auditing service may be provided by third parties. Such third parties can access information transmitted to, processed by and stored on NIWA's IT systems. From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Tom Lane [t...@sss.pgh.pa.us] Sent: Wednesday, May 13, 2015 11:26 AM To: Melvin Davidson Cc: Adrian Klaver; pgsql-general@postgresql.org Subject: Re: [GENERAL] Why is there no object create date is the catalogs? Melvin Davidson melvin6...@gmail.com writes: You are over thinking this. An object is only created once! Yeah? Would you expect that pg_dump followed by pg_restore would preserve the original creation date? What about pg_upgrade? This has come up many times before, and we've always decided that it was not as simple as it seems at first glance, and that it would be difficult to satisfy all use-cases. Try searching the archives for previous threads. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Partioning with overlapping and non overlapping constraints
Hi Mark, We have a somewhat similar situation - we have years of continuous data which are managed in Postgis. The tables are partitioned into annual subsets. The previous (static) years' underlying tables have a clustered index on UTC timestamp, the current year table has a conventional index. It works well, with 20 odd partitions to date. An annual basis for partitions may not be ideal in your case, but you have not specified how long it takes for your data to become fixed - or if there is a way the database can tell that records are now static. If there is, a scheduled task which migrates such records from a live to fixed partition would perhaps be appropriate. Organising your data by UTC timestamp may be the simplest approach for you. Cheers Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Melvin Davidson [melvin6...@gmail.com] Sent: Tuesday, February 10, 2015 6:01 AM To: Marc Mamin Cc: Tim Uckun; pgsql-general Subject: Re: [GENERAL] Partioning with overlapping and non overlapping constraints Well, without knowing too much about your application, it certainly sounds like using the metricts_MMDD is the way to go. As for modifying the constraint daily, couldn't you just use where timestamp current_date - Interval '1 Day' ? On Mon, Feb 9, 2015 at 5:14 AM, Marc Mamin m.ma...@intershop.demailto:m.ma...@intershop.de wrote: I have two partitioning questions I am hoping somebody can help me with. I have a fairly busy metric(ish) table. It gets a few million records per day, the data is transactional for a while but then settles down and is used for analytical purposes later. When a metric is reported both the UTC time and the local times are stored along with the other data belonging to the metric. Don't you have duplicate information within your UTC, location and local_time data ? Maybe you can just attach a timezone to each location... I want to partition this table to both make it faster to query and also to spread out the writes. Ideally the partitions would be based on the UTC timestamp and the sending location. For example metrics_location_X_2015_01_01 First problem with this approach is that there could be tens of thousands of locations so this is going to result hundreds of thousands of tables. I know there are no upper limits to how many tables there are but I am thinking this might really get me into trouble later. With only a few millions rows per day, weekly or even monthly partitions without regard of locations should be sufficient for older data. It should be possible to partition your hot data differently; But Instead of using one partition per location, you may use a hash/modulo approach to keep the number of partitions in a reasonable count if required at all (This can be helpful: https://github.com/markokr/pghashlib). Here I would avoid to include time information except for the limit between old and hot tables. And depending on the pattern and performance requirement of your analytic queries this may be sufficient (i.e. don't partition on the time at all). With smaller partitions for hot data, it should be quite fast to move them one by one to the old data. I have no experience with the trigger based partitioning of Postgres (we handle partitions logically at the application level), so I'm not sure how difficult this approach is. I suppose that you'll need a function that move data from hot to old partitons and that fix the triggers accordingly. Second and possibly more vexing problem is that often the local time is queried. Ideally I would like to put three constraints on the child tables. Location id, UTC timestamp and the local time but obviously the local timestamps would overlap with other locations in the same timezone Even if I was to only partition by UTC the local timestamps would overlap between tables. So the questions in a nutshell are. 1. Should I be worried about having possibly hundreds of thousands of shards. 2. Is PG smart enough to handle overlapping constraints on table and limit it's querying to only those tables that have the correct time constraint. If you partition on the UTC time only, you don't have overlapping. When querying on the local time, the planner will consider all partitions, but an additional index or constraint on this column should be sufficient as long as your partition count remains small. regards, Marc Mamin -- Melvin Davidson I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you. [http://us.i1.yimg.com/us.yimg.com/i/mesg/tsmileys2/01.gif] -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Postgres char type inconsistency
Looking at the behaviour of char varchar types, there seems to be an issue. Can anyone explain this behaviour? Is there a bug of some sort? According to the docs http://www.postgresql.org/docs/9.3/static/datatype-character.html)(: If the string to be stored is shorter than the declared length, values of type character will be space-padded; values of type character varying will simply store the shorter string. Yet chars are not being padded, in fact they lose trailing spaces which are retained by varchars. They also return length()'s less than the defined length... which should not be the case for a padded string as defined in the documentation. fish=# create table test(var3 varchar(3),cha3 char(3)); CREATE TABLE fish=# insert into test values('1','1'); INSERT 0 1 fish=# insert into test values('2 ','2 '); -- one space INSERT 0 1 fish=# insert into test values('3 ','3 '); --two spaces INSERT 0 1 fish=# select var3||':' as var3, cha3||':' as char3 from test; var3 | char3 --+--- 1: | 1: 2 : | 2: 3 : | 3: (3 rows) test=# select length(var3) as v_lgth, length(cha3) as c_length from test; v_lgth | c_length +-- 1 |1 2 |1 3 |1 So, in summary, varchar stores whatever feed to it and keeps trailing spaces to max length, char type will trim off trailing spaces, and stor a string shorter than the specified length.. Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 Brent Wood Principal Technician - GIS and Spatial Data Management Programme Leader - Environmental Information Delivery +64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nzhttp://www.niwa.co.nz [NIWA]http://www.niwa.co.nz
Re: [GENERAL] synchronize DTAP
Gidday, There was an interesting presentation at the Portland Postgres Users Group meeting in early Sept, from a guy who demo'd a Postgres database mounted as a FUSE filesystem. Not production ready, but with tables manifesting as directories, databases could be synch'ed using filesystem tools like rsynch - which offers intriguing backup replication possibilities. http://vimeo.com/105493143 the demo of the FUSE functionality starts at 39 minutes into the presentation. Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 Brent Wood Principal Technician - GIS and Spatial Data Management Programme Leader - Environmental Information Delivery +64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nzhttp://www.niwa.co.nz [NIWA]http://www.niwa.co.nz From: pgsql-general-ow...@postgresql.org pgsql-general-ow...@postgresql.org on behalf of Willy-Bas Loos willy...@gmail.com Sent: Tuesday, September 30, 2014 8:58 PM To: pgsql-general@postgresql.org Subject: [GENERAL] synchronize DTAP Hi, We have an environment that has a central repository for lookups, which is replicated to several databases, ech for different applications. This has been arranged in a DTAP manner. Sometimes it is necessary to synchronize the lookups of one of the DTAP branches with another. But i can't just overwrite one database with a dump from another branch, as the consumer databases will not follow. What i think i need is a way to compute the differences between two databases that have the same schema, and generate insert/update/delete statements from that. Since this seems as a pretty generic problem, i thought that i should ask around before i start writing my own scripts. Does anyone know of script or application that does this? Cheers -- Willy-Bas Loos
Re: [GENERAL] advice sought - general approaches to optimizing queries around event streams
We have a similar timeseries database approaching 500m records. We partition the main tables (much like your events) into one year subsets, with a clustered index on timestamp for all but the live year. https://blog.engineyard.com/2013/scaling-postgresql-performance-table-partitioning http://www.postgresql.org/docs/9.3/static/ddl-partitioning.html http://www.postgresql.org/docs/9.3/static/sql-cluster.html As discussed here previously, you can also improve performance using hardware - SSD'd vs spindles. Also note that tablespaces, with indexes on your faster drives data on slower ones can improve performance. http://www.postgresql.org/docs/9.3/static/manage-ag-tablespaces.html Also make sure your db server is optimised for the database size hardware configuration - like perhaps alloe fewer concurrent users, but more resources per user, or see what pgtune recommends. Should help your performance, in terms of underlying db efficiency performance, rather than tweaking your actual queries. Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 Brent Wood Principal Technician - GIS and Spatial Data Management Programme Leader - Environmental Information Delivery +64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nzhttp://www.niwa.co.nz [NIWA]http://www.niwa.co.nz From: pgsql-general-ow...@postgresql.org pgsql-general-ow...@postgresql.org on behalf of Jonathan Vanasco postg...@2xlp.com Sent: Saturday, September 27, 2014 9:02 AM To: PostgreSQL general Subject: [GENERAL] advice sought - general approaches to optimizing queries around event streams I have a growing database with millions of rows that track resources against an event stream. i have a few handfuls of queries that interact with this stream in a variety of ways, and I have managed to drop things down from 70s to 3.5s on full scans and offer .05s partial scans. no matter how i restructure queries, I can't seem to get around a few bottlenecks and I wanted to know if there were any tips/tricks from the community on how to approach them. a simple form of my database would be: -- 1k of create table stream ( id int not null primary key, ) -- 1MM of create table resource ( id int not null primary key, col_a bool, col_b bool, col_c text, ); -- 10MM of create table streamevent ( id int not null, event_timestamp timestamp not null, stream_id int not null references stream(id) ); -- 10MM of create table resource_2_stream_event( resource_id int not null references resource(id), streamevent_id int not null references streamevent(id) ) Everything is running off of indexes; there are no seq scans. I've managed to optimize my queries by avoiding joins against tables, and turning the stream interaction into a subquery or CTE. better performance has come from limiting the number of stream events ( which are only the timestamp and resource_id off a joined table ) The bottlenecks I've encountered have primarily been: 1. When interacting with a stream, the ordering of event_timestamp and deduplicating of resources becomes an issue. I've figured out a novel way to work with the most recent events, but distant events are troublesome using no limit, the query takes 3500 ms using a limit of 1, the query takes 320ms using a limit of 1000, the query takes 20ms there is a dedicated index of on event_timestamp (desc) , and it is being used according to the planner... finding all the records is fine; merging-into and sorting the aggregate to handle the deduplication of records in a stream seems to be the issue (either with DISTINCT or max+group_by) 2. I can't figure out an effective way to search for a term against an entire stream (using a tsquery/gin based search) I thought about limiting the query by finding matching resources first, then locking it to an event stream, but: - scanning the entire table for a term takes about 10 seconds on an initial hit. subsequent queries for the same terms end up using the cache, and complete within 20ms. I get better search performance by calculating the event stream, then searching it for matching documents, but I still have the performance issues related to limiting the window of events i didn't include example queries, because I'm more concerned with the general approaches and ideas behind dealing with large data sets than i am with raw SQL right now. i'm hoping someone can enlighten me into looking at new ways to solve these problems. i think i've learned more about postgres/sql in the past 48hour than I have in the past 15 years, and I'm pretty sure
Re: [GENERAL] Linux vs FreeBSD
Not a great help with which Linux to run, nor Postgres focused, but may be of interest, very relevant to the subject line.. Given the likely respective numbers of each OS actually out there, I'd suggests BSD is very over-represented in the high uptime list which is suggestive. http://uptime.netcraft.com/perf/reports/performance/Hosters?orderby=epercent Cheers, Brent Wood Brent Wood Principal Technician - GIS and Spatial Data Management Programme Leader - Environmental Information Delivery +64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nzhttp://www.niwa.co.nz [NIWA]http://www.niwa.co.nz From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Fran?ois Beausoleil [franc...@teksol.info] Sent: Thursday, April 10, 2014 8:36 AM To: Bruce Momjian Cc: Christofer C. Bell; pgsql-general@postgresql.org Subject: Re: [GENERAL] Linux vs FreeBSD Le 2014-04-09 ? 16:20, Bruce Momjian a ?crit : On Wed, Apr 9, 2014 at 10:02:07AM -0500, Christofer C. Bell wrote: This highlights a more fundamental problem of the difference between a workstation-based on OS like Ubuntu and a server-based one like Debian or FreeBSD. I know Ubuntu has a server version, but fundamentally Ubuntu's selection of kernels and feature churn make it less than ideal for server deployments. I am sure someone can post that they use Ubuntu just fine for server deployments, but I continue to feel that Ubuntu is chosen by administrators because it an OS they are familiar with on workstations, rather than it being the best choice for servers. I'm not a full-time sysadmin. I chose Ubuntu because I have familiarity with it, and because installing Puppet on it installed the certificates and everything I needed to get going. I tried Debian, but I had to fight and find the correct procedures to install the Puppet certificates and all. Ubuntu saved me some time back then. Cheers! Fran?ois inline: image843a29.JPG
Re: [GENERAL] SSD Drives
Hi David, Does the RAID 1 array give any performance benefits over a single drive? I'd guess that writes may be slower, reads may be faster (if balanced) but data security is improved. Brent Wood Brent Wood Principal Technician - GIS and Spatial Data Management Programme Leader - Environmental Information Delivery +64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nzhttp://www.niwa.co.nz [NIWA]http://www.niwa.co.nz From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of David Rees [dree...@gmail.com] Sent: Friday, April 4, 2014 8:32 AM To: Merlin Moncure Cc: bret_st...@machinemanagement.com; PostgreSQL General Subject: Re: [GENERAL] SSD Drives On Thu, Apr 3, 2014 at 12:13 PM, Merlin Moncure mmonc...@gmail.com wrote: On Wed, Apr 2, 2014 at 2:37 PM, Bret Stern bret_st...@machinemanagement.com wrote: Any opinions/comments on using SSD drives with postgresql? Here's a single S3700 smoking an array of 16 15k drives (poster didn't realize that; was to focused on synthetic numbers): http://dba.stackexchange.com/questions/45224/postgres-write-performance-on-intel-s3700-ssd I just ran a quick test earlier this week on an old Dell 2970 (2 Opteron 2387, 16GB RAM) comparing a 6-disk RAID10 with 10k 147GB SAS disks to a 2-disk RAID1 with 480GB Intel S3500 SSDs and found the SSDs are about 4-6x faster using pgbench and a scaling factor of 1100. Some sort of MegaRAID controller according to lspci and has BBU. TPS numbers below are approximate. RAID10 disk array: 8 clients: 350 tps 16 clients: 530 tps 32 clients: 800 tps RAID1 SSD array: 8 clients: 2100 tps 16 clients: 2500 tps 32 clients: 3100 tps So yeah, even the slower, cheaper S3500 SSDs are way fast. If your write workload isn't too high, the S3500 can work well. We'll see how the SMART drive lifetime numbers do once we get into production, but right now we estimate they should last at least 5 years and from what we've seen it seems that SSDs seem to wear much better than expected. If not, we'll pony up and go for the S3700 or perhaps move the xlog back on to spinning disks. -Dave -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general inline: image10a484.JPG
Re: [GENERAL] SSD Drives
have you seen this? http://it-blog.5amsolutions.com/2010/08/performance-of-postgresql-ssd-vs.html Brent Wood Brent Wood Principal Technician - GIS and Spatial Data Management Programme Leader - Environmental Information Delivery +64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nzhttp://www.niwa.co.nz [NIWA]http://www.niwa.co.nz From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Bret Stern [bret_st...@machinemanagement.com] Sent: Thursday, April 3, 2014 8:37 AM To: pgsql-general@postgresql.org Subject: [GENERAL] SSD Drives Any opinions/comments on using SSD drives with postgresql? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general inline: image1bd21c.JPG
Re: [GENERAL] Mysterious DB reset
One point - a serial datatype will not increment to infinity, as it is effectively a 4 byte integer with a sequence imposed, it can only store values upto MAXINT (2147483647) . Above this it may well wrap around where MAXINT + 1 = 1 You can delay the problem (significantly) by using bigserial (8 byte integer) instead of serial - this has MAXINT=9223372036854775807 http://www.postgresql.org/docs/9.2/static/datatype-numeric.html Otherwise you might run a cron job or trigger to reset the serial values the sequence when you think it timely. I can't see how this would cause the missing records though. Cheers Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Thom Brown [t...@linux.com] Sent: Thursday, March 06, 2014 8:01 AM To: Israel Brewster Cc: PGSQL Mailing List Subject: Re: [GENERAL] Mysterious DB reset On 5 March 2014 18:22, Israel Brewster isr...@eraalaska.netmailto:isr...@eraalaska.net wrote: I have a Postgresql 9.2.4 database containing real-time tracking data for our aircraft for the past week (theoretically). It is populated by two different processes: one that runs every few minutes, retrieving data from a number of sources and storing it in the DB, and one that has an always on connection to the DB streaming data into the database in realtime (often several records per second). To keep the database size manageable I have a cron job that runs every evening to delete all records that are more than a week old, after archiving a subset of them in permanent storage. This morning my boss e-mailed me, complaining that only a couple of aircraft were showing up in the list (SELECT distinct(tail) FROM data being the command that populates the list). Upon looking at the data I saw that it only went back to 4am this morning, rather than the week I was expecting. My first thought was Oh, I must have a typo in my cleanup routine, such that it is deleting all records rather than only those a week old, and it's just that no one has noticed until now. So I looked at that, but changing the delete to a select appeared to produce the proper results, in that no records were selected: DELETE FROM data WHERE pointtimenow() AT TIME ZONE 'UTC'-interval '7 days'; Then I noticed something even more odd. My database has an id column, which is defined as a SERIAL. As we all know, a serial is a monotonically increasing number that is not affected by deletes. However, the oldest record in my database, from 4am this morning, had an id of 1. Even though I KNOW there was data in the system yesterday. Even if my DELETE command was wrong and deleted ALL records, that shouldn't have reset the SERIAL column to 1! I also know that I have not been in the database mucking around with the sequence value - to be completely honest, I don't even know the exact command to reset it - I'd have to google it if I wanted to. Also odd is that my cleanup script runs at 1am. I have records of there being new data in the database up to 3:51am, but the oldest record currently in the DB is from 4:45am (as specified by the default of now() on the column). So I know records were added after my delete command ran, but before this reset occurred. So my question is, aside from someone going in and mucking about in the wee hours of the morning, what could possibly cause this behavior? What sort of event could cause all data to be deleted from the table, and the sequence to be reset? Especially while there is an active connection? Thanks for any ideas, however wild or off the wall :-) That is odd. Even if it were an unlogged table, and there was a crash, the sequence wouldn't reset, and even if it was running in a very long-running transaction held open by a buggy connection pooler, the sequence would still progress as it's immune to the effects of transactions. So if all the data went missing, and the sequence reset, the only thing I can think of is: Someone ran: TRUNCATE data RESTART IDENTITY; or someone restored the table structure from a backup that deleted the original table. Do you log DDL? Was the table partitioned? You should also really be on 9.2.7, although I can't think of any bug that's been fixed which could be responsible for this issue. -- Thom -- Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] Multiple Schema in One DB
We use this script as a proxy for psql, the user can run this followed by the schema's they want in the search path on the command line the PGOPTIONS value sets it up for them... schema=`echo $@|sed 's/, /,/g'|tr ,|sed 's/,,/,/g'` #echo $schema export PGOPTIONS=-c search_path=$schema,public,maps psql fish In your case this could perhaps be used by each application to customise the run time environment, so each has it's own PGOPTIONS string, and thus, when Postgres is run, it's own search path. Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Scott Marlowe [scott.marl...@gmail.com] Sent: Thursday, February 27, 2014 7:32 PM To: itishree sukla Cc: Postgres General Subject: Re: [GENERAL] Multiple Schema in One DB On Wed, Feb 26, 2014 at 11:24 PM, itishree sukla itishree.su...@gmail.com wrote: Hi all, I am in a requirement to have multiple schema in one Database. Schema structure will be same with respect to DB all db objetcs like tables, procedure. However each schema will be accessed by one one application. Problem is as all DB objects are same, when application is calling to a schema we have set search_path='schema name' in our procedure level. Now we want to get rid of that. I know we have do it based on ROLE Level, question is having more than one user defined schema is it achievable, i have done a small exercise, seems to be it is taking the first schema defined in the set search_path. As i am in urgent need of it, please help me. You can alter user to each one have their own search path. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] union of types in a different category
I prefer the explicit approach used by Postgres - MYSQL is simpler, but I'd say simplistic in this area. While it can automate the cating of tpes/catories of variable, it doesn't always do it the way I want - so I need to be explicit anyway. In your second use case, which fails - do you want numerics cast to strings or vice versa? It can make difference, so to get what you want rather than otherwise, I prefer to be explicit. in either Postgres or MySQL. Interestingly - i'm curious as to why SELECT '1' UNION SELECT 2; ?column? -- 1 2 (2 rows) SELECT '1' UNION SELECT 1; ?column? -- 1 (1 row) I didn't think UNION did an explicit distinct - if that is what is happening? Brent Wood Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of James Harper [james.har...@bendigoit.com.au] Sent: Sunday, February 23, 2014 11:52 AM To: pgsql-general@postgresql.org Subject: [GENERAL] union of types in a different category According to clause 3 on http://www.postgresql.org/docs/9.3/static/typeconv-union-case.html regarding union type matching: 3. If the non-unknown inputs are not all of the same type category, fail. So a query SELECT 1 UNION SELECT 1.1 works because 1 and 1.1 are of the same category, and one type has an implicit cast to the other, but the query SELECT '1' UNION SELECT 2 fails because '1' is a string literal and 2 is a number and so they are different categories. Right? Is this an artificial limitation of postgres or is there an underlying technical reason for this behaviour? For my purposes it would be better if the restriction was removed and that the union would work as long as there was an implicit cast that allowed conversion of all fields to the same type. MSSQL doesn't have this restriction and I'd prefer if I didn't have to rewrite these queries (or create a complete set of mssql compatible types in the same category) when porting applications. Thanks James -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] bulk loading table via join of 2 large staging tables
This should help... In each temporary table convert the time parts to a timestamp, then create an index on each of these, then join on the timestamp. ALTER table mmc add column timer timestamp without time zone; UPDATE mmc set timer = (utc_year || '-' || utc_month || '-' || utc_day || ' ' || utc_hour || ':' || utc_minute || ':' || utc_second)::timestamp; CREATE index mmc_timer_idx on mmc(timer); ALTER table gyro add column timer timestamp without time zone; UPDATE gyro set timer = (utc_year || '-' || utc_month || '-' || utc_day || ' ' || utc_hour || ':' || utc_minute || ':' || utc_second)::timestamp; CREATE index gyro_timer_idx on gyro(timer); so something like this should work if you use postgis - which I recommend for GPS data SELECT DISTINCT ON (project_id, platform_id, supplier_id, timer) 2 AS project_id, 1 AS platform_id, 6 AS supplier_id, m.timer, m.latitude, m.longitude, ST_SetSRID(ST_MAKEPOINT(m.longitude, m.latitude),4326) as location, m.sog AS speed_over_ground, m.cog AS course_over_ground, g.heading FROM rmc m, gyro g WHERE m.timer = g.timer; One comment: If either table has times recorded at better than 1 sec precision (ie - more than one value per second) you might join with the avg() value and group by to bring the output into 1 sec values. Cheers Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Seb [splu...@gmail.com] Sent: Tuesday, December 31, 2013 2:53 PM To: pgsql-general@postgresql.org Subject: [GENERAL] bulk loading table via join of 2 large staging tables Hi, I have two large CSV files that need to be merged and loaded into a single table of a database in Postgresql 9.3. I thought I'd do this by first staging the data in these files in two temporary tables: ---cut here---start-- CREATE TEMPORARY TABLE rmc ( utc_year character varying(6), utc_month character varying(4), utc_day character varying(4), utc_hour character varying(4), utc_minute character varying(4), utc_second character varying(8), latitude numeric, longitude numeric, sog numeric, cog numeric); CREATE TEMPORARY TABLE gyro ( utc_year character varying(6), utc_month character varying(4), utc_day character varying(4), utc_hour character varying(4), utc_minute character varying(4), utc_second character varying(8), heading numeric); ---cut here---end And the target table in the database looks like this: ---cut here---start-- Table public.navigation_series Column|Type | Modifiers --+-+-- navigation_record_id | integer | not null default nextval('navigation_series_navigation_record_id_seq'::regclass) project_id | integer | platform_id | integer | supplier_id | integer | time | timestamp without time zone | not null longitude| numeric | latitude | numeric | speed_over_ground| numeric | course_over_ground | numeric | heading | numeric | Indexes: navigation_series_pkey PRIMARY KEY, btree (navigation_record_id) navigation_series_project_id_platform_id_supplier_id_time_key UNIQUE CONSTRAINT, btree (project_id, platform_id, supplier_id, time) Foreign-key constraints: navigation_project_id_fkey FOREIGN KEY (project_id) REFERENCES projects(project_id) ON UPDATE CASCADE ON DELETE RESTRICT navigation_series_platform_id_fkey FOREIGN KEY (platform_id) REFERENCES platforms(platform_id) ON UPDATE CASCADE ON DELETE RESTRICT navigation_series_supplier_id_fkey FOREIGN KEY (supplier_id) REFERENCES suppliers(supplier_id) ON UPDATE CASCADE ON DELETE RESTRICT ---cut here---end Loading the temporary tables was very quick (about 3 min; input files were 580 Mb and 3.5 Gb) in psql, using: \copy gyro FROM 'gyro.csv' CSV \copy rmc FROM 'rmc.csv' CSV I then created a temporary view with: CREATE TEMPORARY VIEW rmc_gyro AS SELECT DISTINCT ON (project_id, platform_id, supplier_id, time) 2 AS project_id, 1 AS platform_id, 6 AS supplier_id, (utc_year || '-' || utc_month || '-' || utc_day || ' ' || utc_hour || ':' || utc_minute || ':' || utc_second)::timestamp AS time, longitude, latitude, sog
Re: [GENERAL] earthdistance
Is there not a precompiled Postgis package you can use? There are a few dependencies, the PROJ.4 libraries you are missing enable projection support, and the package tools automatically manage such dependencies. I know packages are well supported for Debian, Ubuntu/Mint/etc, Suse Fedora. See: http://trac.osgeo.org/postgis/wiki/UsersWikiInstall Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Olivier Chaussavoine [olivier.chaussavo...@gmail.com] Sent: Saturday, August 10, 2013 10:17 PM To: John R Pierce Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] earthdistance As a simple potential user, I tried to install PostGIS, downloaded all libraries required: proj-4.8.0, gdal-1.10.0,json-c,postgis-2.0.3,geos-3.3.8,libwml2-2.9.0, and tried to build the first library with the simple procedure: ./configure make make install I had a fatal error: make[2]: entrant dans le répertoire « /home/olivier/ob/proj-4.8.0/src » /bin/bash ../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -DPROJ_LIB=\/usr/local/share/proj\ -DMUTEX_pthread -g -O2 -MT jniproj.lo -MD -MP -MF .deps/jniproj.Tpo -c -o jniproj.lo jniproj.c libtool: compile: gcc -DHAVE_CONFIG_H -I. -DPROJ_LIB=\/usr/local/share/proj\ -DMUTEX_pthread -g -O2 -MT jniproj.lo -MD -MP -MF .deps/jniproj.Tpo -c jniproj.c -fPIC -DPIC -o .libs/jniproj.o jniproj.c:52:26: fatal error: org_proj4_PJ.h: No such file or directory compilation terminated. problem out of the scope of this list, and probably not /difficult. Since I look for a simple geographic indexing using imprecise lat,long coordinates that do not deal with precise modeling; that I am afraid of long install procedure, and heavy computations, I also give up. Spacial mysql indexing seems to be included in pre-built packages. What can we do? 2013/8/10 John R Pierce pie...@hogranch.commailto:pie...@hogranch.com On 8/9/2013 5:18 PM, Brent Wood wrote: You might install Postgis to implement very powerful spatial functionality that can easily do what you are asking (plus a whole lot more). indeed, PostGIS is the logical answer, but the OP specifically stated he wanted the functionality without 'sophisticated geographic systems'. so I ignored the question. the alternative would be implementing your own spherical geometry functions, and hook them up to GiST indexing, its not that hard, but by the time you got all the functionality you need, you'd be half way to PostGIS, so why fight it? -- john r pierce 37N 122W somewhere on the middle of the left coast -- Olivier Chaussavoine -- Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] earthdistance
You might install Postgis to implement very powerful spatial functionality that can easily do what you are asking (plus a whole lot more). http://www.postgis.org Now that v2 installs as a Postgres extension, it is more closely coupled with the underlying database. Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Olivier Chaussavoine [olivier.chaussavo...@gmail.com] Sent: Saturday, August 10, 2013 7:29 AM To: pgsql-general@postgresql.org Subject: [GENERAL] earthdistance I develope a project openbarter that needs to match objects based on a maximum distance between their positions on earth. I saw that the documentation of the extension earthdistance was interesting, but the promise was not in the code. It would be nice to have these functions available independently of sophisticated geographic systems. There is a circle object for flat two dimensional space, but earth deals with spherical caps. It would not be exact but enough to suppose that earth is a sphere and that all dimensions latitude, longitude and distance are in radian. What would need to be done to adapt the circle type to a new type 'spherical cap' that would allow simple geographic indexing? -- Olivier Chaussavoine -- Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] partial time stamp query
Hi Kirk, We have a (near) real time data database for instrument observations from our research vessels. All observations (summarised to one minute intervals - the actual raw data is in netCDF, this database makes for easier access meets most users needs) go into a single table, with other tables for metadata about platforms, instruments, etc. Now approaching 350m records, so reasonably substantial. Underlying optimisations include partitioned readings table, with a separate partition for each year (now 23 years) clustered index on timestamp for the previous years partitions. largeish filesystem block size - tested to work well with the clustered index small size records) These generally make a big difference to performance. To address one issue, much like yours, where some users want hourly data for a year, some want daily data for 10 years some want 1 minute data for the last month ( some, no doubt, want one minute data for 20+ years!) I introduced an integer column called timer. This value is set according to the time (not date) of each record. Along the lines of (from memory) :an even no of minutes after the hour is 2, 5 minutes is 4, 10 minutes is 8, 15 minute is 16, 30 minutes is 32, 60 minutes is 64, 6 hourly is 128, 12:00 AM is 256 12:00PM is 512. When any timestamp is in more than one category (eg: 12:00 is all of even, 5, 15m 30m 60 minutes), the timer value is set to the largest appropriate one. So a request for: 1 minute data is select from table; 2 minute data is select from table where timer =2 and timer !=15 and timer !=4; hourly data is select from table where timer =64 and timer != 15 and timer != 4; etc 5 15 minute add a bit of complexity, but we gave the users what they wanted. This has worked well for years now, we have an internal web (mapserver/openlayers based) application allowing users to visualise download their selected data - they choose from an interval pick list the SQL is hidden. Some extra enhancements are the automatic collation of lat lon gps readings into a Postgis point for each reading record, the automatic aggregation of daily points into daily track lines, so the track for any selected set of dates can easily be displayed on a map (the platforms are mobile vessels - not fixed sites) You might adapt some of these ideas for your use case? Cheers Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Kirk Wythers [wythe...@umn.edu] Sent: Tuesday, February 05, 2013 5:58 AM To: pgsql-general@postgresql.org Subject: Fwd: [GENERAL] partial time stamp query Thanks. That worked great! Now I am trying to aggregate these same fifteen minute to hourly. I have tried using date_trunk: date_trunc('hour', derived_tsoil_fifteen_min_stacked.time2), but date_truck only seems to aggriage the timestamp. I thought I could use AVG(derived_tsoil_fifteen_min_stacked.value) in combination with date_trunk, but I still get 15 minute values, not the hourly average from the four 15 minute records. rowid date_truck time2 site canopy plot variable name value avg 2010-07-07_00:00:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:00:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 21.06 21.054659424 2010-07-07_00:15:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:15:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.96 20.950844727 2010-07-07_00:30:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:30:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.88 20.871607666 2010-07-07_00:45:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:45:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.8 20.792370605 2010-07-07_01:00:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:00:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.72 20.713133545 2010-07-07_01:15:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:15:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.64 20.633896484 2010-07-07_01:30:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:30:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.55 20.542370605 2010-07-07_01:45:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:45:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.47 20.463133545 I was tying to get two records out of this set, with the 'avg column representing the mean of the first and last four of each 15 minute records. Perhaps date_trunk only works for the timestamp? On Feb 4, 2013, at 8:50 AM, Misa Simic misa.si...@gmail.commailto:misa.si...@gmail.com wrote: WHERE derived_tsoil_fifteen_min_stacked.time2::date = '2010-07-07'::date On Monday, February 4, 2013, Kirk Wythers wrote: I am trying to write a query that grabs one particular day from a timestamp column. The data are ordered in 15 minute chunks like this: 2010-07-07 12:45:00 2010-07-07 13:00:00 2010-07-07 13:15:00 2010-07-07 13:30:00 etc… If I wanted all records from july 7th 2010, I
Re: [GENERAL] COPY column order
You can specify the column order in the copy statement: psql -d test -c create table ttt (id serial primary key, name varchar(10), value int); echo 10|one | psql -d test -c copy ttt (value,name) from stdin with delimiter '|'; psql -d test -c select * from ttt; id | name | value +--+--- 1 | one |10 (1 row) HTH Brent Wood GIS/DBA consultant NIWA +64 (4) 4 386-0300 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of MD33 [mdubosfo...@yahoo.com] Sent: Tuesday, May 08, 2012 12:33 AM To: pgsql-general@postgresql.org Subject: [GENERAL] COPY column order Hi there I'm trying to use COPY with HEADER option but my header line in file is in different order than the column order specified in database. Is the column name order necessary in my file ?? thxs -- View this message in context: http://postgresql.1045698.n5.nabble.com/COPY-column-order-tp5690950.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] CPU Load 100% when pg_dump start in Postgresql 8.4
Can you not nice the dump process to free up resources during the dump? Of course this will not free up any locks, and will make them hang around longer as the dump is slowed down. Brent Wood GIS/DBA consultant NIWA +64 (4) 4 386-0300 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Prashant Bharucha [prashantbharu...@yahoo.ca] Sent: Wednesday, April 04, 2012 7:48 AM To: pgsql-general@postgresql.org Subject: [GENERAL] CPU Load 100% when pg_dump start in Postgresql 8.4 Hello Everyone I facing a big problem ,when pg_dump start .CPU load become 100%. DB Size 35 GB running with e commerce web site. Insert transaction record successfully but Update transaction is not going through. Could you please help to figure out where is the problem ? Thanks Prashant -- Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] huge price database question..
Also look at a clustered index on timestamp Brent Wood GIS/DBA consultant NIWA +64 (4) 4 386-0300 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Jim Green [student.northwest...@gmail.com] Sent: Wednesday, March 21, 2012 2:50 PM To: David Kerr Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] huge price database question.. On 20 March 2012 21:40, David Kerr d...@mr-paradox.net wrote: On 03/20/2012 04:27 PM, Jim Green wrote: Greetings list! I am pretty new to postgresql from mysql and did a fairly extensive search of the list and came up with a few good ones but didn't find the exact same situation as I have now. so I am venturing asking here. I have daily minute stock price data from 2005 on and each day with columns timestamp, open,high,low,close,volume and a few more. each day's data is about 1.2million rows. I want import all the data to postgresql and analyze using R with the help of Rpostgresql. right now I am having about 7000 tables for individual stock and I use perl to do inserts, it's very slow. I would like to use copy or other bulk loading tool to load the daily raw gz data. but I need the split the file to per stock files first before I do bulk loading. I consider this a bit messy. I would seek advise on the following idea: store everything in a big table, partition by month(this gives a reasonable number of partitions) and do bulk loading on the daily file. my queries would consist mostly select on a particular symbol on a particular day. Also in the future, I will import daily data to the db every day. my hardware is 16G Ram, 4x5400rpm raid10 with enough space. Thanks! Jim. Seems like you'd want to do this? http://search.cpan.org/~turnstep/DBD-Pg-2.19.2/Pg.pm#pg_putcopydata COPY support DBD::Pg allows for quick (bulk) reading and storing of data by using the COPY command. The basic process is to use $dbh-do to issue a COPY command, and then to either add rows using pg_putcopydata, or to read them by using pg_getcopydata. Thanks! would you comment on the table setup as well? Jim. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] what Linux to run
Hmm... I tend to _mostly_ run workstations rather than servers, pick my distro to suit my application needs. My workplace is a SLES site, I use Open Suse. Given most of my Postgres databases are in fact PostGIS databases, and need to work with a variety of other spatial data GIS related apps, then I have a set of dependencies to work with for every install. Postgres, Postgis, GEOS, Proj, GDAL, mapserver, Java, python. QGIS, GMT, etc. I have liased with the package maintainers who look after the Suse GEO repository, and they are generally able to build any required package, for both server * workstation distros (SLED, SLES, OpenSuse). Having robust packages built by people who know more than I do about this area is core to my selection of distro. While I'm aware that Debian, Ubuntu, Fedora also have GIS related repositories, the OPenSuse ones have, for me at least, the best mix of currency stability, fantastic support. If your goal is to run a robust Postgres server, find the mainstream distro which provides what you want out of the box, so you can run the database, not wrestle with compiling it every time something changes. Only consider compiling your own applications if there is no such distro, or you really want to have that level of control ownership of the system. Also, if you are running a VM as your server, then under Xen commercial tools, for example, SLES is fully supported by the hypervisor. Ubuntu isn't. Makes choosing easy... YMMV :-) Brent Wood GIS/DBA consultant NIWA +64 (4) 4 386-0300 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of David Boreham [david_l...@boreham.org] Sent: Sunday, March 04, 2012 3:23 PM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] what Linux to run On 3/3/2012 7:05 PM, Tom Lane wrote: [ raised eyebrow... ] As the person responsible for the packaging you're dissing, I'd be interested to know exactly why you feel that the Red Hat/CentOS PG packages can never be trusted. Certainly they tend to be from older release branches as a result of Red Hat's desire to not break applications after a RHEL branch is released, but they're not generally broken AFAIK. No dissing intended. I didn't say or mean that OS-delivered PG builds were generally broken (although I wouldn't be entirely surprised to see that happen in some distributions, present company excluded). I'm concerned about things like : a) Picking a sufficiently recent version to get the benefit of performance optimizations, new features and bug fixes. b) Picking a sufficiently old version to reduce the risk of instability. c) Picking a version that is compatible with the on-disk data I already have on some set of existing production machines. d) Deciding which point releases contain fixes that are relevant to our deployment. Respectfully, I don't trust you to come to the correct choice on these issues for me every time, or even once. I stick by my opinion that anyone who goes with the OS-bundled version of a database server, for any sort of serious production use, is making a mistake. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] running multiple versions
Run them in different locations with different addresses (5432 5433 for example) see this thread:http://archives.postgresql.org/pgsql-admin/2008-02/msg00084.php Brent Wood GIS/DBA consultant NIWA +64 (4) 4 386-0300 From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Heine Ferreira [heine.ferre...@gmail.com] Sent: Friday, February 17, 2012 11:11 AM To: pgsql-general@postgresql.org Subject: [GENERAL] running multiple versions Hi Is it possible to run more than one instance of Postgresql on Windows? Can you run different versions simultaneously? If so can you run different minor versions or only different major versions? If possible how do you do this? Thanks H.F. -- Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Factors Influencing Participant Satisfaction with Free/Libre and Open Source Software Projects
Apologies for the cross posting, but this thesis may be of interest to a wide array of FOSS related lists. It just went public: Title: Factors Influencing Participant Satisfaction with Free/Libre and Open Source Software Projects Author: Chawner, Brenda Abstract: The purpose of this research was to identify factors that affect participants’ satisfaction with their experience of a free/libre open source software (FLOSS) project. ... http://researcharchive.vuw.ac.nz/handle/10063/1710
Re: [GENERAL] Error Importing CSV File
Can you show the output of \d geo_data ? Try 'using' delimiters Are you doing this as the postgres superuser? Because COPY can't load from files as a casual user, you need to pipe it to copy read from stdin. Simple script below works for me, modified copy statement might help?. HTH, Brent Wood #! /bin/bash DB=test psql -d $DB -c drop table geo_data; # latitude has only one 't' psql -d $DB -c create table geo_data ( zip_code text, latitude float8, longitude float8, city text, state text, countytext); echo 96799,-7.209975,-170.7716,PAGO PAGO,AS,AMERICAN SAMOA 96941,7.138297,151.503116,POHNPEI,FM,FEDERATED STATES OF MICRO 96942,7.138297,151.503116,CHUUK,FM,FEDERATED STATES OF MICRO | \ psql -d $DB -c copy geo_data from stdin using delimiters ',' null ''; psql -d $DB -c select * from geo_data; Output: zip_code | latitude | longitude | city| state | county --+---++---+---+--- 96799| -7.209975 | -170.7716 | PAGO PAGO | AS| AMERICAN SAMOA 96941| 7.138297 | 151.503116 | POHNPEI | FM| FEDERATED STATES OF MICRO 96942| 7.138297 | 151.503116 | CHUUK | FM| FEDERATED STATES OF MICRO (3 rows) Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Bryan Nelson 07/16/11 7:15 AM I am having problems importing a CSV file of sample data for testing in a web app. Columns Types --- zip_code - text lattitude - float8 longitude - float8 city - text state - text county - text Some Sample Data From CSV File -- 96799,-7.209975,-170.7716,PAGO PAGO,AS,AMERICAN SAMOA 96941,7.138297,151.503116,POHNPEI,FM,FEDERATED STATES OF MICRO 96942,7.138297,151.503116,CHUUK,FM,FEDERATED STATES OF MICRO COPY Command COPY geo_data FROM 'geo_data2.csv' DELIMITERS ',' CSV; Error Message - ERROR: invalid input syntax for integer: 96799 CONTEXT: COPY geo_data, line 1, column id: 96799 I can't figure out why it keeps choking with invalid syntax for integer since the field was created as text. Any and all help greatly appreciated! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] Read MS-SQL data into Postgres via ODBC link?
Hi Jonathan, I haven't done this from MySQL, but have from Postgres from Oracle From a command line client, extract the data from the source table, so you get a stream of csv style records. Then pipe these directly into a psql statement to load them into the target table. A simple pg2pg example: psql -d $DB1 -F '|' -Atc select * from table; | psql -d $DB2 -c copy table from STDIN with delimiter '|' with null ''; A MySQL example would be similar, the second part pretty much identical. HTH, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Jonathan Brinkman 07/05/11 12:45 PM Greetings I'd like to INSERT data into my Postgresql 8.4.8 table directly from an ODBC-linked MS-SQL table or view. I'm looking at using the Cybertec tool ODBC-LINK (http://www.cybertec.at/en/postgresql_products/odbc-link) but I'm wondering if there isn't a way to do this through Postgresql directly? I saw a post about someone doing a SELECT * FROM XXX ODBC SOURCE or something like that (http://archives.postgresql.org/pgsql-odbc/2009-07/msg00032.php) and that would be cool. I don't need to import massive datasets, only 20-30K records at a time. Thanks much Jonathan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] Miidpoint between two long/lat points? (earthdistance?)
Why not install PostGIS with full ellipsoidal projection support use the azimuth distance functions available in SQL? Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Carlo Stonebanks 05/27/11 8:20 AM Nicely done, Merlin! Hope others with the same problem can find this post. Thanks a lot. -Original Message- From: Merlin Moncure [mailto:mmonc...@gmail.com] Sent: May 26, 2011 9:53 AM To: Carlo Stonebanks Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Miidpoint between two long/lat points? (earthdistance?) On Thu, May 26, 2011 at 12:05 AM, Merlin Moncure wrote: Converted from javascript from here: http://en.wikipedia.org/wiki/Atan2 whoops! meant to say here: http://www.movable-type.co.uk/scripts/latlong.html merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] concatenating with NULLs
Hi Seb, Use CASE to change nulls to empty strings (or a placeholder) as below. See: http://www.postgresql.org/docs/9.0/static/functions-conditional.html if you want a placeholder in the result to indicate the presence of a null, try the second SQL: test=# select (case when 'a' isnull then '' else 'a' end) || (case when 'b' isnull then '' else 'b' end) || (case when NULL is null then '' end); ?column? -- ab (1 row) test=# select (case when 'a' isnull then '_' else 'a' end) || (case when 'b' isnull then '_' else 'b' end) || (case when NULL is null then '_' end); ?column? -- ab_ (1 row) test=# HTH, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Seb 04/26/11 10:21 AM Hi, A query such as: SELECT 'a' || 'b' || NULL; returns a NULL. How can I ensure I get 'ab' in the result? I'm trying to concatenate columns and one of them might have NULL values, which I simply want to ignore. Cheers, -- Seb -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] Postgres 9.1 - Release Theme
I haven't checked to follow this up, but it seems like the sort of announcement one might expect on 1 April. Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Darren Duncan 04/02/11 3:01 PM I was under the impression that QUEL was actually a good language in some ways, and that it was more relational and better than SQL in some ways. http://en.wikipedia.org/wiki/QUEL_query_languages Maybe bringing it back would be a good idea, but as an alternative to SQL rather than a replacement. In any event, QUEL was somewhat similar to SQL. -- Darren Duncan Rajasekhar Yakkali wrote: Following a great deal of discussion, I'm pleased to announce that the PostgreSQL Core team has decided that the major theme for the 9.1 release, due in 2011, will be 'NoSQL'. ... the intention is to remove SQL support from Postgres, and replace it with a language called 'QUEL'. This will provide us with the flexibility we need to implement the features of modern NoSQL databases. With no SQL support there will obviously be some differences in the query syntax that must be used to access your data. hmm.. shock it is this shift for 9.1 due in mid 2011 is unexpectedly soon :) Curious to understand as to - how this relates to every feature that is provide at the moment based on RDBMS paradigm. ACID compliance, support for the features provided by SQL, referential integrity, joins, caching etc, .. - Also does this shift take into an assumption that all the use cases fit the likes of data access patterns usecases similar to facebook/twitter? or to address the the likes of those ? Thanks, Raj -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] Web Hosting
Rimu hosting allows you to install whatever you want, including Postgres... which I have done before now. If your project is in support of Open Source software in any way, ask what discount they can offer, they have been pretty generous in that arena. http://rimuhosting.com/ Like many hosting companies, they allow you to install run Postgres, but do not provide support for it. Although given the technical competencies of their support staff, you may find one of them will be able to help anyway. HTH, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Uwe Schroeder 03/06/11 7:05 PM Godaddy virtual hosting does in fact support postgresql. You have a root account on the virtual server and you can install whatever you want. I run several servers with them and all have postgresql, some virtual, some dedicated servers. Haven't tried their shared servers though, so I can't say anything about those. Hope that helps. PS: for a company that size their customer support isn't too shabby either. Uwe Gentlemen- Go-daddy *claims* to support postgres http://help.godaddy.com/article/2330 YMMV Martin-- __ Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung fuer den Inhalt uebernehmen. Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le destinataire prévu, nous te demandons avec bonté que pour satisfaire informez l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est interdite. Ce message sert à l'information seulement et n'aura pas n'importe quel effet légalement obligatoire. Étant donné que les email peuvent facilement être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité pour le contenu fourni. Date: Sat, 5 Mar 2011 16:40:57 -0800 Subject: Re: [GENERAL] Web Hosting From: m...@kitchenpc.com To: urlu...@gmail.com CC: pgsql-general@postgresql.org On Sat, Mar 5, 2011 at 1:08 PM, matty jones wrote: I already have a domain name but I am looking for a hosting company that I can use PG with. The few I have contacted have said that they support MySQL only and won't give me access to install what I need or they want way to much. I don't need a dedicated host which so far seems the only way this will work, all the companies I have researched so far that offer shared hosting or virtual hosting only use MySQL. I will take care of the setup and everything myself but I have already written my code using PG/PHP and I have no intention of switching. Thanks. Well there's this list: http://www.postgresql.org/support/professional_hosting Also, maybe something like Amazon EC2 if you want your own box? I think the small instances are even free.. Mike -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] Simple, free PG GUI/query tool wanted
Hi, pgadmin is still an admin tool, NOT a simple user query tool. I'd suggest PG Access as worth a look, but unfortunately it is no longer supported, and I have never tried it with a recent version of Postgres. Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Elford,Andrew [Ontario] 12/15/10 11:05 AM http://www.pgadmin.org/download/windows.php From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Gauthier, Dave Sent: December 14, 2010 1:38 PM To: pgsql-general@postgresql.org Subject: [GENERAL] Simple, free PG GUI/query tool wanted Hi: I'm trying to find a simple, easy-to-use, open (free), Windows based query tool for PG. This would be something for the novice end user, a means to build a query, execute it, export results out to a csv or Excel or something like that. Tools with metadata editors are actually undesirable (they don't need it, will just serve to get the confussed, could give them info that would get them into trouble). GOing through ODBC is undesirable, but understandable if there are no other options. I've looked through the Community_Guide_to_PostgreSQL_GUI_Tools but found nothing that foots that bill (but I may have missed something). I'm looking for something similar to MySQL's Query Browser or SQLyog. Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] COPY FROM and INSERT INTO rules
From the 8.3 docs... Be aware that COPY ignores rules. ... COPY does fire triggers, so you can use it normally if you use the trigger approach. HTH, Brent Wood All, I have a rule written on a temp table which will copy the valuesinserted into it to another table applying a function. The temp tablewill be discarded then. The rules I have written works when I useInsert into the temp table. But when I use bulk copy COPY FROM, therule doesn't get triggered and data is inserted only into the temptable that I created. Is there a way to call a rule when I use COPY FROM instead of INSERTINTO TIA, Sairam Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] Dumping a table from one database and adding it to another
Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand James B. Byrne 12/03/10 12:56 PM I have read the documentation respecting backups but I cannot seem to find any mention of the specific case that I wish performed. Hi James, pg_dump can take arguments to dump a specified table, the output can be filtered/edited in a pipe passed back to psql accessing a different database on the same or another host: You may need to set user other parameters, but a quick example; to dump a table, rename it recreate in a different db on another server on the fly: pgdump -h -t| sed 's/src table/target table/g' | psql -h -d dump named table in specified db change all occurences of run the SQL in the original table name to new onetarget db By piping through sed you can also change column names, just as this does the table name. But note that if you have users/tables/columns with the same name, this is not going to be straightforward... HTH, Brent Wood I have a pair of tables in a production database that I wish to dump and then restore to a new, different database. I can, and probably will, recreate the tables and column layouts in the new database. Is there a way to load the data dumped from a single table in one database into a new, possibly differently named, table in a different database, using PG utilities? -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:byrn...@harte-lyne.ca Harte Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] techniques for bulk load of spatial data
On 2010-11-30 14.29, Mario Corchero wrote: Hi, I'm a student of Computer Science, I know diffrents techniques of bulk load, but I need to know how specifically postgreSQL make a bulk load of spatial data, could anyone If you are using spatial data in Postgres, this might usefully be addressesd to the Postgis list. Refer to http://www.postgis.org When you say bulk loading of spatial data, is this hundreds of thousands or billions of records? Are you needing to include coordinate system/projection info? Have you looked at ogr2ogr or shp2pgsql, or SPIT in QGIS, all of which can lod data into PostGIS, depending on how big a bulk you are talking about. If your spatial data is available in Postgis WKB format, you could generate a file to use with Postgres copy command? Regards, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] median for postgresql 8.3
Hi Maarten, The best way I know of to do this is not to do statistical queries in the DB at all, but use a stats capability embedded in your database, so they still appear to the user to be done in the db. I don't see how you can easily get the functionality you want without user defined functions or addons, While PL/R is a special addon, and you created a custom median function to do this, there are very good instructions to follow to do this. I think it might be worth your while if you are looking to retrieve stats from SQL queries. See PL/R, and the median how-to at: http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgresql_plr_tut01 HTH, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand maarten 11/17/10 9:15 AM Hello everyone, I was doing some analysis of data to find average delays between some timestamp values etc... When the number of rows the average is computed over is small, this can give distorted values. So I've obviously added a count column to see if the average represents much data. However, I would also like to add the median value to give me a pretty good idea of whats happening even for smaller counts. I couldn't find such an aggregate function in the manual (version 8.3) and some websearching didn't uncover it either. I was thinking about SELECT max(id) FROM test ORDER BY id ASC LIMIT (SELECT count(*)/2 FROM test) But two things are wrong with that: Limit can't use subqueries :( And ORDER BY gives me the error: 'must be used in aggregate function etc...) but I can probably work around this by using an ordered subquery in stead of the table directly. Furthermore, I need the median for a timestamp column, which would probably complicate things more than when it is a number column. I'd like to be able to do this using only the database. (So no programming functions, special addons etc...) Any ideas anyone? regards, Maarten -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] Database INNOVATION
Gotcha. Apologies for the digression, off your exact topic but consistent with the subject :-) I'm interested in both, PL/R representational graphics from an analytical perspective, doing more than just retrieving raw or accumulated data with SQL. also from the (mathemetical) graphic perspective to support biological taxonomic trees/heirarchies, which do not easily fit the SQL model, although a number of kludges to traverse such structures are around. (I need to look at the Postgres recursive capability for this sometime) Cheers, Brent Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Craig Ringer 10/20/10 6:12 PM On 10/20/2010 12:35 PM, Brent Wood wrote: Have a look at PL/R. You can embed a command to generate a graphic using R via a user defined SQL function, In this case, when I say graph or tree I'm referring to the concept in the graph theory sense, not the plot sense. object graph not image representation of data. http://en.wikipedia.org/wiki/Graph_(mathematics) http://en.wikipedia.org/wiki/Graph_theory Sorry, I didn't even think to clarify my usage. What I'm talking about is a way to query the database and obtain a representation of matching tuples where each tuple is represented exactly once, and referential relationships between tuples are included in an efficient way. For a simple tree or forest (ie a directed graph with no cycles) this could be a XML/JSON/YAML/whatever document that uses nesting to represent relationships. For more complex graphs, it'd have to be a list of XML/JSON/YAML/whatever representations of each tuple or (if Pg supported it) multiple tabular result sets, one for each tuple type. An edge list could be included to speed mapping out the inter-object references after deserialization. To say this would be nice when dealing with document-in-database storage and certain types of ORM workload is quite an understatement. Getting rid of all that horrid multiply left join, filter and de-duplicate or n+1 select crap would be quite lovely. Sure, it's often better to use sane SQL directly, but there are tasks for which ORMs or document-database mappings are a real time and pain saver - it'd just be nice to be able to teach the database their language. Plus, that'd help shut up the NoSQL crowd and separate NoSQL from relaxed or no ACID shareded databases, two different things people currently confuse. In any case, thanks for the tip. It's nice to know the PL/R can be used for such in-database processing when I *do* want to plot data. -- Craig Ringer Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] Database INNOVATION
Have a look at PL/R. You can embed a command to generate a graphic using R via a user defined SQL function, This example from http://www.varlena.com/GeneralBits/Tidbits/bernier/art_66/graphingWithR.html HTH Brent Wood = Graphs can be as easy as '123'. Here's an example where two columnsin a table are plotted against each other. Create and populate the table using the following commands: CREATE TABLE temp (x int, y int); INSERT INTO temp VALUES(4,6);INSERT INTO temp VALUES(8,3);INSERT INTO temp VALUES(4,7);INSERT INTO temp VALUES(1,5);INSERT INTO temp VALUES(7,8);INSERT INTO temp VALUES(2,3);INSERT INTO temp VALUES(5,1);INSERT INTO temp VALUES(9,4); The function f_graph()generates the graph as a pdf document: CREATE OR REPLACE FUNCTIONf_graph() RETURNS text AS 'str - pg.spi.exec(''select x as my a ,y asmy b from temp order by x,y'');pdf(''/tmp/myplot.pdf'');plot(str,type=l,main=GraphicsDemonstration,sub=Line Graph);dev.off();print(''done'');' LANGUAGE plr; Creating the graph by invoking this query: SELECT f_graph(); Craig Ringer said: Now, personally, if we're talking database innovation what I'd like to see is a built-in way to get query results straight from the database as graphs of tuples and their relationships. Tabular result sets are poorly suited to some kinds of workloads, including a few increasingly common ones like document-oriented storage and use via ORMs. In particular, the way ORMs tend to do multiple LEFT OUTER JOINs and deduplicate the results or do multiple queries and post-process to form a graph is wasteful and slow. If Pg had a way to output an object graph (or at least tree) natively as, say, JSON, that'd be a marvellous option for some kinds of workloads, and might help the NoSQL folks from whining quite so much as well ;-) -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] Selecting top N percent of records.
Something like this should work - (but is untested), and does require the extra subquery, so there may be a more efficient way? However, off the top of my head: select a,b,c from table where order by c desc limit (select count(*)/10 from table where ); where c is the no of sales column Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Tim Uckun 10/18/10 3:40 PM Is there a way to select the top 10% of the values from a column? For example the top 10% best selling items where number of sales is a column. Thanks. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] How to force select to return exactly one row
Use a case staement to test for a null output, return whatever you want in the event of it being null, else the actual value: from the top of my head, something like: SELECT case when (select somecolumns FROM ko RIGHT JOIN (SELECT 1) _forceonerow ON true LEFT JOIN t1 ON t1.primarykey= ko.t1foreignkwey ... LEFT JOIN tn ON tn.primarykey= ko.tnforeignkwey WHERE ko.primarykey is null or ko.primarykey='someprimarykeyvalue') not null then (select somecolumns FROM ko RIGHT JOIN (SELECT 1) _forceonerow ON true LEFT JOIN t1 ON t1.primarykey= ko.t1foreignkwey ... LEFT JOIN tn ON tn.primarykey= ko.tnforeignkwey WHERE ko.primarykey is null or ko.primarykey='someprimarykeyvalue') else 0 end It does require the query to be run twice, so does have extra overhead. You could wrap a function around this to get store the result test that, then having stored it you can use it for the output value without a second query. All depends on how much overhead there is in teh query. HTH, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Andrus 06/22/10 10:12 AM Autogenerated select statement contains 0 .. n left joins: SELECT somecolumns FROM ko LEFT JOIN t1 ON t1.primarykey= ko.t1foreignkwey ... LEFT JOIN tn ON tn.primarykey= ko.tnforeignkwey WHERE ko.primarykey='someprimarykeyvalue'; This select can return only 0 or 1 rows depending if ko row with primary key 'someprimarykeyvalue' exists or not. Problem: if there is no searched primary key row in ko database, select should also return empty row. To get this result I added right join: SELECT somecolumns FROM ko RIGHT JOIN (SELECT 1) _forceonerow ON true LEFT JOIN t1 ON t1.primarykey= ko.t1foreignkwey ... LEFT JOIN tn ON tn.primarykey= ko.tnforeignkwey WHERE ko.primarykey is null or ko.primarykey='someprimarykeyvalue'; but it still does not return row if primary key row 'someprimarykeyvalue' does not exist. How to force this statement to return one row always ? Andrus. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] Best way to handle multi-billion row read-only table?
If you will be selecting sets of data within a time range, it should also improve performance if you can build a clustered index on the sample_time. It may also be worth looking at whether partitioning by timestamp channel offers any advantages. Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Justin Graf 02/10/10 3:07 PM On 2/9/2010 4:41 PM, Asher Hoskins wrote: Thanks for that, it looks like partitioning is the way to go. I'm assuming that I should try and keep my total_relation_sizes less than the memory size of the machine? This depends on what the quires look like. As other have stated when partitioning you have to consider how the data is quired. If I partition so that each partition holds data for a single channel (and set a CHECK constraint for this) then I can presumably remove the channel from the index since constraint exclusion will mean that only partitions holding the channel I'm interested in will be searched in a query. Given that within a partition all of my sample_time's will be different do you know if there's a more efficient way to index these? Given the timestamp will most likely be the where clause, NO on the plus side its only 8 bytes All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary quotation system. Quotations received via any other form of communication will not be honored. CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other information proprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it addresses. If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified that any unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this e-mail immediately. Thank you. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd.
Re: [GENERAL] db not dumping properly, or at least not restoring
Hi Kirk, How's it going? You can use pg_dump on the local host to access a db on a remote host, as the output is just SQL, pipe this directly intp a psql command, thus replicating/migrating a database. One note, if you are doing this with a PostGIS db, I find it works better to create an empty target db with PostGIS installed first, then let the constraints on PostGIS objects prevent the old PostGIS being installed in the new db. Or you can copy over the old PostGIS use the PostGIS upgrade SQL. Cheers, Brent Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Kirk Wythers kwyth...@umn.edu 10/17/09 4:15 PM On Oct 16, 2009, at 4:51 PM, Scott Marlowe scott.marl...@gmail.com wrote: On Fri, Oct 16, 2009 at 11:25 AM, Kirk Wythers kwyth...@umn.edu wrote: Any ideas what the problem could be here? Use the pg_dump from the target (i.e. newer) pgsql. I.e. if going from 8.3.8 to 8.4.1, use the pg_dump that comes with 8.4.1 to dump the 8.3.8 database. Can I assume that this is even more critical if gong from 8.2 to 8.4? I usually just do it like so: (First migrate accounts:) pg_dumpall --globals -h oldserver | psql -h newserver postgres I'm a little confused here. Are you saying to used the network connections between thetwo servers and to pipe the dumpall directly to the psql load? (then each database:) createdb -h newserver dbname Then create new databases on the the new server to match the. The names from the old server? pg_dump -h oldserver dbname | psql -h newserver dbname (repeat as needed, save output for error messages) Then dump each database individually and pipe the dump to the psql load? These two procedures seem to duplicate the goal? Or am I mosaic something? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to get variable out to shell script
On Sun, 2009-09-20 at 16:49 -0500, Alex Gadea wrote: I am using psql to call an external sql file that executes a simple select count(*): ie: select into ct count(*) from table; I can't figure out how to make the ct variable available to the shell script once the external sql file completes execution. Hi Alex, If you are using bash, for example: COUNT=`psql -d database -Atc select count(*) from table;` in a shell script, any string enclosed in back-quotes (`) is executed the result is returned. So the above expression assigns the value returned by the psql command to the variable called COUNT. The -Atc tells psql to return only the unaligned value, no formatting or column names, etc. If you store your SQL command outside the script, then you could use: COUNT=`psql -d database -Atf SQL_file` HTH, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] haversine formula with postgreSQL
A bit out in left field, Writing your own haversine in Postgres seems a bit like reinventing a wooden wheel when you gan get a free pneumatic one... Any reason not to just install PostGIS fully support geometries projections in Postgres? You can build the geometries provided to the functions on the fly from lat/lon coordinates stored as numerics in your SQL, so your DB structures don't even have to change if you don't want them to.. http://www.postgis.org/documentation/manual-1.4/ST_Distance_Sphere.html http://www.postgis.org/documentation/manual-1.4/ST_Distance_Spheroid.html HTH Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Scott Marlowe scott.marl...@gmail.com 09/18/09 11:35 AM On Thu, Sep 17, 2009 at 1:16 PM, Jonathan jharah...@gmail.com wrote: Hi! I am looking at the PHP/MySQL Google Maps API store locator example here: http://code.google.com/apis/maps/articles/phpsqlsearch.html And I'm trying to get this to work with PostgreSQL instead of MySQL. I've (slightly) modified the haversine formula part of my PHP script but I keep getting this error: Invalid query: ERROR: column distance does not exist LINE 1: ...ude ) ) ) ) AS distance FROM aaafacilities HAVING distance ... ^ I'm new to this, but it doesn't look like I need to create a column in my table for distance, or at least the directions didn't say to create a distance column. Here is my PHP with SQL: $query = sprintf(SELECT 'ID', 'FACILITY', 'ADDRESS', latitude, longitude, ( 3959 * acos( cos( radians('%s') ) * cos( radians ( latitude ) ) * cos( radians( longitude ) - radians('%s') ) + sin ( radians('%s') ) * sin( radians( latitude ) ) ) ) AS distance FROM aaafacilities HAVING distance '%s' ORDER BY dist LIMIT 0 OFFSET 20, pg_escape_string($center_lat), pg_escape_string($center_lng), pg_escape_string($center_lat), pg_escape_string($radius)); Does anyone have any ideas on how I can get this to work? I'm not sure what is wrong, since it doesn't seem like I need to create a distance column and when I do create one, I get this: Is that really the whole query? Why a having with no group by? Can you do me a favor and print out $query instead of the php stuff? It might help you as well to troubleshoot to see the real query. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] psql crashing - don't know why
Hi... I have a view across 3 tables, total some 5m rows. I can extract parts of the view, entire rows, with a where clause (eg: select * from view where cell_id=10;) If I try to select the entire view (eg: select * from view;) it runs for a while then gives the error msg Killed and returns to the system prompt, having exited psql. The log says: 2009-08-10 00:19:01 NZST ben200601 woodb LOG: could not send data to client: Broken pipe 2009-08-10 00:19:48 NZST ben200601 woodb LOG: unexpected EOF on client connection I have written a script along the lines of: get distinct cell_id from table; file while read cell ; do psql -d db -c select * from view where cell=$cell; output done file This worked, the number of lines in the output file is the same number as that returned by select count(*) from view; (which works fine), but I'd like to find out the cause of the error. I assume there is some sort of timeout or overflow occurring, but I can't see any indication of what settings I can change to fix this. All the underlying tables have just had vacuum analyse run on them. I'm running PostgreSQL 8.1.4 on x86_64 Linux, I know it is dated, but I'm not in a position to upgrade at this point. Thanks, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] commercial adaptation of postgres
Also try Netezza, one data warehouse appliance originally based on Postgres. Although this is not the only such Postgres derivative. Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Greg Smith gsm...@gregsmith.com 07/24/09 9:10 AM On Mon, 20 Jul 2009, Christophe wrote: On Jul 20, 2009, at 6:56 PM, Dennis Gearon wrote: I once talked to a company that made a custome version of Postgres. It split tables up on columns and also by rows, had some other custome features. It was enormously faster from what I gathered. I could of sworn it began with the letter 'T', but maybe not. I don't see anything like that on the commercial page of the posgres site. Truviso? http://truviso.com/ We don't do any column-oriented stuff at Truviso. From the description Dennis gave, I'm guess he's thinking of the Petabyte database at Yahoo: http://it.toolbox.com/blogs/database-soup/2-petabyte-postgresql-24848 -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] indexes on float8 vs integer
Hi Dennis, Is there any reason you are not using PostGIS to store the values as point geometries use a spatial (GIST) index on them? I have tables with hundreds of millions of point features which work well. On disk data volume is not really worth optimising for with such systems, i suggest flexibility, ease of implementation overall performance should be more valuable. If you need to store query coordinates, then a map based tool seems relevant, and there are plenty of tools to do this soirt of thing with PostGIS data, such as Mapserver, GeoServer at the back end OpenLayers in the front end. Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Scott Marlowe scott.marl...@gmail.com 07/12/09 10:31 PM On Sat, Jul 11, 2009 at 10:15 PM, Dennis Gearongear...@sbcglobal.net wrote: Anyone got any insight or experience in the speed and size of indexes on Integer(4 byte) vs float (8byte). For a project that I'm on, I'm contemplating using an integer for: Latitude Longitude In a huge, publically searchable table. In the INSERTS, the representation would be equal to: IntegerLatOrLong = to_integer( float8LatOrLong * to_float(100) ); This would keep it in a smaller (4 bytes vs 8 byte) representation with simple numeric comparison for indexing values while still provide 6 decimals of precision, i.e. 4.25 inches of resolution, what google mapes provides. I am expecting this table to be very huge. Hey, I want to be the next 'portal' :-) Dennis Gearon Well, floats can be bad if you need exact math or matching anyway, and math on them is generally slower than int math. OTOH, you could look into numeric to see if it does what you want. Used to be way slower than int, but in recent versions of pgsql it's gotten much faster. Numeric is exact, where float is approximate, so if having exact values be stored is important, then either using int and treating it like fixed point, or using numeric is usually better. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 10 TB database
Hi Artur, Some general comments: I'd look at partitioning and tablespaces to better manage the files where the data is stored, but also look at some efficiently parallelised disks behind the filesystems. You might also look at optimising the filesystem OS parameters to increase efficiency as well, so it is a mix of hardware/OS/filesystem db setup to optimise for such a situation. For data retrieval, clustered indexes may help, but as this requires a physical reordering of the data on disk, it may be impractical. Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Artur a_wron...@gazeta.pl 06/16/09 3:30 AM Hi! We are thinking to create some stocks related search engine. It is experimental project just for fun. The problem is that we expect to have more than 250 GB of data every month. This data would be in two tables. About 50.000.000 new rows every month. We want to have access to all the date mostly for generating user requesting reports (aggregating). We would have about 10TB of data in three years. Do you think is it possible to build this with postgresql and have any idea how to start? :) Thanks in advance, Artur -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Change view definition - do not have to drop it
I believe Postgres only checks the output types column names for each column in the view. If, as you suggest, you convert these in your view to a standard appropriate datatype, you could then recreate the view with different input column datatypes: eg: in the countries_simpl table, cat is a bigint datatype, gid is an int: bgmaps=# create view v_test as select cat from countries_simpl; CREATE VIEW bgmaps=# create or replace view v_test as select cat::bigint from countries_simpl; CREATE VIEW bgmaps=# create or replace view v_test as select (cat::char)::bigint from countries_simpl; CREATE VIEW bgmaps=# create or replace view v_test as select (cat::int)::bigint from countries_simpl; CREATE VIEW bgmaps=# create or replace view v_test as select cat::char from countries_simpl; ERROR: cannot change data type of view column cat bgmaps=# create or replace view v_test as select gid from countries_simpl; ERROR: cannot change name of view column cat bgmaps=# create or replace view v_test as select gid::bigint from countries_simpl; ERROR: cannot change name of view column cat bgmaps=# create or replace view v_test as select gid::bigint as cat from countries_simpl; CREATE VIEW HTH, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Emi Lu em...@encs.concordia.ca 06/03/09 10:45 AM Now I need update view1 definition to create or replace view view1 as select col1, col2 from new_table; However, col1 in new_table is not bpchar. This gives me headache! There are tens of dependent views based on view1, so I cannot just drop view1 and recreate it. How I can redefine view1 without dropping it and recreate it? Cast the new column to bpchar? If you want to change the output column type of the view, you have to drop and recreate it. Thank tom. Ok, I will not change view type, just keep bpchar for now. Just a wish, if =8.4 could allow change view type, that would be great! -- Lu Ying -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Online Backups PostGre (rsync for Windows)
Hi There are a few rsync on Windows options, just google rsync windows One we've found works well is DeltaCopy, which may meet your requirements. Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Adam Ruth adamr...@mac.com 05/02/09 1:01 PM Cygwin comes with rsync on Windows. On 02/05/2009, at 4:06 AM, John R Pierce wrote: Joshua D. Drake wrote: Well that's just it. Out of the box it doesn't actually work. PostgreSQL only gives you the facilities to roll your own PITR solution. You can look at PITR Tools: https://projects.commandprompt.com/public/pitrtools It doesn't quite work on Windows due to lack of rsync and signaling differences but could give you an idea of how to move forward with your own implementation. Quite possibly 'robocopy' from Microsoft somewhere (doesn't come with windows, but was part of an admin kit or something) would be a workable replacement for the rsync part. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Proper entry of polygon type data
Hi Peter, If you want to use Postgres to store/manage/query spatial data, I strongly recommend you look at PostGIS, not the native Postgres geometry types. Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Peter Willis pet...@borstad.com 03/24/09 10:35 AM Hello, I would like to use 'polygon' type data and am wondering about the entry format of the vertex coordinates. Are the coordinates of the polygon type to be entered one entry per polygon vertex, or one entry per polygon edge segment? For example: I have a triangle with vertex corners A, B, C. One entry per vertex format suggests INSERT INTO my_table (my_polygon_column) VALUES ( ((Ax,Ay),(Bx,By),(Cx,Cy)) ); One entry per edge format suggests INSERT INTO my_table (my_polygon_column) VALUES ( ((Ax,Ay),(Bx,By),(Bx,By),(Cx,Cy),(Cx,Cy),(Ax,Ay)) ); Which entry format is the correct one? If per vertex format is the correct one, do I need to 'close' the path by entering the first vertex again at the end of the list? ie: INSERT INTO my_table (my_polygon_column) VALUES ( ((Ax,Ay),(Bx,By),(Cx,Cy),(Ax,Ay)) ); Thanks, Peter -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SRID conflict, PostgreSQL 8.3
This should be addressed to the Postgis list. However, you are spatially joining two geometries, and they need be in the same coordinate system. The column the_geom has a defined SRID (spatial reference id) when created in it's original table. Your hard coded POLYGON in the SQL below has a SRID of -1 (the last value in the argument). You can fix this by either changing the -1 to be the same number as the SRID specified for the_geom, or by setting this arbitrarily to -1 for this operation (as below): SELECT whatever FROM a view into 'catalog' WHERE ((TRUE AND TRUE) AND SetSrid(the_geom, -1) GeometryFromText('POLYGON ((-83.28 26.07, -83.28 28.26, -81.59 28.26, -81.59 26.07, -83.28 26.07))', -1)); (This assumes that the_geom the coordinates you specify in the query are in fact in the same coordinate system Note that SRID of -1 means unknown coordinate system.) I'm not sure of the relevance of the (TRUE AND TRUE) in the where clause, it seems redundant, as it will always return true. Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand csmith csm...@insequenceinc.com 03/21/09 8:57 AM Hello, I serve a Geospatial IS project that for some years has used PostgreSQL and PostGIS. A strategy/scheme that has worked for all previous versions has failed with a recent upgrade to 8.3 (e.g. 8.3.6-1). Relation catalog is a large geospatially-enabled aggregation of data with a variety of SRID's imbedded within the_geom attribute values. Querying a view into the catalog which describes a subset of it's tuples with identical SRID's (the uniqueness of the SRID associated with this view's data is easily demonstrable with an ad-hoc query) has always worked smoothly. With the 8.3 engine, an error is introduced: SELECT whatever FROM a view into 'catalog' WHERE ((TRUE AND TRUE) AND the_geom GeometryFromText('POLYGON ((-83.28 26.07, -83.28 28.26, -81.59 28.26, -81.59 26.07, -83.28 26.07))', -1)) results in this error: Operation on two geometries with different SRIDs The result of the GeometryFromText routine has, of course, but one SRID, thus the SRID from the_geom must be the culprit. It's as if the query is considering tuples in catalog outside of the view's domain. (note: I can offer further evidence of this behavior- removing all tuples from catalog other than those returned by a query against the view eliminates the conflict/error). Can someone comment on this mystery/phenomenon vis-a-vis PostgreSQL version 8.3 (PostGIS 1.3.5)? Many thanks, Christopher Smith -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Uploading data to postgresql database
ogr2ogr can write most formats to most other formats. It can certainly write to a PostGIS database, read KML., so if it can write it to shape, it can write direct to Postgis You just need to set your output format to postgis. Note: depending on where you got GDAL (ogr2ogr) from, it may or may not have PostGIS drivers compiled in, if it doesn't you can compile it yourself against Postgres/PostGIS to enable this on your platform. Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Subha Ramakrishnan su...@gslab.com 03/18/09 7:04 PM Hi, Thanks for the reply. I did take a look at ogr2ogr which can convert kml to shape. But i was wondering if there's some direct way..:) And by upload I meant adding geometry data to the DB. Thanks regards, Subha Stephen Frost wrote: * Subha Ramakrishnan (su...@gslab.com) wrote: So far, I have been using shp2pgsql to upload data from shape files. I don't want to convert it to shape and then upload it. Have you looked at ogr2ogr? It looks to support KML as a format, and has PostGIS support, though I'm not sure if it can handle the direction you're asking for. Stephen - Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. - Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] surprising results with random()
Or perhaps: CREATE OR REPLACE VIEW test_view AS SELECT (random()*3)::int as test_value; At least in this case, that should give the same result. in this case 1/3 should be 1, 1/3 = 2 1/3=3 in your case 1/3 = 1, 1/2 the remainder (1/2 * 2/3 = 1/3) = 2, remaining 1/3 = 3 Although I'm guessing the original intent is to NOT generate an equal distribution, but I'm not sure what distribution is required. Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand ries van Twisk p...@rvt.dds.nl 02/24/09 12:13 PM Jessi, should the function not look like this??? CREATE OR REPLACE VIEW test_view AS SELECT CASE WHEN random() .3 THEN '1' WHEN random() .5 THEN '2' ELSE '3' END AS test_value FROM client; On Feb 23, 2009, at 5:09 PM, Jessi Berkelhammer wrote: Hi, I have a view in which I want to randomly assign values if certain conditions hold. I was getting surprising results. Here is a (very) simplified version of the view, which seems to indicate the problem: CREATE OR REPLACE VIEW test_view AS SELECT CASE WHEN random() . THEN '1' WHEN random() . THEN '2' ELSE '3' END AS test_value FROM client ; It seems this should generate a random number between 0 and 1, and set test_value to '1' if this first generated number is less than .. Otherwise, it should generate another random number, and set test_value to '2' if this is less than .. And if neither of the random numbers are less than ., it should set test_value to '3'. It seems to me that there should be a relative even distribution of the 3 values. However when I run this, the values are always similar to what is below: X_test= select test_value, count(*) from test_view group by 1 order by 1; test_value | count +--- 1 | 23947 2 | 16061 3 | 32443 Why are there significantly fewer 2s? I understand that random() is not truly random, and that the seed affects this value. But it still confuses me that, no matter how many times I run this, there are always so few 2s. If it is generating an independent random number in the second call to random(), then I don't know why there are more so many more 1s than 2s. Thanks! -jessi -- Jessi Berkelhammer Downtown Emergency Service Center Computer Programming Specialist -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Appending \o output instead of overwriting the output file
Thanks Tom, That will do trick. Perhaps \o+ as a future fix for this? Brent Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Tom Lane t...@sss.pgh.pa.us 02/18/09 7:46 PM Brent Wood b.w...@niwa.co.nz writes: Using \o to redirect output to a file from the psql command line, is there any way to have the output appended to the output file, rather than overwriting it? This is pretty grotty, but it works: \o | cat target Maybe we should provide another way in future... regards, tom lane NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Appending \o output instead of overwriting the output file
I'd be happy with either... is UNIX-ese for append, which is OK, if anyone uses command line MSDOS/ command prompt, it does the same there. But if we are to follow this logic, the \o file should overwrite/create, etc... which is perhaps a bit excessive. I think that having \o write to a file and \o+ add to the file is simple intutive for those folk who aren't familiar with the command line. The + means \o is adding to a file rather than just (over)writing a file, which I find pretty consistent with + in the other \ commands. However, I think introducing a into \ syntax is new different quite inconsistent with the other \ commands. But if either can be added I'll be happy :-) I'll just have to wait for Novell to formally support whichever version provides it, which shouldn't be much more than 3 years or so after it is released... At home I can use it straight away ... Thanks, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Tom Lane t...@sss.pgh.pa.us 02/19/09 10:19 AM John R Pierce pie...@hogranch.com writes: Tom Lane wrote: Brent Wood b.w...@niwa.co.nz writes: Perhaps \o+ as a future fix for this? I'd prefer \o file but maybe I'm too steeped in unix-isms. \o+ is reasonably consistent with the other \ command usages... Not really; none of the other commands interpret + as meaning append to an existing file. They tend to take it as meaning do something *in addition to* what you normally do, not to do something that is significantly different from the base command. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Appending \o output instead of overwriting the output file
Hi, Using \o to redirect output to a file from the psql command line, is there any way to have the output appended to the output file, rather than overwriting it? Thanks, Brent Woood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Update table with random values from another table
I'm not sure if that query will do what you want, but to make it work, one thing you might try, is to pre calculate the random values for each record, then order by those, eg: select trip_code, random() as rand from obs order by rand; works for me, so the following might for you: : UPDATE users SET t_firstname = x.firstname, t_surname = x.lastname, t_username = x.username, t_contact = x.email FROM (select firstname, lastname, username, email, random() as rand from testnames order by rand) WHERE t_firstname x.firstname; Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Rory Campbell-Lange r...@campbell-lange.net 02/17/09 4:33 PM I have a test system for which I need to replace actual user's data (in 'users') with anonymised data from another table ('testnames') on postgres 8.3. The tricky aspect is that each row from testnames must be randomised to avoid each row in users being updated with the same value. I've been trying to avoid a correlated subquery for each column I am trying to update, and instead trying the following, which never returns. There are 2000 records in testnames and about 200 in users. UPDATE users SET t_firstname = x.firstname, t_surname = x.lastname, t_username = x.username, t_contact = x.email FROM (select firstname, lastname, username, email from testnames order by random()) x WHERE t_firstname x.firstname; Any help much appreciated Rory -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] how to implement a foreign key type constraint against a not unique column
Hi, I have a table with a column of ID's (integer), these are unique except where they = -1 (column 1) I have a partial unique index where the value is not -1 to enforce this. I want to use this column as a foreign key on a column in another table (column 2), but cannot without a full unique index. Is there any way to add an equivalent constraint to a foreign key which restricts entries in column 2 to values in column 1? I tried a check where obs_id in (select id from ..), but subqueries are not supported in a check. I believe it is possible by using a table with nulls for the -1 values with a unique index on it as the foreign key, then a view which uses case or coalesce to present the nulls as -1, but this seems a cumbersome workaround. Thanks, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Relational database design book
It might be useful to look at the capabilities of the Informix Timeseries Datablade (http://www-01.ibm.com/software/data/informix/blades/) if you want to look at ways of enhancing the temporal data capabilities of Postgres. Cheers, Brent Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Chris Browne cbbro...@acm.org 12/16/08 10:05 AM rshep...@appl-ecosys.com (Rich Shepard) writes: [2] Strangely enough -- to me, at least -- the lack of full support for date- and time-based SQL in database tools such as PostgreSQL is puzzling. Virtually all business-related databases (think accounting systems as a prime example) depend on dates. So do many scientific databases. The support for temporality in PostgreSQL seems above average as far as I can see... PostgreSQL has pretty nice time types between the timestamptz type and interval. What strikes me as being missing is the ability to create temporally-aware foreign keys. That is, suppose the schema is: create table1 ( nearly_pk integer not null, from_date timestamptz not null default now(), to_date timestamptz not null default 'Infinity', constraint dating_t1 check (from_date to_date) -- probably some other data... ); I'd like to be able to do two more things: a) Treat the date range as part of the primary key (which isn't forcibly hard), b) Have references to table1 that point to the time range for the nearly_pk value but which are a little more liberal with the dates. create table2 ( t2pk integer primary key, nearly_pk integer not null, from_date timestamptz not null default now(), to_date timestamptz not null default 'Infinity', -- And have a foreign key that requires that -- for tuple in table2 the combination (nearly_pk, from_date, to_date) -- is *contained* by relevant ranges of (nearly_pk, from_date, to_date) -- on table1 foreign key (nearly_pk) references table1(nearly_pk) with temporal (table2.from_date, table2.to_date) contained_by (table1.from_date, table1.to_date) ); I don't think the syntax there is necessarily quite right; I'm just hoping to express the idea successfully. I could presumably do this with a trigger; have been trying to avoid that thus far. There are, of course, other ways of treating temporality; that is part of why it's early to treat this approach as worth putting into syntax. -- output = (cbbrowne @ acm.org) http://cbbrowne.com/info/finances.html When the grammar checker identifies an error, it suggests a correction and can even makes some changes for you. -- Microsoft Word for Windows 2.0 User's Guide, p.35: -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how to group several records with same timestamp into one line?
You need to use a self relation, not a group by, as no data are being aggregated into a new single value, which is what the group by achieves. This joins a table to itself, so that columns in it can be replicated. The key is that the where clause in each case needs to just select one channel, so it acts like a similar table but contains different rows. Because you used a char() instead of varchar() for channel, you may find your string 'channel1' has spaces in it to pad it to the specified length, in which case the where clauses below can use like '%channel1%' instead of = 'channel1' or you can strip the spaces before the comparison, eg: where trim(a.channel)='channel1'. I hope this makes sense. eg: select a.data, a.unit, b.data, b.unit, c.data, c.unit, d.data, d.unit, a.create_on from record data a, record-data b, record_data c, record_data d where a.channel='channel1' and b.channel='channel2' and c.channel='channel3' and d.channel=channel4 and b.create_on = a.create_on and c.create_on = a.create_on and d.create_on = a.create on; Thus table a comprises all records from channel1, etc... and they are joined on a common timestamp. NOTE: if any of the channels are null for any given timestamp, you will get no record for that timestamp using this syntax, even if other channels had data, because the query uses an inner join. If this is a problem then you'll need to reword the query as an outer join. HTH, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand zxo102 ouyang [EMAIL PROTECTED] 11/13/08 3:15 AM Hi everyone, My data with same timestamp 2008-11-12 12:12:12 in postgresql are as follows rowid data unitchannel create_on -- 11.5 MPa channel1 2008-11-12 12:12:12 22.5 M3 channel2 2008-11-12 12:12:12 33.5 M3 channel3 2008-11-12 12:12:12 44.5 t channel4 2008-11-12 12:12:12 -- I would like to group them into one line with SQL like 1.5 MPa 2.5M3 3.5 M3 4.5 t 2008-11-12 12:12:12 The table is defined as CREATE TABLE record_data ( rowid serial NOT NULL, data double precision, unit character(10), channel character(20), create_on timestamp ) Is that possible? Thanks for your help in advance. Ouyang NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how to best resync serial columns
Thanks Erik... I found an alternative to psql copy to stdout | psql copy from stdout. I used pg_dump -n schema | psql This approach replicated the entire schema, rather than just the table contents, into the new database, and therefore copied over all the seq data as well. It worked well in this situation. Thanks for the reply, I'll note it for future reference. Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Erik Jones [EMAIL PROTECTED] 11/11/08 8:03 PM On Nov 10, 2008, at 6:48 PM, Brent Wood wrote: Hi, I have a number of tables with serial columns as a primary key. I'm looking to add lots of records via copy, but should reset the serial counters to the appropriate value after this. Is there a simple way to do this, or do I just update the last_value column in each seq table to the max(id) from the relevant table. You shouldn't edit sequence table directly. To set a sequence's value you should use the setval(seqname, seqval) function like so: SELECT setval('some_seq', 1000); Erik Jones, Database Administrator Engine Yard Support, Scalability, Reliability 866.518.9273 x 260 Location: US/Pacific IRC: mage2k -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ordered pg_dump
It isn't guaranteed, but I think a clustered index on the attrs you want the dump ordered by will give an ordered dump. This may depend on your filesystem, and on what else your system is doing at the time, as interupted disk reads may disrupt the sequence. It has worked for me on Suse Linux with Reiser FS when the dump was the only (major) process running. You can also cut out the data text of the dump, if you used COPY format not inserts, then use sort awk to order the records appropriately, then paste them back in (at least on Linux/UNIX you can, on Windows this would need Cygwin installed). None of which is ideal or robust, having pg_dump able to generate ordered dumps natively would be useful. Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Josh Williams [EMAIL PROTECTED] 11/11/08 8:04 PM On Mon, 2008-11-10 at 17:05 -0800, Jeff Davis wrote: Is there any interest in an optional mode for pg_dump to order the output so that it's easier to use diff? I don't think it would make the output 100% deterministic, but it would make it easier to at least compare the data for small databases. That'd be cool. I'd done some poking around on the topic a little while back. The goal was to make the output more predictable so that backups would be more efficient, specifically with a product that does binary diffs of some sort. I may still have some notes somewhere if you're interested. But I believe the idea was to use COPY with a SELECT statement. The non-trivial part was to figure out a proper ordering to use. Or did you plan on combining it with -t, where you could then specify the ordering for each table? I think this has been brought up before, but I couldn't find the thread, so I don't know what conclusion was reached. Regards, Jeff Davis (... Plus, you potentially get a free CLUSTER on a reload.) - Josh Williams -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] how to best resync serial columns
Hi, I have a number of tables with serial columns as a primary key. I'm looking to add lots of records via copy, but should reset the serial counters to the appropriate value after this. Is there a simple way to do this, or do I just update the last_value column in each seq table to the max(id) from the relevant table. Thanks, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Defining string to represent null values in select
Hi, I can specify the text used to represent a null value in output from copy, but I'd like to do something similar is select output, eg: all NULL values are represented by NA or NaN. I can't find anything in the docs about this. This could be managed using case statements around all the columns in the query, but is there a simpler way, like setting a system variable to specify this? Thanks, Brent Wood -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Specifying text to substitute for NULLs in selects
Thanks guys, I'm aware of those options, what I was wondering was if there is a more generic way, for example the Empress RDBMS allows 'set MSNULLVALUE NA', and all NULLs will from then on be output as NA. The COPY option is closest to a generic setting, but doesn't work with a select query, just a table dump. I guess something like the following will work from the shell, although it is hardly elegant :-)... psql -d DB -Atc select '', attr, attr, attr, '' from ; | sed 's/||/|NA|/' | sed 's/|//' | sed 's/|//' data.txt Slightly simpler than the case statement approach in Postgres is COALESCE() eg: select COALESCE(attr,'NA') as attr from table; but this still needs to be applied to every column in the outout which may have nulls. rather than a generic one off setting. A view using COALESCE() may be the easiest way for users to have this capability automatically.. Thanks, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Said Ramirez [EMAIL PROTECTED] 11/08/08 12:34 PM I think you are more after something like SELECT CASE WHEN foo IS NULL THEN 'NA' END FROM bar. -Said Ivan Sergio Borgonovo wrote: On Thu, 6 Nov 2008 17:44:42 -0800 (PST) [EMAIL PROTECTED] wrote: Hi, I can specify the text used to represent a null value in output from copy, but I'd like to do something similar is select output, eg: all NULL values are represented by NA or NaN. I can't find anything in the docs about this. This could be managed using case statements around all the columns in the query, but is there a simpler way, like setting a system variable to specify this? wtw_drupal=# create schema test; CREATE SCHEMA wtw_drupal=# create table test.test(c1 text); CREATE TABLE wtw_drupal=# insert into test.test values(null); INSERT 0 1 wtw_drupal=# insert into test.test values('test'); INSERT 0 1 wtw_drupal=# \copy test.test to stdout null as 'BANANA' BANANA test wtw_drupal=# drop schema test cascade; NOTICE: drop cascades to table test.test DROP SCHEMA everything clearly explained in the COPY manual: http://www.postgresql.org/docs/8.1/static/sql-copy.html -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Specifying text to substitute for NULLs in selects
Thanks Adrian, That's perfect!! Cheers, Brent Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Adrian Klaver [EMAIL PROTECTED] 11/08/08 1:49 PM On Friday 07 November 2008 4:05:08 pm Brent Wood wrote: Thanks guys, I'm aware of those options, what I was wondering was if there is a more generic way, for example the Empress RDBMS allows 'set MSNULLVALUE NA', and all NULLs will from then on be output as NA. The COPY option is closest to a generic setting, but doesn't work with a select query, just a table dump. I guess something like the following will work from the shell, although it is hardly elegant :-)... psql -d DB -Atc select '', attr, attr, attr, '' from ; | sed 's/||/|NA|/' | sed 's/|//' | sed 's/|//' data.txt Slightly simpler than the case statement approach in Postgres is COALESCE() eg: select COALESCE(attr,'NA') as attr from table; but this still needs to be applied to every column in the outout which may have nulls. rather than a generic one off setting. A view using COALESCE() may be the easiest way for users to have this capability automatically.. Thanks, Brent Wood Using psql http://www.postgresql.org/docs/8.2/interactive/app-psql.html lfnw=# \a\t\f ','\pset null 'NA' Output format is unaligned. Showing only tuples. Field separator is ,. Null display is NA. lfnw=# SELECT null,1; NA,1 -- Adrian Klaver [EMAIL PROTECTED] NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postgres/postgis
You might try the Geo repository for OpenSUSE. This includes builds in 32 64 bit for several other distros as well as OpenSUSE for the GIS related applications, but you will probably need to compile Postgres (that version) from scratch. Which is unfortunate, as the default build by compiling Postgres puts files in different places than the standard RPM packages, so if you do compile Postgres, you may also be better off compiling PostGIS/proj/GDAL/GEOS/etc from scratch as well to ensure everything works together. The OpenSUSE geo repository is at: http://download.opensuse.org/repositories/Application:/Geo/ HTH, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Eduardo Arévalo [EMAIL PROTECTED] 10/07/08 6:34 AM hello is campatible install postgresql-8.3.4-1-linux-x64 with postgis-1.3.3. postgis there for 64-bit architecture?? There are the libraries and proj4 GEOS arqitectura for 64-bit?? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] foreign key problem
Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Jorge Godoy [EMAIL PROTECTED] 09/17/08 1:36 AM Em Monday 15 September 2008 19:05:25 [EMAIL PROTECTED] escreveu: Hi, I need a foreign key (or equivalent) where the referenced table cannot have a unique constraint. For fisheries surveys we have a table of catch records. For a given event, several species are caught, and are weighed by one or more methods. Thus a given event may have more than one record for the same spaecies, as parts of the catch may be weighed differently. When we sample a species for measuring, the event is recorded, but as the sample may include individuals from throughout the catch, it does not necessarily relate to the catch-with-method table. Looks like you need a composite primary key here, i.e. a primary key for the category of the record will have more than one column (such as species and method of catch). With that you'll be able to uniquely identify the event and then associate it with the record. Thanks Jorge, There are many catches per event, one for each species/method, so a composite key would be on event/species/method for the catch. For lengths it would be on event/species (there is no catch weigh method here). There should only be a length record for a matching event/species in the catch (normally constrained via a foreign key) but I cannot create a composite key on catch without including method, so there is no possible unique key on catch to match to the primary key on length. -- Jorge Godoy [EMAIL PROTECTED] -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL TPC-H test result?
My 02c, Pg does itself no favours by sticking with such pessimistic defaults, and a novice user wanting to try it out will find tweaking the pg configuration files for performance quite complicated. Given the general increase in typical hardware specs these days, perhaps the default pg specs could be set for higher spec systems? Or perhaps the standard install could come with 2 or 3 versions of the config files, the user can simply rename/invoke the one that fits their system best? I figure (somewhat simplistically) that most settings are more related to available memory than anything else, so perhaps config files for typical 1Gb, 4Gb 8Gb systems could be provided out of the box to make initial installs simpler? Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand Andrew Sullivan [EMAIL PROTECTED] 09/10/08 3:47 AM On Tue, Sep 09, 2008 at 07:59:49PM +0800, Amber wrote: I read something from http://monetdb.cwi.nl/projects/monetdb/SQL/Benchmark/TPCH/index.html Given that the point of that study is to prove something about performance, one should be leery of any claims based on an out of the box comparison. Particularly since the box their own product comes out of is compiled from CVS checkout. Their argument seems to be that people can learn how to drive CVS and to compile software under active development, but can't read the manual that comes with Postgres (and a release of Postgres well over a year old, at that). I didn't get any further in reading the claims, because it's obviously nothing more than a marketing effort using the principle that deriding everyone else will make them look better. Whether they have a good product is another question entirely. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Oracle and Postgresql
I agree with David's comment. The business I work for recently proposed a FOSS based solution for a client, but were told no FOSS solutions would be considered. We had a subcontractor offer a support contract for an application based on the same FOSS components, but with a support contract. This was perfectly acceptable to our client, who now has a working solution, for which they pay annual maintenance, in return for a lack of internal liability. For many businesses, risk avoidance is a core driver. What they require is someone else to blame if things go wrong, hence the companies making a living with contracts for Postgres support. Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand David Fetter [EMAIL PROTECTED] 09/01/08 6:09 PM On Sun, Aug 31, 2008 at 10:44:38PM -0400, Guy Rouillier wrote: M2Y wrote: Why most enterprises prefer Oracle than Postgres even though it is free and has a decent enough user community. Databases are a critical part of many companies' business. I work for telecom company, and if we were to lose our databases, we'd be out of business, period. So, free and decent enough are not good enough. If you are going to bet your business on anything, you want to be as sure as possible that it is reliable and that you can expect quick action if it should break. What they want to have is a huge entity they can blame when everything goes wrong. They're not interested in the actual response times or even in the much more important time-to-fix because once they've blamed Oracle, they know the responsibility is no longer on their shoulders. Cheers, David. -- David Fetter [EMAIL PROTECTED] http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: [EMAIL PROTECTED] Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] What's size of your PostgreSQL Database?
I have a db (tables with up to 5,000,000 records, up to 70 columns x 1,500,000 records, around 50Gb of disk space for the database (incl data, indexes, etc) Most records have PostGIS geometry columns, which work very well. For read performance this is on a (2 yr old) Linux box with 2x software RAID 0 (striped) WD 10,000RPM Raptor drives. FWIW bonnie gives reads at about 150Mb/sec from the filesystem. We have been more than happy with performance. though the 4Gb of RAM helps For data security, pg_dump backs it up every second day onto another 250Gb drive on the box, this is copied over the LAN to another server which is backed up to tape every day. It works for us :-) Cheers, Brent Wood Ow Mun Heng [EMAIL PROTECTED] 08/19/08 4:00 PM On Mon, 2008-08-18 at 11:01 -0400, justin wrote: Ow Mun Heng wrote: -Original Message- From: Scott Marlowe [EMAIL PROTECTED] If you're looking at read only / read mostly, then RAID5 or 6 might be a better choice than RAID-10. But RAID 10 is my default choice unless testing shows RAID-5/6 can beat it. I'm loading my slave server with RAID-0 based on 3 IDE 7200 Drives. Is this worst off than a RAID 5 implementation? I see no problem using Raid-0 on a purely read only database where there is a copy of the data somewhere else. RAID 0 gives performance. If one of the 3 drives dies it takes the server down and lost of data will happen. The idea behind RAID 1/5/6/10 is if a drive does fail the system can keep going.Giving you time to shut down and replace the bad disk or if you have hot swappable just pull and replace. I'm looking for purely read-only performance and since I didn't have the bandwidth to do extensive testing, I didn't know whether a RAID1 or a Raid 0 will do the better job. In the end, I decided to go with RAID 0 and now, I'm thinking if RAID1 will do a better job. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Query help
If I read this correctly, you want the output sorted by config_id,start_day(day),start_time, thus: select config_id, start_day as day, start_time, end_time from config order by config_id, start_day, start_time; Cheers, Brent Wood novice [EMAIL PROTECTED] 08/15/08 3:55 PM Hi, I have a table select id, config_id, start_day, end_day, start_time, end_time from config; id | config_id | start_day | end_day | start_time | end_time -+---+---+-++-- 1 | 101 | Mon | Sun | 08:30:00 | 18:00:00 2 | 101 | Mon | Sun | 18:00:00 | 22:00:00 3 | 555 | Mon | Fri | 08:30:00 | 16:00:00 I'd like to write a query to generate the following... is it possible at all? config_id | day | start_time | end_time ---+---+-+- 101 | Mon | 08:30:00 | 18:00:00 101 | Mon | 18:00:00 | 22:00:00 101 | Tue | 08:30:00 | 18:00:00 101 | Tue | 18:00:00 | 22:00:00 101 | Wed | 08:30:00 | 18:00:00 101 | Wed | 18:00:00 | 22:00:00 101 | Thu | 08:30:00 | 18:00:00 101 | Thu | 18:00:00 | 22:00:00 101 | Fri | 08:30:00 | 18:00:00 101 | Fri | 18:00:00 | 22:00:00 101 | Sat | 08:30:00 | 18:00:00 101 | Sat | 18:00:00 | 22:00:00 101 | Sun | 08:30:00 | 18:00:00 101 | Sun | 18:00:00 | 22:00:00 555 | Mon | 08:30:00 | 18:00:00 555 | Tue | 08:30:00 | 18:00:00 555 | Wed | 08:30:00 | 18:00:00 555 | Thu | 08:30:00 | 18:00:00 555 | Fri | 08:30:00 | 18:00:00 Thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] storing latitude and longitude
Hi Mark, Look at Postgis, to do this properly. It adds full OGC spec support for managing spatial/querying spatial data within a Postgres database. It is an option included with the Windows Postgres installer, but is generally extra packages under Linux. Cheers, Brent Wood mark [EMAIL PROTECTED] 07/11/08 5:34 AM hi.. i want to store latitude and longitude in a users table.. what is the best data type to use for this? i want to be able to find use this info to find users within a distance.. how do i go about doing this? thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] To store and retrive image data in postgresql
aravind chandu [EMAIL PROTECTED] 07/08/08 10:30 AM Hello, I need to store an image in postgresql database and after that i need to retrive the image back.Can you please help me how to do this? Hi, I have been involved in building a few applications to manage this capability. In the 3 cases to date, we store the info about the image in the database, including the name/path/URL to the image, and have an application retrieve the image for us from this data. We have not (yet) found any added value in storing the image itself as a binary object in the database. Images are stored as files on the system. One (relatively) easy way to demonstrate this is using QGIS. This is able to display data stored in PostGIS tables (PostGIS adds spatial support to Postgres, so we can store a point location with an image name/path). We can plot a map in QGIS showing the points representing photographs, and via the action capability in QGIS, we can create a command to click an image on the map display it, retrieving the path name from the database in order to do so. One PostGIS/ImageJ application we are working on does store ImageJ ROI files as binary objects in the database, but the images they are derived from is still stored outside the database as a file, with the path/name stored as database fields as a pointer to the image. HTH, Brent Wood -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Temporary Tables and Web Application
Hi Tim, Off the top of my head, from somewhat left field, using filesystems to manage this sort of effect. Would real tables in a tablespace defined on a ramdisk meet this need? So the functionality/accessibility of a physical table is provided, along with the performance of a filesystem actually residing in memory. Presumeably viable if you have the memory to spare know the size of the temp tables won't exceed this. You could also mount a tablespace on a physical disk with a filesystem which has delayed/deferred writes to disk, so that if it is created deleted quickly enough, it is never actually written to disk, but just generally sits in the cache. Cheers, Brent Wood Bill Moran [EMAIL PROTECTED] 06/06/08 8:01 AM In response to Tim Tassonis [EMAIL PROTECTED]: Bill Moran wrote: In response to Tim Tassonis [EMAIL PROTECTED]: Now, with apache/php in a mpm environment, I have no guarantee that a user will get the same postgresql session for a subsequent request, thus he will not see the temporary table. Is there a way to create temporary tables in another way, so they are visible between sessions, or do I need to create real tables for my purpose? And is the perfomance penalty big for real tables, as they have been written to disk/read from disk? Build a framework that creates the tables in a special schema, and then can access them through any session. Use some method to generate unique table names and store the names in the HTTP session. Create some sort of garbage collection routines that removes tables when they're no longer needed. The details of exactly how you pull this off are going to depend heavily on the rest of your application architecture. What you describe is what I referred to as create real tables. I've done that and it works, but I wondered if there's something similar built in postgres apart from classical temporary tables. Not that I'm aware of. If you keep the mailing list in the CC, others can answer as well. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ [EMAIL PROTECTED] Phone: 412-422-3463x4023 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] PG 8.3 review in Linux Mag
You can't read the online article without an account, but the headlines might still be of interest (or you can buy the magazine :-) http://www.linux-mag.com/id/5679 Cheers, Brent Wood -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump and schemas
Hi Rusty, Try passing the output through a utility like sed, already there under Linux , but versions that work under Windows are available (eg, cygwin) eg, using a pipe: pg_dump -d | sed 's/public/test1/g' dump.sql or converting a pg_dump output file: pg_dumpcreates dump.sql cat dump.sql | sed 's/public/test1/g' dump2.sql With tools like these freely available, you don't really need to spend time reinventing them in your database applications. Of course. if you have the public schema name used elsewhere in your database, you may need to get a bit creative in your use of sed, but it can pretty well always be made to do this sort of operation. Cheers, Brent Wood Rusty Conover [EMAIL PROTECTED] 05/04/08 8:42 AM Hi All, Is there a way to pass a parameter to pg_dump that would make the produced dump be loaded into a different schema rather then the one it is being dumped from? Basically be able to say dump out of public, but write the dump so its restored to say test1. Thanks, Rusty -- Rusty Conover InfoGears Inc. http://www.infogears.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump and schemas
Hi Rusty, In which case can you not include the text around the schema table creation commands to ensure other instances of public do not match the string? I'm not sure exactly what the pg_dump output contains, but you could use something like: cat pg_dump.sql | sed 's/CREATE SCHEMA public/CREATE SCHEMA new_schema/' | \ sed 's/CREATE TABLE public./CREATE TABLE new_schema./' new_schema.sql This should avoid any ambiguity as to which instances of public you want changed in the pg_dump sql file. I think adding support for changing schema names in pg_dump would make it unnecessarily complex, as why just schemas? Also rename databases, tables, columns, index names, change comments... I've yet to find something like this I couldn't do with sed, if there was there is always awk for the truly desparate :-) pg_dump generates the dump, reliably, simply safely. Any change you want from the original is not, IMHO, the role of a backup program. That should ONLY back up a replica of your data. Make changes afterwards if you like, but a backup program shouldn't modify your data. Just my 02c, I ain't no Postgres developer, so I'm not speaking for them in this Cheers, Brent Wood Hi Brent, It's not he best solution, because we could have fields containing public in their names and sed would happily change those to test1 as well. I'm looking for a safer solution, thats why it should be a part of pg_dump. Rusty On Apr 5, 2008, at 12:41 AM, Brent Wood wrote: Hi Rusty, Try passing the output through a utility like sed, already there under Linux , but versions that work under Windows are available (eg, cygwin) eg, using a pipe: pg_dump -d | sed 's/public/test1/g' dump.sql -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] (FAQ?) JOIN condition - 'WHERE NULL = NULL'
Ian Sillitoe [EMAIL PROTECTED] 03/04/08 5:49 AM I'm trying to JOIN two tables (well a table and a resultset from a PL/pgsql function) where a joining column can be NULL In a join, no value can be ascribed to a null field, so the equivalence fails. You can do tests like IS NULL, which strictly speaking is test for meeting a condition (that of not having any value), not a test for equivalence. As (simplistically) the condition NULL does equal the condition NULL, (NULL = NULL) is true. The simplest approach is perhaps to have a value which does not occur naturally (like -1), as a substitute for nulls in the relevant columns. I believe this can be achieved via a view in your case, (pun intended :-), but which may be less efficient if performance is an issue: create view depth_v as select d.id, d.name, case when (d.depth1 is null) then -1 else d.depth1 end as depth1, case when (d.depth2 is null) then -1 else d.depth2 end as depth2, case when (d.depth3 is null) then -1 else d.depth3 end as depth3, case when (d.depth4 is null) then -1 else d.depth4 end as depth4, case when (d.depth5 is null) then -1 else d.depth5 end as depth5 from depth_table d; You could then join against this view instead of your underlying table, eg: select c.* from get_cathcode('1.10.8') c JOIN depth_v t USING(depth1, depth2, depth3, depth4); The view will not have any NULL values in the depth fields, so the join should work. see: http://www.postgresql.org/docs/8.2/static/functions-conditional.html (Incidentally, if you are storing bathymetry or CTD data, I'd be interested in seeing your db structures, as I may be doing some work in that area soon :-) HTH, Brent Wood -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using tables in other PostGreSQL database
I have mixed feelings, I agree that pretty much equivalent functionality CAN be delivered using schemas, but some RDBMS's do not have this restriction. Business cases preferences do not necessarily follow database design preferences or capabilities, so irrespective of whether a schema approach CAN work, any user is entitled to ask whether an alternative approach is possible. Enough such users the capability may well be implemented. I am looking to convince a business which does not use schemas, but does use separate databases to move to Postgres having to shift from this paradigm is an issue for them. They are perfectly entitled to require such an approach, if we want to increase the user base of Postgres, we need to meet more users needs. (simplistic I know, but I think a valid concern). For example, any user may need to be able to create databases, add data use referentials in a corporate database of lookup records. Using schemas, everyone needs to have create table privileges to the database by default, then this needs to be denied by schema (is this possible???- never tried yet). Cross db queries allow read only access to corporate metadata for data integrity rules to be applied to any database any user creates. Agreed, not a common requirement, but one where schemas are less flexible less secure. Cheers, Brent Wood Scott Marlowe [EMAIL PROTECTED] 29/03/08 4:37 AM On Thu, Mar 27, 2008 at 11:29 PM, Swaminathan Saikumar [EMAIL PROTECTED] wrote: I am fairly new to Postgres. However, I have to say that I agree with Barry's comments. The real problem here is that you are not using the db properly. You should have one db with all these data in it in different schemas. PostgreSQL provides you with the ability to segregate these data via schemas and fine grained (by the table) ACLs. Your refusal to use multiple schemas in one database due to some perceived problem with them all being in the same database is what's causing your issues. Put your data into various schemas in one database and you can then use access control to decide who sees what. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] IBM investing in EnterpriseDB
Add Informix to the list of IBM's RDBMS products.. Also note that where Postgres has PistGIC as an OGC compliant geodata extension, IBM already supports this in both DB2 Informix, so an even higher degree if interoperability is there for geospatial data. Brent Wood Alex Vinogradovs [EMAIL PROTECTED] 27/03/08 8:20 AM Shouldn't forget IBM got DB2. Could be they are just seeking additional userbase in opensource market space... On Wed, 2008-03-26 at 12:12 -0700, Ron Mayer wrote: Clodoaldo wrote: ...IBM is investing...What does it mean for Postgresql? One cool thing it means is that there are now *two* companies (thanks again Fujitsu) bigger than Oracle backing (to some extent) Postgres. And now one company bigger than Microsoft. Yeah, this doesn't affect the community much. But it sure comes in useful when your CFO calls you into a meeting and says Hey, I just had lunch with our Microsoft rep and he asked why we're running some unsupported freeware database. Your CFO wouldn't want to run your company on a database - like Oracle 10i and MySQL and SQLServer - that are only backed by little (under $50B revenue) guys, would he? :-) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] IBM investing in EnterpriseDB
I need to learn to type!!! try PostGIS (how did that become PistGIC? I have no idea) Brent Wood [EMAIL PROTECTED] 27/03/08 1:44 PM Add Informix to the list of IBM's RDBMS products.. Also note that where Postgres has PistGIC as an OGC compliant geodata extension, IBM already supports this in both DB2 Informix, so an even higher degree if interoperability is there for geospatial data. Brent Wood -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Postgres development
Hi all, I'm interested in finding what would be involved on enhancing Postgres to allow queries run in one database in a cluster to access join with tables in other databases in the cluster, ie: cross database join support. This would be very useful, depending on cost, I may be able to arrange for funds to cover this development. Thanks, Brent Wood -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Linux distro
On Wed, 2007-08-01 at 13:29 +0200, [EMAIL PROTECTED] wrote: Hello, I bought a Dell server and I am going to use it for installing PostgrSQL 8.2.4. I always used Windows so far and I would like now to install a Linux distribution on the new server. Any suggestion on which distribution ? Fedora, Ubuntu server, Suse or others? Thanks in advance, Paolo Saudin My 02c, I'm pretty promiscuous when it comes to distros, as I want to run applications, not fuss with an OS, so which ever distro works best for me gets used. I've recently played with Ubuntu, Mandriva, Debian, OpenSUSE, SLED, Fedora Core, SImply Mepis a few others (including running Postgres/PostGIS on them all) I don't think it really matters for Postgresql, most distros will run it fine. If you want a genuine basic server setup, maybe without any GUI, then avoid distros which focus more specifically on desktop ease of use. Perhaps look as BSD? If you want a workstation system, where there needs to be a good mix of desktop server capabilities, a more generic system is preferable. If you want to set up essentially a desktop system, but run Postgresql on it, then any popular desktop distro will work. While Ubuntu Mandriva (for example) focus on ease of use, they also have less commonly used server versions. OpenSUSE is the distro I currently prefer, it seems to do all I want better than the others I've tried recently. All the server stuff with a good set of desktop apps. I suggest you look at www.distrowatch.com to see their comments (but remember everyone has different likes dislikes, so treat any review with caution, as your opinion may vary) HTH, Brent Wood ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Normal distribution et al.?
Jan Danielsson wrote: Andrej Ricnik-Bay wrote: On 6/18/07, Jan Danielsson [EMAIL PROTECTED] wrote: UPDATE foo SET value=value+normdistsample(10, 0.2) WHERE id=1; Something like this? http://www.joeconway.com/plr/ That looks too good to be true. Many thanks! See http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgresql_plr_tut01 for a new intro, pretty basic, but a good place to start Brent Wood ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Command line export or copy utility?
Francisco Reyes wrote: Does anyone know of any export or copy utility that runs on FreeBSD? I basically need a program that will connect to one database, do a select and copy the result to a second database. There are a few ways, from memory (so I might have the odd syntax error): To replicate a table run pg_dump on one machine pointing at the host/db to export pipe the output to psql -f with the host name of the target db. pg_dump -h host0 -d db0 -t table ... | psql -h host1 -d db1 -f you can do similar data streams from one db to another with (if the target table exists): psql -c copy table to STDOUT ... | psql ... -c copy table from STDOUT ... to do this with the results of a query to subset the data will require the pre-building of the target table, but you can do: psql -h host0 -d db0 -F| -Atc select.; | psql -h host1 -d db1 -c copy table from STDIN with delimiters = '|'; Cheers, Brent Wood ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Views- Advantages and Disadvantages
Ashish Karalkar wrote: Thanks Brent for your replay, What about the Disadvantages, Performance issues? As far as I'm aware, performance is the only real disadvantage. I tend to break DB design into stages: ER modelling to define the entities/relationships the DB needs to store/represent Normalize this to maximise data integrity minimise duplication/redundancy De-normalise to meet performance access requirements. In the latter stage, views are generally used, unless there are performance restrictions, when a de-normalised schema may be applied. Cheers, Brent ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Views- Advantages and Disadvantages
Joshua D. Drake wrote: Brent Wood wrote: Ashish Karalkar wrote: Thanks Brent for your replay, What about the Disadvantages, Performance issues? As far as I'm aware, performance is the only real disadvantage. What performance are we talking about here? Executing from a view although has *some* overhead, I don't even know that it is worth considering in most cases. Sorry, I didn't express that clearly... A view as a filter on a single table, or a simple two table join is generally quite fast enough, and has no noticeable overhead over a query. A complex self relation plus some joins instantiated via a view can be slow. But a query to produce the same output from the same underlying tables will be just as slow. In such cases, (like some data warehousing) a redesigned, denormalised table structure can be implemented to provide the same look as the view would have, but as a physical table, well indexed and much faster than the view, without the joins being required. My comment was meant to indicate that a complex view wihich is too slow may be replaced by a denormalised physical table, which is faster, not that a view is significantly slower than the same query upon the same tables. To paraphrase, the view/query performance is dependent on the underlying table structures relationships, and these are what may need to change if a view is too slow. In many cases, such changes can make the original view redundant. Cheers, Brent ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] PG on Debian 4.0.x ?
Andreas wrote: Hi, I compiled the latest and greatest PG server on SUSE systems until now. Now I want to stay with the official binaries of the Debian project because I plan to evaluate if it is really that good for a server as everybody is telling me. It should get a server with only the minimum of programs that don't belong to the dedicated task, like in this case run a PG server. Especially I don't want any development stuff like gcc on this server for security reasons. So I can't build it myself - at least not with this box. BTW ... the Debian installer had an anonymous option SQL server and installed PostgreSQL 8.1.8 as default without further question. This is cool because I - being a known pessimistic - expected to find MySQL. ;-) Do you know where I find PG 8.2.4 and pgAdmin 1.6.3 binaries for Debian 4.0.x ? Hi Andreas, I have spent the last couple of weeks looking at Debian 4 as a GIS workstation using PostGIS, GRASS, GEOS, GDAL/OGR, Proj.4, QGIS, etc. I figured it was worth a look because I'd heard so many good things about apt vs yast for package management the Debian GIS group is realy trying to provide a good GIS capability in this distro. I don't really wanna get into distro wars, so this is simply my opinion, but I had lots of issues with Debian, especially in that I needed some of the latest versions of several packages, which I built from source anyway, just as on Suse. I am currently using OpenSuse, but also use/have used Ubuntu, Mandriva, Kubuntu, Fedora, SLED SimplyMepis so I'm not too fixated on any particular distro, just use what seems easiest for me at the time. It was great having pretty recent versions of GRASS, PostGIS, etc installed working very easily, but they weren't the versions I needed, I tried Ubuntu simplyMepis to see if the setup tools, etc, made things easier, but these were not compatible with the Debian repositories anyway (kernel panics on trying to start after auto updates). I've gone back to OpenSuse 10.2, as I don't see YAST package management as very different to apt, the number of supported packages is certainly much less, but it just works better for me. As far as a GIS workstation is concerned, I'd say that if you need to build from source to get current versions, Debian has no real advantages. If you want to get something working quickly easily, but don't need the latest vesrions, Debian works well. Cheers, Brent Wood ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Utility of OIDs in postgres
Richard Huxton wrote: Jasbinder Singh Bali wrote: Hi, What would be the benefit of creating tables with OIDs as against one's not with OIDs Giving a unique identifier to each row inserted has some extra efficiency factor involved or what. OIDs are used by the various system tables. Historically, all user tables had them too. There's no reason to use them in a new system - they offer no advantages over an ordinary integer primary-key. Generally this is correct. However I can show one case where they are very useful: Table 1: oid, id, category, name, desc, f1, f2, f3, f4 Table 2: oid, id, category, name, desc, f1, f2 Table 3: oid, id, category, name, desc, f1, f2, f3, f4, f5, f6, f7, ... ID is a serial int as primary key. create view v_demo as select oid, name, category, desc from table1 union select oid, name, category, desc from table2 union select oid, name, category, desc from table3; As oid is unique across all tables (in fact all database objects), but serial is unique within a table, there are odd cases like this where using an oid in each table ensures an automatic unique key in the view. So oids can be useful. This is a real case, for listing objects identified in seabed photos, table1 is substrates, table2 is scampi burrow types, table 3 is taxa. The user is presented with a pick list on the view, the oid is used to specify what has been identified. The underlying tables contain detail about each category of object. We could do something much more complicated in an application, but with oids there is no need. A similar example could be power stations, all have name, output, contacts, etc, but depending on whether it is coal/gas/hydro/geothermal, etc, the attributes will vary. So a table for each type, with a view enabling a common access of the common fields. Cheers, Brent Wood ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Temporal Units
Listmail wrote: On Sun, 29 Apr 2007 22:33:37 +0200, Rich Shepard [EMAIL PROTECTED] wrote: On Sun, 29 Apr 2007, Martijn van Oosterhout wrote: Then I'm afraid you havn't indicated your requirements properly. All I can see is that the interval type does exactly what you want. It can store days, weeks, months or any combination thereof. You can multiply them and add them to dates and all works as expected. How does one define 'shift' with intervals? 0.33 DAY? If I'm following this correctly, then interval extract timepart can be used to provide all the required functionality: If you know what hours your shifts (periods) cover, and you want to check that you have a value for that shift today (ignoring for now issues with shifts only on 5 day working weeks holidays, ...) then you can do something like: if (select count(*) from table where extract day = today and extract hour (now) = extract hour from (shift start timestamp) and extract hour = extract hour from(shift start timestamp + interval shift length) ) =0 then a value is missing So for any check, you want to ensure you have no periods without a value, which can only be done at the end of each period. If you have a table defining each period, a cron job can run (for example) hourly, identifying any periods which ended in the last hour and did not have a value. Or periods about to end in the next interval which do not yet have a value, to prompt before the period ends. The trickier part is how to specify your periods, and which days/hours/months/etc are included. Each period needs to be defined by data which allows a start and a finish date/time expressed in a generic way which is relative to now to be determined. So for any now we can evaluate which periods are about to end or have just ended. Cheers, Brent Wood ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] PG service restart failure (start getting ahead of stop?)
George Pavlov wrote: We have a nightly restart of one PG database. Today it failed and I can't seem to understand why and how to prevent this in the future (nor can I reproduce the problem). If you get the PID of the psql server process then invoke a STOP, then have a loop waiting for that PID to disappear, then run your START, it may do what you want. Or someone may have a more elegant solution :-) Brent Wood We have a line in a shell script that calls /etc/init.d/postgresql restart. In the shell script's log from this invocation I have: Stopping postgresql service: [FAILED] Starting postgresql service: [ OK ] The outcome of this was that the service was not running (despite the [ OK ] on the second line). The query log from that run ends with: 2007-04-23 03:03:59 PDT [23265] LOG: received fast shutdown request 2007-04-23 03:03:59 PDT [23265] LOG: aborting any active transactions 2007-04-23 03:03:59 PDT [26749] FATAL: terminating connection due to administrator command ... snipped more lines like the one above ... 2007-04-23 03:03:59 PDT [24090] LOG: could not send data to client: Broken pipe 2007-04-23 03:03:59 PDT [26700] FATAL: terminating connection due to administrator command ... snipped more lines like the one above ... 2007-04-23 03:04:13 PDT [26820] FATAL: the database system is shutting down ... snipped more lines like the one above ... 2007-04-23 03:06:10 PDT [23269] LOG: shutting down 2007-04-23 03:06:10 PDT [23269] LOG: database system is shut down 2007-04-23 03:06:13 PDT [23267] LOG: logger shutting down end of log So it looks like the STOPPING of the service actually succeeded, albeit it took a while (more than the usual sessions open?). The STARTING is the one that actually failed (is that because the STOP was still in process?). The question is why -- in a RESTART situation wouldn't/shouldn't the START part wait for the STOP part to complete (regardless of how long it takes)? Also what can we do to avoid this in the future? We can issue a separate STOP and then a START X minutes later, but how long an X? It would seem that a RESTART is really what I want... TIA, George ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] where clause help
Ketema wrote: i have a record set like below: num_prods|num_open_issues|num_provisioned|num_canceled 1|0|1|0 2|0|0|2 3|0|1|1 * 2|0|1|1 1|0|0|1 2|0|0|0 * 3|3|0|0 3|0|0|3 3|1|0|2 3|2|0|1 2|0|2|0 Of the list above only row 3 and row 6 should be returned. Plain english definition: With a result set like above eliminate all rows that should not show up on the provision List. Provision List Definition: All rows that have products that need provisioning. Provisioning means its NOT canceled and it does NOT have an open issue. If I understand this correctly, we start with: where num_cancelled num_prods and num_open_issues num_prods Some facts: num_open_issues + num_provisioned + num_canceled will never be more than num_prods. no individual column will ever be more than num_prods. Then in addition to this, we also only retrieve records where: num_open_issues + num_provisioned + num_canceled num_prods and num_open_issues num_prods (already there above, no need to have it twice) and num_provisioned num_prods and num_canceled num_prods (already there above, no need to have it twice) giving the query: select * from table where num_open_issues num_prods and num_provisioned num_prods and num_canceled num_prods and (num_open_issues + num_provisioned + num_canceled) num_prods; With (I think) the result of: records 1,11 fail as num_provisioned is not num_prods records 2,8 fail as num_cancelled is not num_prods record 3 passes all constraints records 4,5,9, 10 fail as num_open_issues + num_provisioned + num_canceled is not num_prods record 6 passes all constraints record 7 fails as num_open_issues is not num_prods Is this what you were after? Brent Wood ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] selective export for subsequent import (COPY)
chrisj wrote: Thanks Brent, very much appreciated, your first suggestion is perfect. the translate suggestion assumes that there are no commas in the data, but that is why I wanted to use tab. I figured as much :-) Note that you can use -F | for a pipe symbol, or use any other character as the field separator in the psql command line, then change that to a tab with tr, if you do have commas in the data. It also scripts up nicely: ... FSEP=| psql -d -F $FSEP | tr $FSEP \t $FILE ... Brent Brent Wood wrote: chrisj wrote: Thanks Alan, This helped a lot, but ideally I want a tab field delimiter and -F '\t' does not seem to work, any ideas?? I noticed one other post on this same problem of the fieldsep '\t' not working but the only advise offered was to use \pset. Can \pset be used on the command line, I can only get it to work within the psql command processor. You can always have a text file (file.sql): \pset ... select . the run the commands is a single client connection with psql database -Atf file.sql This runs a file of sql commands in a single psql connection instead of opening a new connection for every -c command. Thus the result of the \pset is still in force when the next sql statement is executed. or run your command as it is pipe the output through tr to translate the commas to tabs. You can see what tr does using echo 1,2 | tr , \t eg: psql $DATABASEHOST -U $DATABASEUSER -d $DATABASE -q -P footer -A -F , -c $DETAIL_SQL | tr , \t table.csv Cheers, Brent Wood ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] selective export for subsequent import (COPY)
chrisj wrote: I would like to do a selective export of a number of tables from a large database to import into a smaller (test) DB. I know about: psql dbname -tc select * from tableX where whatever tableX.dat You might try psql dbname -Atc select * from tableX where whatever tableX.dat to produce un-aligned output, if this is your problem. Brent Wood but unless I put it through a sed script, this file cannot be easily used for import. It feels like I am re-inventing the wheel. Does anybody know a better way or have a good sed script. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Postgres data/form entry tool
Richard Huxton wrote: David Potts wrote: David Potts wrote: Any platform ! ie Web ,X, , L/Unix console, Mac or even Weandozz pgadmin, phppgadmin - although these are pitched more towards admin, they let you edit table data. PGAccess provides a spreadsheet like view of tables which users can type data into, works pretty well, it does not have the admin functionality of pgadmin, which might be a good thing in your case. Perhaps a simple PHP data entry form for a web based tool. I haven't tried these, but they may also work OK (on a Linux/KDE platform): knodahttp://www.knoda.org/ rekallhttp://www.thekompany.com/projects/rekall/ Maybe someome who has actually tried these could comment? Cheers, Brent Wood ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] selective export for subsequent import (COPY)
chrisj wrote: Thanks Alan, This helped a lot, but ideally I want a tab field delimiter and -F '\t' does not seem to work, any ideas?? I noticed one other post on this same problem of the fieldsep '\t' not working but the only advise offered was to use \pset. Can \pset be used on the command line, I can only get it to work within the psql command processor. You can always have a text file (file.sql): \pset ... select . the run the commands is a single client connection with psql database -Atf file.sql This runs a file of sql commands in a single psql connection instead of opening a new connection for every -c command. Thus the result of the \pset is still in force when the next sql statement is executed. or run your command as it is pipe the output through tr to translate the commas to tabs. You can see what tr does using echo 1,2 | tr , \t eg: psql $DATABASEHOST -U $DATABASEUSER -d $DATABASE -q -P footer -A -F , -c $DETAIL_SQL | tr , \t table.csv Cheers, Brent Wood ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] seeking: advice on reordering table
Andrej Ricnik-Bay wrote: On 4/12/07, Jonathan Vanasco [EMAIL PROTECTED] wrote: can anyone suggest a good way to reorder the table ? everything that i can think of involves creating a new table which means I'd have to redo all the constraints . The common thing to do is to write your query in such a way that what interests you most is put left ... // Jonathan Vanasco Create a clustered index on the attribute you want it ordered by. (I'm not sure is a clustered index can be a composite, but the docs or someone else here will let you know :-) Note that any inserts (or updates on the attr with the clustered index) will cause the ordering to be broken, in which case you'll need to re-cluster. Cheers, Brent Wood ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] NEWBIE: How do I get the oldest date contained in 3 tables
Lorenzo Thurman wrote: I have three tables using date fields. I want to retrieve the oldest date contained in the tables. Can someone show me an example of a query that would do that? You could write a custom function doing the same sort of thing, or (perhaps more portable) use generic sqls views like: create view min_dates as select min(date1) from table1as date_1 union select min(date2) from table2 as date_1 union select min(date3) from table3 as date_1; then either: select min(date_1) from min_dates; or create view min_date as select min(date_1) as min_date from min_dates; then just: select min_date from min_date; Cheers Brent Wood ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] inserting 4800 records at a time
We use an application that generates 4800 points for the graph of a waveform. We capture this data and display it to the user. Now we want to save all this information to a database. I have tried to create a record for each point, but insertion/retrieval is slow. I thought that maybe I could save one record per graph and save all the points as a large string, but there would be 148k characters in the string. Then I'm still not sure what the performance would be like. Would the use of BLOBs a better way to go here? Any ideas on what the best approach would be for us? I strongly recommend the use of PostGIS for storing (and managing/querying) point geometries in PostGIS. If you do take this approach there are several advantages, not least the large number of supporting applications. For example, OGR now supports GMT (in SVN right now), so you can plot your spatial timeseriesdata from the command line with data driven scripts, a simplistic example: LIST=`psql $DB -A -t -c select distinct species from table;` for SPP in $LIST ; do ogr2ogr -f GMT -nln data.gmt data PG:dbname=db -sql select pont, catch from table where species = '$SPP'; psxy data.gmt -R -JM ... ${SPP}.ps done this approach allows maps/plots to be generated automagically from the data, as GMT is a commandline package for plotting data ogr2pgr can generate GMT format data from a PostGIS table... As far as loading is concerned, are you loading as separate inserts or using copy? A bulk load via copy is generally much faster. Cheers, Brent Wood ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq