[GENERAL] datomic db as foreign data wrapper?
Does anyone have experience accessing a datomic database using a foreign data wrapper? I'd be quite content with read-only and discarding the transaction data. The real goal is to explore data in datomic using SQL. Thanks, Reece
[GENERAL] plans for plpython in RDS?
Does anyone know if there are plans to support plpython in Amazon's RDS? I (approximately) understand the issue, but I don't know if there's any effort to remedy the problem or, rather, I shouldn't bother hoping. Thanks, Reece
Re: [GENERAL] automatically refresh all materialized views?
On Fri, Feb 14, 2014 at 10:15 AM, Merlin Moncure wrote: > yeah -- you could do this with some gymnastics and some dynamic SQL. > If I were lazy (check), I would just encode the order in the name of > the view somehow. > Thanks. That's exactly what I do already. Apparently, I'm even lazier than you! In case anyone out there is looking for a convention, here's one that I find useful: _v for views intended for interactive queries, _dv (defining views) for views that are too slow to use interactively, and _mv for the materialized _dv views.
[GENERAL] automatically refresh all materialized views?
Is there a way to automatically refresh all materialized views, preferably in order of dependendency? I should be able to chase pg_depends entries to create this ordering, right? Thanks, Reece
Re: [GENERAL] password-less access, without using pg_hba
On Fri, Feb 7, 2014 at 8:27 AM, Steve Crawford < scrawf...@pinpointresearch.com> wrote: > Ignoring the scary security issues > One of the niceties of an RDS deployment is that I don't care much about the security issues: The machine is not in our VPC, there's only public data on it, and I presume that AWS has isolated the instance to their satisfaction. From my point of view, it's an ideal way to make data public and way better than running it ourselves. If you can't access pg_hba.conf how about just sticking pgbouncer or > similar in the middle and have your users connect through that? > I like the pgbouncer idea in principle, but it means more work for me that I'm not willing to take on for this use. Thanks everyone for the input. I'll stick with an advertised password. -Reece
Re: [GENERAL] password-less access, without using pg_hba
On Thu, Feb 6, 2014 at 6:37 PM, David Johnston wrote: > Doubtful. > Yeah, that's what I had assumed too. The question is motivated entirely by what I think would make it easier for users. In principle it's not difficult to give people a password (as I do now), but in practice it's a barrier that I'd like to eliminate. -Reece
[GENERAL] password-less access, without using pg_hba
I'd like to provide public access, without a password, to a database hosted on Amazon RDS. I'm familiar with using pg_hba.conf to enable trust (no) authentication for a user. pg_hba.conf is not available to DBAs on RDS. Is there any other way to achieve password-less login in postgresql? I tried alter user password NULL. Thanks, Reece
Re: [GENERAL] equivalent of mysql's SET type?
On Wed, Mar 9, 2011 at 9:16 AM, Merlin Moncure wrote: > create type validation_flags as > ( > cluster bool, > freq bool > ); > Wow. That solution is nearly sexy, and far and away better than any solution that I would have come up with. Thanks, Merlin! -Reece
Re: [GENERAL] equivalent of mysql's SET type?
On Tue, Mar 8, 2011 at 9:41 PM, John R Pierce wrote: > why not just have a set of booleans in the table for these individual > on/off attributes? wouldn't that be simplest? I like that approach, but I think it's unlikely to fly in this specific case for a couple reasons. First, there are actually 8 factors (I edited for clarity... sorry about that). The original database is actively developed (released apx quarterly). I will need an approach that minimizes my burden when they edit the set factors. And, I'd like to be compatible with mysql syntax and semantics for sets. If you hold your nose for a moment, you'll be able to read the following without becoming ill: mysql uses comma delimited strings to assign and query set types (but stored internally as bit vectors). So, one does validation_status = 'cluster,freq' to set those bits or validation_status like '%freq%' to query. Much to my chagrin, emulating this interface will make migration easier. However, implementing this string interface to set/get boolean columns is just too offensive to whatever modest design sensibilities I have. (For more pleasure reading, see http://dev.mysql.com/doc/refman/5.0/en/set.html. I particularly like the *warning* issued when one tries to add a value that's not part of the set.) -Reece
Re: [GENERAL] equivalent of mysql's SET type?
On Tue, Mar 8, 2011 at 5:51 PM, Steve Atkins wrote: > tsvector or intarray might be other options, depending on how you're going > to query it and whether those are really strings or more enum-ish. > > The more SQLy way of doing it would be to have that stored in a separate > table, with a foreign key to this one. > Using a separate table is what I've got now in an experimental conversion. This works well enough and might save me the bother of an enum update. Perhaps I should just move on. Thanks Steve. -Reece
Re: [GENERAL] equivalent of mysql's SET type?
On Tue, Mar 8, 2011 at 5:52 PM, Darren Duncan wrote: > Try starting with an enum type to define the possible values: > > CREATE TYPE Consequence_Type >AS ENUM ('ESSENTIAL_SPLICE_SITE','STOP_GAINED', > 'STOP_LOST','COMPLEX_INDEL','SPLICE_SITE'); > > ... and then you could try using "ARRAY OF Consequence_Type" or some such. > Arrays occurred to me, but they don't give a set (i.e., a consequence type can appear more than once) unless I write the code to dedupe the array. However, if I were going to put that much effort into it, I might as well represent the set directly and obviate the dedupe. Or, am I missing something from your suggestion? -Reece
[GENERAL] equivalent of mysql's SET type?
I'm considering porting a MySQL database to PostgreSQL. That database uses MySQL's SET type. Does anyone have advice about representing this type in PostgreSQL? MySQL DDL excerpt: CREATE TABLE `transcript_variation` ( `transcript_variation_id` int(10) unsigned NOT NULL AUTO_INCREMENT, `transcript_stable_id` varchar(128) NOT NULL, ... `consequence_type` set('ESSENTIAL_SPLICE_SITE','STOP_GAINED','STOP_LOST','COMPLEX_INDEL','SPLICE_SITE') ) ENGINE=MyISAM AUTO_INCREMENT=174923212 DEFAULT CHARSET=latin1; I'm considering implementing this as a new type based on a bit vector, but I wonder if anyone has a more flexible general solution. Thanks, Reece
Re: [GENERAL] pg_dump issues
On 02/02/2010 03:41 PM, DM wrote: You were right, after zipping the dump file it came out to 6.9G Also check out the "custom" pg_dump format (pg_dump -Fc ...), which is compressed. There are caveats regarding portability across PG versions with the custom format, which you may want to research. -Reece -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deleting 100 rows which meets certain criteria
On Wed, 2009-12-30 at 09:11 -0800, shulkae wrote: > I want to delete all rows exceeding 400 hours (10 days) and I want to > limit deletion of only 100 records at a time. If your table has a primary key or you can contrive one, then the DELETE FROM tbl WHERE pk in (SELECT pk FROM tbl WHERE hours>400) construct sketched by others will work fine. The much harder case is where you have a table that doesn't have a pk. For instance, if you have a table like name | hours -+-- bob | 400 bob | 20 and you naively do DELETE FROM tbl WHERE name in (SELECT name FROM tbl WHERE hours>400) then you'll delete *all* bob rows, which I suspect is not what you want. In cases without a pk, try something like BEGIN; CREATE TABLE tbl2 AS SELECT * FROM tbl WHERE hours>400; TRUNCATE tbl; INSERT INTO tbl SELECT * FROM tbl2; COMMIT; The effect is to select the rows you want and replace the contents of tbl. The above will work if you have indexes or views that depend on tbl, but there are simpler variations on this theme if that is not a concern. And for a completely different approach, consider creating a view: CREATE VIEW tbl_le_400 AS SELECT * FROM tbl WHERE hours<=400; Then use tbl_le_400 in lieu of tbl for your selects. -Reece -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Q: data modeling with inheritance
On Fri, 2009-07-03 at 11:29 -0700, David Fetter wrote: > I'm missing what you're doing here that foreign keys don't cover. > Could you send along your DDL? No DDL yet... I'm just in the thinking stages. FKs technically would do it, but would become unwieldy. The intention was to have subclasses of each of the variant, association, and phenotype tables. That leads to the polymorphic key problem. > Just generally, I've only found table inheritance useful for > partitioning. "Polymorphic" foreign key constraints can be handled > other ways such as the one sketched out below. That answers the question -- I do want polymorphic foreign keys. Dang. Thanks, Reece
Re: [GENERAL] Q: data modeling with inheritance
On Thu, 2009-07-02 at 19:19 -0700, Nathan Boley wrote: > Is an association, for example, an experiment that establishes a > dependent relationship? So could there be multiple associations > between variant and phenotype? Exactly. You might have one group say that allele X "causes" some trait, whereas another group might report a more precise increase in odds ratio (for example) for the same genotype/phenotype. > Is your concern that the number of joins will grow exponentially in > the number of variants and phenotypes? Not the number of joins, but the number of association subclasses. If I have Nv variant subclasses and Np phenotype subclasses, I'd need Nv * Np association subclasses. Multiply that by the number of association subclasses. > So all variants would be stored in the variants table, all phenotypes are in > the phenotypes table, and you join through association. Thanks. I had considered that too and that's probably what I'll end up using. -Reece
[GENERAL] Q: data modeling with inheritance
This is a question about data modeling with inheritance and a way to circumvent the limitation that primary keys are not inherited. I'm considering a project to model genomic variants and their associated phenotypes. (Phenotype is a description of the observable trait, such as disease or hair color.) There are many types of variation, many types of phenotypes, and many types of association. By "type", I mean that they have distinct structure (column names and inter-row dependencies). The abstract relations might look like this: variant associationphenotype --- ---- variant_id - variant_id+--- phenotype_id genome_idphenotype_id -+short_descr strand origin_id (i.e., who) long_descr start_coord ts (timestamp) stop_coord There are several types of variants, such as insertions, deletions, inversions, copy-number variants, single nucleotide polymorphisms, translocations, and unknowable future genomic shenanigans. Phenotypes might come from ontologies or controlled vocabularies that need a graph structure, others domains might be free text. Each is probably best-served by a subclass table. Associations might be quantitative or qualitative, and would come from multiple origins. The problem that arises is the combinatorial nature of the schema design coupled with the lack of inherited primary keys. In the current state of PG, one must (I think) make joining tables (association subclasses) for every combination of referenced foreign keys (variant and phenotype subclasses). So, how would you model this data? Do I ditch inheritance? Thanks, Reece -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Providing an alternative result when there is no result
On Mon, 2009-05-18 at 15:13 -0400, Joshua Berry wrote: > Is there an easy and efficient way to return a boolean false for a > query that returns no result, and true for one that does return a > result? Presuming that you're not using the values in temp_table, I think you should be using PERFORM * WHERE ...; and then IF FOUND ... ELSE ... END IF; See here: http://www.postgresql.org/docs/8.3/static/plpgsql-statements.html and also follow link to 38.5.5 . -Reece
Re: [GENERAL] text column indexing in UTF-8 database
On Thu, 2009-03-12 at 17:32 -0700, Steve Atkins wrote: > If A=B then lower(A) = lower(B), and if A like B then lower(A) like > lower(B). > > So, if nothing else, you could rewrite "where alias = 'Foo'" as > "where lower(alias) = lower('Foo') and alias='Foo'" and take advantage > of the lower() functional index. Good idea. Thanks. The niggling remaining problem is that the DB is open to a SQL-savvy audience and it'd be nice to avoid telling them to casefold their predicates. For regexps, lower(alias) ~* lower(regexp) won't work because extended regexps might contain character classes (e.g., \S != \s). And, I guess that alias ~* regexp requires a seqscan because the index isn't ordered over ~* (right?). How about lower(alias) ~* regexp ? Is PG smart enough to know that that ordering is well defined? Is my head on straight there? Thanks again, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] text column indexing in UTF-8 database
On Thu, 2009-03-12 at 18:02 -0700, Jeff Davis wrote: > It looks like an index using text_pattern_ops can be used for equality > (see my test case below). Odd. I can't reproduce your test case. I noticed that I edited out the version and platform from my OP. (A: 8.3.6, x86_64 linux). You're on 8.3.6, or do you happen to be testing on the 8.4 branch? I see this: r...@rkh=> \i tpo-test.sql version PostgreSQL 8.3.6 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.0 (SUSE Linux) select name,setting from pg_settings where name~'locale|encoding'; name | setting -+- client_encoding | UTF8 server_encoding | UTF8 (2 rows) \!echo $LANG en_US.UTF-8 create temp table a(t text); CREATE TABLE insert into a values('foo'); INSERT 0 1 set enable_seqscan='f'; SET create index a_t_tpo on a (t text_pattern_ops); CREATE INDEX analyze a; ANALYZE explain analyze select * from a where t = 'foo'; QUERY PLAN --- Seq Scan on a (cost=1.00..10001.01 rows=1 width=4) (actual time=0.014..0.016 rows=1 loops=1) Filter: (t = 'foo'::text) Total runtime: 0.047 ms (3 rows) create index a_t on a (t); CREATE INDEX analyze a; ANALYZE explain analyze select * from a where t = 'foo'; QUERY PLAN --- Index Scan using a_t on a (cost=0.00..8.27 rows=1 width=4) (actual time=0.061..0.062 rows=1 loops=1) Index Cond: (t = 'foo'::text) Total runtime: 0.099 ms (3 rows) script at http://harts.net/reece/tpo-test.sql -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] text column indexing in UTF-8 database
Do I really need 4 indexes per column to handle the 4 combinations of {equality,like/regexp} x {unfolded,casefolded} in a UTF-8 encoded database? I have a column that I'd like to be able to search with equality and regexp (or like), optionally casefolded. The database is UTF-8 encoded. The table and index defs are below. Jeff Davis gave me a tip to use text_pattern_ops on indexes to speed up regexp and like; that worked beautiful. But I discovered a caveat that t_p_o apparently doesn't handle equality. Thus, I think I need distinct indexes for the 4 cases above. Right? Thanks, Reece r...@csb-dev=> \d pannotation Table "unison.pannotation" Column | Type | Modifiers +--+ pannotation_id | integer | not null default origin_id | integer | not null alias | text | not null descr | text | tax_id | integer | added | timestamp with time zone | not null default timenow() Indexes: ... "pannotation_alias" btree (alias) "pannotation_alias_cf" btree (lower(alias)) "pannotation_alias_cf_tpo" btree (lower(alias) text_pattern_ops) "pannotation_alias_tpo" btree (alias text_pattern_ops) ... where those indexes are defined as: r...@csb-dev=> \x r...@csb-dev=> select indexname,indexdef from pg_indexes where indexname~'^pannotation_alias'; -[ RECORD 1 ] indexname | pannotation_alias_cf_tpo indexdef | CREATE INDEX pannotation_alias_cf_tpo ON pannotation USING btree (lower(alias) text_pattern_ops) -[ RECORD 2 ]- indexname | pannotation_alias_tpo indexdef | CREATE INDEX pannotation_alias_tpo ON pannotation USING btree (alias text_pattern_ops) -[ RECORD 3 ]- indexname | pannotation_alias indexdef | CREATE INDEX pannotation_alias ON pannotation USING btree (alias) -[ RECORD 4 ]- indexname | pannotation_alias_cf indexdef | CREATE INDEX pannotation_alias_cf ON pannotation USING btree (lower(alias)) -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
My two: * lack of PK/unique indexes on inherited tables (workarounds possible but annoying) * lack of auto-tuning or tuning tools (or perhaps my lack of awareness of them?) -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
[GENERAL] Unison Protein Database manuscript
Dear PostgreSQL friends- I spoke at OSCON 2005 on the Unison Protein Database, which is built on PostgreSQL. Unison's grown a lot in complexity, size, and (internal) user base since then. The whole thing -- database, most content, web interface, and command line tools -- are Open Source. A paper on Unison was accepted to the Pacific Symposium on Biocomputing and will appear shortly. I'll also be giving a talk on it at PSB on Jan 9. Links: http://unison-db.org/ -- the web site, docs, downloads http://harts.net/reece/pubs/ -- the manuscript http://psb.stanford.edu/ -- the PSB conference site I'm open to all feedback and problem reports. Cheers, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Stroring html form settings
On Fri, 2008-09-26 at 09:23 -0700, Dianne Yumul wrote: > If the structure of each report did not differ this would work great > but unfortunately it does. Diane- Would an EAV model work? I'm thinking something like . For example: formA,ckbox1,true formA,input1,initial value formB,textarea1,enter your long comment here You could easily extend this to , or normalize it as you see fit. Another possibility is to use hstore hashes for form fields/values. See http://www.postgresql.org/docs/8.3/static/hstore.html Do either of those suffice? -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] PostgreSQL process architecture question.
On Wed, 2008-09-10 at 00:02 -0600, Scott Marlowe wrote: > Unless you have either a small data set or a very powerful RAID array, > most the time you won't be CPU bound anyway. But it would be nice to > see some work come out to parallelize some of the work done in the > back end. I would have agreed with this several years ago, but many folks now buy enough RAM to reduce the impact of IO. We're routinely CPU-bound on small queries, and even on some large ones, on a 32GB / 16-core Opteron box that serves a ~200GB database (on disk tables+indexes). Does anyone know of research/references on query optimizers that include parallelization as part of the cost estimate? I can envision how PostgreSQL might parallelize a query plan that was optimized with an assumption of one core. However, I wonder whether cpu and io costs are sufficient for efficient parallel query optimization -- presumably contention for memory (for parallel sorts, say) becomes critical. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] partitioned table insert triggers
On Mon, 2008-06-23 at 10:18 +0900, ITAGAKI Takahiro wrote: > You need to do so in 8.3: > EXECUTE 'INSERT INTO ' || ctable || > ' VALUES( (' || quote_literal(new) || '::' || ptable || > ').*)'; > > In 8.4, the next release, you will be able to use a better way: > EXECUTE 'INSERT INTO ' || ctable || ' VALUES( ($1).* )' USING new; Domo arigato, Itagaki-san. (I think I got that right.) The 8.3 method works like a charm. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
[GENERAL] partitioned table insert triggers
I'm trying to create a date-partitioned version of the postgres_log table that is described here: http://www.postgresql.org/docs/current/static/runtime-config-logging.html In lieu of hardcoding the partition table names in the trigger, as shown in the table partitioning docs, I wanted to dynamically compute the table name. I tried: CREATE OR REPLACE FUNCTION postgres_log_insert_trigger() RETURNS TRIGGER AS $$ DECLARE c text; ptable text := 'postgres_log'; ctable text := ptable || '_' || to_char(NEW.session_start_time, '_MM'); BEGIN c := 'INSERT INTO ' || ctable || ' VALUES (NEW.*)'; EXECUTE c; RETURN NULL; END; $$ LANGUAGE plpgsql; I get: ERROR: NEW used in query that is not in a rule CONTEXT: SQL statement "INSERT INTO measurement_postgres_log_2008_05 VALUES (NEW.*)" PL/pgSQL function "postgres_log_insert_trigger" line 7 at EXECUTE statement I didn't really think that would work exactly for the reason given, but I didn't know what else to try. Now that my only idea is gone, the question for y'all is: how? (Okay, I actually do have one more idea: construct the insert values list with quote_literal. I'm hoping there's a better way.) Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Annoying messages when copy sql code to psql terminal
On Thu, 2008-06-05 at 22:28 -0400, Merlin Moncure wrote: > As others have noted, you have tabs in your sql source. I'd advise if > possible, not to use the tab character in sql, for this and other > reasons. Tabs in SQL are a problem only if you copy-paste. If your editor and psql can see the same files (i.e., are running on the same machine or you're saving to a network fs), it's more reliable to save the file and read it in psql with \i or with psql -f. (\i and -f won't try to do tab completion, of course.) -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Copying data from a table to another database
On Sun, 2008-05-25 at 12:24 +0100, Pedro Doria Meunier wrote: > Now I've migrated the entire 'shebang' to another server and need to > update the history table on the new server with data from the old > server > for every different record. I'm not sure I understand this completely. Would COPY work? For example: $ psql -h -d -c 'copy . to stdout' | psql -h -d -c 'copy . from stdin' This presumes that a table with the same structure already exists in the new database. If you have made schema changes, or you need only a subset of rows, you can specify an appropriate select statement to the copy command on "old" database. See documentation for COPY. Also consider a statement like this: => INSERT INTO newtable SELECT * FROM oldtable EXCEPT SELECT * FROM newtable; I'm assuming that you populate a temporary oldtable in the new db (perhaps using the COPY method above). This won't work if there are intentional identical rows in your table. The pipe assumes a Unix-ish box. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Short-circuiting FK check for a newly-added field
On Tue, May 20, 2008 at 02:25:15PM -0400, Decibel! wrote: > I need to add a field to a fairly large table. In the same alter statement > I'd like to add a FK constraint on that new field. Is there any way to > avoid the check of the table that the database is doing right now? The > check is pointless because the newly added field is nothing but NULLs. > I don't see the problem. FK constraints don't fire on NULL values. I think you might be imagining that a problem exists when it doesn't. If the FK column is created as NOT NULL, there is a problem. The best way to handle this case is to add the column (allowing nulls), populate the columns, then alter the column to make it NOT NULL. You can wrap that all in a transaction if you like. > This is version 8.1.mumble. > You can get the version using select version(). -Reece -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how to modify a view
On Wed, 2008-05-21 at 02:43 +0930, Shane Ambler wrote: > Yes - The only way to do it is DROP VIEW... then CREATE VIEW... - the > catch is any other views that depend on the view you wish to drop. Tip: It's sometimes useful to rename an existing view or table to deprecate it. (I use names like _someview_v_20080520.) When you rename a table or view, the definition of any views which depend on them will be updated appropriately. This is particularly handy when you have several layers of views and don't want to break current behavior. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Link tables from different databases
> I want it so that every operation on db1.table1 (ie. INSERT, UPDATE, DELETE) > also happens to db2.table2, but similarly I want it that application can do > those operations on db2.table2 but actually it does it on db1.table1 behind > the scene. If the tables could be in the same database but in different schemas, you should be able to arrange a view in schema2 (called "table2") with rules that operate on table1 in schema1. See the manual for a discuss of rules and views. If you really want these tables in separate *databases* (same machine or not), then you probably want dblink, a stiff drink, and a long-hard think about the prudence of such a design. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Unloading a table consistently
On Sat, 2008-05-03 at 09:11 -0700, Christophe wrote: > I will have a log table which, once a day or so, is copied to a file > (for movement to a data warehouse), and the log table emptied. ... > Is there a better way to handle this kind of thing that I'm > missing? Anyone care to comment about whether a table partition might be better here? Could an insert rule on a table partition automatically create the inherited table, if needed, and insert there? Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Scripting function definitions as SQL?
On Sun, 2008-05-11 at 06:12 -0700, Postgres User wrote: > Has anyone written a function that scripts out all the functions in a > database as full SQL statements (Create Function.) You could pg_dump the schema in the "custom" format (-Fc), then call pg_restore with -l to get the TOC, grep the TOC for functions, and feed that back into pg_restore with -L. It sounds like a lot, but it's pretty easy in practice, like so: $ sudo -u postgres pg_dump -Fc -s mydb >mydb.pgdfc $ pg_restore -l mydb.pgdfc >mydb.toc $ grep -E '^[0-9]+; [0-9]+ [0-9]+ FUNCTION' mydb.toc >mydb-fx.toc $ pg_restore -L mydb-fx.toc mydb.pgdfc The output of pg_restore is sql. This technique is extremely useful for other kinds of schema elements as well. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] tsearch2 and hyphenated terms
On Fri, 2008-04-11 at 22:07 +0400, Oleg Bartunov wrote: > We have the same problem with names in astronomy, so we implemented > dict_regex http://vo.astronet.ru/arxiv/dict_regex.html > Check it out ! Oleg- This gets me a lot closer. Thank you. I have two remaining problems. The first problem is that 'bcl-w' and 'bcl-2' are parsed differently, like so: [EMAIL PROTECTED]> select * from ts_debug('english','bcl-w'); alias | description | token | dictionaries | dictionary | lexemes -+-+---++--+- asciihword | Hyphenated word, all ASCII | bcl-w | {english_stem} | english_stem | {bcl-w} hword_asciipart | Hyphenated word part, all ASCII | bcl | {english_stem} | english_stem | {bcl} blank | Space symbols | - | {} | | hword_asciipart | Hyphenated word part, all ASCII | w | {english_stem} | english_stem | {w} [EMAIL PROTECTED]> select * from ts_debug('english','bcl-2'); alias | description | token | dictionaries | dictionary | lexemes ---+-+---++--+- asciiword | Word, all ASCII | bcl | {english_stem} | english_stem | {bcl} int | Signed integer | -2| {simple} | simple | {-2} One option would be to write a new parser/modify wparser_def.c to make the InHyphyenWordFirst accept p_isdigit or p_isalnum on the first character (I think I got this right). This would achieve Tom's initial inkling that Bcl-2 might be parsed as a numhword and (to me) it seems more congruent with asciihword class. Perhaps a more broadly useful modification is for the lexer to also emit whitespace-delimited tokens (period). asciihword almost does the trick, but it too requires a post-hyphen alphabetic character. The second problem is with quantifiers on PCRE's regexps. I initially implemented a dict_regex with a conf line like (\w+)-(\w{1,2}) $1$2 I can make simpler expressions work (eg., (bcl)-(\w)). I think it must be related to the README caveat regarding PCRE partial matching mode, which I didn't understand initially. However, I don't see that it's possible to write a general regexp like the one I initially tried. Do you have any suggestions? Thanks again. I'm very impressed with tsearch2. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] tsearch2 and hyphenated terms
I'd like to use tsearch2 to index protein and gene names. Unfortunately, such names are written inconsistently and sometimes with hyphens. For example, MCL-1 and MCL1 are semantically equivalent but with the default parser and to_tsvector, I see this: [EMAIL PROTECTED]> select to_tsvector('MCL1 MCL-1'); to_tsvector - '-1':3 'mcl':2 'mcl1':1 For the purposes of indexing these names, I suspect I'd get the majority of cases by removing a hyphen when it's followed by 1 or 2 chars from [a-zA-Z0-9]. Does that require a custom parser? Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Survey: renaming/removing script binaries (createdb, createuser...)
Although I, too, am not fond of the current command names, I'm not aware of a naming conflict that is serious enough to warrant renaming. Do we have even one example of one significant naming conflict? Renaming executable seems likely to create much more confusion that it will solve. I loathe the idea of confounding years of newsgroup/mailing list wisdom, especially for newcomers. The specter of having to ask questions like "Which executable naming scheme are you using?" is also unpalatable. (Remember, this is the group that will engage in multi-message threads about postgres, postgresql, PostgreSQL, PG, postgre, etc. We'll need a new list for the fallout of renaming executables.) Perhaps one way to sidestep this issue is for you (Zdeněk) to create a pgFoundry project that provides the command line interface you propose by wrapping the existing postgresql binaries. This would provide the more rational names you seek. Furthermore, it could be used to provide the *sole* interface to postgresql if postgresql is configured with a --prefix that sequesters it from system paths. In the meritocracy of open source, your interface might become the official interface later on. I completely "get" your motivation and I appreciate your effort to tidy up. However, renaming is going to create problems, not solve them. (And, it's not even clear that there is a problem.) -Reece P.S. My responses: 1. a (do nothing) (or b if there really must be a change) 2. a (multiple times daily) 3. consistent with 1 4. c Blue. No, red. Aaah... -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Postgresql partitioning
On Fri, 2008-03-21 at 23:00 -0400, Ram Ravichandran wrote: > I assume that primary key > uniqueness is not tested across tables. Right? That's correct. It's on the TODOs: Inheritance * Allow inherited tables to inherit indexes, UNIQUE constraints, and primary/foreign keys (at http://www.postgresql.org/docs/faqs.TODO.html ) I wonder whether you might be able to achieve the benefits of partitioning and the simplicity of a single-table updates by using a view with an update rule. This would allow you to embed the logic for moving rows between partitions when the partition criterion changes into the database. I've not done this myself, so I'm, um, not speaking from experience. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postgre vs MySQL
On Tue, 2008-03-11 at 06:47 -0700, rrahul wrote: > Any major clients of the two. > You can add you own points too. Perhaps someone can comment on current MySQL backups procedures. I believe that MySQL used to (still does?) require shutdown to be backed up. I don't know whether this was true for all engines or whether it might have been fixed. Having to shutdown a database to make a backup is a non-starter for anything that other than a toy (or read-only) databases. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
[GENERAL] executing query results withing psql
Is there a way to generate and directly execute query results within psql? Essentially, I want eval within psql. For example, a lot of us generate DDL commands using queries, like this: => SELECT 'GRANT SELECT ON VIEW "'||table_schema||'.'|| table_name||'" TO budget_view;' FROM information_schema.views WHERE table_schema='budget'; To execute such results, I do one these: * in a shell, pipe between psqls (ie, psql -Atc 'select ...' | psql) * within psql, execute and copy-paste * within psql, use \g file, then \i file * within psql, use \g |psql I like the \g + \i option most for consistency and robustness (don't have to specify the db connection or worry about connecting to the wrong db), but what I really want is a hypothetical \eval to execute the query and then execute the results. Any such thing? Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] dynamic crosstab
On Wed, 2008-02-13 at 14:04 +0100, Tino Wildenhain wrote: > Well after all you want a CSV not a table. You could shortcut this > with a generic query which creates array out of your "columns" > and join them to a CSV line. This would just be outputted as > one single column from database. Depending on your use case, this may be a better way: In psql: => \copy (select col1,col2,col3 from data) TO data.csv CSV HEADER or on the command line: $ psql -c '\copy (select col1,col2,col3 from data) TO data.csv CSV HEADER' Strictly speaking, the CSV formatting isn't being done in the database but rather by psql. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Good F/OSS license summary
Folks- A recent thread ( http://archives.postgresql.org/pgsql-general/2008-01/msg01595.php ) suggests that there's a lot of interest and knowledge about F/OSS licenses on this list. So... Does anyone know of an authoritative, complete, and current license comparison? After a lot of trolling, I've been able to satisfy one or two of those criteria, but not all. -*-mode:opinion-*- It's disappointing that we as a community have yet to converge on just a few licenses that differ only by the few characteristics that *really* matter (e.g., virality, commercializability, code reciprocation/share-alike, etc). The glut of license options and license compatibility issues create enormous confusion and even some unfortunate internecine FUD -- never mind the real uncertainty caused by the lack of case law in this area. I long for a Creative Commons-like family of licenses for code. Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Oracle Analytical Functions
create table data ( client_id integer, datetime timestamp not null ); create index data_client_id on data(client_id); copy data from STDIN DELIMITER ','; 122,2007-05-01 12:00:00 122,2007-05-01 12:01:00 455,2007-05-01 12:02:00 455,2007-05-01 12:03:00 455,2007-05-01 12:08:00 299,2007-05-01 12:10:00 299,2007-05-01 12:34:00 \. CREATE OR REPLACE FUNCTION visits ( OUT client_id INTEGER, OUT datetime_1 TIMESTAMP, OUT datetime_2 TIMESTAMP, OUT dur INTERVAL ) RETURNS SETOF RECORD LANGUAGE plpgsql AS $_$ DECLARE rp data%ROWTYPE;-- previous data table record r data%ROWTYPE; -- data table record, more recent than rp BEGIN rp = (NULL,NULL); FOR r IN SELECT * FROM data ORDER BY client_id,datetime LOOP IF rp.client_id = r.client_id THEN client_id = r.client_id; datetime_1 = r.datetime; datetime_2 = rp.datetime; dur = r.datetime-rp.datetime; RETURN NEXT; END IF; rp = r; END LOOP; RETURN; END; $_$; [EMAIL PROTECTED]> select * from visits() order by client_id,datetime_1; client_id | datetime_1 | datetime_2 | dur ---+-+-+-- 122 | 2007-05-01 12:01:00 | 2007-05-01 12:00:00 | 00:01:00 299 | 2007-05-01 12:34:00 | 2007-05-01 12:10:00 | 00:24:00 455 | 2007-05-01 12:03:00 | 2007-05-01 12:02:00 | 00:01:00 455 | 2007-05-01 12:08:00 | 2007-05-01 12:03:00 | 00:05:00 (4 rows) -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 create table data ( client_id integer, datetime timestamp not null ); create index data_client_id on data(client_id); copy data from STDIN DELIMITER ','; 122,2007-05-01 12:00:00 122,2007-05-01 12:01:00 455,2007-05-01 12:02:00 455,2007-05-01 12:03:00 455,2007-05-01 12:08:00 299,2007-05-01 12:10:00 299,2007-05-01 12:34:00 \. CREATE OR REPLACE FUNCTION visits ( OUT client_id INTEGER, OUT datetime_1 TIMESTAMP, OUT datetime_2 TIMESTAMP, OUT dur INTERVAL ) RETURNS SETOF RECORD LANGUAGE plpgsql AS $_$ DECLARE rp data%ROWTYPE;-- previous data table record r data%ROWTYPE; -- data table record, more recent than rp BEGIN rp = (NULL,NULL); FOR r IN SELECT * FROM data ORDER BY client_id,datetime LOOP IF rp.client_id = r.client_id THEN client_id = r.client_id; datetime_1 = r.datetime; datetime_2 = rp.datetime; dur = r.datetime-rp.datetime; RETURN NEXT; END IF; rp = r; END LOOP; RETURN; END; $_$; ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] OT - pg perl DBI question
On Tue, 2008-01-29 at 13:28 -0500, Geoffrey wrote: > Are you speaking from personal experience, or just of the lack of > maintenance? I'm guessing lack of maintenance... the SFPUG hasn't maintained David for a very long time. We think he lives at Casa Donde, but no one is really sure where that is. -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] (un)grouping question
On Mon, 2008-01-21 at 22:05 +0200, Andrei Kovalevski wrote: > How can you garantee that "somevalue + random()" won't duplicate other > unique values in this column? ;) Like the inane memo from the boss says, "expect the unexpected" and "tell me all the unpredictable issues that will happen". If uid is unique then something like => select uid,somevalue,somevalue||'-'||uid from mytable is much better. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] (un)grouping question
On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote: > uid|somevalue > > 1|11 > 2|44 > 3|31 > 4|44 > 5|71 > 6|33 > 7|33 > 8|44 > 9|14 > > would like to remove the duplicate values in the column somevalue. > doing this by just adding a random number is perfectly fine, however > i want to retain at least one of the original values of somevalue. Any > ideas how to do this in in a query? There's certainly no need for a random number hack. Instead, use a query like 'select distinct on (somevalue) * from mytable;' . Although DISTINCT is standard, the DISTINCT ON (cols) variant is a PostgreSQL-ism. See http://www.postgresql.org/docs/8.2/static/queries-select-lists.html#QUERIES-DISTINCT . -Reece, your heterographic brother -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Deadlock in Postgres 8.2
On Sun, 2008-01-20 at 16:54 -0500, Greg Smith wrote: > You can look at who has locks on what using pg_locks, see > http://www.postgresql.org/docs/8.2/static/view-pg-locks.html I use the following view to ferret out locking problems. Briefly, look for rows with state='WAIT', then find RUN rows for the same that have an exclusive lock. -Reece CREATE OR REPLACE VIEW pgutils.locks AS SELECT l.pid, d.datname AS "database", n.nspname AS "schema", c.relname AS relation, l.locktype, l."mode", CASE l."granted" WHEN true THEN 'RUN'::text ELSE 'WAIT'::text END AS state, a.usename, a.current_query, to_char(now() - a.query_start, 'HH24:MI:SS'::text) AS duration FROM pg_locks l JOIN pg_database d ON l."database" = d.oid JOIN pg_class c ON l.relation = c.oid JOIN pg_namespace n ON c.relnamespace = n.oid JOIN pg_stat_activity a ON l.pid = a.procpid ORDER BY l.pid, d.datname, n.nspname, c.relname, l."granted"; eg=> select * from pgutils.locks ; pid | database | schema | relation | locktype | mode | state | usename | current_query | duration ---+--+++--+-+---+-+---+-- 15716 | csb-dev | pg_catalog | pg_class | relation | AccessShareLock | RUN | rkh | select * from pgutils.locks ; | 00:00:00 15716 | csb-dev | pg_catalog | pg_class_oid_index | relation | AccessShareLock | RUN | rkh | select * from pgutils.locks ; | 00:00:00 15716 | csb-dev | pg_catalog | pg_class_relname_nsp_index | relation | AccessShareLock | RUN | rkh | select * from pgutils.locks ; | 00:00:00 ... -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] double free corruption?
On Fri, 2007-12-28 at 12:33 -0300, marcelo Cortez wrote: > i received the follow message from backend ,it's this > a bug? ... > *** glibc detected *** postgres: postgres richelet > 201.235.11.133(2504) SELECT: double free or corruption > (!prev): 0x0845d7e8 *** > === Backtrace: = Does this happen to be on Novell SLES/SLED 10? I saw this bug sporadically and in several applications 6-9 months ago. I don't think I ever knew the cause or even what specifically tickled this problem, but my recollection is that a Novell patch set fixed it. I remember that I saw a log of these in /var/log/messages (assuming you're logging locally). Consider looking there to see if you have other instances of this bug with other applications. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] comparing rows
On Mon, 2007-12-10 at 13:31 -0800, hjenkins wrote: > I would like to take a timeseries of data and extract the rows of data > flanking the gaps in it. So I need to compare timestamps from two > adjacent > rows, and determine if the interval is greater than the standard > sampling > interval. It often helps for us to have a snippet of a table definition to frame replies. I'll assume that you have a "data" table with a timestamp column called "ts". I suspect you could use a subquery, like this: => select D1.ts as ts1,(select ts from data D2 where D2.ts>D1.ts limit 1) as ts2 from data D1; I'm uncertain about the performance of this subquery in modern PGs. If this query works for you, then you can wrap the whole thing in a view or another subquery in order to compute ts2-ts1, like this: => select ts1,ts2,ts2-ts1 as delta from ( ) X; This will get you only the timestamps of adjacent rows with large deltas. The easiest way to get the associated data is to join on the original data table where ts1=ts or ts2=ts. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] any way for ORDER BY x to imply NULLS FIRST in 8.3?
On Tue, 2007-11-06 at 11:38 +0400, rihad wrote: > Is there any way to "hardcode" the NULL handling in an index (as per > http://www.postgresql.org/docs/8.3/static/indexes-ordering.html) so > that > SELECT * FROM t ORDER BY foo automatically implies NULLS FIRST (and, > similarly so that SELECT * FROM t ORDER BY foo DESC automatically > implies NULLS LAST)? Thing is, I'm using PHP Symfony/Propel to > generate > their SQL and have no easy way to influence how they do so. As Tom already stated, the ordering of NULLs with respect to non-NULLs is defined by the implementation. However, it's not clear that you've considered a clause like 'ORDER BY (foo IS NULL), foo', which I believe is not implementation dependent. (In SQL2003 draft, true is defined to sort before false. I can't find a similar statement in SQL92 or SQL99.) Bear in mind that the ordering of rows with the same value (incl. NULL) of foo is arbitrary. And, I guess that the equivalence of this query on two systems will depend on the collating locale also (but I'm a neophyte in this area). -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] getting list of tables from command line
On Wed, 2007-10-31 at 08:01 -0700, Craig White wrote: > I wanted a way to > query a database and get a text file with just the table names and > cannot figure out a way to do that. This should do the trick: $ psql -Atc "select table_schema||'.'||table_name from information_schema.tables" -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Multiple versions on same box
On Mon, 2007-10-08 at 17:34 -0700, Ralph Smith wrote: > What is the best way to upgrade? Use pg_dumpall (using 8.2's > program), afterwards redirect that file into v8.2 via psql v8.2? There are lots of ways to do this. My favorite is to bring a new cluster on port 5433 and then pipe data between them, roughly like this: $ /path/to/7.4/pg_dumpall -p5432 ... | /path/to/8.2/psql -p5433 -qaf- >restore.log 2>&1 This obviates the intermediate dump file, which can be large. As a precaution, I remove the PATH elements that locate pg (thus requiring that I specify the full path) and I unsetl PG* variables. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Why the ERROR: duplicate key violates uniqueconstraint "master_pkey" is raised? - Is this a Bug?
On Tue, 2007-09-25 at 10:06 +0530, Anoo Sivadasan Pillai wrote: > But since you can’t control the order, it is likely that the operation > will sometimes and will succeed sometimes, with the same data.. Weird > I feel . Not so weird (IMO). Many databases will return rows unordered unless you specify otherwise. Once one understands that uniqueness constraints are evaluated immediately and that rows are visited unordered, it's not so weird. > The work around given by you will work, Does it mean that whenever we > are updating a unique key we have to take this precaution ? Probably. I suspect that you're using unique or primary keys to imply some meaning that would be better stored elsewhere, and definitely NOT in a primary key. Perhaps you should consider refactoring your design. You're doing something wrong if you find that you need to update primary keys routinely. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Why the ERROR: duplicate key violates unique constraint "master_pkey" is raised? - Is this a Bug?
On Mon, 2007-09-24 at 12:50 +0530, Anoo Sivadasan Pillai wrote: > CREATE TABLE master ( m1 INT primary key , m2 VARCHAR(100)) ; > INSERT INTO master VALUES ( 1, 'm1' ) ; > INSERT INTO master VALUES ( 2, 'm2' ) ; > UPDATE master SET m1 = m1 + 1; > Update fails with the message - ERROR: duplicate key violates unique > constraint "master_pkey" Primary key constraints are not deferred and are not deferrable (try: \x, then select * from pg_constraint where conname~'master_pkey'). This means that the constraint is checked immediately for each row updated. Success will depend on the order in which postgresql decides to visit rows: if it visits all rows in descending order, you'll be fine, but that's unlikely and uncontrollable (AFAIK). Here's a sketch of an easy workaround. You might have to modify it for your particular range of m1. begin; update master set m1=-m1; update master set m1=-m1+1; commit; You could just as easily add N to m1, then subtract (N-1) from m1. You'll need N>max(m1). Good luck, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Seq Scan
On Fri, 2007-06-01 at 18:24 +0100, Tyler Durden wrote: > It uses Index Scan for id>20 and Seq Scan for id>10?! Based on the statistics pg has for your table, and on the cost of using the index, the cost based optimizer decided that it's more efficient to seq scan all of the rows than to incur the index overhead. These decisions are not always correct, but they're usually quite good. If you don't believe it, try: # set enable_seqscan=off; # explain analyze ... # set enable_seqscan=on; # explain analyze ... -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Make for PgSQL?
On Thu, 2007-05-31 at 09:01 +0200, Vincenzo Romano wrote: > Scripts have been named so that the lexicographical order of filenames > brings > the information about dependencies. > > I've been playing with the GNU Make itself but it's quite hard to keep > track > of changes and to re-load a single SQL file that changed. Expressing the dependencies is the hardest part. Once you have the dependencies, the make part ought to be straightforward (except for learning the bugaboos of make). "Keeping track of the changes" might look something like this make snippet: .DELETE_ON_ERROR: .SUFFIXES: %.log: %.sql psql ... -f $< >$@ (Bless the pg folks with mountains of chocolate and mountain dew for returning meaningful exit codes even for DDL changes, as opposed to, say, sqlplus.) The you need to express your dependencies in a way that make can understand. The most general way to do this is with a list like: common.log: utils.log fx1.log: common.log utils.log fx2.log: fx1.log etc. Finally, you'll want a list of all log targets so that you can type something like "make update" or whatever to reload as needed. You can get that with, for example: SQL_FILES:=$(wildcard sqldir/*.sql) TARGETS:=$(SQL_FILES:.sql=.log) .PHONY: update update: ${TARGETS} > Is there any hint about "best prectices", software tools and the > likes? I don't know anything about best practices (ahem). However, it occurs to me that it wouldn't be hard to move your dependency encoding into the SQL itself, such as -- requires: utils.sql common.sql create or replace function ... Then you'd automatically generate a file of sql dependencies using a perl one-liner (or whatever). -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Command line export or copy utility?
On Tue, 2007-05-22 at 18:07 -0400, Francisco Reyes wrote: > Does anyone know of any export or copy utility that runs on FreeBSD? > I basically need a program that will connect to one database, do a > select and copy the result to a second database. Two options: 1) if you want a whole table or schema, a pipe works nicely: eg$ pg_dump -t | psql 2) As of 8.2, you can formulate COPY commands with subqueries. For example: eg$ psql -c 'COPY (SELECT origin_id,origin FROM origin WHERE is_public order by 1) TO STDOUT' eg$ psql -c 'COPY (SELECT x FROM a WHERE x%2=1) TO STDOUT' \ | psql -c 'COPY a FROM STDIN;' The only wrinkle is what to do when you need the DDL for the table itself (say, when you want to create the same table with a subset of the rows). The way I do this is to pg_dump the schema (-s) in the custom format (-Fc). Then, I generate a table of contents with pg_restore -l, edit the TOC to include only the entries I want, and then rerun pg_restore with -L. Good luck, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Random Sample
On Fri, 2007-05-18 at 15:36 -0500, [EMAIL PROTECTED] wrote: > How do I pull a random sample of either 100 records or 5% of the > population of a table? If you can be a little flexible about the number of samples, you can try select * from table where random()<=0.05; Of course, there's nothing that guarantees that you'll get 5% and this only works reasonably for large N. On the other hand, if N were small, you probably wouldn't be asking for a random sample. You could also try select * from table order by random() limit 100; That'll be expensive, but get you exactly 100 (if your table has >= 100 rows, of course). -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Vacuum non-clustered tables only
On Wed, 2007-05-16 at 15:40 -0700, Glen Parker wrote: > Is there a semi-easy way vacuum all tables in a database *except* > those that are clustered? Yes, it can be done "semi-easily". Here's an example: select N.nspname,relname from pg_class C join pg_namespace N on C.relnamespace=N.oid where relkind='r' and not exists (select * from pg_index I where C.oid=I.indrelid and I.indisclustered); Many of us build statements in SQL itself. For instance, replace the first line above with select 'vacuum '||N.nspname||'.'||C.relname||';' as vacuum_cmd and now the query returns executable statements. You probably want to restrict by namespace/schema name as well. If you put all of that into a view, it's then easy to do something like $ psql -c 'select vacuum_cmd from view' | psql -aX which uses one connection to select the script, and another to execute it. Good luck, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Views- Advantages and Disadvantages
On Wed, 2007-05-09 at 12:02 -0700, Dann Corbit wrote: > Views can hide important information from the optimizer (especially > index information). I believe that you're mistaken, and you can see it rather easily by explaining a select on a view (or even a view of views). For example: [EMAIL PROTECTED]> \d palias View "unison.palias" Column| Type | Modifiers -+--+--- palias_id | integer | pseq_id | integer | origin_id | integer | alias | text | descr | text | tax_id | integer | ref_pseq_id | integer | added | timestamp with time zone | View definition: SELECT pa.palias_id, pv.pseq_id, pa.origin_id, pa.alias, pa.descr, pa.tax_id, pv.ref_pseq_id, pv.added FROM paliasorigin pa, pseqalias pv WHERE pv.palias_id = pa.palias_id AND pv.is_current = true; [EMAIL PROTECTED]> explain select * from palias where tax_id=9606; QUERY PLAN Hash Join (cost=701397.95..1634572.27 rows=1293890 width=137) Hash Cond: (pv.palias_id = pa.palias_id) -> Seq Scan on pseqalias pv (cost=0.00..474670.85 rows=20706650 width=20) Filter: is_current -> Hash (cost=647199.80..647199.80 rows=1692012 width=121) -> Bitmap Heap Scan on paliasorigin pa (cost=33808.65..647199.80 rows=1692012 width=121) Recheck Cond: (tax_id = 9606) -> Bitmap Index Scan on paliasorigin_tax_id_idx (cost=0.00..33385.65 rows=1692012 width=0) Index Cond: (tax_id = 9606) (9 rows) Long ago I compared a few views with their inlined counterparts and the upshot is that there is exactly or practically zero difference. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] IF function?
On Tue, 2007-05-01 at 07:38 -0700, novnov wrote: > I didn't express myself clearly. Is there an IIF type function one can > use > inline in a view? > > SELECT UserName, IIF(mybooleanfield = true, "It's true", "It's not > true") as > SayIt FROM tblUser I think people understood you. The part that may not be obvious to you is that case can be used in the select list, as in: SELECT UserName, CASE mybool WHEN TRUE THEN $$It's true$$ ELSE $$It's not true$$ END AS veracity FROM sometable; or even SELECT UserName, $$It's$$ || CASE mybool WHEN TRUE THEN '' ELSE ' not' END || ' true' AS veracity FROM sometable; Nearly anything you can express in a select statement can also be turned into a view. (The only exception that comes to mind is that select allows cols with the same name, but views don't.) If you really want a function, that's not hard to write for the case of consistent types: [EMAIL PROTECTED]> create or replace function iif(boolean,text,text) returns text language sql as 'select case $1 when true then $2 else $3 end'; CREATE FUNCTION Time: 71.242 ms [EMAIL PROTECTED]> select iif(true,'yep','nope'); iif - yep (1 row) Time: 1.468 ms [EMAIL PROTECTED]> select iif(false,'yep','nope'); iif -- nope (1 row) [EMAIL PROTECTED]> select $$It's$$ || iif(true,'',' not') || ' true'; ?column? --- It's true (1 row) The function is NOT declared strict: it can return null on null input. I think that is the right behavior for the boolean, but not for the return strings. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Schema as versioning strategy
On Wed, 2007-04-25 at 12:47 -0400, Owen Hartnett wrote: > I want to "freeze" a snapshot of the database every year (think of > end of year tax records). However, I want this frozen version (and > all the previous frozen versions) available to the database user as > read-only. First, I'd rename the current-year schema to a more meaningful name (eg, taxes2006). Each year you could do a schema-only dump of the current year, tweak the schema name in the dump to reflect the new year, and restore just the schema into the same database. The benefit of this approach is that the data stay in place (ie, you don't dump public and restore into a new schema). Conceptually, something as simple as the following pipe might suffice to dump, rename, and restore into a new schema: $ pg_dump -s -n taxes2006 | sed -e 's/taxes2006/taxes2007/g' | psql -qa (This is a little dangerous because I've assumed that the string 'taxes2006' occurs only as a schema name. I've also assumed Unix/Linux and I have no idea what you'd do on a windows box.) PostgreSQL doesn't have a read-only mode per se. The closest you can get is to write a script to revoke insert/update/delete on all tables in an archived schema; that's also pretty easy: $ psql -Atc "select 'REVOKE INSERT,UPDATE,DELETE FROM '|| nspname||'.'|| relname||' FROM someuser;' from pg_class C join pg_namespace N on C.relnamespace=N.oid and N.nspname='taxes2006' WHERE C.relkind='r'" \ | psql -qa (In general, you should reduce everything to a one-liner.) One of the advantages of having archived schemas in a single database is that you'll be able to write queries that involve multiple years. You wouldn't be able to do that (easily*) if you archived the full database. -Reece * This is where David Fetter will mention dblink. -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] digest data types?
Does anyone have postgresql types for message digests, especially md5 and sha1? Obviously I could store these as text (as I currently do), but I'm particularly interested in custom types that store digests as binary blobs and provide conversion to/from text. Am I correct in assuming that the space saved by storing digests as binary (1/2 size of hex) will substantially impact index ins/upd/del performance or when the digest itself is a large fraction of the rest of the row size? Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
[GENERAL] digest data types?
Does anyone have postgresql types for message digests, especially md5 and sha1? Obviously I could store these as text (as I currently do), but I'm particularly interested in custom types that store digests as binary blobs and provide conversion to/from text. Am I correct in assuming that the space saved by storing digests as binary (1/2 size of hex) will substantially impact index ins/upd/del performance or when the digest itself is a large fraction of the rest of the row size? Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
[GENERAL] storing checksums in a database
I'd like to store several kinds of checksums in a database and would appreciate some advice about the best way to do this. I will use these checksums for uniqueness constraints and for searching. I currently store md5s as text and I could store other checksums as text also, but it occurs to me that there are at least two other options. One option is to store binary blobs instead. This would save 2x on storage over hex. For example, sha1 is a 160 bit hash; that's 20 bytes as binary or 40 (8-bit) chars as hex. As longer hashes are used, I imagine that the space savings might have significant impact on index building and query optimization decisions. The downside is that one would need to encode/decode for lookups based on the hash. Another option is to create a custom type for hex digests that would provide type conversion as necessary. These would provide the space benefit of storing as a binary and the usability of having postgresql Do The Right Thing depending on the type of the representations being compared. I can't think of any downsides. So, the questions: 1) Am I worrying about nothing, or is the size of a hash significant? In my case, the rows consist of a protein sequence (~350 chars, as text), a timestamp, an integer PK, and an integer length. The text md5 is 32 text chars, of course, to which I might add sha1 (40 text chars). Thus, the size of hashes is a sizable fraction of the rest of the row. 2) Does anyone know of postgresql digest data types that provide these conversions? If not, I'll do it myself... but I'd rather steal of course. Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter.
Re: [GENERAL] Dealing with table names in functions
On Thu, 2007-03-22 at 09:40 -0400, Alvaro Herrera wrote: > Pass the optionally qualified name and cast it to regclass. It will > work correctly when the name is not qualified, applying search_path, > and it will also work when the name is qualified. Is there a way to get names that are always qualified, irrespective of the search_path? Once one has the oid, it's an easy function to write, but I suspect this machinery already exists. For example, I'd like the hypothetical cast: [EMAIL PROTECTED]> set search_path = 'unison'; SET [EMAIL PROTECTED]> select 'pseq'::regclass::oid::FQregclass; regclass -- unison.pseq (1 row) -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] deleting a foreign key that has no references
On Mon, 2007-03-19 at 13:03 -0700, Reece Hart wrote: > The harder and more general case is to build such a query dynamically > from pg_depends ... > See the pg_depends documentation at Apologies. I intended to write pg_constraint and the documentation at http://www.postgresql.org/docs/8.2/interactive/catalog-pg-constraint.html -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter.
Re: [GENERAL] deleting a foreign key that has no references
On Mon, 2007-03-19 at 11:12 -0500, Glen W. Mabey wrote: > What I'm hoping to find out is whether there is some way to directly > find out how many (using a SELECT query) references there are to a > key. In the easy case when your schema doesn't change often, you can just hard code a query of the FK tables and add up the row counts. I bet something like 'select count(*) from (select * from FKtable1 UNION ALL select * from FKtable2 ... )' will work (and I'm guessing that the UNION ALL will optimize well). Obviously, you want indexes on the FKs. The harder and more general case is to build such a query dynamically from pg_depends. A good start would be to write a function that returns an sql query like the above to count the referents of PKtable(PKcolumn). If you can declare this function stable or immutable (I'm not sure of this), then it might not be too painful to generate the query within the trigger itself. Otherwise, you might have to store/update these queries in a separate table after every DDL change. See the pg_depends documentation at http://www.postgresql.org/docs/8.2/interactive/catalog-pg-depend.html -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Postgres Database size
On Sun, 2007-03-18 at 00:40 +0530, Mageshwaran wrote: > how to find the size of a particular database in postgres... The old way was to use du or similar. Recent versions (I believe >=8.1, but check the release notes to be sure) provide several useful functions for this: pg_column_size pg_database_size pg_relation_size pg_size_pretty pg_tablespace_size pg_total_relation_size For example: [EMAIL PROTECTED]> select datname,pg_size_pretty(pg_database_size(oid)) from pg_database ; datname | pg_size_pretty -+ postgres| 3905 kB csb | 113 GB template0 | 3840 kB csb-dev | 124 GB csb-dev-snapshot-2007-03-08 | 123 GB csb_02_02_2007 | 121 GB template1 | 3840 kB -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Oracle-PostgreSQL Coexistence/Interoperability (segue from PostgreSQL to Oracle)
On Fri, 2007-03-16 at 14:20 -0400, Jonah H. Harris wrote: > As Ezequias asked about migrating an application, I'm not quite sure > why we're discussing this. Using HSODBC to move data permanently is > quite good assuming you have no data type issues. I missed the OP and therefore the link to app migration. However, HSODBC and its limitations were mentioned and therefore this discussion seems germane. No matter... I've posted this as a new thread. Perhaps you've implicitly answered my question -- you apparently think HSODBC's primary utility is to "move data permanently". That's fine, but far short of my desire to have a efficient way to expose schema and data in a postgresql database to oracle's clients -- in real time, with pg functions, sane type conversions, and reasonable join performance. I appreciate the challenges of the latter for an optimizer, but it's technically possible (and, psst, could be very lucrative). I'm not trying to permanently migrate anything, but rather I'm trying to coexist seamlessly with Oracle and lessen the barriers to transitioning to postgresql (or v/v). Interoperability is really the only option when one has to contend with the integration of multiple database apps/schema from multiple developers and with Oracle-specific dependencies, such as I do. Pragmatically speaking, you can't just migrate such a tangle wholesale -- if there's no other route, it'll never happen (and, sniffle, I'll be inextricably bound to Oracle forever). -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter.
Re: [GENERAL] [SQL] PostgreSQL to Oracle
On Thu, 2007-03-15 at 16:20 -0400, Robert Treat wrote: > Do you find [HSODBC] works well? I've used it from some older Oracle > instances connecting back into PostgreSQL and the results I had have > been flakey at best. For us too. We tried in early 2006 with Oracle 8 via hsodbc to postgresql 7.4.6 and had several problems. From what I remember: - Oracle complained about SQLError when using unixODBC 3.0. This necessitated using 2.5. http://www.ianywhere.com/developer/product_manuals/sqlanywhere/0902/en/html/dbpgen9/0291.htm - We had type conversion problems. I can't remember many details of this, but I *think* they may have been text/varchar and timestamp/date. We don't use blobs. - It took our admins a fair bit of time to get working. It probably didn't help that Oracle was running on AIX. - The real nail in the coffin seemed to be that there was no way to push predicates to the remote side. This mean that a remote relation was materialized locally in order to join with local data and that we'd never get reasonable performance for cross-database joins (which, perhaps, it's not designed to do). I'm particularly curious about this last problem. Without pushing predicates to the remote side, it's hard for me to imagine that HSODBC would be terribly useful for anything but building a local warehouse. This seems so odd that I think I must be wrong, but I couldn't find any documentation that to support or discourage cross-database joins. Can someone out there enlighten me? What's the intent for HSODBC and what are its limitations? Can one access remote server side functions via HSODBC (eg, sqlplus> select col1,plpgsqlfx(col1) from remotetable)? Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] orphaned PGDATA/base/ subdirectories
On Tue, 2007-03-13 at 18:16 -0400, Alvaro Herrera wrote: > On the other hand it would be scary to have the elevator "change its > mind" on the middle of carrying you to 20th and suddenly opening the > floor for dumping you to the basement instead. Which one would think > is > pretty much what must have happened here ... Actually, I got off at the floor 7-and-a-half and I discovered a great view through Tom Lane's eyeballs. Here's what it looks like: Jeez, I have the postgresql source memorized (all versions)... my hands move rapidly between 5 keyboards, I can watch 5 monitors at the same time, and I can respond to 15 emails while coding on 3 branches and running regression tests... this is great! -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter.
Re: [GENERAL] orphaned PGDATA/base/ subdirectories
On Tue, 2007-03-13 at 17:44 -0400, Tom Lane wrote: > are you in the habit of > banging on ^C repeatedly? I couldn't reproduce such a problem in a > small amount of testing though. You mean C-c doesn't work like elevator buttons? It wouldn't surprise me that I hit repeatedly. I certainly remember at least one case where it seemed like I was being ignored and hit C-c repeatedly. It's also quite possible that I logged out from ssh (with ~.), and I have no idea what signal (signals?) that ultimately sends. And, last night I sent KILL when I was convinced that I was being ignored and was about to restart the cluster anyway, for which I'd expect remnants. However, I'm certain of the observation that disk full led to remnants of a partial copy. That job was running in the background when I wasn't logged in. Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter.
[GENERAL] orphaned PGDATA/base/ subdirectories
setup: postgresql 8.1.8, x86_64, gnu/linux I recently used createdb -T to copy a large database and the process failed due to disk full, but the space wasn't freed after that failure. That led me to wonder how much of my disk usage was from failed or interrupted operations. In the 8.1.8 manual, Chapter 50, I read: For each database in the cluster there is a subdirectory within PGDATA/base, named after the database's OID in pg_database. This subdirectory is the default location for the database's files; in particular, its system catalogs are stored there. So, in pg_database I see: [EMAIL PROTECTED]> select oid,datname from pg_database order by 1; oid | datname -+- 1 | template1 10792 | template0 10793 | postgres 1235174 | csb_02_02_2007 1379984 | csb-dev 2121234 | csb 2296808 | csb-dev-snapshot-2007-03-07 2422198 | csb-dev-snapshot-2007-03-08 2558026 | csb-dev-snapshot-2007-03-13 (9 rows) And in PGDATA/base: [EMAIL PROTECTED]:/srv/postgresql/base> ls -l total 207 drwx-- 3 postgres postgres 2640 2007-03-12 22:07 1 drwx-- 2 postgres postgres 2576 2007-02-16 17:21 10792 drwx-- 3 postgres postgres 2640 2007-03-12 20:48 10793 drwx-- 3 postgres Bioinfo 21472 2007-03-12 20:47 1235174 drwx-- 2 postgres Bioinfo 3872 2006-10-19 21:25 1363637 ? drwx-- 2 postgres Bioinfo 1888 2006-10-20 08:32 1371793 ? drwx-- 3 postgres Bioinfo 31336 2007-03-12 20:48 1379984 drwx-- 3 postgres Bioinfo 21336 2007-03-12 20:51 2121234 drwx-- 3 postgres Bioinfo 30720 2007-03-13 08:57 2296808 drwx-- 3 postgres Bioinfo 30680 2007-03-12 20:47 2422198 drwx-- 2 postgres Bioinfo 21304 2007-03-12 18:50 2549963 ? drwx-- 2 postgres Bioinfo 20504 2007-03-12 20:47 2549977 ? drwx-- 2 postgres Bioinfo 4320 2007-03-12 20:47 2550014 ? drwx-- 2 postgres Bioinfo 18448 2007-03-13 03:57 2558026 I've put ? after the base/ subdirs that have no correspondent in pg_database. The manual clearly says that a each row in pg_database will have a corresponding subdirectory in base/. Is the converse also true, ie, should every subdir in base/ also have a row in pg_database? If so, am I safe to rm the unreferenced subdirectories? FWIW, I know that I've interrupted (^C) a few createdb -T operations during the course of this installation, and I recently had the disk full error. My best guess is that these orphaned directories result from those interruptions and failure. Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] pg_restore and database comments
I periodically take snapshots of my current development database to facilitate recovery from screw-ups. I do so roughly like this: createdb -EUTF-8 db-snapshot pg_dump -Fc db | pg_restore -d db-snapshot For which I get the fairly innocuous error: pg_restore: WARNING: database comments may only be applied to the current database I'm not aware of any other part of a database dump that references the source database by name (nor should it, since this would cause problems when restoring to a different database name). Is there a better way to snaphsot a database? createdb -T db db-snapshot seemed like a good idea, but that requires the template db to not be in use (ie, no logged in users). I'm running 8.1.8 and will move to 8.2.3 shortly. Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Database deadlock/hanging
On Wed, 2007-03-07 at 10:08 -0600, John Gateley wrote: > I'm guessing it's something related to table locks. ... > Any pointers on what I should be looking for to prevent this from > happening again? What information I should be tracking to figure > out what is exactly happening? Your inserts almost certainly have a table or index exclusively locked and thereby causing a backlog of selects. You can fish current and waiting locks out of pg_locks, but those use internal identifiers rather than names. Here's a view that will make pg_locks more readable: [EMAIL PROTECTED]> CREATE OR REPLACE VIEW pgutils.locks AS SELECT l.pid, d.datname AS "database", n.nspname AS "schema", c.relname AS relation, l.locktype, l."mode", CASE l."granted" WHEN true THEN 'RUN'::text ELSE 'WAIT'::text END AS state, a.usename, a.current_query, to_char(now() - a.query_start, 'HH24:MI:SS'::text) AS duration FROM pg_locks l JOIN pg_database d ON l."database" = d.oid JOIN pg_class c ON l.relation = c.oid JOIN pg_namespace n ON c.relnamespace = n.oid JOIN pg_stat_activity a ON l.pid = a.procpid ORDER BY l.pid, d.datname, n.nspname, c.relname, l."granted"; [EMAIL PROTECTED]> select * from pgutils.locks ; pid | database | schema | relation | locktype | mode | state | usename | current_query | duration ---+--+++--+-+---+-+---+-- 28434 | csb-dev | pg_catalog | pg_class | relation | AccessShareLock | RUN | rkh | | 00:00:21 28434 | csb-dev | pg_catalog | pg_class_oid_index | relation | AccessShareLock | RUN | rkh | | 00:00:21 28434 | csb-dev | pg_catalog | pg_locks | relation | AccessShareLock | RUN | rkh | | 00:00:21 28434 | csb-dev | pg_catalog | pg_namespace | relation | AccessShareLock | RUN | rkh | | 00:00:21 28434 | csb-dev | pg_catalog | pg_stat_activity | relation | AccessShareLock | RUN | rkh | | 00:00:21 28434 | csb-dev | pgutils | locks | relation | AccessShareLock | RUN | rkh | | 00:00:21 (6 rows) -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Union Query Improvement
On Tue, 2007-02-13 at 05:46 -0800, Ray Bannon wrote: > Select ID, plan_name from table/view > Where plan_name = 'A' > And rownum = 1 > UNION ... > > Ad infinitum for about 100 iterations. > > Any way to write this more efficiently? I assume that "table/view" in your OP really refers to different tables or views; otherwise, this is a trivial query that can be collapsed to "select ... where plan_name in ('A','B','C') ..." or, perhaps just "select ... where rownum=1". As Tom said, UNION ALL may be more appropriate for your needs. See http://www.postgresql.org/docs/8.1/interactive/queries-union.html You should consider inherited tables and table partitioning. See http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html Partitioning offers two administrative benefits: consistency of child tables and an implied UNION ALL of children when one selects on the parent. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter.
Re: [GENERAL] Foreign Key Identification
From: Ashish Karalkar * To: pgsql-general ( at ) postgresql ( dot ) org * Subject: Foreign Key Identification * Date: Wed, 10 Jan 2007 08:23:28 -0800 (PST) Hello All, Is there any means to list out the foreign key tables which are liked with a primary key? What i want to do is something as follows: ... select tablename from ? where Foreign key = TableAPK and this should list me I needed to do something similar. I wrote some views to do this and packaged 'em up as something I call pgutils. See: http://archives.postgresql.org/pgsql-general/2006-11/msg00752.php http://harts.net/reece/pgutils/ -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] select union with table name
On Thu, 2006-12-28 at 19:09 +0700, Akbar wrote: > select blue.name from blue union select red.name from red > give me this: > name > 'blabla' > 'bubu' > 'haha' > 'kkk' > > I want this: > nametable_name > 'blabla' blue > 'bubu'blue > 'haha'red > 'kkk' red > > Could I? select name,'blue' as "table_name" from blue union all select name,'red' as "table_name" from red; Note the 'all' after union... I suspect you'll want that or should at least consider it. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] (bash the search engine) Don't split on underscore
On Tue, 2006-12-19 at 13:25 +0100, Hannes Dorbath wrote: > I think it would be useful to adjust the parser to not split on underscores: > > In case I'd like to lookup PG's to_number() function I won't get > anything useful. Number is contained nearly everywhere and to is > configured as stop word. Same with most other functions. That would be useful and almost certainly result in better specificity. A counter example is searching for "information schema", for which you'd probably want hits to "information_schema" as well. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter.
Re: [GENERAL] Anyone? Best way to authenticate postgres against
On Mon, 2006-12-18 at 20:23 -0600, Derrick wrote: > Subject: > [GENERAL] Anyone? Best way to > authenticate postgres against > active directory? I know very little about AD, its kerberos implementation, and interoperability limitations. I have been using kerberos with postgresql for years and think it's nearly the holy grail of authentication. I know that linux can authenticate against AD domains , so I presume that one could also get postgresql to do so. Here are two bookmarks that I've saved and that you might find useful: http://chrisp.de/en/rsrc/kerberos.html http://www.windowsnetworking.com/articles_tutorials/Authenticating-Linux-Active-Directory.html If you can make that work, the community might appreciate a summary of how you did it. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Let's play bash the search engine
On Mon, 2006-12-18 at 15:47 -0800, Joshua D. Drake wrote: > Take a look at let us know what you think and how it performs for you. Terrific. Fast and meaningful. I echo Thomas' request to have docs limited to a version (or, better, most recent). Perhaps archived docs should be searched via a separate page entirely. Most the queries I did hit what I expected, except that the docs were for old versions. (In fact, I don't think 8.2 docs ever showed up first.) I tried "defer constraints" and got a few not-too-useful hits. However, "deferred constraints" returned meaningful links. Is that a stemmer problem? -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter.
Re: [GENERAL] MySQL drops support for most distributions
On Wed, 2006-12-13 at 14:21 -0500, Tom Lane wrote: > I think the real criterion for big companies is not so much whether > you're supporting your "own" product as whether you're big enough to > be worth suing if things go wrong. This is a common misunderstanding and it is incorrect, at least in my experience. I work at a company with >10K people. I oversee computer architecture and operations for Research (~800 people) and I work very closely with our large IT group. In order to understand how we purchase hardware, software, or support, you have to understand what's important to us. A successful company must focus on their products and not irrelevant details about how they gets produced and delivered. Employees may personally care about the detailed means to product, but successful companies and their managers -- and, ultimately, customers and stock holders -- do not. The major concerns for our purchases include: 1) Does it meet our functional requirements? 2) Does it integrate with our existing infrastructure? 3) Can we identify a support channel? and 4) What's the risk relative to other options? These days, OSS packages frequently exceed functional requirements over proprietary alternatives. Apache is an irrefutable example. Big vendors often have proven track records for (2) and (3), but it's not the bigness per se that appeals. We choose small vendors when that's appropriate for a need. Whom we sue when things go wrong is almost never a consideration during purchasing. If a relationship goes south, a suit is unlikely to address our primary goal, the product. Now, lest you think I'm a corporate troll on the pg lists, I should tell you that I'm probably among the most visible and vocal open source supporters here. I've long railed against proprietary software -- not because of support issues but because I view *some* proprietary software as a real threat to our long-term success. What's important is that our data are usable in ways we see fit, without encumbrance from vendors. This is not the goal of big vendors who require depend on lock-in for growth. The EnterpriseDB folks have the right strategy. Nobody wants Oracle itself, but rather they want database services that behave like Oracle (er, except the parts that annoy). If I can't tell that I'm not talking to Oracle but getting the "right" answers, why should I care? Cheaper too? Even better. Oracle should be scared because it seems inevitable that their database business will be commoditized out of existence. Concern for risk is perhaps the most elusive problem for OSS providers and supporters. Companies don't like risk, and *any* change to a working process is a risk. Much to my chagrin, this risk makes it difficult to unseat even mediocre products. We should all cheer EnterpriseDB's success in booking some big name companies. Their successes will establish PostgreSQL as a reliable, cost-effective, and empowering alternative to proprietary databases and therefore decrease the risk concerns. The only reason I spent this much time weighing in is because I'm thrilled with PostgreSQL (er, sorry Tom, Postgres) and appreciate and respect the terrific work done in this community. Thank you. Cheers, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter.
Re: [GENERAL] Copy command to load data into a PostgreSQL DB
On Fri, 2006-11-17 at 15:46 -0500, Wm.A.Stafford wrote: > If we use '\x05' as suggested by the > Fastreader User guide. PostgreSQL complains "ERROR: COPY delimiter > must be a single character". > > Any help or suggestions would be appreciated. \x05, aka control-e, ctl-e, ^e, C-e, etc, is merely one way to represent the *single* ASCII character 5. When a manual says type \x05, it almost surely means that some program will interpret those four printable chars as a single non-printable character rather than the 4-char string literally. Different programs use different representations of control characters and \x is very common, but the problem is that postgres's copy command doesn't understand this syntax for non-printable characters. No matter, copy will work for you, and in fact you can use TWO ways to represent control-e. Option 1: Instead of '\x05', type E'\x05', that is: - letter E - single quote - the 4-char string \x05 - single quote E is postgresql's way of indicating that the string will be interpreted in way that does not conform to SQL spec. This results from the pg developers being pedantic about conformance and refusing to extend the standard carelessly (I love 'em for this attitude). Option 2: Instead of '\x05', type 'C-vC-e' . By this I mean: - single quote - control-v - control-e - single quote. The magic here is that control-v means "take the next character verbatim". In effect, you're typing ASCII char 5 (a single character) literally into the single quotes rather than \x05 or any other representation of it (e.g., \x05). I'm pretty sure that readline is responsible for this interaction, and therefore this probably doesn't work on readline-less installations. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] R and postgres
On Tue, 2006-11-07 at 00:22 -0800, Joe Conway wrote: I think you want this: http://bioconductor.org/packages/1.9/bioc/html/RdbiPgSQL.html The bioconductor project is now maintaining RdbiPgSQL. I think many people also use RODBC to connect R with Postgres. Joe- I almost sent you an off-list email before I posted because I just knew you'd know the answer. Thank you. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] database name aliases?
Woody and Albe- I wasn't aware of pg_service -- that does solve my original problem. Thanks for the replies. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] database name aliases?
On Mon, 2006-11-06 at 16:43 -0800, Jeff Davis wrote: You can use "ALTER DATABASE name RENAME TO newname;". Does that help? This is what I do now to evolve from development to staging to production, as well as to deprecate versions. That indeed solves most of the problem. Aliases might solve two problems. The first is to address the oft-recurring problem of wanting to be able to refer simultaneously to an instance and more generally to a concept (e.g., HEAD in cvs, or /etc/alternatives/ for system executables, etc). That is, one could refer to a specific db version/instance as well as a name for the "most recent" version (or dev, stage, prod, or whatever). The second goal is more practical: postgres doesn't allow a database to be renamed while it's in use and that prohibition causes minor scheduling problems when rotating instances. I imagine that db aliases would affect only new connections. Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] R and postgres
On Mon, 2006-11-06 at 19:13 -0800, Adrian Klaver wrote: You might want to take a look at PL/R a procedural language for R in Postgres. http://www.joeconway.com/plr/ PL/R is indeed neat, but I want to go the other way: within R, fetch data from postgres and manipulate it (plot, histogram, etc). I appreciate that I could expose this through postgres, but that's a little awkward for prototyping and I'm not keen to add load to my postgres box. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
[GENERAL] R and postgres
I'd like to get R to talk to postgresql, but my head's spinning among a web of broken links, way outdated web pages, project deprecation announcements and a terrible signal:link ratio. Rdbi and RdbiPgSQL seem to be the answer, despite both being apparently defunct projects. What is the Right Thing for a guy who wants R to talk to postgresql? Thanks, and send aspirin, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] database name aliases?
On Tue, 2006-11-07 at 04:54 +0500, Talha Khan wrote: You can do this by using a script. I mean whenever you are creating a new version of the database. you can do that by running a script that creates a database and then stores the name of that database in an environmental variable . Now you can run psql by connecting to a database through that variable. I think I understand the gist of your proposal, but psql was just one client example. In principle, I'd want the same alias to be exposed to psql, perl DBI, odbc, jdbc, etc. I really think this should be done in the database itself to ensure consistency. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
[GENERAL] database name aliases?
I'd like to be able to have several versions of a database available concurrently and one database alias that refers to the most recent of these. For example: dbname_1-1 dbname_1-2 dbname_1-3 dbname -> dbname_1-3 and $ psql -d dbname would connect to dbname_1-3. Any ideas? Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] not null across multiple columns
On Thu, 2006-11-02 at 19:10 -0500, Matthew Terenzio wrote: > suppose you wanted to be certain that either one of two or more > columns were present > > Like in a user table, either a username or an email need to be present > to create a row. > > You can't use not null because it's an either or situation. > > what's the best method to accomplish this sort of constraint? See check constraints: http://www.postgresql.org/docs/8.1/interactive/ddl-constraints.html#AEN1954 I suppose you seek something like this: create table people ( name text, email text, constraint valid_name_or_email check (name is not null or email is not null) ); Cheers, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] DELETE performance issues
On Thu, 2006-11-02 at 11:27 -0500, Tom Lane wrote: Are you absolutely sure about that last? Unindexed foreign key references are the explanation nine times out of ten when someone complains about deletes being slow. This was certainly the major cause when I had slow deletes. (The other problem was chaining of cascading deletes.) In order to help correct such problems, I wrote some views to identify unindexed, cascading foreign keys. An example: [EMAIL PROTECTED]> select * from pgutils.foreign_keys_missing_indexes ; fk_namespace | fk_relation | fk_column | fk_indexed | pk_namespace | pk_relation | pk_column | pk_indexed | ud ... --+--+-++--+-+-++--- ... gong | node | alias_id | f | gong | alias | alias_id | t | cn ... taxonomy | node | division_id | f | taxonomy | division | division_id | t | cc ... gong | alias | go_id | f | gong | node | go_id | t | cc ... etc... ud is an abbreviation for update and delete constraint type (cascade, set null, restrict, etc). In this view, "indexed" means that the column is the first or only column in some index, i.e., pg_index.indkey[0] = pg_attribute.attnum. I suppose that one might want to distinguish the indexing cases more precisely as unindexed, sole-column index, first col of mult-col index, second col of multi-col index, etc, but I didn't do so. The views were originally written for 7.4 and I don't know what's appropriate for current multicolumn index behavior. The code is in http://harts.net/reece/pgutils/ . -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Deleting Problem
On Tue, 2006-10-31 at 10:24 +1100, Jamie Deppeler wrote: Here is my problem i have delete triggers on table5 and table6 which update summary information on table1,table2,table3,table4 if table5 or 6 gets delete, problem i am faced with is when eg table1 record gets deleted i get the following error Error insert or update on table5 violates foreign key constraint "table4". They are all link with a cascade delete. As Stephen said, it's difficult to help without more details about the triggers and the structure of your transaction. However, if you expect that the constraints are consistent at the end of the transaction, perhaps the problem is merely the order in which constraints are checked. See http://www.postgresql.org/docs/8.1/interactive/sql-set-constraints.html and consider something like SET CONSTRAINTS ALL DEFERRED . -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] creating a dumpfile from a view
On Mon, 2006-10-30 at 16:02 -0800, Richard Yen wrote: Hi, I'm trying to create a dumpfile for a client. The data is gathered from about 7 tables, and I need to output all the columns as the client wishes. [snip] Would anyone know how to dump the data from the view? I tried the following, but it doesn't work: I often export data with psqland feed to a spreadsheet, like this: $ psql -F'' -Ac 'select col1,col2 from aview' >aview.csv $ OOo aview.csv Perhaps that will work for you. is generated by typing ctrl-v, then . ctrl-v inserts the next char literally. I dunno about the ill begotten csh-derivatives. Fortunately, I've never had conflicts with the delimiter choice and the data. You may not be so lucky. If you're truly looking for the view data in pg_dump format, the only thing I can think of is to materialize the view and dump that. A similar effect could be obtained by synthesizing a create table statement appropriate for the psql data export as above and using the copy command to load data. That could all be wrapped into a single file to be passed to psql for loading. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] Postgresql 6.13
On Fri, 2006-10-13 at 14:54 +0100, Richard Huxton wrote: Greg's mostly right, but the Vic-20 port is available as a type-in program listing in issue 3 of "Your Computer". Don't bother. I already typed it in and have it on cassette tape here somewhere. I'll send it via parcel post with two 13 cent stamps as soon as I finish watching War Games on my betamax. Is Carter still president? -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] using schema's for data separation
On Fri, 2006-09-29 at 09:39 -0500, Erik Jones wrote: > What we've done is to use table inheritance. So, > we have a group of base account data tables and whenever a new account > is added they get a set of tables that inherit from these base tables. > This works well in that whenever we need a global schema change to any > of these tables we can just alter that pertinent base table and the > change will cascade down to the child tables. Many DDL commands do not propagate to the child tables -- foreign keys, for example, are not inherited. As a consequence, using inheritance for Chris' purpose has the important caveat that referential integrity will be difficult to ensure with standard FK constraints. I believe that other types of constraints and rules are also not propagated to child tables. Erik- do you have a clever solution to this aspect of using inheritance? On Thu, 2006-09-28 at 22:59 -0700, snacktime wrote: > The connections to the database are from an unprivileged user, and > everything goes through functions that run at the necessary > privileges. Someone out there must have implemented row-level security for postgresql (EnterpriseDB?). You could implement this yourself by adding a usesysid column to each table and using a combination of column defaults, views, triggers, or rules to set usesysid on insert and update and to require that the usesysid column matches the current user on select and delete. This probably improves the consistency of your security policy (over having the policy in functions). On the other hand, I could be blowing smoke -- it seems like this outta work, but I haven't actually done this myself. I don't actually do any real work myself. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] text formatting in a query, a la sprintf
I often desire text and number formatting that's not available with functions like to_char. I'd really like to have functionality akin to sprintf, such as: $ select no_hits,term,sprintf('http://google.com?q=%s',replace(queryterm,' ','+')) as url from queries; $ select event,sprintf("%.2g",probability) as prob from event_probs; $ select sprintf('%s',color,content) as td_cell from cells; I considered a plperl function to do this, but plperl can't take anyarray and that means that I'd have to resort to ugliness like coding for specific numbers of args or encoding args them within in a single string... ick! Is there a way to get more flexible formatting as in these examples within a backend query, even if not with sprintf per se? Thanks, Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Re: [GENERAL] text datum VARDATA and strings
On Mon, 2006-08-14 at 15:51 -0400, Tom Lane wrote: Whose 7.x manual? This stuff has been there since we invented the "version 1" function call convention, which was 7.3 or before. There is some documentation in the SGML docs, but really we kind of expect you to look at the standard built-in functions to see how things are done... The PostgreSQL manual. I wrote these functions early in the 7.x series and I don't know which manual version exactly. For example, the sec 9.5.4 of http://www.postgresql.org/docs/7.3/static/xfunc-c.html shows code for concat_text, which I remember was the basis for what I wrote. I now see and understand the text regarding detoasting the 'DatumGetXXX' macros; the relevance of these were not obvious to me at the time. So it's best not to assume that you know what is inside a text datum, if possible, Okay. Does that mean that code in 9.5.4 should have a warning to that effect? > 3) Is there any reason to believe that the code below is problematic? The only thing I'd suggest is that checking for a null return from palloc is a waste of effort. It doesn't return to you if it runs out of memory. Okay. Thanks for the advice, Tom. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
[GENERAL] text datum VARDATA and strings
Michael Enke recently asked in pgsql-bugs about VARDATA and C strings (BUG #2574: C function: arg TEXT data corrupt). Since that's not a bug, I've moved this follow-up to pgsql-general. On Mon, 2006-08-14 at 11:27 -0400, Tom Lane wrote: > The usual way to get a C string from a TEXT datum is to call textout, > eg > str = DatumGetCString(DirectFunctionCall1(textout, datumval)); Yikes! I've been accessing VARDATA text data like Michael for years (code below). I account for length and don't expect null-termination, but I don't use anything like Tom's suggestion above. (I always try to do what Tom says because that usually hurts less.) I have three questions: 1) I based everything I did on examples lifted nearly verbatim from a 7.x manual, and I bet Michael did similarly. I've never heard of DatumGetCString, DirectFunctionCall1, or textout. Are these and other treasures documented somewhere? 2) Does DatumGetCString(DirectFunctionCall1(textout, datumval)) do something other than null terminate a string? All of the strings are from [-A-Z0-1*]; server_encoding has been either SQL_ASCII or UTF8 in case that's relevant. 3) Is there any reason to believe that the code below is problematic? Thanks, Reece #include #include #include #include static char* clean_sequence(const char* in, int32 n); PG_FUNCTION_INFO_V1(pg_clean_sequence); Datum pg_clean_sequence(PG_FUNCTION_ARGS) { text* t0; /* in */ text* t1; /* out */ char* tmp; int32 tmpl; if ( PG_ARGISNULL(0) ) { PG_RETURN_NULL(); } t0 = PG_GETARG_TEXT_P(0); tmp = clean_sequence( VARDATA(t0), VARSIZE(t0)-VARHDRSZ ); tmpl = (int32) strlen(tmp); /* copy temp sequence into new pg variable */ t1 = (text*) palloc( tmpl + VARHDRSZ ); if (!t1) { elog( ERROR, "couldn't palloc (%d bytes)", tmpl+VARHDRSZ ); } memcpy(VARDATA(t1),tmp,tmpl); VARATT_SIZEP(t1) = tmpl + VARHDRSZ; pfree(tmp); PG_RETURN_TEXT_P(t1); } /* clean_sequence -- strip non-IUPAC symbols The intent is to strip non-sequence data which might result from copy-pasting a fasta file or some such. in: char*, length out: char*, |out|<=length, NULL-TERMINATED out is palloc'd memory; caller must free allow chars from IUPAC std 20 + selenocysteine (U) + ambiguity (BZX) + gap (-) + stop (*) */ #define isseq(c) ( ((c)>='A' && (c)<='Z' && (c)!='J' && (c)!='O') \ || ((c)=='-') \ || ((c)=='*') ) char* clean_sequence(const char* in, int32 n) { char* out; char* oi; int32 i; out = palloc( n + 1 );/* w/null */ if (!out) { elog( ERROR, "couldn't palloc (%d bytes)", n+1 ); } for( i=0, oi=out; i<=n-1; i++ ) { char c = toupper(in[i]); if ( isseq(c) ) { *oi++ = c; } } *oi = '\0'; return(out); } -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match