[SQL] C functions and int8?
I have written a few Postgres extension functions in C, and want to modify some of them to return an int8. However, I don't see any int8 definition in postgres.h. (I have the 7.0.2 RPMs installed.) How should I accomplish this?
Re: [SQL] C functions and int8?
That's an int8 meaning "eight bit integer". I want to work with an int8 meaning "64 bit integer", as described in the docs: http://www.postgresql.org/users-lounge/docs/7.0/user/datatype.htm#AEN942 So how do I return one of these suckers from a C function? Forest On Thu, 21 Sep 2000 10:32:50 +0200 (CEST), Karel Zak wrote: > >On Thu, 21 Sep 2000, Forest Wilkinson wrote: > >> I have written a few Postgres extension functions in C, and want to modify >> some of them to return an int8. However, I don't see any int8 definition >> in postgres.h. (I have the 7.0.2 RPMs installed.) How should I >> accomplish this? > > in the source tree: src/include/c.h > > typedef signed char int8; /* == 8 bits */ > > > ... but I not sure if this file is included in the RPM package. > > > Karel
[SQL] SQL functions not locking properly?
I'm having a problem with functions written in SQL. Specifically, they
don't seem to be adhering to Postgres locking rules. For the record, I'm
using postgres 7.0.2, installed from RPMs, on Red Hat 6.2.
Here's what I'm seeing:
(psql input represented by '<<'; output represented by '>>'.)
session1<< create table idseq
session1<< (
session1<< name varchar(32) not null,
session1<< id int8 not null default 0
session1<< );
session1>> CREATE
session1<< insert into idseq values ('myid');
session1>> INSERT 20700 1
Each row in the table is supposed to represent a named numeric sequence,
much like the sequences built into postgres. (Mine use an int8 though,
so their values can be much higher.)
session1<< create function nextid( varchar(32)) returns int8 as '
session1<< select * from idseq where name = $1::text for update;
session1<< update idseq set id = id + 1 where name = $1::text;
session1<< select id from idseq where name = $1::text;
session1<< ' language 'sql';
session1>> CREATE
The idea here is that the select...for update within the nextid() function
will establish a row level lock, preventing two concurrent function calls
from overlapping.
Next, I test with two sessions as follows:
session1<< begin;
session1>> BEGIN
session2<< begin;
session2>> BEGIN
session1<< select nextid('myid');
session1>> nextid
session1>>
session1>> 1
session1>> (1 row)
session2<< select nextid('myid');
(session2 blocks until session1 completes its transaction)
session1<< commit;
session1>> COMMIT
(session2 resumes)
session2>> nextid
session2>>
session2>> 0
session2>> (1 row)
What gives??? I expected the second call to nextid() to return 2!
session1<< select * from idseq;
session1>> name | id
session1>> ------+
session1>> myid | 1
session1>> (1 row)
session2<< select * from idseq;
session2>> name | id
session2>> --+
session2>> myid | 1
session2>> (1 row)
As you can see, my nextid() function is not synchronized the way I hoped.
I don't know why though. Can someone help?
Thanks,
Forest Wilkinson
[SQL] Re: SQL functions not locking properly?
Tom Lane wrote:
>Hmm. If you do the same commands without wrapping them in an SQL
>function, they operate as expected. I'll bet there's some nasty
>interaction between the FOR UPDATE support and the way that SQL
>functions twiddle the current-command counter. Sigh, another bug.
>
>You might try it in plpgsql to see if that language has the same
>bug (and please report back the results!). If so, the only workaround
>right now is not to use a function, or possibly to code it in C using
>the SPI interface.
I just happen to have a C function that has a similar problem.
In this case, I am again trying to approximate the behavior of postgres
sequences. The main difference is that I'm using alphanumerics, instead
of 64 bit integers, to represent sequential values. The values are very
much like base 36 numbers.
The table used to store these alphanumeric sequences looks like this:
create table my_uid(
name varchar(32) NOT NULL PRIMARY KEY, -- sequence name
uid varchar(32) NOT NULL default(text '0'),-- current value
min_uid varchar(32) NOT NULL default(text '1'),
max_uid varchar(32) NOT NULL default(text
''),
cycle bool NOT NULL default('f'));
My next_uid() function is written in C, and has an interface like the
built-in nextval() function. It originally did the following:
(Get the desired sequence name from the function arguments;
e.g. "testuid".)
SELECT \"uid\", \"min_uid\", \"max_uid\", \"cycle\" FROM \"my_uid\" WHERE
\"name\" = 'testuid' FOR UPDATE LIMIT 1;
(Detect & report any errors.)
(Calculate a new sequence value according to the row selected;
e.g. "newval".)
UPDATE \"my_uid\" SET \"uid\" = 'newval' WHERE \"name\" = 'testuid';
(Detect & report any errors.)
(Return the new value.)
However, with just those two queries, next_uid() fails during concurrent
calls on the same uid sequence. Here's a log of what happens:
(<< indicates input; >> indicates output.)
session1 << begin;
session1 >> BEGIN
session2 << begin;
session2 >> BEGIN
session1 << select next_uid('foo');
session1 >> next_uid
session1 >> --
session1 >> 1
session1 >> (1 row)
session2 << select next_uid('foo');
(session2 blocks until session1 ends its session)
session1 << commit;
session1 >> COMMIT
(session2 resumes)
session2 >> ERROR: testuid: update query processed 0 rows
That error message is generated by my error trapping within next_uid().
It is reporting that SPI_processed is <= 0, right after executing the
update query I listed above. After committing both transactions,
examination of the rows in the database confirms that the session2 didn't
update any rows.
As you can see, the SELECT...FOR UPDATE followed by UPDATE does not work
properly during concurrent access. However, I can achieve the desired
behavior by executing this query before the other two, within the
next_uid() function:
UPDATE \"my_uid\" SET \"uid\" = \"uid\" WHERE \"name\" = 'testuid';
Once I modified the next_uid() code to execute this new query before doing
the rest of its work, concurrent calls worked as expected. So, that's
what I'm doing in my production code. But it's really quite ugly, and I
shouldn't have to do that! I can't explain why this approach works, short
of saying that row level locking is broken in postgres. Furthermore, I am
not confident that this workaround avoids the problem in all cases.
BTW, I'm using postgres 7.0.2 now, but I discovered this problem (and the
workaround) with 6.5.2 or 6.5.3.
>I'm up to my armpits in subselect-in-FROM right now, but will put this
>on my to-do list. Will look at it in a week or two if no one else has
>fixed it before then...
>
> regards, tom lane
Thanks, Tom. Please let me know when there's a fix.
I can provide more detailed C source code if you need it, but I think the
relevant parts of the code are expressed in this message.
Forest Wilkinson
[SQL] transactions surrounding extension functions
According to the postgres 7 docs: >By default, Postgres executes transactions in unchained mode (also >known as “autocommit” in other database systems). In other words, each >user statement is executed in its own transaction and a commit is >implicitly performed at the end of the statement (if execution was >successful, otherwise a rollback is done). Does this mean that when I call a function I wrote, which is composed of several queries, each of those queries will be executed in its own transaction? Or, will the statement containing the function call be executed in its own transaction, thereby including all the queries composing my function in that same transaction? This is important when I have a function that performs a SELECT...FOR UPDATE, followed by an UPDATE to the selected rows. If I call that function without surrounding it with BEGIN and END, the former behavior would be dangerous, while the latter behavior would be desirable. Thanks, Forest Wilkinson
[SQL] how to index a numeric(12,2) column?
I'd like to create an index on a column whose type is NUMERIC(12,2). There appears to be no default operator class for the numeric type. What class should I use instead? My guess is that something like this might work: CREATE INDEX foo_idx on foo (bar int8_ops); Will that work properly? Will it do me any good? Is there a better way?
Re: [SQL] how to index a numeric(12,2) column?
>> I'd like to create an index on a column whose type is NUMERIC(12,2). >> There appears to be no default operator class for the numeric type. > >Uh, what version are you using? Works fine for me in 7.0.2. Sorry; I hit send before adding that information. I'm using postgres 6.5.3 on i386 Red Hat 6.1. It allows me to create an index on a NUMERIC(12,2) field using the int8_ops class, but I'm wondering if this might have some undesirable hidden side effects. CREATE TABLE foo (id INTEGER, bar NUMERIC(12,2)); CREATE INDEX foo_idx on foo (bar int8_ops);
Re: [SQL] Using a postgres table to maintain unique id?
On 13 Nov 2000 10:30:55 PST, Dr Frog wrote: >create sequnece seq_name ; > >there are additional options >start sql and type What are the "SQL" and "TYPE" options? I don't see reference to them in the docs. Can I use the TYPE option to create a sequence that's based in int8 (64 bit integer) instead of int4? That's exactly what I've been wanting! What version of postgres supports this?
[SQL] alter table add column implementation undesirable?
A coworker told me that the postgres implementation of ALTER TABLE ADD COLUMN creates an inefficient database. He said it results in a table whose new column is stored someplace other than the rest of the columns. (A hidden auxiliary table?) Is this true in postgres 6.5.3? 7.x? Was it ever true? Forest Wilkinson
[SQL] unreferenced primary keys: garbage collection
I have a database in which five separate tables may (or may not) reference any given row in a table of postal addresses. I am using the primary / foreign key support in postgres 7 to represent these references. My problem is that, any time a reference is removed (either by deleting or updating a row in one of the five referencing tables), no garbage collection is being performed on the address table. That is, when the last reference to an address record goes away, the record is not removed from the address table. Over time, my database will fill up with abandoned address records. I suppose I could write procedural code in my client application, to check for abandonment when a reference is removed, but that would require examining each of the five referencing tables. I consider this a messy option, and I expect it would be rather inefficient. I thought of attempting to delete the address record any time a reference to it is removed, and relying on foreign key constraints to prevent the deletion if it is referenced elsewhere. However, I believe postgres will force the entire transaction block to be rolled back in such cases, thus nullifying all the other work done in the transaction. This is clearly undesirable. Isn't there some way to tell postgres *not* to roll back my transaction if a particular DELETE fails due to referential integrity? Are there any other options that might help me? Regards, Forest Wilkinson
Re: [SQL] unreferenced primary keys: garbage collection
Jan, Thanks for the reply, but your solution is rather unattractive to me. It requires that, any time a reference to an address id is changed, five tables be searched for the address id. This will create unwanted overhead every time a change is made. In order to make those searches even remotely fast, I'd have to add indexes to every one of those tables, which will mean an additional performance hit on table inserts. Moreover, if a new table is created that references address ids, and the maintainer at the time forgets to rewrite those trigger functions, the system will break. I'd much rather be able to simply attempt a delete of any given address, relying on referential integrity to prevent the delete if the address is still being referenced. I don't see why postgres has to treat such a situation as a fatal error. If postgres issued (for example) a warning instead of an error here, I'd be home free! Hasn't there been some talk on the lists about this lately? Forest Jan Wieck wrote: >> While this behaviour makes sense in your case, it's not >> subject to referential integrity constraints. You could >> arrange for it with custom trigger procedures, checking all >> the five tables on DELETE or UPDATE on one of them. Forest Wilkinson wrote: >> > I have a database in which five separate tables may (or may not) reference >> > any given row in a table of postal addresses. I am using the primary / >> > foreign key support in postgres 7 to represent these references. >> > >> > My problem is that, any time a reference is removed (either by deleting or >> > updating a row in one of the five referencing tables), no garbage >> > collection is being performed on the address table. That is, when the >> > last reference to an address record goes away, the record is not removed >> > from the address table. Over time, my database will fill up with >> > abandoned address records.
Re: [SQL] unreferenced primary keys: garbage collection
On Wed, 24 Jan 2001 00:26:58 -0500 (EST), Michael Fork wrote: >One other method is to setup up the foreign keys as ON DELETE RESTRICT, >then outside of your transaction block issue a DELETE FROM address WHERE >add_id = 1; If there are still records in the other tables referencing >this record, it will error out and nothing will happen, however if no >related records are left, the delete will succeed (you have to do it >outside of transaction, otherwise I belive it will rollback on the >error if other rows are found to be referencing the primary key) Yes, that's the approach I originally posted. The rollback issue is the thing I'm complaining about. The code in question gets called from within a parent function, which uses a single transaction block for all of its operations. This means that executing a query outside a transaction block (or within a separate one) is not an option. I want to be able to tell postgres not to rollback the whole transaction just because my delete attempt fails. I can think of 3 ways to do this: 1. Allow the delete to fail without throwing a fatal error. (Perhaps a warning would suffice.) 2. Allow the client to tell postgres not to roll back if a specified query produces an error. 3. Implement nested transactions. Forest
[SQL] possible row locking bug in 7.0.3 & 7.1
I'm having a problem with functions written in SQL. Specifically, they
don't seem to be adhering to Postgres locking rules. For the record, I'm
using postgres 7.0.3, installed from RPMs, on Red Hat 6.2. I got the same
results with postgres 7.1 beta 6, installed from sources.
Here's what I'm seeing:
(psql input represented by '<<'; output represented by '>>'.)
session1<< create table idseq
session1<< (
session1<< name varchar(32) not null,
session1<< id int8 not null default 0
session1<< );
session1>> CREATE
session1<< insert into idseq values ('myid');
session1>> INSERT 18734 1
Each row in the table is supposed to represent a named numeric sequence,
much like the sequences built into postgres. (Mine use an int8 though,
so their values can be much higher.)
session1<< create function nextid( varchar(32)) returns int8 as '
session1<< select * from idseq where name = $1::text for update;
session1<< update idseq set id = id + 1 where name = $1::text;
session1<< select id from idseq where name = $1::text;
session1<< ' language 'sql';
session1>> CREATE
The idea here is that the select...for update within the nextid() function
will establish a row level lock, preventing two concurrent function calls
from overlapping.
Next, I test with two sessions as follows:
session1<< begin;
session1>> BEGIN
session2<< begin;
session2>> BEGIN
session1<< select nextid('myid');
session1>> nextid
session1>>
session1>> 1
session1>> (1 row)
session2<< select nextid('myid');
(session2 blocks until session1 completes its transaction)
session1<< commit;
session1>> COMMIT
(session2 resumes)
session2>> nextid
session2>>
session2>> 0
session2>> (1 row)
What gives??? I expected the second call to nextid() to return 2!
session2<< commit;
session2>> COMMIT
session2<< select * from idseq;
session2>> name | id
session2>> --+
session2>> myid | 2
session2>> (1 row)
session1<< select * from idseq;
session1>> name | id
session1>> --+
session1>> myid | 2
session1>> (1 row)
As you can see, my nextid() function is not synchronized the way I hoped.
I don't know why, though. Can someone help? I'm going to try out some of my
SPI functions with 7.1 beta 6, to see if they exhibit a locking problem as
well.
Thanks,
Forest Wilkinson
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] possible row locking bug in 7.0.3 & 7.1
On Tuesday 27 March 2001 15:14, Tom Lane wrote: > Forest Wilkinson <[EMAIL PROTECTED]> writes: > > session1<< create function nextid( varchar(32)) returns int8 as ' > > session1<< select * from idseq where name = $1::text for update; > > session1<< update idseq set id = id + 1 where name = $1::text; > > session1<< select id from idseq where name = $1::text; > > session1<< ' language 'sql'; > > [ doesn't work as expected in parallel transactions ] [snip] > The workaround for Forest is to make the final SELECT be a SELECT FOR > UPDATE, so that it's playing by the same rules as the earlier commands. > But I wonder whether we ought to rethink the MVCC rules so that that's > not necessary. I have no idea how we might change the rules though. > If nothing else, we should document this issue better: SELECT and SELECT > FOR UPDATE have different visibility rules, so you probably don't want > to intermix them. My, that's ugly. (But thanks for the workaround.) If I remember correctly, UPDATE establishes a lock on the affected rows, which will block another UPDATE on the same rows for the duration of the transaction. If that's true, shouldn't I be able to achieve my desired behavior by removing the initial as follows: create function nextid( varchar(32)) returns int8 as ' update idseq set id = id + 1 where name = $1::text; select id from idseq where name = $1::text; ' language 'sql'; Or, would I still have to add FOR UPDATE to that final SELECT? Forest ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Re: [SQL] possible row locking bug in 7.0.3 & 7.1
On Thursday 29 March 2001 22:15, Tom Lane wrote: > > Just looked in heapam.c - I can fix it in two hours. > > The question is - should we do this now? > > This scares the hell out of me. > > I do NOT think we should be making quick-hack changes in fundamental > system semantics at this point of the release cycle. Although I'm the one who is being bit by this bug, I tend to agree. > The problem went unnoticed for two full release cycles I first reported the problem on 25 September 2000, on the pgsql-sql list, message subject "SQL functions not locking properly?" I was using 7.0.2 at the time. Also, I seem to remember that a problem of this nature bit me in 6.5.x as well. > it can wait another cycle for a fix that has been considered, reviewed, > and tested. Let's not risk making things worse by releasing a new > behavior we might find out is also wrong. Good point. How long is the next cycle likely to take? Forest ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[SQL] ON UPDATE CASCADE overhead?
If I add ON UPDATE CASCADE to my foreign key definitions, how much will it affect performance for queries that don't trigger the cascade? Cheers, Forest ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[SQL] are NEW and OLD rule attributes broken?
(Postgres 7.0.3, linux kernel 2.4.2, i386, red hat 7.1)
I'm trying to build rules to automatically populate several tables with
references to any new rows inserted into a primary key table. The primary
key is a sequence. Here's what's going on:
mydb=# create table foo (fooid serial primary key, foonote text);
NOTICE: CREATE TABLE will create implicit sequence 'foo_fooid_seq' for
SERIAL column 'foo.fooid'
NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index 'foo_pkey' for
table 'foo'
CREATE
mydb=# create table bar (fooid integer references foo (fooid) deferrable
initially deferred, barnote text);
NOTICE: CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
check(s)
CREATE
mydb=# create rule brule as on insert to foo do insert into bar (fooid,
barnote) values ( new.fooid, 'blah');
CREATE 30351 1
mydb=# begin;
BEGIN
mydb=# insert into foo (foonote) values ('test row');
INSERT 30353 1
mydb=# select * from foo;
fooid | foonote
---+--
2 | test row
(1 row)
mydb=# select * from bar;
fooid | barnote
---+-
1 | blah
(1 row)
mydb=# commit;
ERROR: referential integrity violation - key referenced from bar
not found in foo
What's going on here? My rule inserted a new row into bar, but the fooid
it used was not the new value inserted into foo. It looks like my rule is
inserting "nextval('foo_fooid_seq')" into the bar table, rather than
inserting the new primary key from the foo table. Is this the intended
behavior? How can I get the result I want?
Cheers,
Forest
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://www.postgresql.org/search.mpl
[SQL] Empty queries guaranteed to work?
Tom Lane mentioned in this post that an empty query can be sent to the server to determine whether the connection is still good: http://archives.postgresql.org/pgsql-hackers/2001-10/msg00643.php Is a query of "" guaranteed to work as long as the connection is good? What about ";" or " "? Background: I am maintaining some postgres client code (C++/libpq) that, during exception handling for a failed query, must determine whether the database connection is still good. This check is currently done by sending a "select version()" query and checking the result. However, even that simple query fails with PGRES_FATAL_ERROR when the connection is still good, if executed in an aborted transaction. (I have no idea why in the world a fatal error would be reported, when the connection is perfectly good and a rollback is all that's needed.) I need to be able to distinguish this situation from a real fatal error, and PQstatus() appears to be unreliable for this purpose. A "" query that returns PGRES_EMPTY_QUERY seems to be a good indicator that the connection is good, even within aborted transactions. Now I just need to know whether this is documented and guaranteed to work. Suggestions of alternative methods are welcome. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[SQL] how to determine array size
I need to enumerate the constraints on any given column in a table, so I'm examining pg_constraint to get the relevant information. The conkey array contains a list of constrained columns, and although I am able to check conkey[1] for constraints on a single column, I would like to properly handle multi-column constraints. How do I determine the size of the conkey array? I haven't found any field that looks like it contains the number of values in conkey. Do I have to check each element of the array sequentially, until I get a NULL value from one of them? (Section 5.12 of the User's Guide seems to forbid this: "A limitation of the present array implementation is that individual elements of an array cannot be SQL null values.") Moreover, that method doesn't give me a nice way of selecting all constraints on a specific column, as I would have to write clauses like this: ... WHERE conkey[1] = blah OR conkey[2] = blah OR conkey[3] = blah OR conkey[4] = blah ... Can somone offer a better way? ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] how to determine array size
>> I need to enumerate the constraints on any given column in a table, so >> I'm examining pg_constraint to get the relevant information. The >> conkey array contains a list of constrained columns, and although I am >> able to check conkey[1] for constraints on a single column, I would >> like to properly handle multi-column constraints. > >You may have an easier time dealing with pg_get_constraintdef() than >trying to get the info from the source. Is pg_get_constraintdef() documented somewhere? I'd like to know it's arguments, return format, and whether it will be supported in future postgres releases. From what I see in pg_dump.c, it appears to accept an oid from the pg_constraint table, and only work with foreign key constraints. True? That might be useful in some special-case code, but I really want a method that will work with all types of constraint. (The idea here is to enumerate all constraints on a table, along with their names, types, and constrained columns.) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[SQL] How to get a self-conflicting row level lock?
I have become maintainer of a program that uses PostgreSQL 6.5.2 for database functionality. It is littered with code blocks that do the following: 1. SELECT * FROM some_table WHERE foo = bar FOR UPDATE; 2. -- Choose a new value for some_field, which might or might not be based on its original value. 3. UPDATE some_table SET some_field = new_value WHERE foo = bar; I'm worried about concurrent process synchronization. According to the PostgreSQL docs on the LOCK command, SELECT ... FOR UPDATE acquires a "ROW SHARE MODE" lock, which is not self-conflicting. This tells me that when two processes execute the same code block concurrently, this can happen: 1. Process A selects the desired row for update. It now has a copy of the original values in that row. 2. Process B does the same. (This is allowed because ROW SHARE MODE locks do not conflict with each other.) It now has a copy of the original values in that row. 3. Process A chooses a new value for the desired field, based on the original value. 4. Process B does the same. 5. Process A updates the row with its new value, and exits. 6. Process B updates the row with its new value, overwriting the changes made by process A. Is it true that SELECT ... FOR UPDATE only acquires a ROW SHARE MODE lock, and that it isn't self-conflicting? How can I acquire a self-conflicting row level lock? What is the proper way to perform operations like those I'm describing? Thank you, Forest
[SQL] non-cachable 'C' language functions
(using postgresql 6.5.2) I have created a set of postgres extension functions in C (which use SPI to perform queries), and added them to my database with something like this: CREATE FUNCTION my_next_uid(text) RETURNS text AS '/usr/lib/pgsql/my_uids.so' LANGUAGE 'c'; My functions are designed to behave like nextval() and friends, except that they operate on a varchar field in a predetermined table, rather than a database sequence. For example, my_next_uid() should always return a unique value, incrementing the current value in said table each time it is called. So far, my functions appear to work correctly. However, looking at their entries in the pg_proc table, I see that their "proiscachable" fields are set to true. This worries me, because my understanding is that postgres will re-use values returned by cachable functions, which is undesirable. (In order for my_next_uid() to be useful, it must retrieve a new value each time it is used.) Is my understanding correct? What should I do about it? The postgresql 6.5.2 docs for CREATE FUNCTION don't tell me how to make my functions non-cachable. Regards, Forest Wilkinson
