[HACKERS] OIDs, CTIDs, updateable cursors and friends
Hi all, I'm in the process of writing an OLE DB provider for postgres. I am, right now, at the point where updating an entry becomes an issue. Ideally, I would open an updateable cursor for a table/view, and use that. Unfortunetly, Postgres doesn't seem to support those at all. As an alternative, I was thinking of using some unique field to identify, with certanity, the line in question. This will allow me to use an update command to modify said line, in case there is a need with the OLE DB client. My quetion is this - what can I use? I have tried to find some docs regarding OIDs and CTIDs, but the docs seem scarce. If I understand this correctly - OID is meant to identify a row in a table, though it is not guarenteed to be unique across tables, or even inside a given table. A CTID is meant to identify the physical location at which a row is stored. Are these correct? Would adding OID to the rows returned by each Select call, and then doing update blah where oid=xxx when I'm requested to update the row sound like a reasonable stategy, in lieu of updateable cursors? Can anyone suggest a better way? Shachar -- Shachar Shemesh Lingnu Open Systems Consulting http://www.lingnu.com/ ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends
Shachar Shemesh [EMAIL PROTECTED] writes: Would adding OID to the rows returned by each Select call, and then doing update blah where oid=xxx when I'm requested to update the row sound like a reasonable stategy, in lieu of updateable cursors? Can anyone suggest a better way? If you're in control of the database schema and can ensure that all tables will have OIDs enabled and you can add a unique index on OID on all these tables then yes. But it's not ideal. If OID wraps around you'll get errors from unique key violations. A better strategy is to pull the primary key columns from information_schema and use those columns. This would be more work but would work on any table with a primary key. This won't work for tables without primary keys, but in that case, arguably, updating records doesn't really make sense anyways. Something like this, though I'm not really very familiar with the information_schema. db= SELECT ordinal_position,column_name FROM information_schema.table_constraints AS a JOIN information_schema.key_column_usage AS b USING (constraint_schema,constraint_name) WHERE a.constraint_type = 'PRIMARY KEY' AND a.table_schema = 'public' AND a.table_name = 'country' ORDER BY ordinal_position; ordinal_position | column_name --+-- 1 | country_code (1 row) -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends
I believe the ODBC driver uses CTID for this sort of problem. CTID is guaranteed to exist and to be fast to access (since it's a physical locator). Against this you have the problem that concurrent updates of the record will move it, leaving your CTID invalid. However, that IIRC the ctid access follows the chain up to the currently valid tuple ? I thought the only enemy of ctid access was vacuum ? Andreas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends
Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes: IIRC the ctid access follows the chain up to the currently valid tuple ? No. I think Hiroshi or someone put in a function you can use to follow the chain, but a simple WHERE ctid = whatever won't do it. In any case, if you're not holding an open transaction then you have to be prepared to have the dead tuple vacuumed out from under you, in which case you'd not be able to follow the chain anyway. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends
-Original Message- From: Shachar Shemesh [mailto:[EMAIL PROTECTED] Sent: 18 February 2004 13:18 To: Hackers; PostgreSQL OLE DB development Subject: [HACKERS] OIDs, CTIDs, updateable cursors and friends Would adding OID to the rows returned by each Select call, and then doing update blah where oid=xxx when I'm requested to update the row sound like a reasonable stategy, in lieu of updateable cursors? Can anyone suggest a better way? Ignoring potential OID wraparound problems (which we do in pgAdmin) this should work, assuming there is an OID column. I would suggest trying the following methods in sequence: 1) Use the tables primary key. 2) Use the OID (and check that only one record will be affected). 3) Build a where clause based on all known original values (and check that only one record will be affected). 4) Fail with an appropriate error. 2 3 can potentially affect more than one record, but even Microsoft code runs into that problem from time to time and fails with an appropriate error message. In pgAdmin II we used to ask the user if they wanted to update all matching rows, but of course that is not appropriate in a driver. Regards, Dave. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends
Dave Page wrote: -Original Message- From: Shachar Shemesh [mailto:[EMAIL PROTECTED] Sent: 18 February 2004 13:18 To: Hackers; PostgreSQL OLE DB development Subject: [HACKERS] OIDs, CTIDs, updateable cursors and friends Would adding OID to the rows returned by each Select call, and then doing update blah where oid=xxx when I'm requested to update the row sound like a reasonable stategy, in lieu of updateable cursors? Can anyone suggest a better way? Ignoring potential OID wraparound problems (which we do in pgAdmin) this should work, assuming there is an OID column. I would suggest trying the following methods in sequence: 1) Use the tables primary key. I would, except I'm not sure how many queries I would need in order to find what the primary key is. Also, what happens if the primary key is not a part of the fields returned by the query? 2) Use the OID (and check that only one record will be affected). That may work. Do a query for how many would be affected. Then again, I'm currently not inside a transaction. The plan was not to be inside a transaction unless I needed to. I'm not sure how safe this is to perform many queries. 3) Build a where clause based on all known original values (and check that only one record will be affected). Again - what happens when I'm not inside a transaction? 4) Fail with an appropriate error. 2 3 can potentially affect more than one record, but even Microsoft code runs into that problem from time to time and fails with an appropriate error message. In pgAdmin II we used to ask the user if they wanted to update all matching rows, but of course that is not appropriate in a driver. Regards, Dave. The doc mentions something about making the OID column unique. Would that not cause other problems? What happens if I define the OID field as unique, and I get a wraparound and an attempt to put a new field in with existing value? Would the OID skip to the next unique per table, or would the insert fail? Shachar -- Shachar Shemesh Lingnu Open Systems Consulting http://www.lingnu.com/ ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends
-Original Message- From: Shachar Shemesh [mailto:[EMAIL PROTECTED] Sent: 18 February 2004 14:10 To: Dave Page Cc: Hackers; PostgreSQL OLE DB development Subject: Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends I would, except I'm not sure how many queries I would need in order to find what the primary key is. Well this is the only safe way to update a specific record. To find the pkey, look for an index on the table in pg_index with indisprimary = true. The indkey column holds an array of pg_attribute.attnum's that are in the index iirc. Also, what happens if the primary key is not a part of the fields returned by the query? Add them as you proprosed to do with the OID, or fall back to the next method. ADO etc. normally fail to update rows if the programmer hasn't included a suitable key in the recordset. 2) Use the OID (and check that only one record will be affected). That may work. Do a query for how many would be affected. Then again, I'm currently not inside a transaction. The plan was not to be inside a transaction unless I needed to. I'm not sure how safe this is to perform many queries. Should be perfectly safe. 3) Build a where clause based on all known original values (and check that only one record will be affected). Again - what happens when I'm not inside a transaction? You might find a new row that wasn;t there before but is now, or vice versa. The doc mentions something about making the OID column unique. Would that not cause other problems? What happens if I define the OID field as unique, and I get a wraparound and an attempt to put a new field in with existing value? Would the OID skip to the next unique per table, or would the insert fail? It is not the drivers place to mess with peoples schemas, but yes, it could cause an insert to fail following wraparound. Regards, Dave. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends
Dave Page wrote: -Original Message- From: Shachar Shemesh [mailto:[EMAIL PROTECTED] Sent: 18 February 2004 14:10 To: Dave Page Cc: Hackers; PostgreSQL OLE DB development Subject: Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends I would, except I'm not sure how many queries I would need in order to find what the primary key is. Well this is the only safe way to update a specific record. To find the pkey, look for an index on the table in pg_index with indisprimary = true. The indkey column holds an array of pg_attribute.attnum's that are in the index iirc. I'll have a look at that. How would updateable cursors do it? By locking the row? Also, what happens if the primary key is not a part of the fields returned by the query? Add them as you proprosed to do with the OID, or fall back to the next method. ADO etc. normally fail to update rows if the programmer hasn't included a suitable key in the recordset. So, basically, I would not be able to update a table that has no primary key? 2) Use the OID (and check that only one record will be affected). That may work. Do a query for how many would be affected. Then again, I'm currently not inside a transaction. The plan was not to be inside a transaction unless I needed to. I'm not sure how safe this is to perform many queries. Should be perfectly safe. What happens if I check how many would be updated, and get 1 as a result. I then actually do it, but between asking and performing, someone added a second row that matches the criteria? 3) Build a where clause based on all known original values (and check that only one record will be affected). Again - what happens when I'm not inside a transaction? You might find a new row that wasn;t there before but is now, or vice versa. But what if someone else changes some of the known values of my row? The doc mentions something about making the OID column unique. Would that not cause other problems? What happens if I define the OID field as unique, and I get a wraparound and an attempt to put a new field in with existing value? Would the OID skip to the next unique per table, or would the insert fail? It is not the drivers place to mess with peoples schemas, but yes, it could cause an insert to fail following wraparound. Then it's not a good enough solution, even if the driver did have the prorogative to change the table. Regards, Dave. Ok, it seems to me there are several options here. 1. Find out which is the primary key for the table. What happens if the primary key is a multi-row thing? What happens if there is no primary key? 2. If I'm in a transaction, use OID for the insert after checking with a select that I'm only affecting one row. If I'm not in a transaction - perform the update in a generated transaction, and roll it back if there is more than one row affected. I like 1 better, frankly. Dillemas dillemas dillemas. -- Shachar Shemesh Lingnu Open Systems Consulting http://www.lingnu.com/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends
-Original Message- From: Shachar Shemesh [mailto:[EMAIL PROTECTED] Sent: 18 February 2004 14:56 To: Dave Page Cc: Hackers; PostgreSQL OLE DB development Subject: Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends I'll have a look at that. How would updateable cursors do it? By locking the row? Dunno, we don't have them! So, basically, I would not be able to update a table that has no primary key? Yes, unless you feel back to the value matching type update. Realistically though, how can anyone expect to edit data successfully unless they have defined a key to identify rows with? Whilst it's nice to get it to work 100% of the time no matter how brain dead the schema it's not that practical. What happens if I check how many would be updated, and get 1 as a result. I then actually do it, but between asking and performing, someone added a second row that matches the criteria? It'll update both rows unless you do it in one transaction. But what if someone else changes some of the known values of my row? The update will fail to find any rows. This is almost certainly what happens when MS Access starts pinting #DELETED# in rows of a linked table. Ok, it seems to me there are several options here. 1. Find out which is the primary key for the table. What happens if the primary key is a multi-row thing? What happens if there is no primary key? I guess you mean multicolumn? No different, you just need all columns in your WHERE clause. If there is no pkey (and I would be inclined to say if there is none in the user's query and not try to add it yourself) then you fail with an error. 2. If I'm in a transaction, use OID for the insert after checking with a select that I'm only affecting one row. If I'm not in a transaction - perform the update in a generated transaction, and roll it back if there is more than one row affected. I like 1 better, frankly. Dillemas dillemas dillemas. 1 is definitely better and is the only way that is guaranteed to be safe. Thinking about it more, that is almost certainly the position a driver should take. In pgAdmin we can afford a little artistic licence (no pun intended) because no one will be using pgAdmin as a driver to connect another program to a database, plus we can ask the user what action to take if we don't know if the result will be exactly what was intended. You do not have that luxury in a driver of course. Regards, Dave ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html